Page 104 - Read Online
P. 104
Roy et al. Art Int Surg 2024;4:427-34 https://dx.doi.org/10.20517/ais.2024.69 Page 431
were found to be essential to ensure the integration of ML tools into the clinical workflow. A well-defined
clinical utility with clinically relevant parameters and actionable alerts ensured the usefulness of the ML
tools. Securing funding was seen as a significant challenge to overcome to support all phases of ML model
development and deployment. Partnerships with vendors could be considered to help overcome challenges
associated with translation and the long-term sustainability of model deployment .
[33]
The generalizability of ML models to the real-world clinical realm can be limited despite rigorous internal
and external validation studies. It has been shown that the real-world introduction of ML models sometimes
[34]
leads to lower accuracy and higher false positives . This discrepancy between experimentation and reality
can be partially due to the datasets used. Research datasets have been shown to be constrained by stringent
inclusion and exclusion criteria [35,36] . Clinical deployment, therefore, requires close model and output
monitoring, followed by adjustments. Another aspect of validation can be the challenges in data sharing. In
that regard, “federated learning” could enable the use of large multi-institution datasets by decentralizing
[37]
data analysis and sharing computational models rather than data .
A disconnect between developers and users may sometimes occur. The technical expert team developing
craniofacial surgery ML models may not be versed in the clinical needs and settings in which the technology
will be deployed. An in-depth understanding of the clinical environments is key for both the development
and translation of the ML tools to the bedside. Are there support team members available to perform data
entry? Is the information obtained novel or more accurate than the one recorded from conventional clinical
assessment? Can the outputs be easily interpreted? Are the output results clinically relevant and helpful in
guiding the management of patients? . The clinical utility of ML models needs to be properly estimated
[35]
and clinical needs should therefore guide model development and tool creation.
Fostering clinical trust
Beyond weighing the recognized benefits and risks of the introduction of ML and AI in their practices,
surgeons may experience a distrust toward AI systems and their outputs. This skepticism may come from a
lack of transparency or understanding of the processes. The explainability factor is important for users in
the context of clinical decision support systems [38,39] . Explainable AI (XAI) is an emerging field bridging
humans with machines by enabling a better understanding of how a model’s decisions and predictions were
made [40,41] . For clinicians to trust AI and ML models, such bidirectional dialogue and reasoning is crucial.
Using XAI involves significant trade-offs, such as the cost of its incorporation. Costs mostly come from the
computation required to create a dialogue and learning capabilities between the model and clinician.
Another identified trade-off lies between performance and interpretability. It appears that the models
offering the best performance metrics are also often the least explainable . As medicine strives for the best
[42]
clinical performance and outcomes, deployment of explainable yet less performant models may be
questionable. Ultimately, surgeons will have to justify clinical decisions with models that they trust and can
understand in order to provide optimal machine-augmented care. Explainability can help answer some
ethical, societal, and regulatory apprehensions and requirements if paired with rigorous model validation
[38]
techniques and bias assessments . To sustainably translate ML models into clinical practice, XAI appears
to be a fundamental investment that requires further attention and development. However, it is important
to note that model explainability is not a substitute for model evaluation following evidence-based medicine
best practices and robust statistical validation.
Sustainability after clinical translation
The long-term sustainability of ML in practice requires financial support for data quality and access,
governance, security, continuous model validation, and operational deployment. The implementation of AI
models in clinical practice may also require the creation of new roles to facilitate the adoption and