Page 103 - Read Online

P. 103

Page 430 Roy et al. Art Int Surg 2024;4:427-34 https://dx.doi.org/10.20517/ais.2024.69

Clinical challenges
Some of the most feared consequences of AI are the potential for patient harm and the increased
accountability for clinicians. At baseline, introducing new technology into the medical field requires close
monitoring and auditing. In the case of AI, a flawed algorithm can lead to the dissemination of iatrogenic
[1]
harm, medical errors, and malpractice . Incorrect model outputs leading to adverse patient outcomes also
increase the potential liability for physicians. This can deter healthcare professionals from utilizing AI as it is
currently not essential to the delivery of care. Beyond an inaccurate algorithm also exists a black box one, or
one for which the operations and output cannot be explained and is at the center of much controversy [27,28] .
In the current climate, careful model validation, prospective auditing, and algorithm enhancements are
[1]
therefore necessary, as well as human support and oversight upon deployment of any AI models .
Transparency regarding model development and data sources used, algorithms and overall methodological
[2]
disclosure according to the MINIMAR reporting guidelines can mitigate some of these challenges .

The Gender Shades project further underscores the importance of reporting guidelines to ensure rigorous
[29]
bias assessments are completed . The study highlights significant racial and gender bias in commercial
facial recognition technologies. Their research revealed that AI systems from companies like IBM,
Microsoft, and Face++ were less accurate at identifying gender in darker-skinned individuals, particularly
women, with error rates of up to 34.7% for darker-skinned females, compared to less than 1% for lighter-
skinned males. This disparity is linked to the underrepresentation of diverse phenotypes in the datasets used
to train these models, which overwhelmingly consist of lighter-skinned individuals.

These findings are particularly relevant for photo-based AI applications in craniofacial surgery. As these
technologies become integrated into surgical planning and diagnostics, biases could disproportionately
affect individuals with darker skin tones, potentially leading to misdiagnoses or improper treatment
recommendations. It underscores the necessity of using diverse and balanced datasets in the development of
AI models in craniofacial surgery as well as conducting detailed subgroup bias assessments on gender, age,
race, ethnicity, and in some instances, skin tone (for image/photography-based applications). Of note,
transparency to patients on how a model is expected to work specifically for them, based on these subgroup
bias assessments, is important to facilitate genuine informed consent. It is expected that variations in
performance will occur, and with transparency in understanding the limitations of a model, risk can be
mitigated to ensure care delivery remains equitable even when a model may not perform equally across
various populations.

While large data pools are required to create reliable and generalizable ML models, the acquisition of such
information may lead to concerns regarding health data ownership, privacy, and security [30,31] . Possible
malicious uses of AI technology have been reviewed and could include: breaches of data security and
privacy, hacking of algorithms with the intent to harm, manipulation of data, and much more .
[32]
Governance bodies should seek to define best practices, to mitigate security and safety threats, and to have
action plans in case a malicious event occurs . Guidance navigating expectations and consequences of the
[32]
use of ML models in healthcare should be encouraged at all levels (patient, physician, institutions, and
governing bodies). Excessive or absolute dependence on experimental models should be avoided until
robust foundations and infrastructures are in place to mitigate risks associated with AI.

Clinical translation: real-world introduction
Watson et al. conducted semi-structured interviews with American Academic Medical Centers regarding
their use of predictive modeling and ML techniques within clinical care . The team identified specific
[33]
barriers to the adoption and implementation of such models that encompassed several themes: culture and
personnel, clinical utility, financing, technology, and data. Overall, multidisciplinary development teams

98 99 100 101 102 103 104 105 106 107 108