Page 9 - Read Online
P. 9
Dababneh et al. Art Int Surg 2024;4:214-32 https://dx.doi.org/10.20517/ais.2024.50 Page 216
Table 1. Inclusion and exclusion criteria of the systematic review
Inclusion criteria Exclusion criteria
Articles on the integration of AI in hand and wrist surgery Letters to the editor
Articles published between 2014-2024 Systematic reviews
Articles not related to hand or wrist surgery
Languages other than English
Articles related to prosthetic hand or arm
AI: Artificial intelligence.
terms for AI were included to ensure comprehensive coverage of ML-related articles in the field of hand
surgery.
In total, 1,228 articles were identified and screened using the Covidence platform. Two reviewers initially
screened the articles based on the relevance of titles and abstracts. All articles that did not refer to the
application of AI to specific concepts related to hand or wrist surgery were excluded. Two hundred and
twenty-five articles advanced to full-text screening. The inclusion and exclusion criteria applied are
included in Table 1. Excluded articles included duplicates, letters to the editor, systematic reviews, non-
English publications, content unrelated to hand or wrist surgery, and articles published over a decade ago. A
full-text review conducted by a single reviewer led to the extraction of 90 articles, which were subsequently
confirmed by a second reviewer. Each included study then underwent a manual bibliographic review to
identify other relevant studies that were not included in the primary search. This process led to the
inclusion of eight additional articles, bringing the total number of articles covered in this review to 98.
The focus of this study was to explore the application of ML in the diagnosis and management of various
hand conditions, including hand and wrist fractures, peripheral nerve injuries, carpal tunnel syndrome
(CTS), osteoarthritis (OA), and triangular fibrocartilage complex (TFCC) disorders. By examining these
innovative technologies, this study seeks to assist hand surgeons in integrating ML into their practice.
Therefore, an emphasis is placed on evaluating the performance of AI as well as its potential to enhance
resident training and improve patient communication. However, this research does not address
rehabilitation or the use of prosthetic arms and hands following nerve injury or amputations [Figure 1].
RESULTS
Use of large language model in AI
This review identified ten articles that focused on AI’s performance in executing various tasks relating to
hand surgery.
A common area in which AI’s performances were evaluated was answering hand surgery multiple-choice
[16]
exam questions. Thibaut et al. compared ChatGPT-3.5’s performance to Google’s Bard Chatbot . Both
large language models (LLMs) were tasked with answering 18 questions from the European Board of Hand
Surgery (EBHS). This study showed that both platforms failed to obtain a passing score and did not adapt
their response even after the authors provided the correct answer. A similar study carried out in 2024 tasked
ChatGPT-3.5 and ChatGPT-4 with answering the 2021 and 2022 Self-Assessment Examinations (SAE) of
the American Society for Surgery of the Hand (ASSH) . ChatGPT-4 performed significantly better, with an
[17]
overall score of 68.9%, compared to ChatGPT 3.5’s 58.0%. These findings align with Ghanem et al.’s study,
which reported that ChatGPT-4 achieved an overall passing score of 61.98% in the ASSH 2019 exam .
[18]
Despite ChatGPT’s improvement with newer versions, most studies highlighted it is limited by its
incapacity to take clinical context into consideration [17-20] .