Page 57 - Read Online
P. 57
Lim et al. Plast Aesthet Res 2023;10:43 Plastic and
DOI: 10.20517/2347-9264.2023.70
Aesthetic Research
Case Report Open Access
Evaluating the efficacy of major language models in
providing guidance for hand trauma nerve laceration
patients: a case study on Google’s AI BARD, Bing
AI, and ChatGPT
1,2
1
1,2
3
Bryan Lim , Ishith Seth 1,2,3 , Gabriella Bulloch , Yi Xie , David J Hunter-Smith , Warren M Rozen 1,2
1
Department of Plastic Surgery, Peninsula Health, Melbourne 3199, Australia.
2
Central Clinical School at Monash University, The Alfred Centre, Melbourne 3004, Australia.
3
Faculty of Medicine and Surgery, The University of Melbourne 3053, Australia.
Correspondence to: Dr. Ishith Seth, Central Clinical School at Monash University, The Alfred Centre, 99 Commercial Rd,
Melbourne 3004, Australia. E-mail: ishithseth1@gmail.com
How to cite this article: Lim B, Seth I, Bulloch G, Xie Y, Hunter-Smith DJ, Rozen WM. Evaluating the efficacy of major language
models in providing guidance for hand trauma nerve laceration patients: a case study on Google’s AI BARD, Bing AI, and
ChatGPT. Plast Aesthet Res 2023;10:43. https://dx.doi.org/10.20517/2347-9264.2023.70
Received: 16 Jul 2023 First Decision: 3 Aug 2023 Revised: 5 Aug 2023 Accepted: 10 Aug 2023 Published: 17 Aug 2023
Academic Editor: Samuel O. Poore Copy Editor: Dan Zhang Production Editor: Dan Zhang
Abstract
This study evaluated three prominent Large Language Models (LLMs)-Google’s AI BARD, Bing’s AI, and
ChatGPT-4 in providing patient advice for hand laceration. Five simulated patient inquiries on hand trauma were
prompted to them. A panel of Board-certified plastic surgical residents evaluated the responses for accuracy,
comprehensiveness, and appropriate sources. Responses were also compared against existing literature and
guidelines. This study suggests that ChatGPT outperforms BARD and Bing AI in providing reliable, evidence-based
clinical advice, but they still face limitations in depth and specificity. Healthcare professionals are essential in
interpreting LLM recommendations, and future research should improve LLM performance by integrating
specialized databases and human expertise to advance nerve injury management and optimize patient-centred
care.
Keywords: Artificial intelligence, ChatGPT, BARD, Bings AI, large language model, nerve injury
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing,
adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as
long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and
indicate if changes were made.
www.parjournal.net