Page 53 - Read Online
P. 53

Boyd et al. Art Int Surg 2024;4:316-23                                          Artificial
               DOI: 10.20517/ais.2024.53
                                                                               Intelligence Surgery



               Original Article                                                              Open Access



               Analyzing the precision and readability of a
               healthcare focused artificial intelligence platform on

               common questions regarding breast augmentation


               Carter J. Boyd, Lucas R. Perez Rivera, Kshipra Hemal, Thomas J. Sorenson, Chris Amro, Mihye Choi,
               Nolan S. Karp

               Hansjörg Wyss Department of Plastic Surgery, NYU Langone Health, New York, NY 10017, USA.
               Correspondence to: Prof. Nolan S. Karp, Hansjörg Wyss Department of Plastic Surgery, NYU Langone Health, 305 East 47th,
               suite 1A, New York, NY 10017, USA. E-mail: Nolan.karp@nyulangone.org

               How to cite this article: Boyd CJ, Perez Rivera LR, Hemal K, Sorenson TJ, Amro C, Choi M, Karp NS. Analyzing the precision and
               readability of a healthcare focused artificial intelligence platform on common questions regarding breast augmentation. Art Int
               Surg 2024;4:316-23. https://dx.doi.org/10.20517/ais.2024.53

               Received: 24 Jul 2024   First Decision: 19 Sep 2024   Revised: 25 Sep 2024   Accepted: 14 Oct 2024   Published: 19 Oct 2024

               Academic Editor: Andrew Gumbs   Copy Editor: Pei-Yun Wang   Production Editor: Pei-Yun Wang

               Abstract
               Aim: The purpose of this study was to determine the quality and accessibility of the outputs from a healthcare-
               specific artificial intelligence (AI) platform for common questions during the perioperative period for a common
               plastic surgery procedure.

               Methods: Doximity GPT (Doximity, San Francisco, CA) and ChatGPT 3.5 (OpenAI, San Francisco, CA) were
               utilized to search 20 common perioperative patient inquiries regarding breast augmentation. The structure,
               content, and readability of responses were compared using t-tests and chi-square tests, with P < 0.05 used as the
               cutoff for significance.

               Results: Out of 80 total AI-generated outputs, ChatGPT responses were significantly longer (331 vs. 218 words, P <
               0.001). Doximity GPT outputs were structured as a letter from a medical provider to the patient, whereas ChatGPT
               outputs were a bulleted list. Doximity GPT outputs were significantly more readable by four validated scales: Flesch
               Kincaid Reading Ease (42.6 vs. 29.9, P < 0.001) and Flesch Kincaid Grade Level (11.4 vs. 14.1 grade, P < 0.001),
               Coleman-Liau Index (14.9 vs. 17 grade, P < 0.001), and Automated Readability Index (11.3 vs. 14.8 grade, P < 0.001).
               Regarding content, there was no difference between the two platforms regarding the appropriateness of the topic
               (99% overall). Medical advice from all outputs was deemed reasonable.






                           © The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0
                           International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing,
                           adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as
               long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and
               indicate if changes were made.

                                                                                        www.oaepublish.com/ais
   48   49   50   51   52   53   54   55   56   57   58