Page 31 - Read Online

P. 31

Page 299 Brenac et al. Art Int Surg 2024;4:296-315 https://dx.doi.org/10.20517/ais.2024.49

workers to respond to frequently asked questions during medical appointments. Additionally, ChatGPT can
[25]
be utilized 24/7 to address patients’ concerns and promptly provide essential medical information . As a
result, chatbots may help decrease the need for in-person medical consultations and increase access to
healthcare in rural regions at a reduced cost [26,27] .

ChatGPT - enhancing readability of patient education
Readability refers to the ease with which a reader can comprehend written material, with common scoring
systems including the Flesch Reading Ease Score, Flesh-Kincaid Grade Level, Gunning Fog Index, Coleman-
Liau Index, and Simple Measure of Gobbledygook Index [19,21] . Each scoring system employs a unique
mathematical formula to analyze factors such as mean number of sentences, number of syllables per
[19]
sentence, number of words per sentence, or number of complex words per sentence . Despite the need for
readable patient-facing information, multiple studies have demonstrated that the readability level of
information for breast reconstruction, burn injuries, hand surgery, and gender-affirming surgery exceeds
the sixth-grade readability level recommended by the American Medical Association and National Institute
of Health (NIH) [17-19,28] . To increase the readability of online medical material, AI tools such as ChatGPT
have recently been evaluated in clinical settings [19,21] . In one study by Wang et al., the researchers performed
an online search for “breast reconstruction” . They collected patient information from the top 10 websites
[28]
based on “hits” and found most of them to exceed the NIH’s readability recommendation . Information
[28]
provided by the sources was then entered into ChatGPT with the command: “Rephrase this article to a 5th-
[28]
grade readability level: ‘[Article]’” . Paired t-tests of readability scores for Flesch Reading Ease, Flesch-
Kincaid Grade Level, and Simple Measure of Gobbledygook Index were performed, comparing medical
information provided by the websites to the same information adjusted by ChatGPT. A value of P < 0.05
indicated statistical significance. Results demonstrated that ChatGPT generally increased the readability of
patient-facing PRS information; however, this improvement was statistically significant for only one of the
[28]
ten websites, specifically “Plasticsurgery.org” . On an important note, readability scores still exceeded 5th-
grade levels, highlighting an ongoing need to generate more readable patient-facing information . Despite
[28]
this current limitation, Wang et al. proved that ChatGPT has the potential to increase the readability of
online information and may be utilized by plastic surgeons to help simplify complex online medical
[28]
material . Similar results were obtained by Baldwin et al., who evaluated ChatGPT’s ability to improve
[19]
burn first aid information to an 11-year-old literacy level . Baldwin et al. utilized a one-sample one-tailed
t-test to effectively compare readability scores before and after ChatGPT modification. Before ChatGPT
modification, only 4% of the top 50 English webpages with burn first aid information met the 11-year-old
literacy rate according to the following readability formulas: Gunning Fog Index, Coleman-Liau Index, and
[19]
Simple Measure of Gobbledygook Index . However, after ChatGPT altered the material, 18% reached the
11-year-old literacy rate . Additionally, after ChatGPT modified online patient education materials,
[19]
readability scores improved significantly according to all readability formulas employed (P < 0.001) .
[19]
Likewise, Browne et al. investigated ChatGPT’s ability to enhance the readability of hand surgery
information provided by the American Society for Surgery of the Hand and the British Society for Surgery
of the Hand . Browne et al. utilized a two-tailed Paired Student’s t test to compare the readability scores
[17]
prior to and post ChatGPT modification and set a significance level at 5% . Specifically, the readability
[17]
formulas utilized in this experiment include the Automated Readability Index, Gunning Fog Score, Flesch
Kincaid Grade Level, Flesch Reading Ease, Coleman-Liau Index, Simple Measure of Gobbledygook, and
Linsear Write Formula. Both Wang et al. and Baldwin et al. have utilized similar methodologies, yielding
comparable results in different plastic surgery domains [19,28] . The readability of ChatGPT-modified hand
surgery material improved significantly compared to unedited hand surgery information (P < 0.001) for all
readability tests utilized and achieved a mean sixth-grade level for the Flesch Kincaid Grade Level and
Simple Measure of Gobbledygook tests . Therefore, ChatGPT has demonstrated an ability to improve the
[17]
readability of complex surgical information available to patients across multiple disciplines.

26 27 28 29 30 31 32 33 34 35 36