Page 34 - Read Online
P. 34

Koss et al. Art Int Surg. 2025;5:116-25  https://dx.doi.org/10.20517/ais.2024.91      Page 122

               explain why it is not yet seen as a primary resource for making such important medical decisions.


               The underutilization of ChatGPT for GAS information could also be attributed to broader concerns about
               the accuracy and trustworthiness of LLMs in healthcare settings. Several studies have raised alarms about
               the potential for LLMs to disseminate misinformation or oversimplify complex medical concepts, which can
               lead to patient confusion or misinformed decisions [4,19] . Furthermore, the ability of LLMs to account for
               patient-specific factors, such as individual medical histories or co-occurring health conditions, remains
               limited. For transgender patients, whose healthcare needs are often highly specialized and require tailored
               interventions, this lack of personalization could further diminish trust in ChatGPT as a reliable resource.

               Interestingly, despite the limited use of ChatGPT for information on GAS, some participants did report that
               it positively influenced their decision making. This indicates that while ChatGPT may not currently serve as
               a primary resource, it may have greater utility as a supplementary tool, especially as LLMs evolve to
               integrate more specialized medical data and provide real-time, accurate patient feedback. Furthermore,
               research in fields such as ophthalmology and urology has shown that, while LLMs can provide reasonably
               accurate and comprehensive information, they often fall short of addressing the full range of patient
               needs [20,21] , underscoring ChatGPT’s potential to complement, rather than replace, traditional sources of
               medical information for GAS .
                                        [22]

               This study additionally identified several areas where ChatGPT’s content could be improved to better meet
               the needs of individuals seeking information on GAS. Participants specifically noted that ChatGPT
               provided insufficient details on financial considerations, surgical techniques, and recovery processes - key
               elements required in the decision-making process for GAS. This mirrors findings from other research where
               LLMs have been criticized for their inability to provide comprehensive, context-specific medical
                                                                                     [18]
               information, particularly in areas that require detailed, patient-centered guidance . These limitations are
               particularly concerning in fields like transgender healthcare, where access to accurate, personalized, and
               affirming medical information is often limited [23-25] .


               From a broader perspective, the findings of this study emphasize the ongoing need to guide patients toward
               trusted, reputable sources of medical information, especially for GAS, where misinformation can have
               serious and life-long consequences. Healthcare providers should guide patients toward high-quality
               resources, including peer-reviewed medical websites such as GAS websites produced by academic
               institutions, such as our institution’s transgender care website https://genderaffirmingsurgicalcare.ucsf.edu/,
               and consultations with trained professionals. It is also equally important to approach the integration of
               LLMs into healthcare with caution, emphasizing that these technologies should complement, rather than
               replace, human expertise and empathy.

               Our study is not without limitations. The small number of participants who used ChatGPT specifically for
               information on GAS limits the generalizability of our results. Additionally, the reliance on self-reported data
               introduces potential biases, such as over-reporting or under-reporting the use of ChatGPT or other
               information sources. Self-reported demographics such as gender identity and sexual orientation may also
               introduce bias into the results given differences in participant understanding of specific terms, which could
               differ from widely accepted definitions. There are limitations to the use of Prolific, including reliance on
               self-reported data, the potential for participant bias due to financial incentives, and the fact that the
               participant pool may not fully represent the general population. While Prolific participants may not fully
               represent the broader population, prior research has shown that data quality from Prolific is comparable to
               or better than other commonly used platforms such as MTurk. Efforts were also made to reduce the risk of
   29   30   31   32   33   34   35   36   37   38   39