Page 22 - Read Online
P. 22

Page 47                                                             Talwar et al. Art Int Surg. 2025;5:46-52  https://dx.doi.org/10.20517/ais.2024.81

               Such models could understand text, but not create new text. With the release of Generative Pre-trained
               Transformer-3 (GPT-3) in 2021, generative large language models (LLMs) have performed well on both
               generative and non-generative tasks. Newer LLMs (i.e., GPT-4, Mistral, Claude, Bard, Perplexity) have only
               improved in these capacities. Plastic surgeons must understand how NLP tools can be harnessed to improve
               our workflows.

               In plastic surgery, patient consultations are a critical component of the care process. Consultation requires
               effective communication and thorough documentation. The integration of NLP into consultations can
               enhance the quality of care and streamline this process. In this manuscript, we review the current state of
               clinical NLP integration and provide a perspective for future growth.


               NLP is revolutionizing the two overarching domains of documentation and communication, summarized in
               Table 1.  Examples  of  NLP  tasks  related  to  documentation  include  information  extraction  and
               summarization, ambient transcription, and coding. NLP tasks related to communication include
               understanding patient goals, patient-reported outcomes (PRO), translation, health literacy, and a patient-
               facing chatbot. We also discuss ethical considerations, limitations, and challenges of clinical NLP. We are
               still in the early stages of clinical NLP development. Plastic surgeons must help guide toward beneficial
               applications.


               DOCUMENTATION
               Information extraction and summarization
               Plastic surgery is a highly specialized discipline. Plastic surgeons operate across the whole body.
               Unsurprisingly, our experts require very specific knowledge about patients and their medical history. This
               includes how well their comorbidities are controlled (i.e., diabetes) and information about their surgical
               history (i.e., history of abdominoplasty). Electronic health records (EHRs) contain a wealth of this
               information. However, most patient EHRs contain hundreds of clinical documents. Manually searching for
               information can be tedious due to the volume and jargon. For example, in a provider note, “PT” can mean
               “patient”, “physical therapy”, “posterior tibial artery”, “posterior tibialis”, “prothrombin time”, or “part-
               time”.

               LLMs that comprehend clinical documentation should be able to understand the context and easily extract
               information. The most basic task is “named entity recognition” - deriving the names of patients, medical
               procedures, and medications directly written in a document . Plastic surgeons might use named entity
                                                                    [1]
               recognition to identify key surgical information, such as what tissue was resected, what type of mesh/
               implant was used, and what flaps were used in the reconstruction.

               Summarization, on the other hand, is a more complex and generative task. It requires a holistic
               understanding of a text. Here, the generative LLMs could help plastic surgeons by processing several clinical
               documents in the EHR to summarize a patient’s surgical history (i.e., all previous breast procedures) or
               overall health status. This would help plastic surgeons prepare before or during a consultation to guide
               operative planning. A group from Stanford applied eight different LLMs for clinical summarization and
               found several untrained LLMs more completely summarized patient history than humans and had fewer
               errors . Importantly, their study highlighted the importance of “prompt engineering”, that is, phrasing
                    [2]
               questions so the model can generate useful summaries. Their study also found that GPT-4 had the best
               performance compared to other models. Indeed, industry leader Epic © has already announced a
                                                               [3]
               forthcoming integration of GPT-4 into its platform . This would enable surgeons to use Epic to
               instantaneously extract and summarize information for patient care. It would also facilitate powerful
   17   18   19   20   21   22   23   24   25   26   27