Page 25 - Read Online
P. 25
Talwar et al. Art Int Surg. 2025;5:46-52 https://dx.doi.org/10.20517/ais.2024.81 Page 50
amounts of data, including sensitive patient information, to perform tasks effectively. Ensuring that these
data are stored and processed in a HIPAA-secured manner is essential to protect patient confidentiality.
Another challenge is the potential for bias, a well-known limitation of LLMs. If the data used to train NLP
models are not diverse or representative, the algorithms may produce biased or inaccurate results. This
could lead to disparities in patient care, particularly for patients from underrepresented groups. Indeed, a
recent study published in NEJM AI found that nearly all large-scale clinical datasets for training medical
LLMs came from the Americas, Europe, and Asia, and covered nine languages in total . This leaves people
[17]
and languages underrepresented. To address this disparity, it is crucial to develop and train NLP models
using diverse datasets and to continuously monitor and address bias. Another challenge for using clinical
NLP will be integration into current systems and workflows. Many of the aforementioned technologies will
have to integrate with EHRs and access protected health information. In addition, future research should
compare multiple LLMs (GPT-4, Mistral, Claude, Bard, Perplexity, Claude, Bard, Perplexity, etc.) to find
which ones are most appropriate for each clinical application.
A critical challenge is that LLMs are prone to “hallucination” - the generation of false content due to an
[18]
LLM’s extrapolation of its training data . There are several examples in the literature [19,20] . Hallucinations
and inaccuracies with LLMs will be a barrier to clinical adoption for medico-legal reasons, as the uncertain
legal status of these models puts providers at risk. This must be addressed by the medical community before
such technologies can be deployed. The key seems to be extensive prompt engineering, including the
[18]
identification of best practices and close human oversight, as suggested by Shah . Kwong et al. also suggest
[21]
LLMs should indicate uncertainty when it is most appropriate .
While NLP has been purported to help in diagnosis and treatment plans in other specialties [22-24] , this
capacity may be limited in plastic surgery. This is because patients usually come in for a plastic surgery
consultation with a known or apparent diagnosis (i.e., breast cancer, lipodystrophy, facial aging). Even if the
diagnosis is unclear (i.e., hand pain), an astute plastic surgeon will rely on the physical exam and imaging to
establish a diagnosis. A study by Pressman et al. reinforced this notion, as they purported to use LLMs to
diagnose hand injuries based on clinical vignettes, but these vignettes included detailed physical exam and
[25]
imaging findings . Additionally, treatment planning across all plastic surgery also depends heavily on the
physical exam, surgeon comfort, and patient goals - factors that NLP systems cannot assess.
CONCLUSION
This manuscript is an introduction for plastic surgeons to understand how NLP can be integrated into
plastic surgery patient consultations to improve both documentation and communication. These
applications include information extraction, summarization, ambient transcription, coding, patient
understanding, translation, and a patient-facing chatbot. In doing so, NLP has the potential to personalize
care, enhance patient satisfaction, and improve workflows for plastic surgeons.
However, there are ethical considerations and challenges associated with NLP development. Plastic
surgeons should seek to create plastic surgery-specific models to maximize effectiveness. As NLP
technology continues to advance, its role in plastic surgery consultations is likely to expand, offering new
opportunities for innovation and patient-centered care.
DECLARATIONS
Authors’ contributions
Made substantial contributions to the conception and design of the review: Talwar A, Shen C, Shin JH

