Page 75 - Read Online
P. 75
Page 357 Hogue et al. Art Int Surg. 2025;5:350-60 https://dx.doi.org/10.20517/ais.2025.19
literature synthesis [9,10,12,14,15,17] . These functions align closely with adult learning theory by supporting self-
directed, flexible, and iterative learning. For assessors, LLMs may support feedback generation, formative
assessment, and curriculum gap identification [18,26,27] . However, these tools are not without limitations. Their
lack of domain specificity, potential for fabricated outputs, and reliance on generalized datasets raise
concerns about safety and validity in high-stakes educational settings. As LLMs become more integrated
into training environments, programs must implement guardrails to ensure clinical accuracy, ethical use,
and oversight by qualified faculty.
While many studies in this review focused on the development and validation of novel AI tools, few
demonstrated implementation within training programs. The gap between innovation and integration
reflects the challenge of translating AI advancements into sustainable, real-world practices. None of the
included studies reported routine, integrated use of AI in active plastic surgery curricula. Implementation of
AI-based tools, particularly predictive models that require large datasets or generative models that rely on
cloud infrastructure, requires thoughtful planning. At a minimum, programs considering integration should
assess their existing digital infrastructure (e.g., access to simulation labs, high-speed internet, secure data
storage) and designate faculty to oversee pilot testing (Vannaprathip et al., Fang et al., and Yilmaz et al.
show examples of more tech-heavy predictive model requirements [23,24,34] ). In resource-limited settings,
lower-barrier tools such as ChatGPT or podcast-generating platforms can be introduced as supplements for
asynchronous learning without the need for extensive hardware or software upgrades [9,14,15] . Starting with
smaller-scale, low-cost implementations allows programs to evaluate feasibility and acceptability before
broader rollout. As programs prepare to integrate more robust AI tools, collaboration with affiliated
computer science programs or commercial AI vendors may be considered to facilitate access to technical
expertise and potentially reduce implementation costs. Ultimately, successful integration will depend on
institutional readiness, trainee engagement, and the presence of clear educational goals that AI can enhance
without replacing.
AI applications for resident feedback analysis within the field of general surgery could be easily applied in
plastic surgery training. NLP algorithms have proven capable of processing large volumes of text-based
feedback on resident performance. In the future, algorithms could be developed that summarize narrative
feedback into key components for each resident or compute summative scores from narrative comments so
that faculty can focus on narrative feedback [26,28] . NLP algorithms can also flag assessors who repeatedly
provide low-quality feedback [26,27] .
Early applications of AI within the operating room show potential to revolutionize surgical training. For
example, machine learning algorithms may provide assistance with preoperative planning by accurately
[21]
identifying anthropomorphic landmarks prior to unilateral cleft lip repair . When combined with VR and
surgical simulators, AI can continuously assess surgical skills and synthesize immediate feedback [19,20,22-24] .
Such applications during resident training could remove the burden from faculty and provide a source of
effective operative instruction that is not limited by faculty schedules or biases.
Although AI-driven interventions in surgical education appear promising, it is important to recognize their
limitations. AI should not be the sole source of objective information and will always require human
oversight. For example, Koljonen et al. had to choose the most realistic AI-generated clinic image and their
[13]
algorithm struggled to understand medical terminology . It is limited by the extent to which its reasoning
is logical and the inputs used to optimize its performance are accurate. The extent to which AI performance
is affected by inaccurate input may be underrated. Machine learning algorithms will continue to require
improvements to advance AI capabilities to parallel that of humans. For example, within the field of plastic

