Page 108 - Read Online
P. 108
Choksi et al. Art Int Surg. 2025;5:160-9 https://dx.doi.org/10.20517/ais.2024.84 Page 164
Table 1. Participant demographics
All (n = 27) Trainees (n = 20) Attendings (n = 7)
Female n (%) 10 (37%) 9 (45%) 1 (14%)
Right dominant hand n (%) 24 (88.9%) 18 (90%) 6 (86%)
Robotic case experience (median +/- SD) 50 +/- 156 45 +/- 32 186 +/- 231
PGY level (for residents) (median +/- SD) N/A 3 +/- 1.27 N/A
Years experience after training (attendings) (median +/- SD) N/A N/A 8.7 +/- 6.01
PGY: Post graduate year; N/A: not applicable.
Table 2. Performance of each model in the system using 3-fold cross-validation across held-out surgeons. For technical score
prediction models, we assume the type of suturing exercise (backhand, railroad) is known prior and apply the corresponding model
Average weighted F-1 Average macro F-1 Average
Technique Model
score score accuracy
Sub-stitch classification Video swin 0.6452 0.6400 0.7023
transformer
Technical score prediction - Video swin 0.7185 0.7155 0.7259
backhand transformer
Technical score prediction - railroad Video swin 0.6430 0.6364 0.6411
transformer
Surgeon proficiency prediction Random forest 0.6266 0.5805 0.6665
classifier
This study represents one of the first to utilize AI to automatically assess surgical trainees on a specific
robotic surgery task. This study underlines the ability of AI-assisted assessment tools as an effective
educational tool for surgical trainees in identifying their proficiency and potentially providing feedback.
While Ma et al. developed the first AI-based video feedback tool for robotic suturing, their study
participants had no robotic surgical experience and, therefore, focused on improving tasks rather than
determining proficiency . Our model is able to provide feedback while also determining the skill level of
[16]
the trainee.
Suturing is a fundamental surgical skill, and proficiency in this skill implies mastery of many technicalities,
such as needle angulation, insertion point, depth, and tissue manipulation. By breaking down the suturing
tasks into four sub-stitches: needle positioning, needle targeting, needle driving, and needle withdrawal,
trainees can understand what specific needle movements they need to practice while maintaining a
standardized taxonomy. This specific suturing taxonomy, based on prior research, allows us and future
researchers to have a reproducible methodology around automatic supervised learning suturing
assessment [15,17] .
These surgical techniques are vital for surgical trainees to practice more specific movements in a controlled
setting before they perform surgery on a patient. It also allows surgical attendings to gain trust in their
trainees prior to operating in a real clinical setting based on their robotic proficiency score.
This preliminary study shows only the feasibility of creating this type of model to assess the skill level of a
trainee, with an accuracy of 66.7%. While the model does need to be improved, its current accuracy allows
for identifying which residents need extra practice. Only six of the participants were false negatives in our
study [Figure 3]. Although the accuracy rate is only 66.7%, having residents practice more and then attempt
the dry lab again will only improve their skills and therefore the model can be utilized to pick out those
trainees who need more practice. However, we do aim to improve our accuracy in the future with a larger

