Page 21 - Read Online
P. 21

Page 105                                                                Liu et al. Art Int Surg 2024;4:92-108  https://dx.doi.org/10.20517/ais.2024.19

               its reliance on basic mesh-level features rather than image features that may vary widely among OR room
               appearances. To ensure that the risk of overfitting our models was accurately assessed, we stratified our
               training, evaluation, and test sets such that action clips comprising each data subset were derived from
               separate videos. Furthermore, we experimented with various regularization strategies, such as sequence
               frame sampling and reducing the number of modeled mesh features, to improve model robustness within
               the action recognition task. Our models performed comparably on our test set relative to our training set,
               providing strong evidence of their ability to generalize to new domains.


               Limitations
               There are important limitations of our work, one of which is that we focused on simulated surgery videos.
               These videos are similar to those of real endovascular procedures, since the OR room layout is identical, the
               same procedural steps are simulated, and similar equipment is used. However, the simulated videos featured
               fewer people and not all standard protective accessories, such as operating gowns, were used. While these
               simulated videos strongly resembled those of real procedures, we plan to study any challenges that may arise
               from applying our methods to videos of real procedures in future work.


               An additional limitation of our study is that our simulated videos featured relatively similar room layouts
               and human appearances. To assess the capacity for our approach in generalizing to new surgical settings, we
               hope to test our approach across a larger video dataset, comprising full-duration endovascular procedures
               that capture a wide range of OR layouts and human appearances. In these future experiments, we plan to
               adhere to the ethical and legal guidelines outlined by Doyen et al. on performing continuous video
               recordings of the OR . Ensuring patient privacy and practicing data stewardship are critical considerations
                                 [30]
               in the safe integration of computer vision approaches with surgical video analysis; hence, these criteria are
               important for future HMR studies analyzing OR videos with real patients.

               One final limitation of our study is that we focused on a small subset of short-duration surgical actions
               performed by individuals. While these actions are common across OR procedures and provided a
               conceptual foundation for our study, it is important for future efforts to focus on a larger range of surgical
               actions over longer time frames. Expanding the range of identifiable surgical actions would aid in the
               automatic reconstruction of procedure timelines and the identification of critical events, which are active
               areas of surgical research [31-34] . Specifically, sequences of human actions preceding critical milestones could
               be automatically parsed, and identified surgical events could be viewed collectively to attain a full picture of
               the procedure timeline. Computer vision provides a natural way to streamline this analysis in a scalable
               manner [31,35] , and our study is a proof-of-concept for how this may be achieved with HMR. In addition to
               expanding the temporal dimension of our work, we also hope to investigate the extrapolation of subject-
               level behavior to an understanding of team dynamics and interpersonal communication, which are crucial
               hallmarks of success in the OR .
                                         [6]
               Clinical relevance
               Our work has important implications for surgical video analysis that seeks to improve OR efficiency and
               patient outcomes. Previous studies have found that environmental distractions (i.e., auditory and visual)
               and workflow inefficiencies in the OR can have adverse effects on team performance, resulting in
               unfavorable patient outcomes [5,22,36] . This observation has been the basis of several observational studies that
               have focused on uncovering the root causes of OR inefficiencies from the lens of human behavior. Lynch
               et al., for example, performed a manual video review of 28 surgical cases to monitor OR foot traffic and
                                    [2]
               associated infection risk . Hazlehurst et al. performed an ethnographic study of audiovisual data from 20
               open-heart surgical cases to better understand team interactions in the OR . Harders et al. examined
                                                                                  [3]
               perioperative flow patterns within 20 ORs during a three-month period to design interventions for reducing
   16   17   18   19   20   21   22   23   24   25   26