Page 43 - Read Online
P. 43
Ding et al. Art Int Surg 2024;4:109-38 https://dx.doi.org/10.20517/ais.2024.16 Page 127
cannot currently be processed in real time for updating geometric information and require auxiliary
constraints like stereo matching for plausible output. The lack of observability due to limited operational
space is also a major challenge that leads to a paucity of features for geometric scene understanding.
[123]
[2]
Incorporating other modalities like robot kinematics and temporal constraints can be a complement
under this situation. The emergence of foundation models also offers an alternate approach to harness the
power of foundation models trained on enormous natural images to address the aforementioned challenges
in the surgical domain [146,145] . However, the domain gap may hinder the optimal extraction of precise
[147]
features and may require further work to extend them in the surgical domain.
CONCLUSION
Surgical data science, benefiting from the advent of end-to-end deep learning architectures, is also hindered
by their lack of reliability and interoperability. The DT paradigm is envisioned to advance the surgical data
science domain further, introducing new avenues of research in surgical planning, execution, training, and
postoperative analysis by providing a universal digital representation that enables robust and interpretable
surgical data science research. Geometric scene understanding is the core building block of DT and plays a
pivotal role in building and updating digital models. In this review, we find that the existing geometric
representation and well-established tasks provide fundamental materials and tools to implement the DT
framework and have led to the emergence of successful applications. However, challenges remain in
employing more advanced but data-consuming methods especially in segmentation, detection, and
monocular depth estimation tasks in the surgical domain due to a lack of annotations and a gap in the scale
of the data. The complexity of the surgical scene due to the large portion of dynamic and deformable tissues,
and the lack of observability due to limited operational space are also common factors that hinder the
development of geometric scene understanding tasks, especially for the 3D reconstruction that demands
multi-view observations. To address these challenges, numerous approaches, including synthetic image
generation, sim-to-real generalization, auxiliary data incorporation, and foundational model adaptation, are
being explored. Among all of these methods, the auxiliary data incorporation and foundation models
present the most promising improvement. Since the auxiliary data is not always available and the
exploration of the foundation models in surgical data science is still preliminary, it is expected to see more
advancement in this direction that improves the geometric scene understanding performance and further
promotes DT research. Developing an accurate, efficient, interactive, and reliable DT requires robust and
efficient holistic geometric representation and combinations of effective geometric scene understanding, to
build and update digital model pipelines in real time.
DECLARATIONS
Authors’ contributions
Initial writing of the majority part and coordination of the collaboration among authors: Ding H
Initial writing of the 3D reconstruction, integration, and revision of the paper: Seenivasan L
Initial writing of the depth estimation and pose estimation part and revision of the paper: Killeen BD
Initial writing of the pose estimation and application part: Cho SM
The main idea of the paper, overall structure, and revision of the paper: Unberath M
Availability of data and materials
See Section “Availability of data and materials” in the main text.
Financial support and sponsorship
This research is in part supported by (1) the collaborative research agreement with the Multi-Scale Medical
Robotics Center at The Chinese University of Hong Kong; (2) the Link Foundation Fellowship for
Modeling, Training, and Simulation; and (3) NIH R01EB030511 and Johns Hopkins University Internal

