Page 39 - Read Online
P. 39

Ding et al. Art Int Surg 2024;4:109-38  https://dx.doi.org/10.20517/ais.2024.16     Page 123

               neural radiance combined with a time-dependent neural displacement field. Advancing this further,
                       [249]
               EndoSurf  employs three neural fields to model the surgical dynamics, shape, and texture.
               Pose estimation
               Pose estimation aims to estimate the geometric relationship between an image and a prior model, which can
                              [254]
               take several forms . These include rigid surface models, dense deformable surfaces, point-based skeletons,
               and robot kinematic models [254,255] . Many of the same techniques and variations employed in depth
               estimation and 3D reconstruction tasks also apply here, including feature-based matching with the
                                                                                [256]
                                        [255]
               perspective-n-point problem  and end-to-end learning-based approaches . For point-based skeletons,
               such as the human body, pose estimation relied on handcrafted features  before the advent of deep neural
                                                                            [257]
                       [258]
               networks . In the context of surgical data science, pose estimation is highly relevant given the amount of a
               priori information going into any surgery, which can be leveraged to create 3D models. These include
               surgical tool models, robot models, and patient images. The 6DoF pose estimation of surgical tools, for
               example, in relation to patient anatomy, can enable algorithms that anticipate surgical errors and mitigate
               the risk of injuries . By identifying tools’ proximity to critical structures, pose estimation technologies can
                              [254]
               ensure safer operations . This is further advanced by precise 6DoF pose estimation of both instruments
                                   [259]
               and tissue.

               Deep neural networks have been shown to demonstrate promising outcomes for object pose estimation in
               RGB images [254,260-263] . Modern approaches often involve training models to regress 2D key points instead of
               directly estimating the object pose. These key points are then utilized to reconstruct the 6DoF object pose
               through the perspective-n-point (PnP) algorithm, with techniques showing robust performance, even in
               scenarios with occlusions .
                                    [260]

               Hand pose estimation also benefits from these technological advancements, with several methods proposed
               for deducing hand configurations from single-frame RGB images . This capability is crucial for
                                                                            [264]
               understanding the interactions between surgical tools and the operating environment, offering insights into
               the precise manipulation of instruments.

               Beyond tool and hand pose estimation, human pose estimation can be applied for a broad spectrum of
               clinical applications, including surgical workflow analysis, radiation safety monitoring, and enhancing
               human-robot cooperation [265,266] . By leveraging videos from ceiling-mounted cameras, which capture both
               personnel and equipment in the operating room, human pose estimation can identify the finer activities
               within surgical phases, such as interactions between clinicians, staff, and medical equipment. The feasibility
               of estimating the poses of the individuals in an operating room, utilizing color images, depth images, or a
               combination of both, opens possibilities for real-time analysis of clinical environments .
                                                                                        [267]

               APPLICATIONS OF GEOMETRIC SCENE UNDERSTANDING EMPOWERED DIGITAL
               TWINS
               Geometric scene understanding plays a pivotal role in developing the DT framework by enabling the
               creation and real-time refinement of digital models based on real-world observations. Geometric
               information processing is crucial here for precise representation, visualization, and model interaction.
               Section “GEOMETRIC SCENE UNDERSTANDING TASKS” outlined the methods for processing this
               information, critical for navigating the complex geometry of surgical settings - identifying shapes, positions,
               and movements of anatomical features and tools. This section delves into the integration of geometric scene
               understanding within the DT framework, emphasizing its successful applications. It offers valuable insights
               that could be leveraged or specifically adapted to further the development of DT technologies in surgery.
   34   35   36   37   38   39   40   41   42   43   44