Page 83 - Read Online
P. 83

Wei et al. Art Int Surg 2024;4:187-98  I http://dx.doi.org/10.20517/ais.2024.12     Page 189



















               Figure 1. Illustration of our proposed KV-EndoNeRF for scale-aware monocular reconstruction from the robotic endoscope. NeFR: Neural
               radiance fields.


               our proposed pipeline, KV-EndoNeRF, both qualitatively and quantitatively on the publicly available Stereo
               Correspondence And Reconstruction of Endoscopic Data (SCARED) robotic endoscope dataset. The results
               demonstrate that KV-EndoNeRF outperforms previous methods, showcasing its ability to achieve 3D recon-
               struction with accurate scale in monocular surgical scenes.






               2. METHODS

               2.1 Overview of NeRF-based scale-aware reconstruction
               Considering robot-assisted endoscopy, the goal of our proposed pipeline, KV-EndoNeRF, is to achieve scale-
               aware monocular reconstruction from limited multi-modal data (i.e., kinematics, and endoscopic image se-
               quences). It requires neither large numbers of endoscopic images for training, nor other imaging modalities,
               suchascomputedtomography(CT)andmagneticresonanceimage(MRI),forthegroundtruthlabels. Thekey
               toourpipelineis toeffectivelyincorporatethescale information fromrobotkinematics into NeRF-represented
               surgical scenes for optimization. Following the modeling in NeRF [15] , we represent the surgical scene as a neu-
               ral radiance field for further volume rendering (Section 2.2). As shown in Figure 1, we first extract the absolute
               scale from kinematics and then fuse it into sparse depth produced by SfM. Under sparse supervision, we fine-
               tune a monocular depth estimation network to the current endoscopic scene for scene-specific coarse depth
               (Section 2.3). After adjusting the scale of coarse depth estimation, we integrate it into the ray marching of
               NeRF and optimize the volumetric field to obtain the absolute depth (Section 2.4). Finally, the refined abso-
               lute depth maps are fused in a truncated signed distance functions (TSDF)-based volumetric representation
               according to the endoscopic trajectory (Section 2.5). This results in a reconstructed 3D model of the surgical
               scene with global-scale information.



               2.2 Surgical scene representing and rendering by NeRF
               NeRF has achieved impressive success in view synthesis by optimizing the neural implicit field. Our pipeline
               exploresitspotentialfortheoptimizationofdepthestimates. Werepresentasurgicalsceneasaneuralradiance
               field      , which is an 8-layer multilayer perceptron (MLP) with network parameter   . The field,       : (x, d) →
               (c,   ), maps a 3D point x ∈ R and a viewing direction d ∈ R to an RGB value c (x, d) ∈ R and space
                                          3
                                                                     3
                                                                                                3
               occupancy    (x) ∈ R. With scene representations, we further adopt the volume rendering [21]  in NeRF to
               generate rendered images for training. The volume rendering starts with shooting a batch of endoscope rays
               into the surgical scene from the endoscope center o along the direction d. Each ray is constructed as r (  ) =
               o +   d, where    is the ray parameter. We then proceed with ray marching to sample points in the space.
                                                                                        
               Specifically, we partition each camera ray r (  ) into a batch of points {x    |x    = r (      )}  . Then, the rendered
                                                                                        =1
   78   79   80   81   82   83   84   85   86   87   88