Page 87 - Read Online
P. 87

Wei et al. Art Int Surg 2024;4:187-98  I http://dx.doi.org/10.20517/ais.2024.12     Page 193

                                                 Table 1. Depth evaluation metrics

                                               Metrics        Definition

                                               Abs Rel     1 Í   ∈D |   −   |/   ∗
                                                                 ∗
                                                           |D|
                                                                    2
                                               Sq Rel      1 Í   ∈D |   −   | /   ∗
                                                                 ∗
                                                           |D|
                                                           q
                                               RMSE          1 Í   ∈D |   ∗ −   | 2
                                                            |D|
                                                         q
                                                           1 Í
                                              RMSE             ∈D | log    ∗ − log   | 2
                                                          |D|

                                                                    
                                                       1       ∈ D| max(     ∗  ,     ∗ <   ) | × 100%
                                                      |D|         
                                                  and    represent the estimated
                                                      ∗
                                               depth value and the corresponding
                                               ground truth. D corresponds to the
                                               estimated depth map. RMSE: Root
                                               mean square error.
               3.3 Evaluation on scale-aware depth estimation
               We compare the accuracy of depth estimation using the KV-EndoNeRF method with several other deep
                                                                           [7]
               learning-based approaches and the SfM method, specifically COLMAP .
                           [7]
                • COLMAP is a general-purpose SfM pipeline used for reconstructing 3D point cloudreconstruction from
                  ordered and unordered image collections. In our study, we apply it to monocular surgical scene reconstruc-
                  tion. The recovered points are then projected onto each image plane to obtain the sparse depth maps for
                  evaluation.
                • EndoSLAM  [30]  is an unsupervised relative monocular depth estimation method specifically designed for
                  gastrointestinal tract organs. It combines residual networks with a spatial attention module to focus on
                  highly textured tissue regions. We fine-tune the depth model using the SCARED data for comparison.
                • AF-SfMLearner [10]  isanovelself-supervised networkforestimatingmonoculardepthinendoscopicscenes.
                  It is trained on the SCARED datasets, which contain severe brightness fluctuations induced by illumination
                  variations, non-Lambertian reflections, and inter-reflections.
                • DS-NeRF [17]  is a general depth-supervised NeRF method that utilizes sparse reconstruction from the SfM
                  to recover dense 3D structures. We apply DS-NeRF to estimate dense depth maps for each endoscopic
                  image.


               We present the quantitative depth comparison results on SCARED data in Table 2, which rescales the re-
               sults using the ground truth median scaling method. In addition to standard depth evaluation metrics, we
               calculate the means and standard errors of the rescaling factors to demonstrate the scale-awareness ability.
               KV-EndoNeRF achieves the best up-to-scale performance with respect to five metrics and ranks the second
               best for the other two metrics. Notably, KV-EndoNeRF also achieves nearly perfect absolute scale estimation.
               These quantitative results show that our proposed method effectively extracts absolute scale information from
               kinematics and integrates it into NeRF for further depth optimization, resulting in accurate absolute depth
               estimation.


               Furthermore, we select four representative images from the SCARED dataset for qualitative depth comparison.
               As shown in Figure 3, our method with NeRF-based optimization produces depth predictions with sharp
               boundaries and fine-grained details, outperforming other approaches in terms of absolute depth estimation.
               However, COLMAPcouldonlyrecoversparsedepthmapswithouttheentire3Dgeometryofthetissuesurface.
               While EndoSLAM and AF-SfMLearner are capable of generating reasonable 3D structures of tissues, they lose
               many details in tissues with complex geometries and edges. Lastly, the estimated depth values from DS-NeRF
               contain significant noise, which could affect the surgeons’ observations of complicated tissue surfaces.
   82   83   84   85   86   87   88   89   90   91   92