Page 88 - Read Online
P. 88

Page 194                          Wei et al. Art Int Surg 2024;4:187-98  I http://dx.doi.org/10.20517/ais.2024.12

                                 Table 2. Quantitative comparisons for scale-aware depth estimation on SCARED

                                                             Error ↓                           Accuracy ↑
                    Method       Scale
                                            Abs Rel     Sq Rel     RMSE       RMSE log       < 1.25 1     < 1.25 2
                  COLMAP  [7]  4.04 ± 2.24  0.044 ± 0.028  0.391 ± 0.435  4.766 ± 2.506  0.065 ± 0.033  0.979 ± 0.036  0.998 ± 0.006
                  EndoSLAM  [30]  77.77 ± 17.10  0.079 ± 0.047  0.897 ± 1.090  7.160 ± 4.818  0.099 ± 0.052  0.931 ± 0.124  0.997 ± 0.009
                AF-SfMLearner  [10]  2.12 ± 0.45  0.056 ± 0.028  0.437 ± 0.560  5.103 ± 3.143  0.073 ± 0.034  0.979 ± 0.047  0.999 ± 0.005
                  DS-NeRF  [17]  22.04 ± 9.75  0.049 ± 0.034  0.458 ± 1.012  4.866 ± 3.432  0.070 ± 0.041  0.972 ± 0.067  0.997 ± 0.012
                     Ours      0.95 ± 0.07  0.048 ± 0.025  0.347 ± 0.351  4.583 ± 2.247  0.066 ± 0.030  0.984 ± 0.029  0.999 ± 0.003
                 The closer the scale is to 1, the better. The best result is in bold. The second best is underlined. SCARED: Stereo
                 Correspondence And Reconstruction of Endoscopic Data; RMSE: root mean square error; NeRF: neural radiance
                 fields.




































               Figure 3. Qualitative comparisons on SCARED. Our method outperforms COLMAP  [7] , EndoSLAM  [30] , AF-SfMLearner  [10] , and DS-
               NeRF  [17]  in terms of depth quality. A large depth value is encoded with yellow, while a small depth value is encoded with purple. SCARED:
               Stereo Correspondence And Reconstruction of Endoscopic Data; NeRF: neural radiance fields.


               3.4 Comparison with state-of-the-art methods
               We compare our method with state-of-the-art approaches in terms of 3D reconstruction and view synthesis.
               Firstly, we quantitatively assess the reconstruction results and compare them with ground truth 3D models
               calculated by a structure light camera [26] . Unlike other monocular scene reconstruction methods, we do not
               scale the structures during evaluation, thanks to our scale-aware depth estimation. KV-EndoNeRF achieves
               highaccuracyin3Dreconstruction, withanaveragerootmeansquareerror(RMSE)errorof 1.259±0.257mm
               across all data. Figure 4A shows a qualitative comparison of SCARED data. As shown in the figure, the ground
               truth models in the third column, represented by gray points, indicate that these tissues have complex surfaces.
               The sparse point clouds recovered by COLMAP are presented in the first column of the figure. Due to the
               sparsityofthe3Dpoints, itisdifficulttoobservethegeometricstructuresandthetexturesofthetissuesurfaces.
               In comparison, our reconstructed meshes shown in the second column present reasonable structures and rich
               detailsofthesurface. Furthermore, weregisterthereconstructionresultswiththegroundtruthstructures, and
               the registration results show that our 3D reconstruction matches well with the ground truth. In summary, our
               methodcanreconstructsmooth3Dstructuresfromamonocularendoscopewithaccuratescale, highaccuracy,
               and rich details of the surface texture. Moreover, in Figure 4B, we observe that the proposed method benefits
   83   84   85   86   87   88   89   90   91   92   93