Page 94 - Read Online
P. 94

Glaser et al. Art Int Surg. 2025;5:1-15  https://dx.doi.org/10.20517/ais.2024.36     Page 11

                                         [50]
               prone to rater-dependent error . Advancements in imaging techniques, including full-body electron optic
               system (EOS) radiographs, CT scans, and MRI, have enabled more accurate measurements of spinopelvic
               parameters. The development of more sophisticated software has led to accelerated measurement times via
                                                                    [50]
               semi-automated computer-aided tools, such as SurgiMap . Software tools such as SurgiMap have
               demonstrated a mean time efficiency of 75 ± 25 s to perform a full spinopelvic analysis, significantly
               reducing the burden associated with manual measurements . Our review of the existing literature on deep
                                                                 [50]
               learning models for spinopelvic parameter measurement revealed processing times ranging from 0.2 to 1 s
               per image. A set of radiographs for spinopelvic parameter measurement typically involves 2-3 images on
               average: a lateral X-ray, an anterior and posterior X-ray, and possibly a full-body EOS image in more
               complicated cases. Regarding time saved, deep learning models would require an estimated 0.6-3 s to
               analyze a full set of images compared to the 75-second mean from the studies mentioned previously. Deep
               learning models are, therefore, roughly 25× more efficient. Additionally, there were studies included in our
               analysis that involved pathological images, whereas the study using SurgiMap involved images with no
               pathology, further demonstrating the capability and efficiency of deep learning technology. To contextualize
               these efficiency gains with accuracy: Manual measurements typically show inter-observer variability of 5°-
               10° for the Cobb angle and similar ranges for other parameters. Semi-automated tools reduced this
               variability to 3°-7°. Our meta-analysis found AI measurement errors of 4.3° for Cobb angle, 3.9° for thoracic
               kyphosis, and 3.6° for lumbar lordosis - comparable to or better than both manual and semi-automated
               methods. This suggests AI can dramatically improve measurement efficiency without compromising
               accuracy, potentially offering both time savings and measurement reliability improvements in clinical
               practice.

               No one model stood out as superior to the others. Each study and the model they used had advantages and
               disadvantages that are open to interpretation. For example, the model used by Zerouali et al. was mainly
               tested in a pediatric population; therefore, this model would likely only be of interest to a surgeon who
               operates on this population . Many studies only involved a single clinical dataset, which is a key reason
                                       [22]
               why we argue for multicenter validation to demonstrate reproducibility. Additionally, some studies did not
               train their models on patients who had implants. Therefore, these models would require further validation
               to be useable in scenarios such as postoperative evaluation and planning for revision surgery. What was
               consistent across all models was that they all were more efficient than current methods without
               compromising accuracy.

               Despite the demonstrated accuracy and efficiency of these models, there remains a gap in understanding
               their practical utility for surgeons across various clinical contexts, including preoperative and intraoperative
               stages. Theoretically, the enhancement in efficiency should offer surgeons more time to review images and
               make surgical plans. Pending multicenter validation, future research should explore whether or not the
               integration of deep learning truly enhances efficiency throughout the entire perioperative continuum. For
               example, a surgeon may use deep learning as an adjunct for formulating a preoperative plan. Within
               surgery, intraoperative X-ray image evaluation may allow synchronous measurement of spinopelvic
               parameters to assess the efficacy of hardware placement. Lastly, in the postoperative phase, the technology
               can be used to predict postoperative complications and 30-day readmission rates as stated earlier, with the
               potential for much more. No one model stood out as superior to the others. Each study and the model it
               used had advantages and disadvantages that are open to interpretation.


               A notable limitation in measuring PI deserves specific attention. Our meta-analysis found PI measurements
               had a relatively higher pooled error of 4.1° compared to other pelvic parameters such as PT (1.9°). This
               larger error can be attributed to several specific challenges: First, the presence of double-dome endplates can
   89   90   91   92   93   94   95   96   97   98   99