Page 113 - Read Online
P. 113
Ambati et al. Art Int Surg. 2025;5:53-64 https://dx.doi.org/10.20517/ais.2024.45 Page 59
intraoperative blood loss, smoking, and preoperative medical comorbidities, but the exact predictors vary by
study [49-53] . Some models tend to predict relatively common events such as postoperative delirium, hospital
readmissions, and length of stay, whereas others aim to predict rarer and potentially more catastrophic
[51]
events such as vascular injury during anterior lumbar surgery . Across tools that aim to quantify adverse
events, the approaches that are trained on large databases, receive external validation and testing, and
release their tools as open source or commercialized software are most likely to gain the most traction.
In addition to perioperative complications, ML is also well-suited to predict long-term outcomes. In cervical
spondylotic myelopathy, previous studies have accurately predicted outcomes years after surgery from
preoperative variables, with simple methods such as logistic regression performing well compared to more
advanced methods [54,55] . By examining feature importance methods in ML algorithms, drivers of long-term
outcomes can be better understood. For example, in patients who underwent lumbar fusion, higher leg pain
and back pain preoperatively were predictive of improvements in leg and back pain, respectively . In a
[56]
separate study of both cervical and thoracolumbar fusion, preoperative axial pain and peripheral pain,
nationality, the number of previous spine surgeries, age, type of intervention, preoperative quality-of-life,
BMI, number of affected levels, and comorbidity were major predictors of outcome . Similarly, using
[57]
preoperative MRIs, one study used neural network-based models to predict postoperative proximal
junctional kyphosis (PJK). Analysis of the model found that soft tissue features were the strongest drivers of
[58]
the accuracy of PJK prediction . A natural question is to ask: “How valuable are these models?” Indeed,
they primarily identify obvious risk factors as drivers of short-term complications (age, sex, comorbidities),
and those of long-term outcomes (how much patients stand to gain from their preoperative level of
disability). We argue that the key to these models is to be able to quote and counsel patients about risks and
outcomes in a patient-specific manner to improve informed consent and shared decision making [Table 3].
CHALLENGES AND OPPORTUNITIES
AI and ML tools throughout the spectrum of spine surgical care hold significant promise to improve patient
outcomes; however, those at each point of care have sets of unique challenges. AI/ML focusing on
preoperative planning may require prospective studies showing that it improves outcomes to gain traction
from physicians and reimbursement from insurance companies. Intraoperative tools and robotics require
significant hardware investment and may face regulatory challenges to reach clinical integration, and may
encounter resistance from surgeons who fear inefficiencies and potential patient harm associated with early
adoption of new technologies [59,60] . Models that predict postoperative complications and long-term outcomes
face difficulty in standardizing outcome metrics and in generalizing across centers . However, common to
[61
all AI/ML tools in spine surgery are several critical challenges, which we detail below, along with our
proposed solutions.
Challenge 1: patient and surgical heterogeneity
Our varied clinical and research efforts in spine surgery reflect the immense heterogeneity in the patients we
treat. Patients may undergo the same operation for a wide variety of indications, at a wide variety of initial
states of health, and similarly, outcomes are driven by a wide variety of physiologic and psychosocial factors.
In addition, the same patient with the same pathology may be offered differing surgical plans based on their
surgeon’s training and preference. A central challenge in ML is the tradeoff between variables and
observations (i.e., patients) . In spine surgery, where patient variability is high, this limitation means that
[15]
for models to reach the expert level, they must incorporate both many variables and data from a large
number of patients. However, as model complexity increases, the ability to understand such models
decreases. To mitigate this tradeoff, it may be most expedient to focus AI development efforts on
applications that are specifically tailored to quickly and accurately perform highly specific, otherwise time-