Page 46 - Read Online
P. 46
Page 4 Shapey et al. Art Int Surg 2023;3:1-13 https://dx.doi.org/10.20517/ais.2022.31
Post-hoc prediction
Reviewing specified cases that experience mortality or significant morbidity is a long-standing feature of
most contemporary surgical departments. However, the systematic collection of data according to pre-
defined criteria and data variables is a relatively new concept that is gaining popularity. The National
Surgical Quality Improvement Programme (NSQIP), championed by the American College of Surgeons,
provides a structured framework from which to capture and analyse relevant data. NSQIP uses a
standardised Participant Use File to collect data at the individual patient level and can be analysed according
[20]
to the procedure . Failure to rescue is an important binary outcome variable that is collected and reported
by NSQIP and reflects the inability to identify and ameliorate postoperative complications. Meanwhile, in
the UK, O’Reilly et al. showed that the process of instituting a prospective quality improvement programme
was a significant driver behind a reduction in postoperative complications . In this instance, granular data
[21]
using standardised definitions of postoperative complications as agreed by the International Study Groups
of Liver Surgery and Pancreatic Surgery [22-27] were prospectively collected and validated in a weekly meeting
of senior HPB surgeons. Moreover, adoption of the Comprehensive Complication Index (CCI) as a
continuous outcome variable representing the full and broad range of postoperative complications
facilitates a standardised tool for reliable comparison amongst cohorts . The success of the Dutch
[28]
Pancreatic and Hepatobiliary National Audits in providing a data platform from which to perform practice
changing research illustrates the potential for machine learning methods to tap into rich data repositories
that could help improve outcomes [29-30] .
Existing quality improvement and audit programmes highlight some important lessons that require due
consideration prior to instituting ML as an integral part of the analysis of postoperative complications. First,
variables and outcomes should only be reported according to clearly agreed definitions, while prospective
validation of recorded data is essential in order to ensure the accuracy and integrity of ML analyses. Second,
a mixture of data forms that include qualitative and quantitative outcomes (both binary and continuous) are
necessary in order to capture the true impact of surgical care on patient experience. Third, measures of
optimal outcomes (e.g. return to normal physiological function, and length of stay adjusted for the
complexity of surgery) should be included alongside complication outcomes. Effective quality improvement
mandates both the reduction of errors, deriving from the analysis of complications, and an increase in
insight, deriving from the analysis of best practices. It can be challenging to gain consensus on best practice
outcomes because patients, populations and health systems are very heterogenous groups. Nonetheless, it is
vitally important because the minimisation of complications is associated with improvements from multiple
marginal gains, whereas increasing insight can contribute to step-wise positive changes but that occur on a
much less frequent basis. In the absence of detailed attention to the validity of data inputs and outcomes, the
contribution of ML to quality improvement is likely to be, at best, irrelevant, and at worse, damaging to
patient well-being.
Bile duct injuries occurring during minimally invasive cholecystectomy remain a problematic issue. The
advent of minimally invasive surgery, including robotic systems with three-dimensional visualisation, has
facilitated the opportunity for high-quality recording of surgical procedures. Artificial intelligence-assisted
post-hoc review of 290 laparoscopic cholecystectomies demonstrated the ability to accurately (0.95[+/-0.06])
and specifically (0.98[+/-0.05]) identify “No-Go” zones that were representative of hazardous anatomical
regions associated with a higher probability of bile duct injury. However, the technology suffered from a
much lower rate of sensitivity (0.80[+/-0.21]). In this instance, the discrepancy between sensitivity and
specificity is quite important, because the former has the capacity to identify a potential injury before it
occurs and thereby prevent it, whereas the value of the latter lies more in confirming whether an injury may