Page 83 - Read Online
P. 83
McGivern et al. Art Int Surg 2023;3:27-47 https://dx.doi.org/10.20517/ais.2022.39 Page 41
Table 4. Summary of studies leveraging large datasets for AI use in HPB surgery
Year of
Author Study description
publication
[16]
Roch et al. 2015 566,233 CT reports from 50,669 patients analysed for keywords associated with Pancreatic Cysts using NLP
Yang et al. [19] 2017 The Cancer Genome Atlas catalogs genes associated with 33 cancers. Genes associated with HCC were
extracted from database and checked for overlap with genes identified in 35 years of published literature using
NLP
Merath 2020 15,657 patients undergoing liver, pancreatic or colorectal surgery (685 liver and 6,012 pancreatic)
[49]
et al. retrospectively were identified from the American College of Surgeons National Surgical Quality Improvement
Program database. Risk-prediction Machine Learning model created from pre-op characteristics
Szpakowski 2020 365 Gallbladder Cancer and 35,970 Gallbladder Polyp patients were identified from 622,227 patients in a
et al. [55] Californian health system. NLP was used to identify Polyps from Ultrasound reports
[58]
Xie et al. 2021 58,085 imaging reports from 6,346 Chronic Pancreatitis patients were used to develop an NLP algorithm that
could characterize features of Chronic Pancreatitis
Yamashita 2021 430,426 imaging reports from 199,783 patients were used to create an NLP algorithm to identify the presence
et al. [30] and size of Pancreatic Cysts
[98]
Imler et al. 2021 23,674 ERCP reports were analyzed for quality measures using NLP
Noh et al. [61] 2022 Machine learning-based prediction models for survival applied to 10,742 HCC patients
Morris-Stiff 2022 Ultrasound reports identified 49,414 patients with gallstones. NLP algorithm trained to identify asymptomatic
[62]
et al. patients (22,257)
Narayan 2022 25,494 images from 90 liver biopsies were used to develop Machine Learning Computer Vision models to
[63]
et al. score liver steatosis
Kooragayala 2022 NLP was used to identify pancreatic lesions from 18,769 adult trauma CT reports
et al. [35]
CT: Computed tomography; ERCP: endoscopic retrograde cholangiopancreatography; HCC: hepatocellular carcinoma; NLP: natural language
processing.
exacerbating pre-existing healthcare disparities. This is a widely discussed and controversial topic in the
broader AI field. Inherent systematic biases in datasets clearly exist, with some of the most obvious
reflecting racial, socioeconomic and gender-based prejudices. Addressing these complex issues is crucial
across all AI work, including in HPB surgery. The majority of HPB disease occurs LMICs , so it is
[111]
essential that these populations are better represented in current HPB research more broadly and AI
research specifically.
In addition to geographical disparities, concerns around the transparency of AI algorithms and lack of
[112]
explainability are likely to hamper uptake and trust in clinical practice . The need for explainability is
rooted in evidence-based medicine, which relies on transparency and reproducibility in decision-
[113]
making . Without explainable AI, patient trust in healthcare will erode. Others have argued that true
explainability represents a false hope, and that explainability methods cannot deliver meaningful patient-
level interpretability . The focus should be on robust internal and external validation. In this review, we
[114]
found little reference to concepts of explainability in included studies. It is important that these issues are
explored and addressed, particularly when developing algorithms orientated toward patient-facing
prognostication. As AI systems transition from research to clinical practice, transparency and reliability are
paramount if trust is to be built and maintained [115,116] .
The ability to understand and reproduce scientific findings is imperative, yet reporting the quality of
included studies was variable. A number of useful reporting guidelines now exist, specifically orientated
toward AI. In 2019, a rigorous process of literature review, expert consultation, Delphi survey, and
consensus meeting resulted in the SPIRIT-AI (Standard Protocol Items: Recommendations for
Interventional Trials - Artificial Intelligence) and CONSORT-AI (Consolidated Standards of Reporting
Trials - Artificial Intelligence) standards . In addition, two additional tools are currently under
[117]