Page 83 - Read Online
P. 83

McGivern et al. Art Int Surg 2023;3:27-47  https://dx.doi.org/10.20517/ais.2022.39   Page 41

               Table 4. Summary of studies leveraging large datasets for AI use in HPB surgery
                          Year of
                Author               Study description
                          publication
                      [16]
                Roch et al.  2015    566,233 CT reports from 50,669 patients analysed for keywords associated with Pancreatic Cysts using NLP
                Yang et al. [19]  2017  The Cancer Genome Atlas catalogs genes associated with 33 cancers. Genes associated with HCC were
                                     extracted from database and checked for overlap with genes identified in 35 years of published literature using
                                     NLP
                Merath    2020       15,657 patients undergoing liver, pancreatic or colorectal surgery (685 liver and 6,012 pancreatic)
                  [49]
                et al.               retrospectively were identified from the American College of Surgeons National Surgical Quality Improvement
                                     Program database. Risk-prediction Machine Learning model created from pre-op characteristics
                Szpakowski   2020    365 Gallbladder Cancer and 35,970 Gallbladder Polyp patients were identified from 622,227 patients in a
                et al. [55]          Californian health system. NLP was used to identify Polyps from Ultrasound reports
                     [58]
                Xie et al.  2021     58,085 imaging reports from 6,346 Chronic Pancreatitis patients were used to develop an NLP algorithm that
                                     could characterize features of Chronic Pancreatitis
                Yamashita   2021     430,426 imaging reports from 199,783 patients were used to create an NLP algorithm to identify the presence
                et al. [30]          and size of Pancreatic Cysts
                      [98]
                Imler et al.  2021   23,674 ERCP reports were analyzed for quality measures using NLP
                Noh et al. [61]  2022  Machine learning-based prediction models for survival applied to 10,742 HCC patients
                Morris-Stiff   2022  Ultrasound reports identified 49,414 patients with gallstones. NLP algorithm trained to identify asymptomatic
                  [62]
                et al.               patients (22,257)
                Narayan   2022       25,494 images from 90 liver biopsies were used to develop Machine Learning Computer Vision models to
                  [63]
                et al.               score liver steatosis
                Kooragayala   2022   NLP was used to identify pancreatic lesions from 18,769 adult trauma CT reports
                et al. [35]
               CT: Computed tomography; ERCP: endoscopic retrograde cholangiopancreatography; HCC: hepatocellular carcinoma; NLP: natural language
               processing.


               exacerbating pre-existing healthcare disparities. This is a widely discussed and controversial topic in the
               broader AI field. Inherent systematic biases in datasets clearly exist, with some of the most obvious
               reflecting racial, socioeconomic and gender-based prejudices. Addressing these complex issues is crucial
               across all AI work, including in HPB surgery. The majority of HPB disease occurs LMICs , so it is
                                                                                                [111]
               essential that these populations are better represented in current HPB research more broadly and AI
               research specifically.


               In addition to geographical disparities, concerns around the transparency of AI algorithms and lack of
                                                                             [112]
               explainability are likely to hamper uptake and trust in clinical practice . The need for explainability is
               rooted in evidence-based medicine, which relies on transparency and reproducibility in decision-
                     [113]
               making . Without explainable AI, patient trust in healthcare will erode. Others have argued that true
               explainability represents a false hope, and that explainability methods cannot deliver meaningful patient-
               level interpretability . The focus should be on robust internal and external validation. In this review, we
                                [114]
               found little reference to concepts of explainability in included studies. It is important that these issues are
               explored and addressed, particularly when developing algorithms orientated toward patient-facing
               prognostication. As AI systems transition from research to clinical practice, transparency and reliability are
               paramount if trust is to be built and maintained [115,116] .


               The ability to understand and reproduce scientific findings is imperative, yet reporting the quality of
               included studies was variable. A number of useful reporting guidelines now exist, specifically orientated
               toward AI. In 2019, a rigorous process of literature review, expert consultation, Delphi survey, and
               consensus  meeting  resulted  in  the  SPIRIT-AI  (Standard  Protocol  Items:  Recommendations  for
               Interventional Trials - Artificial Intelligence) and CONSORT-AI (Consolidated Standards of Reporting
               Trials - Artificial Intelligence) standards . In addition, two additional tools are currently under
                                                     [117]
   78   79   80   81   82   83   84   85   86   87   88