Page 38 - Read Online
P. 38
Boutros et al. Art Int Surg 2022;2:213-23 https://dx.doi.org/10.20517/ais.2022.32 Page 219
applications. These obstacles exist across stakeholders, with clinicians, computer scientists, patients,
hospitals, and even industry encountering issues that have prevented further progress in the field [Figure 3].
Lack of data
In the realm of computer vision research, data are a vital and often scarce currency. For algorithms to
function, they must be fed a considerable quantity of training data before they are able to generate
conclusions “independently”. However, the success or failure of a given project depends on not only an
abundance of data but an abundance of the right kind of data. While simple in premise, the capture,
processing, and storage of data can represent a herculean task, especially in the realm of computer vision.
Firstly, institutional limitations on the sharing and acquisition of medical media have created necessary but
[1]
ample logistical hurdles in the generation of new datasets . Likewise, inter-institutional variability in image
quality, file format, and capture process has effectively quashed the consolidation of multiple datasets into a
single, more robust unit. Finally, the storage and processing of these data requires significant resources - not
by way of computing power, but manpower. Datasets must be carefully tracked, evaluated, and managed
depending on their use for a given project. These problems are compounded in the field of surgery, where
raw data often represent video footage of procedures. Unlike still images, surgical videos generate complex
spatiotemporal inputs that further exacerbate these issues.
Lack of ground truth
Once visual data have been appropriately captured, processed, and stored, they can begin to be used in the
development of algorithms. However, at this stage, datasets lack the context needed to generate conclusions
independently. That is, datasets are missing labels of objects, scenes, and events. The process of labeling, or
annotation, is carried out by humans and serves to train an algorithm on how to interpret the raw visual
data with which it is presented (e.g., supervised learning). Crucial to this premise is the establishment of a
fundamental basis for data interpretation, or ground truth.
In theory, annotators establish ground truth by defining parameters such as tools, anatomical structures,
[22]
and clinically significant events .In surgical applications, this process is limited by the degree of expertise
required to make such annotations. Besides surgeons or surgical trainees, most annotators would require
training on how to interpret the visual data presented from surgical procedures. For applications lacking a
standardized operative procedure, like many in the HPB space, this process becomes even more difficult.
Similarly, the lack of a standard operative procedure creates a wider margin for the interpretation of clinical
events, which translates to greater annotator-annotator variability, even amongst experts [23]
Explainability
The nature of so-called black box algorithms is a central concern of many ethicists when evaluating the
moral permissibility of artificial intelligence. Deep learning systems are trained to recognize patterns from
many thousands of data points but also to continuously adapt to novel data and modulate analyses
[24]
accordingly . The resultant algorithms are highly sophisticated and at times highly accurate; however, this
complexity comes at the expense of comprehension. While it is entirely possible to describe the annotation
criteria or mathematical underpinnings of these algorithms, the self-learning aspect of black box algorithms
generates outputs that are impossible for humans to understand.
The apparent lack of transparency in black box algorithms could serve to undermine trust in physicians and
complicate the process of shared decision making . Moreover, the principle of informed consent would be
[25]
nearly impossible to uphold. After all, if the process of a medical decision lacks a basis for explanation,
educating patients on the risks and benefits of one outcome versus another would be a moot, if not
impossible, point. In the field of surgery, this problem is further compounded by the acuity of the domain.