Page 69 - Read Online
P. 69

Su et al. Intell Robot 2022;2(3):244­74  I http://dx.doi.org/10.20517/ir.2022.17    Page 262

               can be achieved using knowledge of a causal graph or by a controlled experiment making use of interventions.
               As such, causality-based fairness machine learning algorithms have attracted more and more attention and
               several causality-based fairness approaches have been proposed. Although causal fairness models can indeed
               help us overcome many of the challenges encountered with respect to fair prediction tasks, they still face many
               challenges, which are discussed in the following subsections.


               7.1. Causal discovery
               Causality-based fairness approaches require a causal graph as additional prior knowledge of input, where the
               causal graph describes the mechanism by which the data are generated, that is, it reveals the causal relationship
               between variables. However, in practice, it is difficult for us to obtain the correct causal graph. A basic way
               to discover the causal relationship between variables is to conduct randomized controlled trials. Randomized
               controlled trials consist of randomly assigning subjects (e.g. individuals) to treatments (e.g. gender), and
               then comparing the outcome of all treatment groups. Unfortunately, in many cases, it may not be possible to
               undertakesuchexperimentsduetoprohibitivecosts,ethicalconcerns,ortheyarephysicallyimpossibletocarry
               out. For example, to understand the impact of smoking, it would be necessary to force different individuals to
               smoke or not smoke. As another example, to understand whether hiring decision models are gender-biased,
               it would be necessary to change the gender of a job applicant, which is an impracticality. Researchers are
               therefore often left with non-experimental, observational data, and they have developed numerous methods
               foruncoveringcausalrelations,i.e.,causaldiscovery. Causaldiscoveryalgorithmscanberoughlyclassifiedinto
               the following three categories: constraint-based, score-based, and those exploiting structural asymmetries.


               Constraint-based approaches conduct numerous conditional independence tests to learn about the structure
               of the underlying causal graph that reflects these conditional independence. Constraint-based approaches
               have the advantage that they are generally applicable, but the disadvantages are that faithfulness is a strong
               assumption and that it may require very large sample sizes to get good conditional independence tests. Fur-
               thermore, the solution of this approach to causal discovery is usually not unique, and, in particular, it does not
               help determine the causal direction in the two-variable case, where no conditional independence relationship
               is available.

               Score-based algorithms use the fact that each directed acyclic graph (DAG) can be scored in relation to the
               data, typically using a penalized likelihood score function. The algorithms then search for the DAG that yields
               theoptimalscore. TypicalscoringfunctionsincludetheBayesianinformationcriterion [96] ,Bayesian–Gaussian
               equivalent score [96] , and minimum description length (as an approximation of Kolmogorov complexity) [97,98] .

               Structural asymmetry-based algorithms take into account the setting that it is impossible to infer the causal
               direction from observations alone when the data distribution admits structural causal models indicating either
               of the structural directions       →       or       ←       . To address this problem, structural asymmetry-based algo-
               rithms make some additional assumptions about the function of the underlying true data-generating structure,
               so that they can exploit asymmetries to identify the direction of a structural relationship. These asymmetries
               manifest in various ways, such as non-independent errors, measures of complexity, etc. Existing methods that
               exploit such asymmetries are typically local solutions, as they are only able to test one edge at a time (pair-
               wise/bivariate causal directionality), or a triple (with the third variable being an unobserved confounder) [99] .

               In the absence of intervention and manipulation, observational data leave researchers facing a number of
               challenges: First, there may exist hidden confounders, which are sometimes termed the third variable problem.
               Second, observational data may exhibit selection bias. For example, younger patients may generally prefer
               surgery, while older patients may prefer medication. Third, most causal discovery algorithms are based on
               strong but often untestable assumptions, and applying these strong assumptions to structural or graphical
               models incites some harsh criticisms.
   64   65   66   67   68   69   70   71   72   73   74