Page 69 - Read Online
P. 69
Su et al. Intell Robot 2022;2(3):24474 I Page 262
can be achieved using knowledge of a causal graph or by a controlled experiment making use of interventions.
As such, causality-based fairness machine learning algorithms have attracted more and more attention and
several causality-based fairness approaches have been proposed. Although causal fairness models can indeed
help us overcome many of the challenges encountered with respect to fair prediction tasks, they still face many
challenges, which are discussed in the following subsections.
7.1. Causal discovery
Causality-based fairness approaches require a causal graph as additional prior knowledge of input, where the
causal graph describes the mechanism by which the data are generated, that is, it reveals the causal relationship
between variables. However, in practice, it is difficult for us to obtain the correct causal graph. A basic way
to discover the causal relationship between variables is to conduct randomized controlled trials. Randomized
controlled trials consist of randomly assigning subjects (e.g. individuals) to treatments (e.g. gender), and
then comparing the outcome of all treatment groups. Unfortunately, in many cases, it may not be possible to
out. For example, to understand the impact of smoking, it would be necessary to force different individuals to
smoke or not smoke. As another example, to understand whether hiring decision models are gender-biased,
it would be necessary to change the gender of a job applicant, which is an impracticality. Researchers are
therefore often left with non-experimental, observational data, and they have developed numerous methods
foruncoveringcausalrelations,i.e.,causaldiscovery. Causaldiscoveryalgorithmscanberoughlyclassifiedinto
the following three categories: constraint-based, score-based, and those exploiting structural asymmetries.
Constraint-based approaches conduct numerous conditional independence tests to learn about the structure
of the underlying causal graph that reflects these conditional independence. Constraint-based approaches
have the advantage that they are generally applicable, but the disadvantages are that faithfulness is a strong
assumption and that it may require very large sample sizes to get good conditional independence tests. Fur-
thermore, the solution of this approach to causal discovery is usually not unique, and, in particular, it does not
help determine the causal direction in the two-variable case, where no conditional independence relationship
is available.
Score-based algorithms use the fact that each directed acyclic graph (DAG) can be scored in relation to the
data, typically using a penalized likelihood score function. The algorithms then search for the DAG that yields
theoptimalscore. TypicalscoringfunctionsincludetheBayesianinformationcriterion [96] ,Bayesian–Gaussian
equivalent score [96] , and minimum description length (as an approximation of Kolmogorov complexity) [97,98] .
Structural asymmetry-based algorithms take into account the setting that it is impossible to infer the causal
direction from observations alone when the data distribution admits structural causal models indicating either
of the structural directions → or ← . To address this problem, structural asymmetry-based algo-
rithms make some additional assumptions about the function of the underlying true data-generating structure,
so that they can exploit asymmetries to identify the direction of a structural relationship. These asymmetries
manifest in various ways, such as non-independent errors, measures of complexity, etc. Existing methods that
exploit such asymmetries are typically local solutions, as they are only able to test one edge at a time (pair-
wise/bivariate causal directionality), or a triple (with the third variable being an unobserved confounder) [99] .
In the absence of intervention and manipulation, observational data leave researchers facing a number of
challenges: First, there may exist hidden confounders, which are sometimes termed the third variable problem.
Second, observational data may exhibit selection bias. For example, younger patients may generally prefer
surgery, while older patients may prefer medication. Third, most causal discovery algorithms are based on
strong but often untestable assumptions, and applying these strong assumptions to structural or graphical
models incites some harsh criticisms.