Page 58 - Read Online
P. 58
Page 251 Su et al. Intell Robot 2022;2(3):24474 I http://dx.doi.org/10.20517/ir.2022.17
white men) and = denotes the disadvantaged one (e.g., non-white men). Table 1 summarizes various
−
causality-based fairness notions falling under different types.
4.1. Group causalitybased fairness notions
GroupfairnessnotionsaimtodiscoverthedifferenceinoutcomesofAIdecisionmodelsacrossdifferentgroups.
The value of an individual’s sensitive attribute reflects the group he (or she) belongs to. Considered an example
ofsalarypredictionwhere and representmaleandfemalegroups,respectively. Somerepresentativegroup
−
+
causality-based fairness notions are introduced as follows.
4.1.1. Total effect
Before defining total effect (TE) [10] , statistical parity (SP) is first introduced, since it is similar to TE but is
fundamentally different from TE. SP is a common statistics-based fairness notion, which denotes similar indi-
viduals treated similarly regardless of their sensitive attributes. Statistical parity is satisfied if
| ( )| = | ( | = ) − ( | = )| ≤ (2)
+
−
Intuitively, ( ) measures the conditional distributions of change of one’s sensitive attribute from to ,
+
−
and it is considered to be fair if the difference between the conditional distributions is within the fair threshold
. The main limitation of ( ) is that ( ) is unable to reflect the causal relationship between and . Total
effect is the causal version of statistical parity, which additionally considered the generation mechanism of the
data. Formally, total effect can be computed as follows:
+ − (3)
( ) = ( | ( = )) − ( | ( = ))
TEmeasuresthedifferencebetweentotalcausaleffectofsensitiveattribute changingfrom to ondecision
+
−
= . Intuitively,statisticalparityrepresentsthedifferenceinprobabilitiesof = inthesamplingpopulation,
while total effect represents the difference in probabilities of = in the entire population.
A more complex total effect considers the effect of changes in the sensitive attribute value on the outcome of
automated decision making when we already observed the outcome for that individual, which is known as the
effect of treatment on the treated (ETT) [10] . This typically involves a counterfactual situation which requires
changing the sensitive attribute value of that individual at that time to examine whether the outcome changes
or not. ETT can be mathematically formalized using counterfactual quantities as follows:
− − (4)
( ) = ( | ) − ( | )
−
+
where ( | ) represents the probability of = had been , given had been observed to be .
+
−
−
+
( | ) = ( | ) represents the conditional distributions of = when we observe = . Such proba-
−
−
−
−
bility involves two worlds: one is an actual world where = and the other is a counterfactual world where
−
for the same individual = . Notice that ( | ) = ( | ) for consistency.
−
+
−
−
Other fairness notions similar to TE are also proposed. For example, FACT (fair on average causal effect) [32]
was proposed to detect discrimination of automated decision making, which is based on potential outcome
framework [33,34] . It considers an outcome is fair, if the average causal effect over all individuals in the pop-
+ ( ) ( )
+
−
( )
ulation of the value changes of from to on is zero, i.e., E[ − ] = 0, where denotes the
−
+
potential outcome of an individual had been .
+
TE and ETT both aim to eliminate the decision bias on all causal paths from to . However, they cannot
distinguish between direct discrimination, indirect discrimination, and explainable bias.