Page 55 - Read Online
P. 55

Su et al. Intell Robot 2022;2(3):244­74  I http://dx.doi.org/10.20517/ir.2022.17    Page 248

               evaluate the causal effect of    on    (more details of intervention can be seen in Section 3). Figure 2(b) shows
               the causal structure of such an example after intervening on   . The hiring decisions made by the company are
               fair, if the hiring proportions when all applicants in the population have religious beliefs are the same as the
               hiring proportions when all applicants in the population have no religious beliefs, i.e.,   (   = 1|    (   = 1)) =
                 (   = 1|    (   = 0)). Formally, these probabilities in this example are obtained as below:
                                                       ∑
                                     (   = 1|    (   = 1)) =    (   = 1|   = 1,    =   ) ·   (   =   )
                                                        ∈{0,1}
                                                    = 0.02 × 0.5 + 0.25 × 0.5 = 0.135
                                                       ∑
                                     (   = 1|    (   = 0)) =    (   = 1|   = 0,    =   ) ·   (   =   )
                                                        ∈{0,1}
                                                    = 0.03 × 0.5 + 0.24 × 0.5 = 0.135


               These values confirm that the hiring decisions made by the company do not discriminate against applicants
               with religious beliefs. Therefore, it is critical to conduct a causal analysis of the problem, since understanding
               thecausal mechanismsbehindtheproblemcannotonlyhelptodetect discriminationbutalso helptointerpret
               the sources of discrimination.



               3. PRELIMINARIES AND NOTATION
               In this review, an attribute is denoted by an uppercase letter, e.g.,   ; a subset of attributes is denoted by a bold
               uppercase letter, e.g., X; a domain value of attribute    is denoted by a lowercase letter, e.g.,   ; and the value
               assignment of subset attributes X is denoted by a bold lowercase letter, e.g., x. In particular,    represents the
               sensitive attribute (e.g., race) and    represents the predicted result of the AI model system (e.g., loans).

               One of the most popular causal model frameworks is Pearl’s Structural Causal Model (SCM) [10] . A structural
               causal model M is represented by a quadruple ⟨U,   (U),V,F⟩:
               1. U denotes exogenous variables that cannot be observed but constitute the background knowledge behind
                  the model.
               2.   (U) represents the joint probability distribution of U.
               3. V denotes endogenous variables that can be observed.
               4. F denotes a set of functions mapping from U ∪ V to V, which reflects the causal relationship between
                  variables. For each    ∈ V, there is a mapping function       ∈ F from U ∪ (V \   ) to   , i.e.,    =
                        (    (  ),      ), where parent variables     (  ) ⊂ V \    are the endogenous variables that directly control
                  the value of   , and       is a set of exogenous variables that directly determine   .

               A causal model M is associated with a causal graph G = ⟨V, E⟩, a directed acyclic graph, where V is a set
               of nodes, each of which represents an endogenous variable of V, and each element in E indicates a directed
               edge →, pointing from a node    ∈ U ∪ V to another node    ∈ V if       uses values of    as input, which
               represents a causal relationship between the corresponding variables. The exogenous variables U can be either
               independent or dependent. If the exogenous variables U are mutually independent, the causal model is called
               Markovian model. In this case, exogenous variables typically are not represented in the causal diagram. In
               the case, the exogenous variables are mutually dependent (hidden confounders), the causal model is called
               the semi-Markovian model. In the semi-Markovian model, dashed bi-directed edges are used to represent the
               hidden confounders between two variables. Figure 7 shows examples of causal graphs of Markovian model
               [Figure 3(a)] and semi-Markovian model [Figure 3(b)].

               An intervention simulates the physical interventions that force some variable    to take certain values    regard-
               less of the corresponding function      , denoted by     (  ). In the causal graph, it is shown as discarding all edges
   50   51   52   53   54   55   56   57   58   59   60