Page 52 - Read Online
P. 52

Page 245                          Su et al. Intell Robot 2022;2(3):244­74  I http://dx.doi.org/10.20517/ir.2022.17

               learning models is that they can mine hidden laws and useful information from data with huge volumes and
               various structures more quickly and effectively than human beings. Most importantly, people often mix per-
               sonal emotions when making decisions, making their decisions unfavorable to certain groups. It is canonically
               believed that the decisions made by automatic decision-making systems are more objective, and, thus, there
               will be no discrimination against specific groups or individuals. However, this assumption cannot always be
               met. Due to the biased training data and inherent bias of adopted models, machine learning models are not
               always as neutral as people expect.


               Since many automated systems driven by AI techniques can significantly impact people’s lives, it is important
               to eliminate discrimination embedded in the AI models so that fair decisions are made with their assistance.
               Indeed, in recent years, fairness issues of AI models have been receiving wide attention. For example, auto-
               mated resume screening systems often give biased evaluations based on traits beyond the control of candidates
               (e.g., gender and race), which may not only discriminate against job applicants with certain characteristics but
               also cost employers by missing out on good employees. Early research on achieving fairness of algorithms
               focused on statistical correlation and developed many correlation-based fairness notions (e.g., predictive par-
                 [7]
                                                      [9]
                                  [8]
               ity , statistical parity , and equalized odds ), which primarily focus on discovering the discrepancy of
               statistical metrics between individuals or sub-populations. However, correlation-based fairness notions fail
               to detect discrimination in algorithms in some cases and cannot explain the causes of discrimination, since
               they do not take into account the mechanism by which the data are generated. A classic example is Simpson’s
               paradox [10] , where the statistical conclusions are drawn from the sub-populations and the whole population
               can be different. On the other hand, discrimination claims usually require demonstrating causal relationships
               between sensitive attributes and questionable decisions, instead of the association or correlation between the
               sensitive attributes and decisions.


               Consider the example of the graduate admissions at University of California, Berkeley in 1973, which confirms
               the importance of developing causal perspective admission to detect and eliminate discrimination. From the
               statistical results of historical data, roughly 44% of all men who applied were admitted, compared to 35% of
               women who applied. Then, a flawed conclusion may be drawn with the support of the difference in admission
               rates between males and females. That is, there exists discrimination towards women for their graduate admis-
               sion. After an in-depth examination of this case, there is no wrongdoing by the educational institution, but
               a larger proportion of women applied to the most competitive departments, causing a lower admission rate
               than men. However, the question of discrimination is far from resolved, e.g., there is no way of knowing why
               women tended to apply to more competitive departments from the available data alone. Therefore, it is help-
               ful to detect discrimination and interpret the sources of discrimination by understanding the data generating
               mechanism, namely the causality behind the problem of discrimination. In addition, causal models can be re-
               garded as a mechanism to integrate scientific knowledge and exchange credible assumptions to draw credible
               conclusions. For this admission case, it seems that, due to women’s socialization and education, they tend to
               toward fields of studies that are generally crowded. Therefore, it is necessary to explore the causal structure
               of the problem. Fortunately, more and more researchers have paid attention to detecting and eliminating dis-
               crimination from the perspective of causality, and various fairness concepts and fairness-enhancing methods
               based on causality have been proposed.


               Compared with the fairness notions based on correlation, causality-based fairness notions and methods take
               additional consideration of the knowledge that reflects the causal structure of the problem. This knowledge
               reveals the mechanism of data generation and is helpful for us to comprehend how the influence of sensitive
               attributes change spreads in the system, which is conducive to improving the interpretability of model deci-
               sions [11–14] . Therefore, causality-based fairness machine learning algorithms help to enhance fairness. How-
               ever, causality-based fairness approaches still face many challenges, one of which is unidentifiable situations of
               causal effects [15] . In other words, the causal effect between two variables cannot be uniquely computed from
   47   48   49   50   51   52   53   54   55   56   57