Page 63 - Read Online
P. 63

Su et al. Intell Robot 2022;2(3):244­74  I    Page 256

                                           Pre-processing     eliminate bias  Algorithm  predict

                           Causality-based  In-processing                       Fair
                             methods                                           predict
                                                      Data    Fairness constraints

                                          Post-processing      Algorithm  Prediction revision  predict

                                  Figure 6. The categorization of causality-based fairness-enhancing approaches.

               5.1. Pre­processing Causality­based methods
               Pre-processing methods update the training data before feeding them into a machine learning algorithm.
               Specifically, one idea is to change the labels of some instances or reweigh them before training to limit the
               causal effects of the sensitive attributes on the decision. As a result, the classifier can make a fairer predic-
               tion [36] . On the other hand, some studies propose to reconstruct the feature representations of the data to
               eliminate discrimination embedded in the data [38,39] .

               Forexample,Zhangetal. [27]  formalizedthepresenceofdiscriminationasthepresenceofacertainpath-specific
               effect, and then framed the problem as one maximizing the likelihood subject to constraints that restrict the
               magnitude of the       . To deal with unidentifiable cases, they mathematically bound the       . CFGAN [40] ,
               which is based on Causal GAN [41]  to learn the causal relationship between the attributes, adopts two gen-
               erators to separately simulate the underlying causal model that generates the real data and the causal model
               after the intervention and two discriminators to produce a close to real distribution and to achieve total ef-
               fect fairness, counterfactual fairness, and path-specific fairness. Salimi et al. [42,43]  leveraged the dependencies
               between sensitive and other attributes, which is provided by the causal knowledge, to add or remove samples
               from the collected datasets in order to eliminate the discrimination. Nabi et al. [44]  only considered mapping
               generative models for   (  ,V \W|W) consisting of some attributes W to “’fair” versions of this distribution
                  ∗ (  ,V \W|W) and ensured that    ∗ (W) =   (W). PSCF-VAE [45] achieves path-specific counterfactual
               fairness by modifying the observed values of the descendant attribute of the sensitive attribute on the unfair
               causal path during testing, leaving the underlying data-generation mechanism unaltered during training.

               5.2. In­Processing Causality­based methods
               In-processing methods eliminate discrimination by adding constraints or regularization terms to machine
               learning models [46–50] . If it is allowed to change the learning procedure for a machine learning model, then
               in-processing can be used during the training of a model either by incorporating changes into the objective
               function or imposing a constraint.

               For example, multi-world fairness algorithms [12]  add constraints to the classification model that require satis-
               fying the counterfactual fairness. To address the unidentifiable situation and alleviate the difficulty of deter-
               mining causal models, it combines multiple possible causal models to make approximately fair predictions. A
               tuning parameter    is used to modulate the trade-off between fairness and accuracy. Hu et al. [51]  proposed
               to learn multiple fair classifiers simultaneously from a static training dataset. Each classifier is considered to
               perform soft interventions on the decision, whose influence is inferred as the post-intervention distributions
               to formulate loss functions and fairness constraints. Garg et al. [52]  proposed to penalize the discrepancy be-
               tween real samples and their counterfactual samples by adding counterfactual logit pairing (CLP) to the loss
               function of the algorithm. Similarly, the authors of [53]  proposed to add constraints terms during the training
               to eliminate the difference in outcome between two identical individuals, one from the real world and one
   58   59   60   61   62   63   64   65   66   67   68