Page 65 - Read Online
P. 65

Su et al. Intell Robot 2022;2(3):244­74  I http://dx.doi.org/10.20517/ir.2022.17    Page 258


               6. APPLICATIONS OF FAIR MACHINE LEARNING
               This section enumerates different domains of machine learning and the work that has been produced by each
               domain to combat discrimination in their methods.


               6.1. Data missing
               One major challenge for fairness-enhancing algorithms is to deal with the biases inherent in the dataset that is
               caused by missing data. Selection biases are due to the distribution of collected data, which cannot reflect the
               real characteristics of disadvantaged groups. Martínez-Plumed et al. [63]  learned that selection bias is mainly
               caused by individuals in disadvantaged groups being reluctant to disclose information, e.g., people with high
               incomes are more willing to share their earnings than people with low incomes, which results in bias inference
               that training in the training institution helps to raise earnings. To address this problem, Bareinboim et al. [64]
               and Spirtes et al. [65]  studied how to deal with missing data and repair datasets that contain selection biases by
               causal reasoning, in order to improve fairness.


               On the other hand, the collected data represent only one side of the reality, that is, these data do not contain
               any information about the population who were not selected. Biases may arise that decide which data are
               contained or not contained in the datasets. For example, there is a dataset that records the information of
               individuals whose loans were approved and the information about their ability to repay their loans. Although
               the automatic decision system that satisfies certain fairness requirements is constrained based on this dataset
               to predict whether they repay their loan on time, such a predictor may be discriminatory when it is used
               to assess the credit score of further applicants, since populations not approved for loans are not sufficiently
               representative in the training data. Goel et al. [66]  used the causal graph-based framework to model the causal
               process of possible missing data for different settings by which different types of decisions are made in the past,
               and proved some data distributions can be inferred from incomplete available data based on the causal graph.
               Although the practical scenarios they discussed are not exhaustive, their work shows that the causal structure
               can be used for determining the recoverability of quantities of interest in any new scenario.


               A promising solution for dealing with missing data can be found in causality-based methods. We see that
               causality can provide tools to improve fairness when the dataset suffers from discrimination caused by missing
               data.


               6.2. Fair recommender Systems
               Recommenders are recognized as the most effective way to alleviate information overloading. Nowadays, rec-
               ommender systems have been widely used in variable applications, such as ecommerce platforms, advertise-
               ments, news articles, jobs , etc. They are not only used to analyze user behavior to infer users’ preferences so
               as to provide them with personalized recommendations, but they also benefit content providers with more
               potential of making profits. Unfortunately, there exist fairness issues in recommender systems [67] , which are
               challenging to handle and may deteriorate the effectiveness of the recommendation. The discrimination em-
               bedded in the recommender systems is mainly caused by the following aspects. User behaviors in terms of the
               exposed items make the observational data confounded by the exposure mechanism of recommenders and the
               preference of the users. Another major cause of discrimination in recommender systems is that disadvantage
               items reflected in the observational data are not representative. That is to say, some items may be more popular
               than others and thus receive more user behavior. As a result, recommender systems tend to expose users to
               these popular items, which results in discrimination towards unpopular items and leads to the systems not
               providing sufficient opportunities for minority items. Finally, one characteristic of recommender systems is
               the feedback loop. That is, the systems exposes to the user for determining the user behavior, which is circled
               back as the training data for the recommender systems. Such a feedback loop not only creates biases but also
               intensifies biases over time, resulting in “the rich get richer” Matthew effect.
   60   61   62   63   64   65   66   67   68   69   70