Page 68 - Read Online
P. 68
Page 261 Su et al. Intell Robot 2022;2(3):24474 I http://dx.doi.org/10.20517/ir.2022.17
Table 2. Typical packages or software for causal analysis
Type Package name Program language Description
TETRAD is a full-featured software for causal analysis; after
considerable development, it can be used to discover the
TETRAD [85] Java
causal structure behind the dataset, estimate the causal effects,
simulate the causal models, etc
Py-causal is a Python encapsulation of TETRAD, which can call
Py-causal [87] Python
the algorithms and related functions in TETRAD
Causal-learn is the Python version of TETRAD. It provides the
implementation of the latest causal discovery methods ranging
Causal-learn [86] Python
Causal discovery from constraint-based, score-based, and constrained functional causal
models-based to permutation-based methods
Tigramite focuses on searching causal structure from observational
Tigramite [88] Python
time series data.
gCastle provides many gradient-based causal discovery approaches,
gCastle [89] Python
as well as classic causal discovery algorithms
CausalML encapsulates many causal learning and inference approaches.
CausalML [90] Python One highlight of this software package is uplift modeling, which is used to
evaluate the conditional average treatment effect (CATE)
Causaleffect [92] R Causaleffect is the implementation of ID algorithm
Causal effect [93] DoWhy takes causal graphs as prior knowledge and uses Pearl’s
and Inference DoWhy Python -calculus method to assess causal effects
Mediation provides model-based method and design-based
method to evaluate the potential causal mechanisms. It also provides
Mediation [91] R approaches to deal with common problems in practice and random
trials, that is, to handle multiple mediators and evaluate causal
mechanisms in case of intervention non-compliance
One highlight of this package is uplift modeling, which is used to evaluate the conditional average treatment
effect (CATE), that is, to estimate the impact of a treatment on a specific individual’s behavior.
Mediation [91] is an R package which is used in causal mediation analysis. In other words, it provides model-
based methods and design-based methods to evaluate the potential causal mechanisms. It also provides ap-
proaches to deal with common problems in practice and random trials, that is, to handle multiple mediators
and evaluate causal mechanisms in case of intervention non-compliance.
Causaleffect [92] is an R package which is the implementation of ID algorithm. ID algorithm is a complete
identification of causal effects algorithm, which outputs the expression of causal effect when the causal effect is
identifiable or fails to run when the causal effect is unidentifiable. DoWhy [93] , a Python package, also focuses
on causal inference, that is, it takes causal graphs as prior knowledge and uses Pearl’s -calculus method to
assess causal effects.
Thesepackagesusedforcausalanalysisassistindevelopingcausality-basedfairness-enhancingmethods,which
are mainly reflected in exposing the causal relationship between variables and evaluating the causal effects of
sensitive attributes on decision-making. However, they cannot be used directly to detect or eliminate discrim-
ination. Although there are many software packages for detecting and eliminating discrimination, e.g., AI
Fairness 360 Open Source Toolkit [94] , Microsoft Research Fairlearn [95] , etc, we are still lacking a package that
integrates causality-based approaches.
7. CHALLENGES
Decision based on machine learning has gradually penetrated into all aspects of human society, and the fair-
ness of its decision-making directly affects the daily life of individuals or groups, as well as users’ trust and
acceptance of machine learning application deployment. Recently, fair machine learning has received exten-
sive attention, and researchers are gradually aware of the fact that relying solely on the observable data, with
no additional causal information, is limited in removing discrimination, since the dataset only represents the
selected population, without any information on the groups who were not selected, while such information