Fairness in Machine Learning Overview
Fairness in Machine Learning Overview
An Extensive Overview
Jannik Dunkelau( )
and Michael Leuschel
Heinrich-Heine-Universität Düsseldorf
Universitätsstraße 1 · 40225 Düsseldorf
[Link]@[Link] [Link]@[Link]
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Legal Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Types and Causes of Discrimination . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Causes for Machine Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Necessary Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Machine Learning Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Fairness Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Notions of Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.1 Unawareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Group fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.3 Predictive Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.4 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.5 Individual Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.6 Preference-Based Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.7 Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.8 Note on the Selection of Fairness Metrics . . . . . . . . . . . . . . . . . . . . . 23
6 Pre-Processing: Discrimination-free Training Data . . . . . . . . . . . . . . . . . . 24
2 J. Dunkelau et al.
6.1 Relabelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3 Fair Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 Further Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 In-Processing: Discrimination-aware Learners . . . . . . . . . . . . . . . . . . . . . . 28
7.1 Adjusted Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2 Adapted Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.3 Adversarial Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.4 Optimisation Subject to Fairness Constraints . . . . . . . . . . . . . . . . . 32
7.5 Compositional Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.6 Further Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
8 Post-Processing: Correcting Biased Classifiers . . . . . . . . . . . . . . . . . . . . . 35
8.1 Output correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.2 Input correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.3 Classifier correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.4 Further Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9 Fairness Toolkits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
9.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
9.2 Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
10 Further Discussion and Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.1 Accountability and Transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
10.2 Ethics and the Use of Protected Attributes . . . . . . . . . . . . . . . . . . . 46
10.3 Critiques and Perspectives on Current Practices . . . . . . . . . . . . . . . 47
1 Introduction
Unequal opportunities are known of in employment rates [17, 121] or the mort-
gage market [17, 124]. As a countermeasure, various legislations are in place
to ensure the non-discrimination of minority groups. In the U.K., the Equality
Act [188] is in place since October 2010, consolidating the previous Sex Discrimi-
nation Act [185], the Race Relations Act [186], and the Disability Discrimination
Act [187]. In U.S.A., these legislations are regulated by the Civil Rights Act of
1968 [190], the Equal Pay Act of 1963 [189], and the Equal Credit Opportunity
Act of 1974 [191]. The European Union passed the Racial Equality Directive
2000 [66], the Employment Equality Framework Directive 2000 [67], and the
Equal Treatment Directive 2006 [68]. A further EU proposed directive for imple-
menting the Equal Treatment Directive was proposed and partially agreed on in
2009 [64], yet is still pending.1
Intuitively, employing AI systems driving the decision process in such cases
would give the benefit of being objective and hence free of any discrimination.
Unfortunately, this is not the case. Google Ads show less high-paying job offers
for females [52] while searching for black-identifying names results in ads sug-
gestive of an arrest [173]. Amazon provided same-day-delivery offers to certain
neighbourhoods, chosen by an algorithm which ultimately reinforced racial bias
and never offered same-day-delivery for neighbourhoods consisting mainly of
minority groups [105]. Commercial, image-based gender classifiers by Microsoft,
1
As of October 2019.
4 J. Dunkelau et al.
IBM, and Face++ all increasingly mispredict once the input individual is dark-
skinned or female, with an error rate for dark-skinned females (20.8%–34.7%)
significantly worse than those for light-skinned males (0.0%–0.3%) [36].
Further examples include gender bias in word embeddings [32], discrimina-
tion in assigning credit scores [48], bias in image search results [116], racial bias
in recidivism prediction [7], or prejudice in New York City’s stop-and-frisk pol-
icy [81, 82].
Note that these systems did not explicitly discriminate, e.g. the Amazon
system did not explicitly have access to race in its decision making process.
Although ignoring sensitive attributes like race, sex, age, or religion intuitively
should be sufficient for a fair classification, it was shown that such systems
make use of indirect discrimination [15,40,151,162]. The discriminatory attribute
is deduced from seemingly unrelated data, which is also known as disparate
impact [15]. Further, the algorithms are trained on historical data, which can
contain previous discrimination which is hence learned by the model [15, 40]
The subject of employing machine learning algorithms in such cases hence-
forth introduces new legislative problems, as well as ethical problems. The Lead-
ership Conference on Civil and Human Rights published a set of five principles
to respect the value of equal opportunity and justice [125], following up with a
report on the current stand of social justice and technology [159]. A White House
Report to the Obama administration points toward the discrimination potential
of such techniques [177, 178] with a consecutive report calling for equal opportu-
nity by design [179]. The European Union released a governance framework for
algorithmic transparency and accountability [65], giving policies for accountable
and transparent algorithms, as well as an in-depth overview on the subject.
As the examples in Section 1.1 have proved, AI systems are well capable of
discrimination despite being possibly perceived as impartial and objective [15,
40,177]. In this section, we will give a brief overview of how machine bias can be
introduced to such systems. For a more thorough assembly and discussion, we
point the reader to the excellent works of Calders and Žliobaitė [40] as well as
Barocas and Selbst [15], upon which this section is based.
Data Collection The training data can contain an inherent bias which in turn
leads to a discriminatory model.2
One kind of unfair training data is due to incorrect distribution of the ground
truths. These ground truths can either be objective or subjective [40]. Examples
for objective ground truths include whether a credit was repaid, or whether a
criminal did reoffend, i.e. outcomes which can objectively be determined without
any influence of personal opinion. Subjective ground truths on the other hand
are dependent on the individual creator and assignment might differ depending
on the assigner. Examples include whether an applicant was hired or a student
was admitted to university. Note that only subjective ground truths can be incor-
rect [40]. Objective ground truths however still do not imply a discrimination-free
dataset, as shown by the predictive policing practice presented in Section 1.2.
The collection of data over historical records is always at risk of carrying
biases. Besides subjectively assigned ground truths the dataset itself can vary
in quality. On one hand, the data collected by companies might contain several
mistakes [195]. The collection process can easily be biased in the first place, as
shown by Turner and Skidmore [184]. On the other hand, minority groups can
simply not be well represented in the data due to varying access to digitalisation
or otherwise different exposure to the employed data collection technique [126].
This is correlated with the notion of negative legacy [113] where prior discrimi-
nation resulted in less gathered data samples of a minority group as respective
individuals were automatically denied due to disparate treatment. It is known
that machine learning performance on data with sample-bias may lead to in-
accurate results [201]. This all can cause statistical biases which the machine
learning model eventually reinforces.
Selecting the Features. Related to how the data is collected is the question
of what data is collected. This is known as feature selection.
One already discussed problem is disparate impact or redlining, where the
outcome of a prediction model is determined over a proxy variable [182], i.e. a
variable which is directly dependent on the sensitive attribute. However, incom-
plete information also introduces problems to the system. That is, not all infor-
mation necessary for a correct prediction is taken into account as feature and
remains unobserved, either due to oversight, lack of domain knowledge, privacy
2
This corresponds to the computer science proverb ‘garbage in, garbage out’.
Fairness-Aware Machine Learning 7
reasons, or simply that the information is difficult to observe to begin with [40].
This lack of correctly selected features might only affect specific subgroups of
the population, leading to less accurate predictions only for their individuals
while the majority does not experience this issue. However, in certain cases the
acquisition of more precise features can come at a rather expensive costs for only
a marginal increase in prediction accuracy [129].
2 Related Work
This article is by no means the first attempt to conduct a survey over the fair
machine learning literature. Other authors already tackled different parts of the
field before us and did a great job in their ambitions and serve as a great entry
point into the field. In this section we will present their works and place it in
contrast to ours.
Romei and Ruggieri [160] conducted a broad multidisciplinary survey on
discrimination analysis. Surveying core concepts, problems, methods, and more
8 J. Dunkelau et al.
learning algorithms from literature as baseline approaches. Note that they only
considered pre- and in-processing approaches. The work of Friedler et al. con-
trasts ours as we do not aim to give a performance evaluation of existing algo-
rithms, but rather to describe their algorithms and set them into context with
related ones. This allows us to capture a broader range of algorithms as we do
not bestow ourselves with the burden of unified implementation and training
procedures.
The book ‘Fairness in machine learning’ by Barocas et al. [14] appears to be
a promising addition to the fairness literature, aiming for much the same goals
as this article.3 Unfortunately, it is still a work in progress with most of the
essential chapters remaining yet to be released. As the authors explicitly solicit
for feedback “[i]n the spirit of open review”, we think it is important for the
community to actively increase visibility on their project. Eventually the book
could contribute in the same meaningful manner to the fairness community, as
the Deep Learning book [84] did to the deep learning community.
3 Necessary Nomenclature
Before Section 5 introduces various fairness notions, some common ground re-
garding proper nomenclature might be of need. In the following, Sections 3.1
and 3.2 will define the terminology of machine learning and of fairness problems
respectively.
Training Data and Samples. Training Data is a collection of data samples which
are used to train the machine learning algorithm for its eventual prediction
task. Regarding fairness and discrimination such a data set usually refers to a
population of humans, with each individual’s personal attributes defining a data
sample.
Classifier. A machine learning predictor which assigns a class to each data sam-
ple is known as a classifier. Classification happens over a finite set of distinct
classes. In the scope of fairness, this article is mainly concerned with binary clas-
sification, e.g. predicting between only two distinct classes. For instance, assume
a binary classifier that is used to decide about the credit worth of an individual.
The two classes in use would be ‘approve credit’ or ‘deny credit’.
3
This judgement is based on a comparison of their announced outline of chapters with
our section outline.
10 J. Dunkelau et al.
False
True
Positives
Positives
(FP)
(TP)
3 4
1
5
Positive
Class
2 Classifier Prediction
4
6
5 Negative
Class
3 6
Samples
1
Ground Truth: vs
True False
Negatives Negatives
(TN) (FN)
Positive and Negative Class. As binary classification is used for automated deci-
sion making, the output classes correspond to one positive class (the yes-decision,
‘approve credit’) and to one negative one (the no-decision, ‘deny credit’).
True and False Positives. A true positive (TP) is a sample which was correctly
classified to belong to the positive class, i.e. the ground truth corresponds to the
positive class as well. If the ground truth would actually have been the negative
class, then it is called a false positive (FP). For instance, a non-creditworthy
individual which still has their credit approved would be a false positive.
True and False Negatives. Analogous to true and false positives, a true negative
(TN) is a sample which was correctly classified to belong to the negative class
according to its ground truth. A false negative (FN) corresponds to a negatively
classified sample which actually should have been positive respectively. For in-
stance, a creditworthy individual which still has their credit approved would be
a false negative.
Fairness-Aware Machine Learning 11
True False
Positives Positives
Privileged
4 3
1 3
1
2 Favourable
Outcome
Qualified Unqualified Classifier
6
4
5 Unfavourable
2 Outcome
Unprivileged 6
5
Protected
vs
Attribute:
True False
Ground Negatives Negatives
vs
Truth:
Privileged and Unprivileged Group. Given a binary protected attribute like sex
(male, female)4 , the individuals over which decisions are made are divided into
two demographic groups: sex = male and sex = female.
Assuming discrimination against one group (i.e. females), the other group
experiences a favourable treatment. The latter group is referenced to as the
privileged group, whereas the group experiencing discrimination is known as the
unprivileged group.
4 Mathematical Notation
As usual, we denote by P (A | B) = P (A ∩ B)/P (B) the conditional probability
of the event A happening given that B occurs.
In the following, assume a finite dataset of n individuals D in which each
individual is defined as a triple (X, Y, Z):
– X are all attributes used for predictions regarding the data sample.
– Y is the corresponding ground-truth of the sample.
– Z is a binary protected attribute, Z ∈ {0, 1}, which might be included in X
and hence used by the predictor.
The privileged group will be denoted with Z = 1, whereas Z = 0 corresponds
to the unprivileged group. The favourable and unfavourable outcomes correspond
to Y = 1 and Y = 0 accordingly.
For a sample in Figure 2, we would have:
– Y=1 for green samples with a solid outline,
– Y=0 for red samples with no outline,
– Z=1 for rounded rectangles,
– Z=0 for trapezoids,
– the attributes X are not visible in the figure.
A (possibly unfair) classifier is a mapping h : X −→ [0, 1], yielding a score
S = h(X) which corresponds to the predicted probability of an individual to
belong to the positive class. For a given threshold σ the individual is predicted
to belong to the positive class Y = 1 iff h(X) > σ.
The final prediction based on the threshold is denoted as Ŷ with
Ŷ = 1 ⇔ h(X) > σ .
In the rest of the article we assume binary classifiers and always talk about
a given fixed classifier.
Hence,
P (Ŷ = 1 | Z = 1)
represents the probability that the favourable outcome will be predicted for
individuals from the privileged group, whereas
P (Y = 0 | Z = 0, Ŷ = 1)
Fairness-Aware Machine Learning 13
x1=3
x2=7
X = <3,7> x1=3 Favourable
h(X)=1
Z=1 x2=7 Outcome
x1=2
Y=1 x2=5
Classifier
X = <2,5> x1=2
Z=0 x2=5 h(X)=0
Y=0 Unfavourable
Outcome
represents the probability that a positively classified individual from the unpriv-
ileged group is actually unqualified. Positive examples for these two cases are
illustrated in Figure 3.
To keep notation short, let
Pi (E) := P (E | Z = i) i ∈ {0, 1}
Pi (E | C) := P (E | C, Z = i) i ∈ {0, 1}
P0 (Ŷ = 0 | Y = 1) = P (Ŷ = 0 | Y = 1, Z = 0)
5 Notions of Fairness
For achieving a fair predictor, a metric on how to measure fairness is needed
first. Depending on the use case, however, what is to be perceived as fair differs.
This leads to multiple different notions of fairness, some of which were already
compiled separately by Gajane and Pechenizkiy [76] Verma and Rubin [194], as
well as Friedler et al. [75].
In line with Verma and Rubin [194], we will list the various fairness notions
together with their formal definitions. Besides those notions already compiled by
Verma and Rubin, the list of notions is expanded where applicable. Further, our
summary provides visual, minimal examples for the given, parity-based notions.
The example visualisations are defined in Table 1.
14 J. Dunkelau et al.
3 3 3
7 7 3
3 3 7
7 7 7
Table 1. Summary of example illustrations.
In line with Gajane and Pechenizkiy [76], we split the different notions into
seven categories:
Further, we will provide some discussion present in the literature regarding the
choice of fairness metrics in Section 5.8
5.1 Unawareness
Fairness through unawareness [87, 122] is fulfilled when Z ∈/ X, that is when the
protected attributes are not used by the predictor.
This fairness notion avoids disparate treatment [15, 202] as described in Sec-
tion 1.2 Formally, a binary classifier avoids disparate treatment [202] if:
Group fairness [60] (a.k.a. statistical parity [49, 60, 208], demographic parity [75,
122], equal acceptance rate [210], mean difference [212], benchmarking [168],
Fairness-Aware Machine Learning 15
3 3 7 7 3 7 7 3 3 3 3
3 3 7 3 3 7 7 3 3 7 7
Conditional Statistical Parity. Conditional statistical parity [49, 60, 112] ex-
pands upon group fairness by taking a set of legitimate factors L ⊂ X into
account, over which a decision needs to be equal regardless of the protected
attribute:
P1 (Ŷ = 1 | L = l) = P0 (Ŷ = 1 | L = l) . (4)
3 3 7 7 3 7 7 3 3 3 3
3 3 7 3 3 7 7 7 3 3 7
Group fairness is evaluated only on the prediction outcome alone. This can be
expanded by also taking into account the ground truth of the samples. Pre-
dictive parity [46] (a.k.a. outcome test [168]) requires equal precision for all
demographic groups. Precision hereby is the positive predictive value (PPV),
that is the probability of a positive classified sample to be a true positive:
P P V = T PT+F
P
P [172, 194]. This results in the following parity to hold
P1 (Y = 1 | Ŷ = 1) = P0 (Y = 1 | Ŷ = 1) . (6)
In short, individuals for which the favourable outcome was predicted need to
have equal probability to actually belong to the positive class in both groups.
3 3 7 7 3 7 7 7 3 7 3 3
3 3 7 7 3 7 7 3 7 3 7
3 3 7 7 7 7 7 3 7 3 3
3 3 7 3 7 7 7 3 7 3 7
3 3 3 7 7 3 7 3 7 3 3
3 3 3 7 7 7 7 3 7 3 7
Note that in cases where the prevalence of qualified individuals is not equal
in all groups, a classifier can only satisfy both, predictive parity and equalised
odds once it achieves perfect predictions [46, 120, 202]. Hence, in domains where
the prevalences differ between groups or a perfect predictor is impossible, only
one fairness notion can be satisfied at any time.
18 J. Dunkelau et al.
3 3 3 7 7 3 7 3 7 3 3
3 3 3 7 7 7 7 3 7 3 7
3 3 3 7 7 7 7 7 3 3 3
3 3 3 7 7 7 7 3 3 3 7
P1 (Y = Ŷ ) = P0 (Y = Ŷ ) . (11)
As the name suggests, the accuracy [172] over both groups needs to be equal.
In contrast to conditional use accuracy equality, this notion combines the focus
on true positives and true negatives.
3 3 3 7 7 7 7 3 3 3 3
3 3 3 7 7 7 7 3 3 7 7
The term ‘treatment’ hereby used to convey that such rations can be used
as policy lever [25]. If the classifier produces more false negatives than false
positives for the privileged group, this means more unqualified individuals receive
the favourable outcome than the other way around. Given that the unprivileged
group has an even ratio, the misclassified privileged individuals receive an unfair
advantage.
3 3 3 7 7 7 7 7 7 7 3 3 3
3 3 3 7 7 7 7 3 3 7 7
5.4 Calibration
Calibration [46] (a.k.a. matching conditional frequencies [97], test-fairness [194])
is a notion which is accompanied by a score S, which is the predicted probability
for an individual X to be qualified (i.e. the probability to have the favourable
outcome assigned): S = P (Ŷ = 1 | X).
A classifier is said to be calibrated if
P1 (Y = 1 | S = s) = P0 (Y = 1 | S = s) , ∀s ∈ [0, 1] . (13)
That is, the probabilities for individuals with the same score to actually be
qualified has to be equal for each score value.
The aim of this notion is to ensure that if a set of individuals has a certain
probability of having the favourable outcome assigned, then approximately the
same percentage of individuals is indeed qualified [194].
Balance for negative class. The balance for negative class notion [120] re-
quires equal average scores between the set of unqualified individuals in both
groups. Thus, no group’s unqualified individuals have a statistical advantage
over those of the other group to be misclassified for the favourable outcome.
Verma and Rubin [194] formalised this notion as
E1 (S | Y = 0) = E0 (S | Y = 0) . (15)
Balance for positive class. Similar to the previous notion, balance for the
positive class [120] is satisfied if both groups have an equal average score among
their qualified individuals. This ensures that no group is in disadvantage by
resulting in more false negatives.
It formalises equal to Eq. (15) [194]:
E1 (S | Y = 1) = E0 (S | Y = 1) . (16)
with zk being the protected and xk being the unprotected attributes of individ-
ual k. Though similar to the notion of conditional statistical parity (cf. Eq. (4))
with L = X, this notion is not defined over probabilities. Hence, each pair of
individuals with equal attributes receives the same outcome, whereas in condi-
tional statistical parity it is sufficient when the same percentage of individuals
with equal legitimate attributes receives the favourable outcome.
That is the fraction of favourable outcomes for individuals with protected at-
tribute z.
In other words, each group receives a better outcome on average by their given
treatment as opposed to as treated as another group.
Note, that if h satisfies fairness through unawareness, i.e. ha = h ∀a, then h
also satisfies preferred treatment.
Thus each group receives at least as often the favourable outcome over h as it
would have over h0 , maintaining the core fairness achievable by h0 [202].
22 J. Dunkelau et al.
5.7 Causality
This family of fairness notions assumes a given causal graph. A causal graph is a
directed, acyclic graph representation having the features of X as vertices [118,
122, 142, 149]. Let G = (V, E) be a causal graph, then two vertices vi , vj ∈ V
have a directed edge (vi , vj ) ∈ E between them if a direct causal relationship
exists between them, i.e. vj is (potentially) a direct cause of vi [142].
Remark that if there exists no such path over P in G, fairness through un-
awareness satisfies no proxy discrimination [118].
Fair Inference. The notion of fair inference [142] proposes to select a set of
paths in G which are classified as legitimate paths. Which paths to choose is
hereby a domain-specific problem. Along legitimate paths the protected attribute
is treated as having its active value Z = z, whereas it is treated as baseline value
Z = z 0 on any non-legitimate path.
The idea stems from splitting the average causal effect of Z on Ŷ into path-
specific effects [150], which can be formulated as nested counterfactuals [167]. As
long as Ŷ is a descendant of Z only by legitimate paths, the outcome is deemed
as fair.
The notion of total fairness, which corresponds to satisfying all fairness con-
ditions, is unfortunately shown to be impossible For unequal distribution of
groups in the population it was shown that group fairness (Eq. (2)), equalised
odds (Eq. (9)), and conditional use accuracy equality (Eq. (10)) are mutu-
ally exclusive [25, 46, 74, 212]. Also, group fairness, equalised odds, and cali-
bration (Eq. (13)) as well as group fairness and predictive parity contradict each
other [46, 153, 212] given unequal base rates.
This leads to the problem of deciding which fairness measures are desirable.
Žliobaitė [212] recommends defaulting to normalised difference while discourag-
ing the use of ratio-based measures due to interpretability issues. He further finds
the use of core measures standalone is insufficient as fairness cirteria. Core mea-
sures are hereby measures which are unconditional over the whole population,
group fairness for instance. These are, given unequal distribution of qualification
throughout the groups, not applicable to the problem [212] and should be set
into a conditional context (i.e. segmenting the population beforehand).
Saleiro et al. give for their Aequitas tool [165] a fairness tree, guiding a user
through the decision process of finding a suitable fairness notion to follow.5
The notions considered are group fairness, disparate impact, predictive parity,
predictive equality, false omission rate parity, and equality of opportunity. More
information to Aequitas and the fairness tree is listed in Section 9.2.
Another discussion is the long-term impact onto the populations depending
on choice of fairness notion. Mouzannar et al. [141] consider affirmative action,
which corresponds to either reducing the positive prediction rates of the privi-
leged group or increasing that of the unprivileged group. Specifically, they anal-
ysed the conditions under which society equalises for both policies, i.e. achieving
equal qualification rates among groups. For instance, consider the case of a com-
pany hiring equal rates of qualified individuals among groups, say 20% each.
Given that 40% of the privileged and 10% of the unprivileged groups are indeed
5
[Link]
24 J. Dunkelau et al.
6.1 Relabelling
Relabelling approaches aim to alter the ground truth values in the training set
such that it satisfies the fairness notion.
Massaging. The pre-processing technique of massaging the data [38, 107, 109]
takes a number of individuals in the training data and changes their ground
truth values. This allows any classifying machine learning algorithm (Bayesian
networks, support vector machines, . . . ) to learn on a fair dataset, aiming to
fulfil group fairness (Section 5.2).
For this, a ranker R is employed which ranks the individuals by their proba-
bility to receive the favourable outcome. The more likely the favourable outcome
is, the higher the individual will rank.
Fairness-Aware Machine Learning 25
|D1 | × |D0 |
M =× . (24)
|D1 | + |D0 |
where
D1 = {X | Z = 1} and
D0 = {X | Z = 0}
6.2 Resampling
Resampling methods impact the sampling rate of the training data by either
dropping or doubling specific samples or altering their relative impact at training
time.
The weighed dataset can now be used for learning a fair classifier. Note that
there are only four different weights, depending on whether the individual is
privileged or not and qualified or not.
Drawback of this method is the need of a classifier, which is able to incorpo-
rate the weights.
The data is divided into four subsets D00 , D01 , D10 , and D11 , where
Dz = {(X, Y, Z) | Z = z} ,
Dy = {(X, Y, Z) | Y = y} , and
Dzy = Dz ∩ Dy = {(X, Y, Z) | Y = y ∧ Z = z}
for y, z ∈ {0, 1}. For instance, D01 is the set of all unprivileged, qualified individ-
uals. For each of these sets, the expected cardinality is calculated by [108]
|Dz | × |Dy |
Czy = (26)
|D|
and the sets are sorted according to their ranks: D01 , D11 ascending, D00 , D10 de-
scending. Each of the four sets is then adjusted to match their respective Czy
values by either deleting the top elements or iteratively duplicating them. The
duplication step puts the respective top element and it’s copy to the bottom of
the list before the next sample is duplicated.
Finally, the pre-processed dataset is the union of the modified sets Dzy .
ξ˜ = median∀z0 Fz−1
0 (Fz (ξ)) . (28)
Further algorithms, which we did not summarise above, include rule protec-
tion [92, 93], adversarial learned fair representations [61], fairness through op-
timal transport theory [16], k-NN for discrimination prevention [135], situation
testing [21], statistical framework for fair predictive algorithms [134], contin-
uous framework for fairness [91], sensitive information remover [106], fairness
through awareness [60], provably fair representations [136], neural styling for
interpretable representations [154], and encrypted sensitive attributes [117].
28 J. Dunkelau et al.
Two Naïve Bayes Early work by Calders and Verwer [39] proposed to train
a naïve Bayes classifier for each protected attribute and balance them in or-
der to achieve group fairness. The models Mz for z ∈ {0, 1} are trained on
Dz = {(X, Y, Z) ∈ D | Z = z} only and the overall outcome is determined for
an individual by the outcome of the model corresponding to the individual’s
respective protected attribute.
Let X consist of m features X = hX (1) , . . . , X (m) i. As the overall model
depends on Z, the model can be formalised as
m
Y
P (X, Y, Z) = P (Y | Z) P (X (i) | Y, Z) (29)
i=1
which is equal to the two different naïve Bayes models for the values of Z [39].
Hereby, the probability P (Y | Z) is modified as in the authors’ post-processing
approach ‘modifying naïve Bayes’ in Section 8.
The authors argue that removing Z from the feature set results in too big
of a loss in accuracy, hence keeping the protected attribute for classification
is sensible. This however is unsuitable where any decision making use of the
protected attribute is forbidden by law (cf. Section 1.1).
Naïve Bayes with Latent Variable. A more complex approach than the
balanced bayes ensemble, also proposed by Calders and Verwer [39], is to model
the actual class labels L the dataset would have had if it had been discrimination
free to begin with. This is done by treating L as latent variable under two
assumptions:
1. L is independend from Z.
2. Y is determined by discriminating over L using Z uniformly at random.
Fairness-Aware Machine Learning 29
occurs when the outcome is not directly related with the protected attribute,
but correlation can be observed given X.
To measure prejudice, Kamishima et al. defined the (indirect) prejudice index
(PI) [113, 114]
X P (Y, Z)
PI = P (Y, Z) ln . (32)
P (Y )P (Z)
Y,Z
X X P (Ŷ = y | Z = z)
RP R (D, Θ) = h(x; Θ) ln (33)
(x,z)∈D y∈{0,1}
P (Ŷ = y)
is formed [113]. The authors further propose a general framework that utilises
two regularizer terms. One to ensure fairness, e.g. Eq. (33), and one standard
regularizer to reduce overfitting, e.g. L2 regularization kΘk22 [102].
P1 (C = k) = P0 (C = k) . (34)
Due to the prototypes lying in the same space as X, they induce a natural
probabilistic mapping via the softmax function σ [208]
exp(−d(x, vk ))
P (C = k | X = x) = PK = σ(d(x, v))k (35)
j=1 exp(−d(x, vj ))
L = AC LC + AX LX + AY LY (36)
Fairness-Aware Machine Learning 31
By optimizing α, {vk }k , {wk }k jointly, the model learns its own distance function
for individual fairness (Section 5.5). By utilising different weights αz for each de-
mographic group, this distance metric also addresses the inversion problem [60],
where features are of different impact with respect to the classification of the
two groups.
Adversarial training [85] consists of employing two models which play against
each other. On one side, a model is trained to accurately predict the ground
truth, whereas a second model predicts the protected attribute by considering
the first model’s prediction.
objective is defined over two loss functions, one for h and a respectively
X X
min LY (h(g(x)), y) + LZ (a(Jλ (g(x))), z) (41)
(x,y,z)∈D (x,y,z)∈E
This set of algorithms leaves the loss function unaltered and instead treats the
loss optimisation as a constrained optimisation problem, having the fairness
criterion as a constraint.
where gΘ (y, x) = min(0, ydΘ (x)) [202] and z̄ denotes the arithmetic mean over
(zi )ni=1 . This follows the approach of the disparate impact proxy proposed
in [203]. Hence, the constraints to the loss function formulate as
min L(Θ)
1 X
s.t. (z − z̄)gΘ (y, x) ≤ c ,
n (44)
(x,y,z)∈D
1 X
(z − z̄)gΘ (y, x) ≥ −c ,
n
(x,y,z)∈D
−|D1 | X −|D0 | X
gΘ (y, x) + gΘ (y, x) ≥ −c .
n n
(x,y)∈D0 (x,y)∈D1
Note that for Z ∈/ X this not only avoids disparate mistreatment, but also
disparate treatment as well. Remark that this approach is restricted to convex
margin-based classifiers [203].
where L(Θ∗) denotes the optimal loss of the respective unconstrained classifier
and γ specifies the maximum additional loss to be accepted. For instance, γ = 0
ensures maximally achievable fairness by retaining optimal loss. P
n
Given a loss which is additive over the data points, i.e. L(Θ) = i=1 Li (Θ∗ )
where Li is the individual loss of the ith individual, it is possible to fine-grain
the constraints with individual γi to be
1 X
min (z − z̄)ΘT x
n (47)
(x,y,z)∈D
∗
s.t. Li (Θ) ≤ (1 + γi )Li (Θ ) ∀i .
Mµ (h) ≤ c, (48)
µz (h) − µ∗ (h) ≤ 0
−µz (h) + µ∗ (h) ≤ 0
|K|
with L(Q, λ) = err(Q)+λT (M µ(Q)−c) where lambda ∈ R+ is a Lagrange mul-
tiplier. The problems is solved by the standard scheme of Freund and Schapire [73],
which finds an equilibrium in a zero-sum game. Input to the algorithm hereby
is the training data D, the fairness constraint expressed by gj , Ej , M, c, bound
B, learning rate η, and a minimum accuracy ν. If a deterministic classifier is
preferred, the found saddle point yields a suitable set of candidates.
Algorithms in this category train a set of models with one classifier per group in
Z. Thus the subgroup accuracy is kept height while the overall classifier achieves
fair results.
is not always legally feasible (cf. Section 1.1) it is shown that subgroup specific
thresholding leads to the fairest yet accurate results [49, 138].
We categorised the presented algorithm into
– output correction (Section 8.1),
– input correction (Section 8.2), and
– classifier correction (Section 8.3).
accT = N0 + P1
C1 + D1 A2 + B 2
discT = −
z̄ z
Let the impact of relabelling leaf l on accuracy and discrimination be defined as
(
n−p p>n
∆accl =
p−n p<n
(
a+b c+d
∆discl = z − z̄ p>n
a+b c+d
− z + z̄ p<n
The proposed algorithm now aims to find the minimal subset of leafs L ⊆ L
which need to be relabelled to achieve a discrimination smaller than ∈ [0, 1].
This is done by defining I = {l ∈ L | ∆discl < 0}, which is the set of all leafs
which reduce the discrimination upon relabelling, and then iteratively construct-
ing L. Hereby arg maxl∈I\L (∆discl )/∆accl is added to the initially empty set L
as long as rem_disc(L) > .
Note that this relabelling problem is equivalent to the NP problem KNAP-
SACK [110] and actually NP-complete.
with ∆ ∈ Z such that |∆| is the minimum integer resulting in f (r) < a.
Note that this is only for correcting direct discrimination [151,152]. However,
the authors further give a correction method for indirect discrimination [151,152],
which we omit here.
Plugin Approach. Menon and Williamson propose a plugin approach for cor-
recting outputs of a classifier by thresholding of the class probability function
on an instance-basis [138]. This approach assumes the classification problem to
be cost-sensitive.
Two logistic regression models are trained, ηA , ηF . Hereby, ηY is trained on
X, Y and estimates the probability of x ∈ X to be qualified for the favourable
outcome ηA (x) = P (Y = 1 | X = y), i.e. ηA = h. The second model, ηF (x)
estimates P (Z = 1 | X = x), i.e. the probability an individual belongs to
the privileged group. Note that his definition of ηF is meant to achieve group
fairness. If ηF0 (x) = P (Z = 1 | X = x, Y = 1) is estimated instead, equality of
opportunity is the target fairness instead.
Given two cost parameters cA , cF , define s : x 7→ ηA (x) − cA − λ(ηF (x) − cF )
with the tradeoff parameter λ.
The final classification happens by h(x) = Hα (s(x)), where Hα for α ∈ [0, 1]
is the modified Heaviside step function Hα (s) = 1{s>0} + α1{s=0} .
These approaches are related to pre-processing from Section 6, but add a pre-
processing layer in front of an already trained algorithm.
where BER denotes the balanced error rate of f (a.k.a. half total error rate [175]).
By considering the difference in accuracy of on h(X) and h(X\ Xi ), the (indi-
rect) influence of feature i is measured.
The features are ordered by their (indirect) influences. To remove a fea-
ture i (e.g. Z) from X, compute X\ Xi by applying the following obscur-
ing procedure to each feature j 6= i. For a numerical feature W = X (j) let
Wx = P (W | X (i) = x) and Fx (w) = P (W ≥ w | X (i) = x) denote the
marginal distribution and cumulative distribution conditioned on X (i) = i. Con-
sider the median distribution A, which has its cumulative distribution FA given
by FA−1 (u) = medianx∈X (i) Fx−1 (u). It was already shown for the disparate im-
pact remover that a distribution W̃ which is minimally changed to mimic the
distribution of A maximally obscures X (j) [70].
denote the tuple of false and true positive rates for Ŷ and define the two-
dimensional convex polytope
where convhull denotes the convex hull of the given set. Given a loss function L,
Ỹ can be derived via the optimisation problem
min E(L(Ŷ , Y ))
Ỹ
γ0 (Ỹ ) = γ1 (Ỹ ) .
This ensures Ỹ to minimise loss whilst satisfying equalised odds. For achieving
equality of opportunity, the second condition weakens to only require equal true
positive rates over the groups [97].
9 Fairness Toolkits
9.1 Datasets
German Credit. The German Credit Data [57] consist of 1000 samples span-
ning 20 features each. Containing features such as employment time, current
credits, or marital status it provides the prediction task to determine whether
an individual has good or bad credit risk. Sensitive attributes are sex and age.
Adult Income. The Adult Data (a.k.a. Census Income Dataset) [57] consists of
48842 samples. It’s 14 features include information such as relation ship status,
education level, or occupation, as well as the sensitive attributes race, sex, and
age. The associated prediction task is to classify whether an individual’s income
exceeds $50,000 annually.
Dutch Virtual Census. The Dutch Virtual Census data are released by the
Dutch Central Bureau for Statistics [44, 45] and consist of two sets, one from
2001,7 the other from 1971.8 The data contains 189,725 and 159,203 samples
respectively with the classification objective whether an individual has a presti-
gious occupation or not, providing the sensitive attribute sex.
As Kamiran et al. [110] pointed out, these two sets are unique in a way that
the sex discrimination has decreased from 1971 to 2001 as seen in the data.
Hence, a classifier can be trained on the 2001 data and then be evaluated on the
discriminatory 1971 data.
6
[Link]
score-data-and-analysis
7
[Link]
8
[Link]
42 J. Dunkelau et al.
9.2 Frameworks
FairML. In his master’s thesis, Adebayo developed FairML [1,2], an end-to-end
toolbox for quantifying the relative significance of the feature dimensions of a
given model. FairML is written in Python and available on GitHub.9
It employs four different ranking algorithms with which the final combined
scores for each feature are determined. Those algorithms are the iterative orthog-
onal feature projection algorithm [1], minimum redundancy maximum relevance
feature selection [54], the lasso regression algorithm [180], and the random forest
feature selection algorithm [34, 132].
Having the relative significance of each feature eases reasoning about the
problem domain and validating potential discrimination of the audited system.
Themis. Themis [77] was developed by the Laboratory for Advanced Software
Engineering Research at the University of Massachusetts Amherst and is a tool
for testing software for discrimination. Themis is written in Python and available
on GitHub.11
The tool can be used in three different ways. Firstly, it can be used for gen-
erating a test suite to compute the discrimination scores for a set of features.
Secondly, it can compute all feature subsets against which the software discrim-
inates more than a specified threshold. Thirdly, given a test suite, the apparent
discrimination scores for a feature set are calculated. The discrimination scores
hereby consist of group discrimination scores, as well as causal discrimination
scores.
Aequitas. The Center for Data Science and Public Policy of the University of
Chicago published Aequitas [165], a bias and fairness audit toolkit and produces
detailed bias reports over the input data. It is usable as a Python library or as
a standalone command line utility and available on GitHub.15
The tool checks for six different fairness notions: group fairness, disparate im-
pact, predictive parity, predictive equality, false omission rate parity, and equality
of opportunity. The authors further provided a fairness tree, guiding the choice
of which notion to use via a decision tree. Given a set of notions to check for,
the tool outputs a bias report which captures which notions were violated and
to what extent on a per-subgroup base.
FAT Forensics. FAT Forensics [171] is a Python toolbox for inspecting fairness,
accountability, and transparency of all aspects of a machine learning system. It
was started as an academic collaboration between the University of Bristol and
the Thales Group, and is available on GitHub.19
FAT Forensics works throughout the whole machine learning pipeline, con-
sisting of data, models, and predictions. Besides only concerning itself with
fairness, it further implements algorithms regarding accountability and trans-
parency as well, giving reports on, i.e., neighbour-based density estimations re-
garding prediction robustness, inspection of counterfactual fairness, or finding
explanations for blackbox models via local interpretable model-agnostic expla-
nations [158].
While avoiding discrimination is the overall goal of the field, there are other im-
portant aspects to consider. The legal texts mentioned in Section 1.1 for example
are also concerned with the question of accountability. If employers of discrim-
inatory systems can be held (legally) accountable, accountability serves as an
important tool towards ensuring a more responsible use of machine learning,
which ultimately leads already to an improvement of society [65,145]. On a sim-
ilar note, Taylor [174] makes a case for requiring data justice, based on the three
pillars of (in)visibility, (dis)engagement with technology, and anti-discrimination.
Transparency is an important factor for individuals affected by decisions of an
AI system. Transparency of a decision making process allows a user to actually
inspect and understand why a specific decision was made, which ultimately also
allows the developers of such systems to more easily check for discrimination.
This feature can also be a legal requirement. The European Union states in the
General Data Protection Regulation [69] (effective since 2018) that “[the user]
should have the right [. . . ] to obtain an explanation of the decision reached after
such assessment [. . . ]”, hence stating a right to explanation for the affected users.
In the U.S.A., the Equal Credit Opportunity Act [191] states the same right for
the credit system. Creditors must provide specific reasoning to the applicants
for why a credit action was taken. The European Union identifies accountability
and transparency as tools to reach fairer systems [65].
Unfortunately, in current practice many of the popular machine learning
algorithms such as neural networks [84] or random forests [34], provide only
blackboxes. This is on one hand a transparency issue, as the concrete process
of how the system derived a decision is unknown. On the other hand this also
impacts accountability, as it is harder to prove true discrimination inside an al-
gorithm (although statistical measures as listed in Section 5 can still be tested
for). There was and is a lot of research conducted concerned with symbolic
learning for interpretability such as neural-symbolic integration [11, 79, 95] or
explainable AI systems which can deliver an explanation for their decision pro-
cess [56, 90, 100, 139, 158, 196].
Another addition to this topic is recourse by Ustun et al. [192], which denotes
the ability of an individual to change the output of a given, fixed classifier by
influencing the inputs. This can be understood as the ability of an individual
to adapt it’s profile (i.e., its feature vector) in such a way, that the favourable
outcome is guaranteed. While this does not imply transparency or accountabil-
ity and vice-versa the authors emphasise that this gives some kind of agency
to the individuals which hence could increase the perceived fairness of the algo-
rithms [31]
46 J. Dunkelau et al.
While AI offers a broad range of different ethical problems (e.g. AI warfare [88],
AI and robot ethics [37, 89, 128], harms via unethical use [183], or accountability
of autonomous systems [63]), we restrict this section only to the ethical questions
regarding application of machine learning for decision processes over individuals.
Goodall [83] discusses whether an autonomous vehicle can be programmed
with a set of ethics to act by. As an example they consider a vehicle having
to decide whether to hit a pedestrian or swerve around and hence cause other
kind of property damage. The vehicle’s system could factor in the estimated
financial cost of hitting the pedestrian and weight against the estimated property
damage. Assuming the vehicle does the first estimate over historical data of civil
settlements in the respective neighbourhood, this could lead to a higher crash risk
of pedestrians in poorer areas, where lower settlements might be more common.
The vehicle would then discriminate against social status. Keeping in mind that
the neighbourhood may be correlated with race, this can also lead to racial
discrimination.
In “The Authority of ‘Fair’ in Machine Learning” [170], Skirpan and Gorelick
discuss whether the employment of decision making systems is fair in the first
place and propose three guiding questions for developers to take into account,
and hence to take more responsibility for the implementation. The proposed
questions are whether it is fair to implement the decision system in the first
place, whether a fair technical approach exists, and whether it provides fair
results after it has been implemented. While the latter two questions correspond
to methods covered in Sections 5 to 8, the first question poses another problem
not yet covered: how do we determine whether the employment of an automated
decision system is fair in the first place?
This is related to the feature selection for the task at hand: are the features
upon which the decision shall be conducted a fair set? The answer to this begins
with whether protected attributes should be considered as features or not, which
still remains a topic of open discussion. As already pointed out, inclusion of the
protected attributes into the feature set might not always legally feasible. A
notable exception from the law is business necessity [15], i.e. an employer can
prove that a decision over a protected attribute is actually justified in context
of the business.
Dwork et al. [59] argue that consideration of the protected attribute “may
increase accuracy for all groups and may avoid biases”, yet constrain this to cases
in which inclusion is “legal and ethical”. This is in line with the results of Berk et
al. [23], which reported better overall results by taking race into account in their
experiments. Žliobaitė [212] argued that a model which considers the protected
attribute would not treat individuals with otherwise identical attributes similar
and hence would practice direct discrimination. However, he still emphasises
that utilising the protected attribute still aids in enforcing non-discrimination.
A generalisation of this to the fairness of the whole feature selection problem
is the notion of process fairness proposed by Grgić-Hlača et al. [87], where the
users are given some kind of agency over the employed features. Proposed are
Fairness-Aware Machine Learning 47
As already indicated above, one of the main concerns for training a fairness-
aware classifier is the data it is trained on. Friedler et al. [75] have shown in their
study that the eventual fairness of the trained systems is strongly dependent on
the initial data split, as demonstrated by their cross-validation approach. This
indicates that fairness methods “[are] more brittle than previously thought”, as
stated by them. In their conclusion, they provide three recommendations of
how to approach future contributions to the fairness research to increase quality
and close possible gaps. These recommendations are: emphasising pre-processing
requirements, i.e. providing multiple performance metrics if the training data
can be processed in different ways, avoiding proliferation of measures, i.e. new
notions should be introduced only if fundamentally different than existing ones,
and accounting for training instability, i.e. providing performances on multiple
training splits. Hence, they provide a means for more robust result replication
for readers of such papers, as well as a more unified ground of comparison of
provided performances between different papers.
A similar critique was given by Gebru et al. [80], who criticise the missing
documentation of datasets used in machine learning. They propose a unified ap-
proach to give crucial information, answering questions such as why a dataset was
created, by whom it was funded, whether pre-processing took place, or whether
affected individuals knew about assembly of their data into the set. Ambition
48 J. Dunkelau et al.
The Need for Proof. We conclude this survey by pointing into another direc-
tion, which should be considered in future research. While most fairness eval-
uations are based on the performance on a designated test-set, the problem
arises whether the test-set is a good representation of the eventual individuals
Fairness-Aware Machine Learning 49
over which the predictions are conducted. If not, there are no strong guaran-
tees whether the system will act reliably fair in real-world scenarios. A formal
proof of system-fairness would need to be conducted to show real fairness of a
predictor, independent of training and test set.
McNamera et al. [136] presented an approach of provably fair representation
learning. However, while being a big step into the right direction and proving
that the inferred representation function indeed provides fair results, this proves
are still restricted on probability guarantees over the test set.
Tramèr et al. [181] referred to unfairness in predictors as fairness bugs or
association bugs. In a sense, labelling discriminatory misbehaviour as bugs cap-
tures the problem quite well. It further allows us to look into other computer
science areas where the absence of bugs, i.e. the absence of programmatic misbe-
haviour, is crucial: safety-critical systems. Here, programs need to be rigorously
proven in a formal, mathematical manner, before they are put into production.
A means to do this are formal methods [78, 197].
Proof of machine learning algorithms (with a focus of guaranteed safety)
is part of recent and current research. For instance, it is known that image
detection neural networks can be manipulated in their output by simply chang-
ing certain pixels in the input, unnoticeable for the human eye [140]. As re-
action, the community started to develop proof techniques to verify that the
expectable input space of neural networks is safe from such adversarial pertur-
bations [103, 115, 123, 199]. Proof-carrying code [71, 94, 144] is a mechanism in
which a piece of software is bundled with a formal proof which can be redone
by the host system for verification purposes. This could be used for the afore-
mentioned unified framework so that the prediction model ultimately is always
accompanied with its formal proof of non-discrimination. A similar idea was
presented by Ramadan et al. [156], who outlined a UML-based workflow which
allows for automated discrimination analysis.
If we relate discrimination to software bugs and fairness to software safety,
the intersection of the formal methods community and the fairness community
could actually give field to novel perspectives, algorithms, and applications which
ultimately can benefit not only both research groups, but also the individuals
affected by a more and more digital world, which we shape together to be safer
and fairer for everyone.
References
1. Adebayo, J., Kagal, L.: Iterative orthogonal feature projection for diagnosing bias
in black-box models. arXiv preprint arXiv:1611.04967 (2016)
2. Adebayo, J.A.: FairML: ToolBox for diagnosing bias in predictive modeling. Mas-
ter’s thesis, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cam-
bridge, MA 02139, United States (2016)
3. Adler, P., Falk, C., Friedler, S.A., Nix, T., Rybeck, G., Scheidegger, C., Smith,
B., Venkatasubramanian, S.: Auditing black-box models for indirect influence.
Knowledge and Information Systems 54(1), 95–122 (2018)
50 J. Dunkelau et al.
4. Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., Wallach, H.: A reductions
approach to fair classification. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th
International Conference on Machine Learning. Proceedings of Machine Learning
Research, vol. 80, pp. 60–69. PMLR, Stockholmsmässan, Stockholm Sweden (10–
15 Jul 2018)
5. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules.
In: Proceedings of the 20th International Conference on Very Large Data Bases.
VLDB ’94, vol. 1215, pp. 487–499. Morgan Kaufmann Publishers Inc., San Fran-
cisco, CA, USA (1994)
6. Ahlman, L.C., Kurtz, E.M.: The APPD randomized controlled trial in low risk
supervision: The effects on low risk supervision on rearrest. Philadelphia Adult
Probation and Parole Department (Oct 2008)
7. Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias: There’s software
used across the country to predict future criminals. and it’s biased against blacks.
(May 2016)
8. Arrow, K.: The theory of discrimination. Discrimination in Labor Markets 3(10),
3–33 (1973)
9. Aydemir, F.B., Dalpiaz, F.: A roadmap for ethics-aware software engineering. In:
Proceedings of the International Workshop on Software Fairness. pp. 15–21. ACM
(2018)
10. Ayres, I.: Outcome tests of racial disparities in police practices. Justice Research
and policy 4(1-2), 131–142 (2002)
11. Bader, S., Hitzler, P.: Dimensions of neural-symbolic integration – a structured
survey. arXiv preprint cs/0511042 (2005)
12. Bantilan, N.: Themis-ml: A fairness-aware machine learning interface for end-to-
end discrimination discovery and mitigation. Journal of Technology in Human
Services 36(1), 15–30 (2018)
13. Barocas, S., Bradley, E., Honavar, V., Provost, F.: Big data, data science, and
civil rights. arXiv preprint arXiv:1706.03102 (2017)
14. Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning. fairml-
[Link] (2019), [Link]
15. Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. L. Rev. 104, 671
(2016)
16. Barrio, E.D., Fabrice, G., Gordaliza, P., Loubes, J.M.: Obtaining fairness using
optimal transport theory. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings
of the 36th International Conference on Machine Learning. Proceedings of Ma-
chine Learning Research, vol. 97, pp. 2357–2365. PMLR, Long Beach, California,
USA (09–15 Jun 2019)
17. Barth, J.R., Cordes, J.J., Yezer, A.M.: Financial institution regulations, redlining
and mortgage markets. The regulation of financial institutions 21, 101–143 (1979)
18. Bechavod, Y., Ligett, K.: Penalizing unfairness in binary classification. arXiv
preprint arXiv:1707.00044 (2017)
19. Becker, G.S., et al.: The economics of discrimination. University of Chicago Press
Economics Books (1957)
20. Bellamy, R.K., Dey, K., Hind, M., Hoffman, S.C., Houde, S., Kannan, K., Lohia,
P., Martino, J., Mehta, S., Mojsilovic, A., et al.: AI Fairness 360: An extensible
toolkit for detecting, understanding, and mitigating unwanted algorithmic bias.
arXiv preprint arXiv:1810.01943 (2018)
21. Bendick, M.: Situation testing for employment discrimination in the United States
of America. Horizons stratégiques (3), 17–39 (2007)
Fairness-Aware Machine Learning 51
22. Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and
new perspectives. IEEE transactions on pattern analysis and machine intelligence
35(8), 1798–1828 (2013)
23. Berk, R.: The role of race in forecasts of violent crime. Race and Social Problems
1(4), 231 (Nov 2009)
24. Berk, R.: Criminal Justice Forecasts of Risk: A Machine Learning Approach.
Springer Science & Business Media (2012)
25. Berk, R., Heidari, H., Jabbari, S., Kearns, M., Roth, A.: Fairness in criminal
justice risk assessments: The state of the art. Sociological Methods & Research
(2018)
26. Berk, R., Sherman, L., Barnes, G., Kurtz, E., Ahlman, L.: Forecasting murder
within a population of probationers and parolees: a high stakes application of
statistical learning. Journal of the Royal Statistical Society: Series A (Statistics
in Society) 172(1), 191–211 (2009)
27. Berkovec, J.A., Canner, G.B., Gabriel, S.A., Hannan, T.H.: Race, redlining, and
residential mortgage loan performance. The Journal of Real Estate Finance and
Economics 9(3), 263–294 (1994)
28. Berliant, M., Thomson, W., Dunz, K.: On the fair division of a heterogeneous
commodity. Journal of Mathematical Economics 21(3), 201–216 (1992)
29. Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical impli-
cations when adversarially learning fair representations. In: Proceedings of 2017
Workshop on Fairness, Accountability, and Transparency in Machine Learning.
FAT/ML (2017)
30. Biddle, D.: Adverse Impact and Test Validation. Gower Publishing, Ltd., 2 edn.
(2006)
31. Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., Shadbolt, N.: ‘it’s reduc-
ing a human being to a percentage’; perceptions of justice in algorithmic decisions.
In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Sys-
tems. p. 377. ACM (2018)
32. Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to
computer programmer as woman is to homemaker? debiasing word embeddings.
In: Advances in neural information processing systems. pp. 4349–4357 (2016)
33. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression
trees (1984)
34. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (Oct 2001)
35. Brescia, R.H.: Subprime communities: Reverse redlining, the fair housing act and
emerging issues in litigation regarding the subprime mortgage crisis. Albany Gov-
ernment Law Review 2, 164 (2009)
36. Buolamwini, J., Gebru, T.: Gender shades: Intersectional accuracy disparities in
commercial gender classification. In: Proceedings of the 1st Conference on Fair-
ness, Accountability and Transparency. Proceedings of Machine Learning, vol. 81,
pp. 77–91. PMLR, New York, NY, USA (2018)
37. Burton, E., Goldsmith, J., Mattei, N.: Teaching AI ethics using science fiction.
In: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence
(2015)
38. Calders, T., Kamiran, F., Pechenizkiy, M.: Building classifiers with independency
constraints. In: 2009 IEEE International Conference on Data Mining Workshops.
pp. 13–18. IEEE (2009)
39. Calders, T., Verwer, S.: Three naive bayes approaches for discrimination-free clas-
sification. Data Mining and Knowledge Discovery 21(2), 277–292 (2010)
52 J. Dunkelau et al.
40. Calders, T., Žliobaitė, I.: Why unbiased computational processes can lead to dis-
criminative decision procedures. In: Discrimination and Privacy in the Informa-
tion Society, pp. 43–57. Springer (2013)
41. Calmon, F., Wei, D., Vinzamuri, B., Ramamurthy, K.N., Varshney, K.R.: Opti-
mized pre-processing for discrimination prevention. In: Advances in Neural Infor-
mation Processing Systems. pp. 3992–4001 (2017)
42. Card, D., Zhang, M., Smith, N.A.: Deep weighted averaging classifiers. In: Pro-
ceedings of the Conference on Fairness, Accountability, and Transparency. pp.
369–378. ACM (2019)
43. Celis, L.E., Huang, L., Keswani, V., Vishnoi, N.K.: Classification with fairness
constraints: A meta-algorithm with provable guarantees. In: Proceedings of the
Conference on Fairness, Accountability, and Transparency. pp. 319–328. ACM
(2019)
44. Centraal Bureau voor de Statistiek: Volkstelling (1971)
45. Centraal Bureau voor de Statistiek: Volkstelling (2001)
46. Chouldechova, A.: Fair prediction with disparate impact: A study of bias in re-
cidivism prediction instruments. Big data 5(2), 153–163 (2017)
47. Chouldechova, A., G’Sell, M.: Fairer and more accurate, but for whom? arXiv
preprint arXiv:1707.00046 (2017)
48. Citron, D.K., Pasquale, F.: The scored society: Due process for automated pre-
dictions. Wash. L. Rev. 89, 1 (2014)
49. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision
making and the cost of fairness. In: Proceedings of the 23rd ACM SIGKDD In-
ternational Conference on Knowledge Discovery and Data Mining. pp. 797–806.
ACM (Aug 2017)
50. Cramer, J.S.: The origins of logistic regression (2002)
51. Danner, M.J., VanNostrand, M., Spruance, L.: Risk-based pretrial release recom-
mendation and supervision guidelines (Aug 2015)
52. Datta, A., Tschantz, M.C., Datta, A.: Automated experiments on ad privacy
settings: A tale of opacity, choice, and discrimination. Proceedings on Privacy
Enhancing Technologies 2015(1), 92–112 (2015)
53. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete
data via the EM algorithm. Journal of the Royal Statistical Society: Series B
(Methodological) 39(1), 1–22 (1977)
54. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene
expression data. Journal of Bioinformatics and Computational Biology 3(2), 185–
205 (2005)
55. Donini, M., Oneto, L., Ben-David, S., Shawe-Taylor, J.S., Pontil, M.: Empirical
risk minimization under fairness constraints. In: Advances in Neural Information
Processing Systems. pp. 2791–2801 (2018)
56. Doran, D., Schulz, S., Besold, T.R.: What does explainable AI really mean? a new
conceptualization of perspectives. arXiv preprint arXiv:1710.00794 (Oct 2017)
57. Dua, D., Graff, C.: UCI machine learning repository (2017)
58. Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity
constraints. In: Joint European Conference on Machine Learning and Knowledge
Discovery in Databases. pp. 301–316. Springer (2008)
59. Dwork, C., Immorlica, N., Tauman Kalai, A., Leiserson, M.: Decoupled classifiers
for fair and efficient machine learning. arXiv e-prints (Jul 2017)
60. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through
awareness. In: Proceedings of the 3rd innovations in theoretical computer science
conference. pp. 214–226. ACM (2012)
Fairness-Aware Machine Learning 53
61. Edwards, H., Storkey, A.J.: Censoring representations with an adversary. CoRR
abs/1511.05897 (2015)
62. Eisenhauer, E.: In poor health: Supermarket redlining and urban nutrition. Geo-
Journal 53(2), 125–133 (Feb 2001)
63. Etzioni, A., Etzioni, O.: AI assisted ethics. Ethics and Information Technology
18(2), 149–156 (Jun 2016)
64. European Parliament: Legislative resolution of 2 april 2009 on the proposal for
a council directive on implementing the principle of equal treatment between
persons irrespective of religion or belief, disability, age or sexual orientation
(com(2008)0426 – c6-0291/2008 – 2008/0140(cns)), 2008/0140(APP)
65. European Parliamentary Research Service, Panel for the Future of Science and
Technology: A governance framework for algorithmic accountability and trans-
parency (Apr 2019), PE 624.262
66. European Union Legislation: Council directive 2000/43/EC of 29 june 2000 im-
plementing the principle of equal treatment between persons irrespective of racial
or ethnicorigin. Official Journal of the European Communities L 180/22 (2000)
67. European Union Legislation: Council directive 2000/78/EC of 27 november 2000
establishing a general framework for equal treatment in employment and occupa-
tion. Official Journal of the European Communities L 303/16 (Nov 2000)
68. European Union Legislation: Council directive 2006/54/EC of 5 july 2006 on the
implementation of the principle of equal opportunities and equal treatment of
men and women in matters of employment and occupation. Official Journal of
the European Communities L 204/23 (2006)
69. European Union Legislation: Regulation (EU) 2016/679 of the european parlia-
ment and of the council of 27 april 2016 on the protection of natural persons
with regard to the processing of personal data and on the free movement of such
data, and repealing directive 95/46/ec (general data protection regulation). Offi-
cial Journal of the European Union L 119/1 (Apr 2016)
70. Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian,
S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining.
pp. 259–268. ACM (2015)
71. Feng, X., Ni, Z., Shao, Z., Guo, Y.: An open framework for foundational proof-
carrying code. In: Proceedings of the 2007 ACM SIGPLAN international work-
shop on Types in languages design and implementation. pp. 67–78. ACM (2007)
72. Fish, B., Kun, J., Lelkes, Á.D.: A confidence-based approach for balancing fairness
and accuracy. In: Proceedings of the 2016 SIAM International Conference on Data
Mining. pp. 144–152. SIAM (2016)
73. Freund, Y.: Game theory, on-line prediction and boosting. In: Proceedings of the
Ninth Annual Conference on Computational Learning Thoery. pp. 325–332 (1996)
74. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S.: On the (im)possibility
of fairness. arXiv preprint arXiv:1609.07236 (2016)
75. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamil-
ton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in
machine learning. In: Proceedings of the Conference on Fairness, Accountability,
and Transparency. pp. 329–338. ACM (2019)
76. Gajane, P., Pechenizkiy, M.: On formalizing fairness in prediction with machine
learning. arXiv preprint arXiv:1710.03184 (2017)
77. Galhotra, S., Brun, Y., Meliou, A.: Fairness testing: testing software for discrimi-
nation. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software
Engineering. pp. 498–510. ACM (2017)
54 J. Dunkelau et al.
78. Garavel, H., Graf, S.: Formal methods for safe and secure computer systems.
Federal Office for Information Security (2013)
79. Garcez, A.d., Besold, T.R., De Raedt, L., Földiak, P., Hitzler, P., Icard, T., Kühn-
berger, K.U., Lamb, L.C., Miikkulainen, R., Silver, D.L.: Neural-symbolic learning
and reasoning: Contributions and challenges. In: 2015 AAAI Spring Symposium
Series (2015)
80. Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.W., Wallach, H.,
Daumeé III, H., Crawford, K.: Datasheets for datasets. In: The 5th Workshop
on Fairness, Accountability, and Transparency in Machine Learning. Proceedings
of Machine Learning, PMLR (2018)
81. Goel, S., Rao, J.M., Shroff, R.: Personalized risk assessments in the criminal
justice system. American Economic Review 106(5), 119–23 (2016)
82. Goel, S., Rao, J.M., Shroff, R.: Precinct or prejudice? understanding racial dis-
parities in New York City’s stop-and-frisk policy. The Annals of Applied Statistics
10(1), 365–394 (2016)
83. Goodall, N.J.: Can you program ethics into a self-driving car? IEEE Spectrum
53(6), 28–58 (2016)
84. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016), http:
//[Link]
85. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair,
S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in neural
information processing systems. pp. 2672–2680 (2014)
86. Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel
method for the two-sample-problem. In: Advances in neural information process-
ing systems. pp. 513–520 (2007)
87. Grgić-Hlača, N., Zafar, M.B., Gummadi, K.P., Weller, A.: The case for process
fairness in learning: Feature selection for fair decision making. In: NIPS Sympo-
sium on Machine Learning and the Law. vol. 1, p. 2 (2016)
88. Guizzo, E., Ackerman, E.: When robots decide to kill. IEEE Spectrum 53(6),
38–43 (2016)
89. Gunkel, D.J.: The Machine Question: Critical Perspectives on AI, Robots, and
Ethics. MIT Press (2012)
90. Gurumoorthy, K.S., Dhurandhar, A., Cecchi, G.A., Aggarwal, C.: Efficient data
representation by selecting prototypes with importance weights. (Aug 2019), to
be published at ICDM’19
91. Hacker, P., Wiedemann, E.: A continuous framework for fairness. arXiv preprint
arXiv:1712.07924 (2017)
92. Hajian, S., Domingo-Ferrer, J.: A methodology for direct and indirect discrim-
ination prevention in data mining. IEEE transactions on knowledge and data
engineering 25(7), 1445–1459 (2012)
93. Hajian, S., Domingo-Ferrer, J., Martinez-Balleste, A.: Rule protection for indi-
rect discrimination prevention in data mining. In: International Conference on
Modeling Decisions for Artificial Intelligence. pp. 211–222. Springer (2011)
94. Hamid, N.A., Shao, Z., Trifonov, V., Monnier, S., Ni, Z.: A syntactic approach
to foundational proof-carrying code. Journal of Automated Reasoning 31(3-4),
191–229 (2003)
95. Hammer, B., Hitzler, P.: Perspectives of Neural-Symbolic Integration, vol. 77.
Springer (2007)
96. Hand, D.J.: Classifier technology and the illusion of progress. Statistical science
pp. 1–14 (2006)
Fairness-Aware Machine Learning 55
97. Hardt, M., Price, E., Srebro, N., et al.: Equality of opportunity in supervised
learning. In: Advances in neural information processing systems. pp. 3315–3323
(2016)
98. Harris, R., Forrester, D.: The suburban origins of redlining: A Canadian case
study, 1935-54. Urban Studies 40(13), 2661–2686 (2003)
99. Hendricks, L.A., Burns, K., Saenko, K., Darrell, T., Rohrbach, A.: Women also
snowboard: Overcoming bias in captioning models. In: European Conference on
Computer Vision. pp. 793–811. Springer (2018)
100. Hind, M., Wei, D., Campbell, M., Codella, N.C., Dhurandhar, A., Mojsilović,
A., Natesan Ramamurthy, K., Varshney, K.R.: TED: Teaching AI to explain its
decisions. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and
Society. pp. 123–129. ACM (2019)
101. Hoadley, B.: Comment on ‘statistical modeling: The two cultures’ by l. breiman.
Statistical Science 16(3), 220–224 (2001)
102. Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthog-
onal problems. Technometrics 12(1), 55–67 (1970)
103. Huang, X., Kroening, D., Kwiatkowska, M., Ruan, W., Sun, Y., Thamo, E., Wu,
M., Yi, X.: Safety and trustworthiness of deep neural networks: A survey. arXiv
preprint arXiv:1812.08342 (2018)
104. Hunt, D.B.: Redlining. Encyclopedia of Chicago (2005)
105. Ingold, D., Soper, S.: Amazon does not consider the race of its customers. should
it? Bloomberg (Apr 2016)
106. Johndrow, J.E., Lum, K., et al.: An algorithm for removing sensitive information:
application to race-independent recidivism prediction. The Annals of Applied
Statistics 13(1), 189–220 (2019)
107. Kamiran, F., Calders, T.: Classifying without discriminating. In: 2009 2nd Inter-
national Conference on Computer, Control and Communication. pp. 1–6. IEEE
(2009)
108. Kamiran, F., Calders, T.: Classification with no discrimination by preferential
sampling. In: Proc. 19th Machine Learning Conf. Belgium and The Netherlands.
pp. 1–6. Citeseer (2010)
109. Kamiran, F., Calders, T.: Data preprocessing techniques for classification without
discrimination. Knowledge and Information Systems 33(1), 1–33 (2012)
110. Kamiran, F., Calders, T., Pechenizkiy, M.: Discrimination aware decision tree
learning. In: 2010 IEEE International Conference on Data Mining. pp. 869–874.
IEEE (2010)
111. Kamiran, F., Karim, A., Zhang, X.: Decision theory for discrimination-aware
classification. In: 2012 IEEE 12th International Conference on Data Mining. pp.
924–929. IEEE (2012)
112. Kamiran, F., Žliobaitė, I., Calders, T.: Quantifying explainable discrimination
and removing illegal discrimination in automated decision making. Knowledge
and information systems 35(3), 613–644 (2013)
113. Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with
prejudice remover regularizer. In: Joint European Conference on Machine Learn-
ing and Knowledge Discovery in Databases. pp. 35–50. Springer (2012)
114. Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regular-
ization approach. In: 2011 IEEE 11th International Conference on Data Mining
Workshops. pp. 643–650. IEEE (2011)
115. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Towards
proving the adversarial robustness of deep neural networks. arXiv preprint
arXiv:1709.02802 (2017)
56 J. Dunkelau et al.
116. Kay, M., Matuszek, C., Munson, S.A.: Unequal representation and gender stereo-
types in image search results for occupations. In: Proceedings of the 33rd Annual
ACM Conference on Human Factors in Computing Systems. pp. 3819–3828. ACM
(Apr 2015)
117. Kilbertus, N., Gascon, A., Kusner, M., Veale, M., Gummadi, K., Weller, A.: Blind
justice: Fairness with encrypted sensitive attributes. In: Proceedings of the 35th
International Conference on Machine Learning (ICML 2018). vol. 80, pp. 2635–
2644. International Machine Learning Society (IMLS) (2018)
118. Kilbertus, N., Carulla, M.R., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf,
B.: Avoiding discrimination through causal reasoning. In: Advances in Neural
Information Processing Systems. pp. 656–666 (2017)
119. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. stat 1050, 10 (2014)
120. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair de-
termination of risk scores. In: Papadimitriou, C.H. (ed.) 8th Innovations in Theo-
retical Computer Science Conference (ITCS 2017). Leibniz International Proceed-
ings in Informatics (LIPIcs), vol. 67, pp. 43:1–43:23. Schloss Dagstuhl–Leibniz-
Zentrum fuer Informatik, Dagstuhl, Germany (2017)
121. Kuhn, P.: Sex discrimination in labor markets: The role of statistical evidence.
The American Economic Review pp. 567–583 (1987)
122. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Ad-
vances in Neural Information Processing Systems. pp. 4066–4076 (2017)
123. Kwiatkowska, M.Z.: Safety verification for deep neural networks with provable
guarantees (invited paper). In: Fokkink, W., van Glabbeek, R. (eds.) 30th In-
ternational Conference on Concurrency Theory (CONCUR 2019). Leibniz In-
ternational Proceedings in Informatics (LIPIcs), vol. 140, pp. 1:1–1:5. Schloss
Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2019)
124. LaCour-Little, M.: Discrimination in mortgage lending: A critical review of the
literature. Journal of Real Estate Literature 7(1), 15–49 (Jan 1999)
125. Leadership Conference on Civil and Human Rights: Civil rights principles for the
era of big data. [Link]
data/ (2014)
126. Lerman, J.: Big data and its exclusions. Stanford Law Review Online 66, 55
(2013)
127. Li, Y., Swersky, K., Zemel, R.: Learning unbiased features. arXiv preprint
arXiv:1412.5244 (2014)
128. Lin, P., Abney, K., Bekey, G.A.: Robot Ethics: The Ethical and Social Implica-
tions of Robotics. The MIT Press (2014)
129. Lippert-Rasmussen, K.: “we are all different”: Statistical discrimination and the
right to be treated as an individual. The Journal of Ethics 15(1-2), 47–59 (2011)
130. Loftus, J.R., Russell, C., Kusner, M.J., Silva, R.: Causal reasoning for algorithmic
fairness. arXiv preprint arXiv:1805.05859 (2018)
131. Louizos, C., Swersky, K., Li, Y., Welling, M., Zemel, R.: The variational fair
autoencoder. stat 1050, 12 (2015)
132. Louppe, G., Wehenkel, L., Sutera, A., Geurts, P.: Understanding variable im-
portances in forests of randomized trees. In: Advances in Neural Information
Processing Systems. pp. 431–439 (2013)
133. Lum, K., Isaac, W.: To predict and serve? Significance 13(5), 14–19 (2016)
134. Lum, K., Johndrow, J.E.: A statistical framework for fair predictive algorithms.
stat 1050, 25 (2016)
Fairness-Aware Machine Learning 57
135. Luong, B.T., Ruggieri, S., Turini, F.: k-NN as an implementation of situation
testing for discrimination discovery and prevention. In: Proceedings of the 17th
ACM SIGKDD international conference on Knowledge discovery and data mining.
pp. 502–510. ACM (2011)
136. McNamara, D., Ong, C.S., Williamson, R.C.: Provably fair representations. arXiv
preprint arXiv:1710.04394 (2017)
137. McNamara, D., Ong, C.S., Williamson, R.C.: Costs and benefits of fair repre-
sentation learning. In: Proceedings of the 2019 AAAI/ACM Conference on AI,
Ethics, and Society. pp. 263–270. ACM (2019)
138. Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In:
Conference on Fairness, Accountability and Transparency. pp. 107–118 (2018)
139. Mittelstadt, B., Russell, C., Wachter, S.: Explaining explanations in AI. In: Pro-
ceedings of the Conference on Fairness, Accountability, and Transparency. pp.
279–288. ACM (2019)
140. Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial
perturbations. In: Proceedings of the IEEE conference on computer vision and
pattern recognition. pp. 1765–1773 (2017)
141. Mouzannar, H., Ohannessian, M.I., Srebro, N.: From fair decision making to so-
cial equality. In: Proceedings of the Conference on Fairness, Accountability, and
Transparency. pp. 359–368. ACM (2019)
142. Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Thirty-Second AAAI Con-
ference on Artificial Intelligence (2018)
143. Nash Jr, J.F.: The bargaining problem. Econometrica: Journal of the Econometric
Society pp. 155–162 (1950)
144. Necula, G.C.: Proof-carrying code. In: Proceedings of the 24th ACM SIGPLAN-
SIGACT symposium on Principles of programming languages. pp. 106–119. ACM
(1997)
145. Nissenbaum, H.: Computing and accountability. Communications of the ACM
37(1), 72–81 (1994)
146. Nocedal, J., Wright, S.: Numerical optimization. Springer Science & Business
Media (2006)
147. Paaßen, B., Bunge, A., Hainke, C., Sindelar, L., Vogelsang, M.: Dynamic fairness
– breaking vicious cycles in automatic decision making. In: European Symposium
on Artificial Neural Networks, Computational Intelligence and Machine Learning
(Apr 2019)
148. Page Scott, E.: The Difference: How the Power of Diversity Creates Better Groups,
Firms, Schools, and Societies. Princeton University Press (2007)
149. Pearl, J.: Causality: models, reasoning and inference, vol. 29. Springer (2000)
150. Pearl, J.: Direct and indirect effects. In: Proceedings of the seventeenth conference
on uncertainty in artificial intelligence. pp. 411–420. Morgan Kaufmann Publish-
ers Inc. (2001)
151. Pedreschi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Pro-
ceedings of the 14th ACM SIGKDD international conference on Knowledge dis-
covery and data mining. pp. 560–568. ACM (2008)
152. Pedreschi, D., Ruggieri, S., Turini, F.: Measuring discrimination in socially-
sensitive decision records. In: Proceedings of the 2009 SIAM International Con-
ference on Data Mining. pp. 581–592. SIAM (2009)
153. Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and
calibration. In: Advances in Neural Information Processing Systems. pp. 5680–
5689 (2017)
58 J. Dunkelau et al.
154. Quadrianto, N., Sharmanska, V., Thomas, O.: Neural styling for interpretable fair
representations. arXiv preprint arXiv:1810.06755 (2018)
155. Quinlan, J.R.: C 4.5: Programs for machine learning. The Morgan Kaufmann
Series in Machine Learning, San Mateo, CA: Morgan Kaufmann (1993)
156. Ramadan, Q., Ahmadian, A.S., Strüber, D., Jürjens, J., Staab, S.: Model-based
discrimination analysis: A position paper. In: 2018 IEEE/ACM International
Workshop on Software Fairness (FairWare). pp. 22–28. IEEE (2018)
157. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and ap-
proximate inference in deep generative models. In: International Conference on
Machine Learning. pp. 1278–1286 (2014)
158. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?”: Explaining the
predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD interna-
tional conference on knowledge discovery and data mining. pp. 1135–1144. ACM
(2016)
159. Robinson, D., Yu, H., Rieke, A.: Civil rights, big data, and our algorithmic future.
In: Leadership Conference on Civil and Human Rights. vol. 1 (2014), available
at: [Link]
160. Romei, A., Ruggieri, S.: A multidisciplinary survey on discrimination analysis.
The Knowledge Engineering Review 29(5), 582–638 (2014)
161. Rosenblatt, F.: Principles of neurodynamics. perceptrons and the theory of brain
mechanisms. Tech. rep., Cornell Aeronautical Lab Inc Buffalo NY (1961)
162. Ruggieri, S., Pedreschi, D., Turini, F.: Data mining for discrimination discovery.
ACM Transactions on Knowledge Discovery from Data (TKDD) 4(2), 9 (May
2010)
163. Russell, C., Kusner, M.J., Loftus, J., Silva, R.: When worlds collide: integrating
different counterfactual assumptions in fairness. In: Advances in Neural Informa-
tion Processing Systems. pp. 6414–6423 (2017)
164. Ryu, H.J., Adam, H., Mitchell, M.: Inclusivefacenet: Improving face attribute
detection with race and gender diversity. 2018 ICML Workshop on Fairness, Ac-
countability, and Transparency in Machine Learning (2017)
165. Saleiro, P., Kuester, B., Stevens, A., Anisfeld, A., Hinkson, L., London, J., Ghani,
R.: Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577
(2018)
166. Shen, X., Diamond, S., Gu, Y., Boyd, S.: Disciplined convex-concave program-
ming. In: 2016 IEEE 55th Conference on Decision and Control (CDC). pp. 1009–
1014. IEEE (2016)
167. Shpitser, I.: Counterfactual graphical models for longitudinal mediation analysis
with unobserved confounding. Cognitive science 37(6), 1011–1035 (2013)
168. Simoiu, C., Corbett-Davies, S., Goel, S., et al.: The problem of infra-marginality
in outcome tests for discrimination. The Annals of Applied Statistics 11(3), 1193–
1216 (2017)
169. Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Proceedings of the
24th ACM SIGKDD International Conference on Knowledge Discovery &
Data Mining. pp. 2219–2228. KDD ’18, ACM, New York, NY, USA (2018)
170. Skirpan, M., Gorelick, M.: The authority of “fair” in machine learning. arXiv
preprint arXiv:1706.09976 (2017)
171. Sokol, K., Santos-Rodriguez, R., Flach, P.: FAT forensics: A python tool-
box for algorithmic fairness, accountability and transparency. arXiv preprint
arXiv:1909.05167 (2019)
Fairness-Aware Machine Learning 59
172. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC:
a family of discriminant measures for performance evaluation. In: Australian con-
ference on artificial intelligence. vol. 4304, pp. 1015–1021 (2006)
173. Sweeney, L.: Discrimination in online ad delivery. Commun. ACM 56(5), 44–54
(Jan 2013)
174. Taylor, L.: What is data justice? the case for connecting digital rights and free-
doms globally. Big Data & Society 4(2) (2017)
175. Tharwat, A.: Classification assessment methods. Applied Computing and Infor-
matics (2018)
176. The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems:
Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Au-
tonomous and Intelligent Systems. 1 edn. (2019)
177. The White House. Executive Office of the President: Big data: Seizing opportu-
nities and preserving values (May 2014)
178. The White House. Executive Office of the President: Big data: Seizing opportu-
nities and preserving values: Interim progress report. (Feb 2015)
179. The White House. Executive Office of the President: Big data: A report on algo-
rithmic systems, opportunity, and civil rights (May 2016)
180. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the
Royal Statistical Society: Series B (Methodological) 58(1), 267–288 (1996)
181. Tramèr, F., Atlidakis, V., Geambasu, R., Hsu, D., Hubaux, J.P., Humbert, M.,
Juels, A., Lin, H.: FairTest: Discovering unwarranted associations in data-driven
applications. In: 2017 IEEE European Symposium on Security and Privacy (Eu-
roS&P). pp. 401–416. IEEE (2017)
182. Trenkler, G., Stahlecker, P.: Dropping variables versus use of proxy variables
in linear regression. Journal of Statistical Planning and Inference 50(1), 65–75
(1996), econometric Methodology, Part III
183. Tufekci, Z.: Algorithmic harms beyond facebook and google: Emergent challenges
of computational agency. Colo. Tech. LJ 13, 203 (2015)
184. Turner, M.A., Skidmore, F.: Mortgage lending discrimination: A review of existing
evidence (1999)
185. UK Legislation: Sex discrimination act 1975 c. 65 (1975)
186. UK Legislation: Race relations act 1976 c. 74 (1976)
187. UK Legislation: Disability discrimination act 1995 c. 50 (1995)
188. UK Legislation: Equality act 2010 c. 15 (Oct 2000)
189. U.S. Federal Legislation: The equal pay act of 1963, pub.l. 88–38, 77 stat. 56 (Apr
1963)
190. U.S. Federal Legislation: Civil rights act of 1968, pub.l. 90–284, 82 stat. 73 (Apr
1968)
191. U.S. Federal Legislation: Equal credit opportunity act of 1974, 15 u.s.c. § 1691 et
seq. (Oct 1974)
192. Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In:
Proceedings of the Conference on Fairness, Accountability, and Transparency. pp.
10–19. ACM (2019)
193. Varian, H.R.: Equity, envy, and efficiency (1973)
194. Verma, S., Rubin, J.: Fairness definitions explained. In: FairWare’18: IEEE/ACM
International Workshop on Software Fairness. ACM (May 2018)
195. Wang, R.Y., Strong, D.M.: Beyond accuracy: What data quality means to data
consumers. Journal of management information systems 12(4), 5–33 (1996)
60 J. Dunkelau et al.
196. Wei, D., Dash, S., Gao, T., Gunluk, O.: Generalized linear rule models. In: Chaud-
huri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Confer-
ence on Machine Learning. Proceedings of Machine Learning Research, vol. 97,
pp. 6687–6696. PMLR, Long Beach, California, USA (09–15 Jun 2019)
197. Woodcock, J., Larsen, P.G., Bicarregui, J., Fitzgerald, J.: Formal methods: Prac-
tice and experience. ACM computing surveys (CSUR) 41(4), 19 (2009)
198. Woodworth, B., Gunasekar, S., Ohannessian, M.I., Srebro, N.: Learning non-
discriminatory predictors. In: Kale, S., Shamir, O. (eds.) Proceedings of the
2017 Conference on Learning Theory. Proceedings of Machine Learning Research,
vol. 65, pp. 1920–1953. PMLR, Amsterdam, Netherlands (07–10 Jul 2017)
199. Wu, M., Wicker, M., Ruan, W., Huang, X., Kwiatkowska, M.: A game-based
approximate verification of deep neural networks with provable guarantees. The-
oretical Computer Science (2019)
200. Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In:
Proceedings of the 2003 SIAM International Conference on Data Mining. pp.
331–335. SIAM (2003)
201. Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In:
Proceedings of the Twenty-First International Conference on Machine Learning.
p. 114. ACM (2004)
202. Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond
disparate treatment & disparate impact: Learning classification without disparate
mistreatment. In: Proceedings of the 26th International Conference on World
Wide Web. pp. 1171–1180. International World Wide Web Conferences Steering
Committee (2017)
203. Zafar, M.B., Valera, I., Rogriguez, M.G., Gummadi, K.P.: Fairness constraints:
Mechanisms for fair classification. In: Artificial Intelligence and Statistics. pp.
962–970 (2017)
204. Zehlike, M., Bonchi, F., Castillo, C., Hajian, S., Megahed, M., Baeza-Yates, R.:
FA*IR: A fair top-k ranking algorithm. In: Proceedings of the 2017 ACM on
Conference on Information and Knowledge Management. pp. 1569–1578. CIKM
’17, ACM, New York, NY, USA (2017)
205. Zehlike, M., Castillo, C.: Reducing disparate exposure in ranking: A learning to
rank approach. arXiv preprint arXiv:1805.08716 (2018)
206. Zehlike, M., Castillo, C., Bonchi, F., Baeza-Yates, R., Hajian, S., Megahed,
M.: Fairness measures: A platform for data collection and benchmarking in
discrimination-aware ML. [Link] (Jun 2017)
207. Zehlike, M., Sühr, T., Castillo, C., Kitanovski, I.: FairSearch: A tool for fairness
in ranked search results (2019)
208. Zemel, R., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representa-
tions. In: International Conference on Machine Learning. pp. 325–333 (2013)
209. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adver-
sarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics,
and Society. pp. 335–340. ACM (2018)
210. Žliobaitė, I.: On the relation between accuracy and fairness in binary classification.
In: The 2nd workshop on Fairness, Accountability, and Transparency in Machine
Learning (FATML) at ICML’15 (2015)
211. Žliobaitė, I.: Fairness-aware machine learning: a perspective. arXiv preprint
arXiv:1708.00754 (2017)
212. Žliobaitė, I.: Measuring discrimination in algorithmic decision making. Data Min-
ing and Knowledge Discovery 31(4), 1060–1089 (Jul 2017)