(Elements in The Philosophy of Science) Luke Fenton-Glynn - Causation-Cambridge University Press (2021)
(Elements in The Philosophy of Science) Luke Fenton-Glynn - Causation-Cambridge University Press (2021)
Causation
Causation
for an examination of the latest research. ‘Outward-looking’
in that illustrations are provided of how the philosophy of
causation relates to issues in the sciences, law, and elsewhere.
The aim is to show why the study of causation is of critical
importance, besides being fascinating in its own right.
Luke Fenton-Glynn
of the themes, topics and debates which Cambridge
constitute the philosophy of science.
Distinguished specialists provide an
up-to-date summary of the results of
current research on their topics, as well
as offering their own take on those topics
and drawing original conclusions.
CAUSATION
Luke Fenton-Glynn
University College London
www.cambridge.org
Information on this title: www.cambridge.org/9781108706636
DOI: 10.1017/9781108588300
© Luke Fenton-Glynn 2021
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2021
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-70663-6 Paperback
ISSN 2517-7273 (online)
ISSN 2517-7265 (print)
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
DOI: 10.1017/9781108588300
First published online: June 2021
Luke Fenton-Glynn
University College London
Author for correspondence: Luke Fenton-Glynn, [email protected]
1 Introduction 1
5 Probabilistic Causation 55
6 Conclusion 82
References 85
1 Introduction
We humans take a great interest in causation. Causal knowledge helps us to
understand, predict, and influence the world around us. A baby quickly comes
to realise that pushing a button on her toy causes it to play a song; an adult
exploits her knowledge of the effects of chamomile to sooth the baby’s teething
gums. Because of the close relation between causation, understanding, pre-
diction, and control, natural and social scientists devote significant time and
resources to investigating causal questions. Among other things, they ask or
have asked: What are the causes of cancer? Of climate change? Of anoma-
lies in the orbit of Uranus? What caused the extinction of the dinosaurs? The
First World War? The 2007-8 financial crisis? Trump’s election? The Covid-19
outbreak? What causes mental health problems? Crime? Price inflation?
Causation is of importance to psychology: if we want to understand how
humans learn and reason, we need to understand their capacity for causal learn-
ing and reasoning. It’s of importance in AI: if we want computers and robots
to learn as well as (or better than!) humans, and to manipulate the world as (or
more!) effectively, we need to programme them to be able to acquire causal
knowledge and to use it.
Causation is also closely tied to questions of moral and legal responsibility.
In a landmark UK legal case, the wife of the late Arthur Fairchild successfully
sued Glenhaven Funeral Services1 over her husband’s death from mesotheli-
oma – a type of lung cancer caused by asbestos exposure. In order to establish
the company’s responsibility for Fairchild’s fatal illness, it was of course nec-
essary to establish that there was a causal link between something they’d
done – namely negligently expose Fairchild to elevated levels of asbestos –
and the illness itself.
Despite its ubiquity and importance, it’s surprisingly difficult to say exactly
what causation is. Difficult questions about the fundamental nature of the
world – especially those that don’t readily admit of empirical resolution – nat-
urally attract the attention of philosophers. But causation isn’t only of intrinsic
philosophical interest. Greater theoretical clarity on its nature has had signifi-
cant payoffs in the sciences and in law. And, close to home for philosophers, it
has payoffs in virtue of the fact that causation plays a role in key theories of a
variety of philosophically interesting phenomena including (but not limited to)
reference, perception, decision, knowledge, inference, action, and explanation.
It shouldn’t be thought that work on the theory of causation is the exclusive
preserve of philosophers. Much important theoretical work has been done by
1 Fairchild v Glenhaven Funeral Services Ltd ([2002] UKHL 22; [2003] 1 AC 32).
2 The notion that there might be more than one fundamental causal relation is taken up in Section
3.4.2.
Craver and Tabery (2019) note, ‘Mechanists have disagreed with one another
about how to understand the cause in causal mechanism. … Four ways of
unpacking [it] have been discussed: conserved quantity accounts, mechanistic
accounts, activities accounts, and counterfactual accounts’.
I’ve already said that it’s doubtful that the conserved quantity approach
can yield an understanding of the fundamental causal relation(s). The activ-
ities approach, on the other hand, is a primitivist approach (see Craver and
Tabery 2019), whereas we’ll be examining accounts that seek a deeper under-
standing of causation. Meanwhile, the counterfactual approach is one that
we’ll be exploring in Section 4. Finally, the mechanistic account – as advo-
cated by Glennan (1996) – is regressive. The proposal is that causal con-
nections that may seem basic at (say) the biological level can be understood
in terms of mechanisms at the chemical level, and those at the chemical
level in terms of mechanisms at the physical level. There’s thus a hierarchy
of mechanisms. The concern, though, is that this hierarchy bottoms out at
the level of fundamental physics at which level we have causings that can’t
be mechanistically understood. Again, this favours the view that there are
fundamental causal relations in terms of which mechanisms can ultimately
be understood. The regularity, counterfactual, and probabilistic approaches
seem the most promising approaches to understanding these basic causal
relations.
caused by the prior states of the rock all the way back until we get to the throw
itself.
But we don’t have a case of causation just any time an event occurs prior
to and contiguously with another. Towards the end of the movie Saving Pri-
vate Ryan, in a defiant last stand, Captain Miller repeatedly fires his pistol at a
German Tiger tank. The bullets are of course completely incapable of piercing
the tank’s armour. Down to his last bullet, Miller points his gun and shoots at
the tank at the very moment the tank is blown to pieces by a bomb dropped by
a US P-51 aircraft. The impact of Miller’s bullet is immediately prior to, and
contiguous with, the explosion of the tank. Yet it’s the bomb and not the bullet
that causes the tank to explode.
Fortunately, Hume’s account doesn’t imply that the bullet impact was a
cause. That’s because, in addition to priority and contiguity, Hume adds a
third requirement: constant conjunction. For Hume, for an event c to be a
cause of an event e, it must be the case that events like c are always fol-
lowed contiguously by events like e.3 This criterion excludes the impact of
Miller’s bullet from counting as a cause of the tank’s explosion. That’s because
events like the former aren’t always followed by events like the latter. Indeed,
Miller had already fired his gun at the tank five times prior to firing his last
bullet: the impact of none of these previous five bullets was followed contig-
uously by an explosion. So Hume’s analysis yields the correct verdict about
this case.
Although Hume doesn’t say this, it’s tempting to think that not any old con-
stant conjunction can ground a causal relation, but rather one might wish to
require that the constant conjunction be entailed by the laws of nature. This
avoids problems such as the following. Suppose there exists an extremely
rare isotope, call it ‘unobtanium-352’. Only one atom of this isotope ever
exists. Suppose this atom happened to decay on the afternoon of Novem-
ber 19, 1863, immediately before Lincoln delivered the Gettysburg Address
and contiguously with it. Now, for any type T of event of which the Gettys-
burg Address is a member (‘famous speeches’, say), it’s true that all cases
of unobtanium-352 decay are followed by events of type T. Nevertheless, it
clearly doesn’t follow that the decay of the unobtanium-352 atom was a cause
of the Gettysburg Address.4 A sophisticated regularity theory can avoid this
3 We could, on Hume’s behalf, distinguish direct from indirect causation, with c counting as a
direct cause of e iff c is prior to, and contiguous with, e and events like c are always followed
contiguously by events like e. Indirect causation would then be understood in terms of chains
of (i.e. ordered sequences of events that stand in relations of) direct causation.
4 Note that even if we take ‘constant conjunction’ to require a multiple instances, we’re still
liable to get ‘accidental’ constant conjunctions that aren’t apt to underwrite causal relations
(see Armstrong 1983, 15–17).
conclusion by pointing out that the fact that all instances of unobtanium-352
decay are followed by events of type T isn’t entailed by the laws of nature
(rather, it’s an instance of an accidental regularity).5
2.2 Mill
An apparent problem with Hume’s account is that sometimes we seem to have
causation without constant conjunction. John Stuart Mill pointed this out in
making the following observation:
5 A couple of comments are worth making regarding this proposed appeal to laws of nature. First,
whether one regards it as marking a departure from a pure regularity theory of causation will
depend upon one’s preferred metaphysics of laws. It will not be a departure if one adopts a regu-
larity theory of laws. Whilst sophisticated regularity theories of laws – such as the Best System
Analysis (Lewis 1994) – regard laws as regularities, they don’t count just any old regularity as a
law of nature. Thus, for instance, it’s to be hoped that they wouldn’t count an unobtainium-325
decay/famous speech regularity as a law.
Second, for the appeal to laws of nature to be satisfactory, it will presumably be necessary
that a wide range of regularities outside the domain of fundamental physics (e.g. ‘aspirin con-
sumption is followed by pain relief’) are entailed by laws of nature. That’s because it’s clear that
our causal claims extend to such domains. There’s an extensive philosophical literature on the
status of generalisations outside of fundamental physics. For overviews of important aspects of
this literature, see Cat (2017) and Reutlinger et al. (2019). Thanks to an anonymous referee for
encouraging me to say something about both of the foregoing points.
is between eating the dish and (e.g.) having a severe peanut allergy (as Mill
puts it: having ‘a particular bodily constitution’) and death.
Mill thinks that, properly speaking, in such a case it’s the combination of
the severe peanut allergy with the eating of the dish that’s the cause of death.
However he observes elsewhere (Mill 1843, III.v.3) that, in ordinary talk, we
often single out just one of the factors in such a combination as the cause
and regard the others as mere ‘conditions’. Specifically, he claims that we’re
inclined to pick out ‘events’ or ‘changes’ like the eating of the dish as causes
and treat long-standing ‘states’ like the possessing of the peanut allergy as mere
‘conditions’.
2.4 Mackie
As we saw, although Mill acknowledges that in ordinary talk we distinguish
between causes and conditions, he thinks that the ‘real Cause’ is the combina-
tion of all the factors needed to bring about the effect. We also saw that Hart
and Honoré seek to account for why we distinguish between causes and condi-
tions in terms of our explanatory interests, noting that these various factors may
be equally necessary for the effect. Especially if we think that the event/state
distinction does not always track the cause/condition distinction (as Hart and
Honoré’s example suggests), we might conclude that the cause/condition dis-
tinction isn’t really a metaphysical distinction at all but rather one that’s to be
accounted for by a suitable pragmatics of causal talk. It’s tempting to impute
this view to David Lewis when he says:
We sometimes single out one among all the causes of some event and call it
‘the’ cause, as if there were no others. Or we single out a few as the ‘causes,’
calling the rest mere ‘causal factors’ or ‘causal conditions.’ … We may select
the abnormal or extraordinary causes, or those under human control, or those
we deem good or bad, or just those we want to talk about. I have nothing
to say about these principles of invidious discrimination. I am concerned
with the prior question of what it is to be one of the causes (unselectively
speaking). (Lewis 1973a, 558–9)
We’ll return to Lewis’s account of causation in Section 4. For now it’s worth
noting that, even if we agree that there’s no deep metaphysical distinction
between causes and conditions (or indeed between the factor that we might
on some occasion be inclined to pick out as ‘the’ cause of some effect, and the
rest of that effect’s causes), we needn’t follow Mill in taking the conjunction
of all the factors requisite to produce an effect as the real cause. An alternative
approach – as suggested in the passage from Lewis – is that we allow that each
of the factors in question counts as a cause, so that a given effect may have a
plurality of causes.
The latter is the approach of Mackie, whose well-known account of causation
perhaps marked the zenith of the regularity approach.6 Mackie’s account can
be introduced by means of his own example:
Suppose that a fire has broken out in a certain house …. Experts … con-
clude that it was caused by an electrical short-circuit at a certain place. …
Clearly the experts are not saying that the short-circuit was a necessary con-
dition for this house’s catching fire at this time; they know perfectly well
that … the overturning of a lighted oil stove, or any one of a number of
other things might, if it had occurred, have set the house on fire. Equally,
they are not saying that the short-circuit was a sufficient condition for this
house’s catching fire; for if the short-circuit had occurred, but there had been
no inflammable material nearby, the fire would not have broken out …. In
what sense, then, is it said to have caused the fire? At least part of the answer
is that there is a set of conditions … including the presence of inflammable
material, the absence of a suitably placed sprinkler … which combined with
the short-circuit constituted a complex condition that was sufficient for the
house’s catching fire – sufficient, but not necessary, for the fire could have
started in other ways. Also, of this complex condition, the short-circuit was
an indispensable part: the other parts of this condition, conjoined with one
another in the absence of the short-circuit, would not have produced the fire.
… In this case, then, the so-called cause is … an insufficient but necessary
part of a condition which is itself unnecessary but sufficient for the result. …
[L]et us call such a condition (from the initial letters of the words italicized
above), an inus condition. (Mackie 1965, 245)
6 This isn’t to denigrate subsequent accounts within the regularity tradition, including the excel-
lent contributions of Strevens (2007) and Baumgartner (2013). But these can be regarded as
attempts to revive the tradition after a prolonged period in which it has been out of favour. It’s
fair to say that it currently remains a minority approach.
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
Causation 9
Clearly Mackie’s account makes heavy use of the notions of necessary and
sufficient conditions. The reason it’s usually classed as a regularity theory is
that necessary and sufficient conditions can themselves be understood in terms
of regularities.7 For instance, we might say that the set of conditions comprising
the short-circuit, the presence of inflammable material, the presence of oxygen,
the absence of a sprinkler, etc. is sufficient for the fire iff whenever such a
constellation of factors co-occurs, a fire always ensues. Likewise, we might
say that the short-circuit is a necessary (or non-redundant) element of this set
of conditions iff it’s not the case that whenever the remaining conditions in the
set co-occur then fire ensues.8
With the notions of ‘necessity’ and ‘sufficiency’ so interpreted, we can view
Hume as taking causes to be sufficient for their effects (and possibly necessary
too, if ‘constant conjunction’ is taken to cut both ways, so that not only does a
cause never occur without its associated effect, but an effect never occurs with-
out its associated cause), while Mill agrees that real causes are sufficient for
their effects but takes them to typically be complex (e.g. the complex condition
comprising the short-circuit, the presence of oxygen, the presence of inflam-
mable material, and the absence of a sprinkler system, etc.). Mackie, on the
other hand, allows non-redundant elements of such sets (e.g. the short-circuit)
to count as causes.
7 Though, drawing upon the point made in Section 2.1, we may wish to understand them in terms
of lawfully entailed regularities.
8 Actually, Mackie himself proposes that necessity and sufficiency only indirectly be understood
in terms of regularities: specifically, he proposes to interpret necessity and sufficiency in terms
of counterfactuals, with the counterfactuals understood in terms of regularities (Mackie 1965,
253–5). So one might take Mackie’s account to be a sort of hybrid regularity/counterfactual
theory. Counterfactual theories will be examined in Section 4.
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
10 Philosophy of Science
might cause her to drink at time t2 , which in turn causes her depressed state at
time t3 , what we’d never have is a person’s depressed state at t1 causing her to
drink at t2 with her drinking at t2 causing her depression at t1 . The point is that,
where a and b are particular events or states that obtain at specific times, it can’t
be the case both that a causes b and that b causes a. This point is sometimes
put by saying that token causation (that is, the causal relation between token –
i.e. particular, dated – events or states) is asymmetric. We’ll have more to say
about token causation – and its distinction from what’s sometimes known as
type causation – in Section 3.3.
One might wonder whether it’s really entirely impossible that a should be
a token cause of b and b a token cause of a. The General Theory of Relativ-
ity (GTR) allows for the possibility of what are known as ‘closed time-like
curves’ – paths in space-time of positive distance that lead from a given space-
time point p back to p that could be traversed without ever travelling at or above
the speed of light. Suppose object o traverses such a path from p to p and goes
through some point q on its way. Then the state of o at p (its position and veloc-
ity) is presumably a cause of its state at q and its state at q a cause of its state
at p. But, whilst GTR allows that this is a possibility, we don’t have reason to
think that closed time-like curves in fact exist in our universe.
The trouble for regularity theorists is that they face a challenge in accounting
for why we don’t have bi-directional causation even in quoditian cases. That’s
because necessity and sufficiency are two sides of the same coin: if a is suffi-
cient for b then b is necessary for a. Putting it in regularity-theoretic terms, if
events like a are always accompanied by events like b, then there are no a-like
events without b-like events. This means that a challenge arises for the accounts
of Hume and Mill when we have causes that are both necessary and sufficient
for their effects. It also turns out that often when a is an inus condition of b,
then b is also an inus condition of a.
This difficulty isn’t necessarily fatal to the regularity theory. What it shows
is that some extra element needs to be added to the analysis to distinguish cause
from effect. Hume takes this extra element to be time: he requires that the cause
be earlier in time than the effect. One potential objection to Hume’s approach
is that it’s not clear how to reconcile it with the possibility of closed causal
loops of the sort seemingly allowed by GTR. Mackie was reluctant to rule
out backwards-in-time causation, and so instead took the cause to be the event
that becomes fixed first (Mackie 1980, ch. 7). While, ordinarily, earlier events
become fixed before later ones (because becoming a past event is the usual
way in which an event becomes fixed), Mackie thought that in special circum-
stances (of the sort that would allow for retro-causation) a later event might
become fixed prior to an earlier one. Yet it’s not clear that Mackie’s proposal
Probabilistic Bomb
“[A] bomb is connected with a Geiger counter, so that it will go off if the
Geiger counter registers a certain reading; whether it will or not is not
determined, for it is so placed near some radioactive material that it may
or may not register that reading. … [T]here would be no doubt of the
cause of the reading or of the explosion if the bomb did go off.”
10 Likewise, the problem is one that Strevens (2007) seeks to deal with via an account that, while
incorporating elements of a Mackie-inspired regularity analysis, also appeals to a notion of
‘causal connection’. Though Strevens (2007, 110) suggests that this notion of ‘causal connec-
tion’ might potentially be understood via a ‘process theoretic’ account of causation (of the sort
briefly discussed in the Introduction of this Element) or a nuanced counterfactual approach, he
refrains from endorsing any specific analysis of the notion.
11 This representation isn’t to be confused with the graphs of causal models that will be introduced
in Sections 4 and 5. Rather, it’s merely an intuitive representation of the causal relations that
obtain in this case.
sufficient for the storm,12 while the latter two of these conditions aren’t on their
own sufficient. Requiring that causes temporally precede their effects doesn’t
help since barometers are prediction devices and therefore, if the barometer is
any good, it will signal a storm prior to the storm’s occurrence.
In the face of these difficulties, the regularity approach has fallen out of
favour. In Sections 4 and 5 we’ll explore alternatives to it. But first we’ll
pause to discuss some important general questions about the nature of cau-
sation and our methodology in investigating it. It’ll be easier to do this now
we’ve embarked on our discussion of theories of causation, since this will lend
greater concreteness to the discussion.
12 This means that Mill’s account yields an incorrect result about the causation of the storm.
Hume’s account, on the other hand, gives incorrect results whenever we have a common cause
c that’s necessary for an effect e1 and sufficient for an independent effect e2 , for this will mean
that e1 is sufficient for c and therefore e2 .
(from the fact that x causes y and y causes x, we can infer by transitivity that
x causes x).
Also on the assumption that causation is a relation, we might ask what the
relata of the causal relation are. The standard answer is events, where the lat-
ter category is broadly construed to include what might be considered ‘states’
or ‘conditions’ (such as the presence of oxygen). Thus, for instance, David-
son (1967) and Kim (1973a) advance considerations in favour of the view that
causation relates events, on a reasonably broad construal of what counts as an
‘event’.
There’s a hitch, though, with the view that causation is a relation with events
as its relata. That hitch is how to make sense of statements like ‘the crop failed
because there was no rain’, or ‘the absence of rain caused the crop to fail’, which
seems a legitimate paraphrase. The difficulty is that it’s not clear that an absence
of an event is itself an event. Moreover, we can’t just happily acknowledge that
and simply allow that causation is a relation that may have events or absences
as its relata. The reason is that it’s not obvious that absences are particulars that
could stand in such a relation to events or to other absences (Beebee 2004, cf.
Mellor 1995, 163–5).
What are our options? One is to deny that causation is a relation at all. This
option is pursued by Mellor (1995). Mellor (1995, 156) regards the basic form
of causal truth as ‘E because C’, where ‘C’ and ‘E’ are facts, with facts under-
stood as true sentences (Mellor 1995, 161). Facts aren’t particulars and so aren’t
apt to be the relata of a real relation. Moreover, in cases like ‘the crop failed
because there was no rain’, Mellor (1995, 163–5) claims there are no particu-
lars that serve as truthmakers for C and E that could themselves stand in a real
relation of causation.
A second option, discussed by Lewis (1986b, 190–3, 2004b, 281–2), would
be to insist that absences are particulars and therefore apt to stand in a causal
relation. Lewis discusses two versions of this view: (i) there are negative events
(events whose essence is the absence of something); (ii) there are (positive)
events that are contingently absences. To illustrate the idea behind (ii), turn-
ing right at a junction might be one way for a car to fail to turn left; another
way for it to fail to turn left might be by its continuing straight on. If the car
in fact turns right, then it’s turning right might be taken to be the absence of
its turning left, but only contingently because it’s possible that another event
(the car’s continuing straight on) should have instead been the absence of its
turning left.
The trouble with (i) is that negative events would be strange sorts of things,
and it’s not obvious how to reconcile them with standard accounts of the ontol-
ogy of events (e.g. Kim 1973a, Lewis 1986a). The trouble with (ii) is that
there don’t always seem to be events of the right sort to contingently serve
as absences. For instance, what positive event could contingently serve as the
absence of rain? The presence of sun? But sun tends to be beneficial to crops
(so its presence, unlike the absence of rain, wouldn’t explain a crop failure)
and, in any case, its presence isn’t incompatible with rain (fortunately for rain-
bow fans). The presence of a cloudless sky? But this is really an absence of
clouds.
A third option (see, e.g., Beebee 2004) is to maintain that causation is a rela-
tion, but admit that absences aren’t particulars that can stand in such a relation,
and so deny that statements like ‘the crop failed because there was no rain’ have
a causal relation as their truthmaker. If one nevertheless wishes to maintain that
such statements can be true, then one takes on the burden of giving a semantics
for them that doesn’t appeal to the existence of a corresponding causal relation.
Lewis (2004b, 282–3), for instance, appeals to the obtaining of certain counter-
factuals. For instance, the statement ‘the absence of rain caused the crop to fail’
is made true by the truth of the counterfactual ‘if there had been rain, then the
crop wouldn’t have failed’ (which, Lewis would claim, doesn’t itself require
a causal relation as its truthmaker – Lewis 1979). A regularity theorist, on the
other hand, might take this statement to be made true by the regularity that,
whenever there’s healthy crop growth (in the absence of a sprinkler system,
etc.), it’s preceded by rain. On the other hand, an advocate of a probabilistic
analysis (of the sort to be discussed in Section 5) might take it to be made true
by the fact that rain raises the probability of healthy crop growth.
In what follows, I’ll adopt the standard view that causation is a relation and
remain neutral between the second and third option for dealing with absences.
But, setting aside the problem of absences, are there particulars other than
events (construed liberally) that could serve as the relata of a causal relation?
One might think that objects could. For instance, if a brick is thrown at a win-
dow, which shatters, it sounds perfectly acceptable to say ‘the brick caused the
window to break’. Yet the brick is an object, not an event.
It is, however, tempting to think that event causation is fundamental. That’s
because, whenever it sounds felicitous to say ‘object o was a cause of e’, it
seems there’s some event c that o participated in such that it’s apt to say that ‘o
was a cause of e in virtue of o’s participation in c’ and also that ‘c was a cause
of e’. The brick caused the window to break, but it did so by hitting the window,
and the brick hitting the window is an event which itself caused the window to
break. On the other hand, whenever it sounds felicitous to say ‘a was a cause of
object o’ (e.g. ‘the collision of the Indo-Australian and Eurasian plates caused
the Himalayas’), an apt paraphrase seems to be ‘a was a cause of o’s coming
into existence’, and the coming into existence of o is an event.
13 This view is particularly associated with the causal modelling approach to causation to be
explored in sections 4.6 and 5.3–5.6. Pearl (2009), Spirtes et al. (2000), and Halpern (2016)
are prime examples of this approach.
14 As we’ll see in Section 4.6.5, taking the variables themselves to be the relata might sometimes
be appropriate depending on the variety of causation at issue.
such as “drinking lots of wine last night caused my hangover” is in most con-
texts evaluable as true without explicitly specified contrasts, that’s because
in most contexts the implicit contrast to my drinking all that wine is some
sort of default behaviour like my not drinking alcohol (and the implicit con-
trast to my having a hangover is just my not having a hangover). But perhaps
in rare contexts the implicit contrast might be different so that the sentence
expresses a false proposition. For example, suppose that last night I was par-
ticipating in an initiation ceremony where I was forced to choose to drink
either lots of wine or lots of whisky. Then the implicit contrast to my drink-
ing lots of wine is my drinking lots of whisky, so that the sentence can be
considered elliptical for “drinking lots of wine rather than lots of whisky last
night caused my hangover (rather than my not having a hangover)”, which is
presumably false.
15 Approaches to the semantics of the generic operator are discussed by, for example, the
contributors to Carlson and Pelletier (1995).
16 And, as we’ll see in Section 5.6, the question of the relation between token and type causation
is somewhat complex if one adopts a probabilistic approach.
17 Of necessity, the discussion here is brief. Readers are referred to Paul and Hall (2013, esp. ch.
2, sect. 3) for a more extensive discussion of the goals and methods of a range of contemporary
approaches to caussation.
those objects, that resemble the latter’ (Hume 1739, I.iii.14). This, of course, is
Hume’s regularity account of causation.18
Hume, as well as other British empiricists (notably John Locke), were
important influences in the later analytic tradition which gave the method of
conceptual analysis a central role in philosophy. We’ve already seen that there
is some reason to think that analysis of the concept of causation isn’t auto-
matically an investigation of causation as it exists in the world. Still, the two
projects aren’t entirely unrelated. If one gave an account of causation in the
world that bore little resemblance to our concept of causation, then one could
reasonably be accused of changing the subject. For example, this would be a
fair charge if one asserted that causation ‘in the objects’ were identical to mere
correlation. After all, it’s a common aphorism that correlation doesn’t imply
causation: barometer readings don’t affect the weather, for instance.
There’s a certain amount of give and take here, however. Perhaps nothing ‘in
the objects’ could perfectly answer to our concept of causation. This would be
so if, for instance, there are no necessary connections in nature, but Hume is
right that the concept of causation involves the idea of ‘necessary connection’
as a key component. One response would be to simply deny that there is such
a thing as causation ‘in the world’. Yet if there is something in the world that
answers near enough to our concept of causation, or if there’s something in the
world that our concept of causation tracks with at least a reasonable degree of
systematicity (for instance, Hume thought that we were systematically willing
to apply our concept of causation in cases in which there’s a confluence of
priority, contiguity, and constant conjunction), then it may be reasonable to
call this thing ‘causation’.
If one regards causation ‘in the world’ as diverging from our concept of
causation, then it might be natural to suggest a revision to our ordinary con-
cept so that it better corresponds to causation in the world or so that it better
corresponds to something that it would be useful for our concept of causation
to track. This is one reason why one might endorse a ‘revisionary’ or ‘prescrip-
tive’ account of the concept of causation, as opposed to a purely descriptive
one. There are other (perhaps related) reasons for endorsing a revisionary anal-
ysis. For example, one might think that the ordinary concept of causation is
somehow imprecise or even incoherent and that we would be better served by
a concept that was coherent and precise. We’ll see in Sections 4 and 5 that
18 It’s a matter of controversy whether Hume thought that this ‘definition’ captures all that causa-
tion is in the world or merely all that we can know about causation as it is in the world – see
(Read 2014). The former thesis seems fairly clearly the one argued for in (Hume 1739). There
is, however, some evidence that he rows back from this in (Hume 1748).
especially some of the more formal accounts of causation (e.g. those appealing
to structural or probabilistic models) arguably involve a degree of precision
that our ordinary concept of causation lacks.19 Indeed, as we’ll see, some of
those who have developed such accounts (e.g. Spirtes et al. 2000, Hitchcock
2001b, Woodward 2003, Pearl 2009) define overtly technical causal notions –
for example, ‘total’, ‘contributing’, ‘direct’ causation – which, though arguably
useful, are sometimes only loosely related to the causal concept(s) of the lay-
person. Nevertheless, several of these authors also discuss the notion of what
they call ‘actual causation’ (Woodward 2003, sect. 2.7; Pearl 2009, ch. 10),
which is taken to be more-or-less the ordinary person’s token causal concept.
When theorising about causation, it can often be helpful to have in mind the
question of what the point is in our having a concept of causation, or perhaps
a concept of this or that sort of causation (e.g. a concept of type causation and
a concept of token or actual causation). This is perhaps particularly obvious if
one is in the business of developing a prescriptive analysis of causation – since
a consideration of what practical or cognitive needs are served by a concept (or
concepts) of causation will help guide the development of a concept that better
meets those needs. But it is also the case even when it comes to developing a
descriptive analysis: an understanding of why we have a concept (or concepts)
of cause can help us to more successfully delineate the contours of the con-
cept(s) that we do have, just as an understanding of the purpose or function of
a carburetor can help us to understand (and perhaps improve) its design.
Mellor (1995, 58–66, ch. 7) argues that causation has several important
‘connotations’, including that effects are evidence for their causes, that causes
explain their effects, and that causes are means for bringing about their effects
as ends. The connections to evidence, explanation, and practical reasoning can
well explain why we have a concept or concepts of causation and should inform
our theorising about it. For instance, if one were to devise a theory of causation
on which it were difficult or impossible to see why – on such a theory – causes
are evidence for, explanatory of, or means to, their effects, then this would be
at least prima facie evidence that we were barking up the wrong tree. It turns
out that this is a genuine issue for many accounts of causation. For instance,
each of the broad traditions that we consider in this Element – the regularity,
19 This is also true of some of the less technical accounts. For instance, we’ll see that Lewis’s
counterfactual analysis of causation implies that overdeterminers are not causes in cases of so-
called symmetric overdetermination. Lewis (1986b, 208) believes that there is a ‘lack of firm
common-sense judgements’ regarding whether overdeterminers are genuine causes and says
that when, as in cases of overdetermination, ‘common sense falls into indecision or controversy
… then … [w]e can reasonably accept as true whatever answer comes from the analysis [of
causation] that does best on the clearer cases’.
20 Thus, for instance, though the notion of ‘non-backtracking counterfactual dependence’ may be
obscure as far as the ordinary competent language user is concerned, if this relation is grounded
in some asymmetry of physics – as Lewis (1979) and Dunn (2011) have argued – then giv-
ing extensionally adequate truth-conditions for causal claims in terms of non-backtracking
counterfactual dependence potentially paves the way for an overall simplification of our
metaphysics.
21 In fact, it might be that deterministic and probabilistic causation coexist in some worlds (perhaps
at different scientific ‘levels’), and that ours is such a world. For relevant discussion, see Albert
(2000), Loewer (2001), Ismael (2009), Werndl (2011, 2263), Emery (2015), and List and Pivato
(2015, sect. 9). Potentially, then, there’s an interesting discussion to be had about the legitimacy
of seeking a philosophical account of causation that applies only to certain domains (e.g. levels)
even within a world.
And perhaps one can also usefully define technical causal notions at the token-
level provided that one is careful to distinguish these from our notion of actual
causation.22
But what would be particularly interesting, given our firm intuitions about
actual causation, is if there were some inconsistency in our thought or talk about
it that might motivate a distinction between two or more concepts of actual
causation. There are at least two ways in which this might occur. First, it might
be that the conception of actual causation in play varies across contexts. For
instance, it might be that lawyers or scientists or historians draw upon a dif-
ferent conception of actual causation from one another or from ordinary folk.
Second, it might be that there’s an inconsistency in our thought or talk about
actual causation even within a context.
The latter is a view argued for in some detail by Hall (2004) who thinks
that the ordinary folk concept of actual causation bifurcates into a concept of
‘dependence’ and a concept of ‘production’. In many cases, production and
dependence align. For example, a short circuit might produce a fire, with the
occurrence of the latter also depending upon the occurrence of the former.
Sometimes, though, they come apart: if two short circuits occur simultaneously,
each sufficient for the fire, then whilst both may play a role in the fire’s produc-
tion, the occurrence of the fire might depend on neither (since in the absence of
each, the other short circuit would have still been enough to produce the fire).
This is an instance of so-called ‘symmetric overdetermination’ (a phenomenon
that will be discussed in more detail in Section 4.4). To the extent that we feel
conflicted about whether each of the short circuits was a cause in such a case,
the breaking of the normal association of production and dependence might be
one explanation for this feeling of conflict.
There are also prima facie reasons for thinking that the conception of actual
causation varies across contexts. For instance, a simple counterfactual ‘but for’
test for actual causation is preponderant in the law (“But for the short circuit,
would the fire have occurred?”). In apparently stark contrast, there’s consid-
erable controversy about the legitimacy of using counterfactual reasoning to
establish causal claims in history (see, e.g., Evans 2013). Still, this prima facie
divergence may only be that. When application of the ‘but for’ test produces
counterintuitive consequences, judges tend to vary the test for causation.23
22 For instance, Hitchcock (2007, 503–4) introduces a technically defined notion of ‘token causal
structure’ which he clearly distinguishes from the ordinary notion of actual causation. And, as
we’ll see in Section 4.6.5, Woodwardian notions of ‘direct’, ‘total’, and ‘contributing’ cause
can also be defined for the token level.
23 Examples include R v Dyson [1908 2 KB 454 (CA)], Kingston v Chicago & N.W. Ry [191 Wis.
610, 211 N.W. 913 (1927)], McGhee v National Coal Board [1972 3 All E.R. 1008, 1 W.L.R.
1], and Fairchild v Glenhaven Funeral Services Ltd [2002 UKHL 22].
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
Causation 27
3.4.3 Reduction
Reductivity is often seen as a virtue of philosophical analyses. For those seek-
ing a metaphysical reduction of causation, success would – other things being
equal – allow for a simplification of their overall metaphysic. For example, if
causation could be reduced to constant conjunction with the latter interpreted
24 This ‘rule’ is a convention followed by the courts of England and Wales, with similar conven-
tions also adopted in other jurisdictions. It was articulated in the Sussex Peerage Case [1844; 11
Cl and Fin 85] as follows: ‘If the words of the Statute are in themselves precise and unambig-
uous, then no more can be necessary than to expound those words in that natural and ordinary
sense.’
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
28 Philosophy of Science
25 This biconditional is implied by the (controversial) KK principle together with the factivity of
knowledge.
26 I don’t claim that knowledge is itself a propositional attitude, though this is a view defended by
Williamson (2000, ch. 1).
The first of these expresses the counterfactual ‘if it had been the case that c
occurred, then it would have been the case that e occurred’ (for short: ‘if c had
occurred, then e would have occurred’); the second expresses the counterfac-
tual ‘if it hadn’t been the case that c occurred, then it wouldn’t have been the
case that e occurred’ (for short: ‘if c hadn’t occurred, then e wouldn’t have
occurred’).28
Lewis (1973a, 1973b, 1979) gives counterfactuals a possible worlds seman-
tics. According to him, a counterfactual A C is true iff either there are no
worlds in which A (i.e. A is necessarily false) or there’s some world in which
A&C that’s more similar (‘closer’) to our own, actual world (hereafter ‘@’),
than is any world in which A&:C. Lewis (1973b, 14–15) subscribes to a prin-
ciple known as Strong Centering, according to which @ is more similar to itself
than any other possible world is to it. Given his semantics for counterfactuals,
this has the immediate implication that any counterfactual with a true anteced-
ent has the same truth-value as its consequent. This means that, where c and
e are actual events (as they must be if they’re to be contenders to stand in a
causal relation in the actual world), the aforementioned counterfactual (i) is
automatically true. This means that whether e causally depends upon c hinges
upon whether counterfactual (ii) is true.
Lewis, however, doesn’t equate causation with causal dependence between
events.29 Rather, he takes this merely to be a sufficient condition for causation.
27 Sometimes I’ll say ‘counterfactually depends’ rather than ‘causally depends’ to express this
relation in what follows. Lewis reserves ‘counterfactually depends’ for the relation between the
propositions O.e/ and O.c/ that obtains when the following pair of counterfactuals are true. I’ll
be more lax in also using it to describe the relation between the corresponding events too.
28 Following Lewis, I’m calling both subjunctive conditionals that have true antecedents and those
with false antecedents ‘counterfactuals’. Lewis gives a uniform semantics to both types of
subjunctive.
29 Strictly speaking, we should say ‘distinct’ events. To give Kim’s (1973b, 571) example: ‘[i]f I
had not written ‘r’ twice in succession, I would not have written “Larry” ’. Yet writing ‘r’ twice
in succession isn’t a cause of writing ‘Larry’ but rather an essential part of it.
Early Preemption
Alice and Bob are hunting deer. They have one gun between them and are
down to their last bullet. They spot a deer. Both are ace shots and the shot
is a simple one. If Alice doesn’t shoot, then Bob will. Alice shoots. The
bullet hits the deer and the deer dies.
In Early Preemption, although Alice’s shot is a cause of the deer’s death, the
deer’s death doesn’t counterfactually depend upon her shot: but for her shot,
Bob would have shot and so the deer would still have died. Bob’s intention to
shoot if Alice doesn’t can be considered the preempted backup. Had Alice not
shot, this would have initiated a process (involving Bob’s shooting) that would
itself have resulted in the deer’s death. However, such a process is stopped
in its tracks before the deer’s death (which is why this counts as a case of
early preemption). Specifically, as soon as Alice shoots, Bob is prevented from
shooting.
Cases of preemption pose a problem for regularity theories. Alice’s shooting
together with other prevailing conditions such as the shot being an easy one,
the gun’s being in good working order, etc., suffices for the deer’s death (recall
that, for now, we’re assuming determinism). The trouble is that Bob’s inten-
tion to shoot if Alice doesn’t, together with his being an excellent shot, Alice’s
being an excellent shot, the shot being an easy one, the gun’s being in good
working order, etc., is also a sufficient condition (it guarantees that, one way
or the other – by means of Alice’s shooting or Bob’s shooting – the deer’s death
will come about). Bob’s intention is a non-redundant element of this sufficient
condition (after all, the condition doesn’t include Alice’s shooting). So Bob’s
intention is an inus condition of the deer’s death. Yet Bob’s intention isn’t a
cause.30
Returning to Lewis’s account, observe that, despite the fact that Bob’s inten-
tion means there’s no counterfactual dependence of the deer’s death on Alice’s
shot, Lewis would claim there’s a chain of counterfactual dependence connect-
ing the two, and this is why Alice’s shot counts as a cause of the deer’s death.
Specifically, consider the event comprising the speeding bullet’s presence at the
mid-point between Alice and the deer. Call this event m. Lewis’s (1973a, 567,
1986b, 200–1) suggestion would be that there’s a two-link chain of counterfac-
tual dependence connecting Alice’s shot and the deer’s death, with the deer’s
30 Although he develops an account of causation that draws upon the notion of an inus condition,
Strevens (2007, esp. 5.4) recognises the need to supplement the notion of an inus condition with
a notion of ‘causal production’ (which he doesn’t define in regularity-theoretic terms) in order
to deal with cases like this.
Lewis (1979) provides a detailed story about why his possible worlds seman-
tics for counterfactuals implies the ‘non-backtracking’ result.31 His approach
has, however, been criticised by several authors (Elga 2001; Woodward 2003,
ch. 3.6; Kment 2006). A now-popular alternative approach to providing a non-
backtracking semantics is the so-called ‘interventionist’ approach (Goldszmidt
and Pearl 1992; Meek and Glymour 1994; Woodward 2003), which we’ll
examine in Section 4.6.6. But however we understand the semantics for coun-
terfactuals, let’s grant that Lewis’s analysis correctly handles cases of early
preemption (even if it does, rather controversially, commit to causal transitiv-
ity). Still the account runs into trouble in cases of late preemption, as Lewis
(1986b, 203–5) himself admits.
Late Preemption
Alice and Bob are hunting deer. They each have a gun and each gun has
plenty of ammunition. They spot a deer. If Alice doesn’t shoot or if Alice
shoots but fails to kill the deer, then Bob will shoot and kill the deer. Alice
shoots and kills the deer.
It’s worth emphasising how Late Preemption differs from Early Preemption.
In Early Preemption, the backup process (the one via which Bob would have
brought about the deer’s death if Alice hadn’t shot) is cut short early in pro-
ceedings. Once Alice shoots, there’s no possibility of Bob’s shooting (because
there are no bullets left in their one rifle). Since this backup process is cut short
early, the effect (the deer’s death) depends upon the later stages of the process
via which Alice’s shot brings about the effect: for instance, upon m (cf. Lewis
1986b, 200). This allows us to complete a chain of causal dependence
In contrast, in Late Preemption the backup process isn’t cut short until the
effect itself (the death) occurs. That’s because it’s only the effect itself that
prevents Bob from shooting. This means that the effect (the death) doesn’t
counterfactually depend upon any part of the process via which Alice’s shot
brought it about (cf. Lewis 1986b, 203–4). For instance, the death doesn’t coun-
terfactually depend upon the presence of a bullet at the midpoint between Alice
31 In fact, Lewis (1979, 475) claims that his semantics entails the falsity of backtracking coun-
terfactuals only where certain physical asymmetries that ordinarily correspond to the direction
of time obtain. He thinks that if these physical asymmetries broke down – as they might on a
closed time-like curve – then backwards causation might occur and that his semantics would
entail the truth of the corresponding backtracking counterfactuals.
and the deer since, if the bullet hadn’t reached that point, it wouldn’t have hit
the deer and so (seeing that the deer was initially unscathed) Bob would have
shot and killed it.
The counterfactual reasoning from the absence of the bullet at the mid-point
to Bob’s shooting and the deer’s death isn’t in this case the sort of ‘backtracking’
reasoning that Lewis takes to be illicit. That’s because, in Late Preemption,
the bullet’s absence would mean that it wouldn’t (later) hit the deer and so
Bob (later) would have shot. By contrast, in Early Preemption, since Bob was
prevented from shooting by Alice’s very act of shooting, we need to reason
backwards from the bullet’s absence to Alice’s (earlier) not shooting to reach
the conclusion that Bob would have shot and killed the deer.
So invoking transitivity doesn’t solve the problem posed by late preemption.
Lewis’s 1973 account of causation therefore doesn’t handle late preemption
correctly: it fails to diagnose the preempting cause as a cause. Can a counter-
factual account of causation be given that gets cases of late preemption right?
Before seeking to answer that question, let’s briefly consider another sort of
case that arguably poses problems for Lewis’s 1973 analysis.
Symmetric Overdetermination
Alice and Bob are hunting deer. Each has a loaded gun. They spot a deer.
They both shoot, with their bullets piercing the deer’s heart simultane-
ously. Each bullet alone would have been sufficient to bring about the
deer’s death.
a speeding bullet at the mid-point between Alice and the deer) is such that the
deer’s death depends on it: after all, the process initiated by Bob’s shot was com-
plete and would have sufficed to bring about the deer’s death even had some
part of the Alice-process been missing. Exactly analogous reasoning shows that
the deer’s death neither counterfactually depends upon, nor is connected by a
chain of counterfactual dependence to, Bob’s shot.
Lewis (1986b, 194–5, 207, 211–12) claims that the intuition that symmetric
overdeterminers are causes isn’t strong. He therefore thinks it’s not a signifi-
cant strike against his counterfactual analysis of causation that it doesn’t deliver
the result that they are. But not everyone agrees with this assessment (cf. Hitch-
cock 2007, 522) and there are cases in which courts have recognised symmetric
overdeterminers as causes.32 Nevertheless, since late preemption cases are the
most uncontroversial counterexamples to Lewis’s 1973 account of causation,
we’ll focus on what might be done to deal with these cases, though we’ll see that
some of the potential solutions may help with symmetric overdetermination too
insofar as this is seen as problematic for the counterfactual approach.33
32 E.g. Kingston v Chicago & N. W. Railway [191 Wis. 610, 211 N.W. 913 (1927)].
Note that regularity theories have better prospects for counting overdeterminers as causes since
each overdeterminer is (in the circumstances) sufficient for the effect.
33 There are varieties of ‘overdetermination’ that differ in structure from the ones we’ve dis-
cussed in the last three subsections (e.g. Moore 2009, 418; Kroedel 2008, 130–1) but which
we lack space to discuss here. One particularly well-known variety of overdetermination not
discussed in the main text is so-called ‘trumping preemption’ (Schaffer 2004). However, sev-
eral philosophers (e.g. McDermott 2002, 89–92, 97; Halpern and Pearl 2005, 873–5; Hitchcock
2007, 512n) take the view that ‘trumping preemption’ is really just an instance of symmetric
overdetermination, and so has limited independent diagnostic value.
is somewhat different in manner from the death that would have occurred had
only one bullet been fired.
The response under consideration invokes what Lewis (1986b, 204) calls
the modal ‘fragility’ of the effect: the notion that a counterfactual version
of the effect (e.g. the death of the deer at Bob’s hands) isn’t to be identi-
fied with the actual effect if it is even slightly different in time or manner
from the actual effect. Lewis doesn’t avail himself of this solution because
he rejects extreme standards of event fragility (the adopted standards would
have to be fairly extreme if this solution were to work for all cases of late
preemption, including ones in which the preempted alternative would have
brought about a very similar and only very slightly later version of the effect).
The key reason for his doing so is that adopting extreme standards of fragility
appears to yield spurious cases of causation on the counterfactual approach.
For example, such things as the precise angle at which the deer was stand-
ing or even its precise blood pressure might have made a difference to the
precise time and/or manner of its death, even if they were entirely irrelevant
to whether or not it died. Yet it’s tempting to think that such factors are not
causes.34
34 Lewis (2004b) later came to advocate a counterfactual analysis of causation that appeals to event
fragility in a subtler way. His guiding idea is that a cause ‘influences’ (i.e. makes a difference to)
the timing and manner of its effect in a way that non-causes – such as preempted alternatives
– don’t. Though we lack the space to discuss the details here, this later account suffers from
serious counterexamples, as Bigaj (2012) shows, and has not widely caught on.
scenario described in Early Preemption plays out) and the de facto depend-
ence approach tells us to hold this feature fixed in seeking latent counterfactual
dependence of the deer’s death on Alice’s shot.35
The counterfactual that reveals this de facto dependence of the deer’s death
on Alice’s shot has a more complex antecedent than those to which Lewis
appeals in his analysis of causation. As well as counterfactually varying the
putative cause (Alice’s shooting), the antecedent serves to hold fixed a certain
feature of the actual situation (Bob’s non-shooting). This feature was something
that depended upon the putative cause (because Bob intended to shoot if Alice
didn’t) and therefore would have been absent under the simple counterfactual
supposition that the cause didn’t occur.
Hitchcock (2001a, 275) articulates this point well when he points out that
Lewis – in virtue of supposing that the counterfactuals relevant to analysing
causation don’t backtrack – takes history prior to the occurrence of the event
c to be implicitly held fixed (in virtue of the semantics for the counterfac-
tual) when evaluating :O.c/ :O.e/. But Lewis (1979) takes the relevant
semantics to allow foretracking, so that events lying to the future of c are
allowed to vary under the counterfactual supposition that :O.c/. Yet, as Hitch-
cock (2001a, 275) observes,’ ‘The necessity of foretracking from c to e does
not prevent us from entertaining a counterfactual of the form ‘If c had not
occurred, but e had occurred anyway, then …’. Let us call a counterfactual
of this sort an explicitly nonforetracking or ENF counterfactual.’ In order
to give a full-fledged account of causation in terms of ENF counterfactuals/
de facto dependence, we need an account of which features of the actual world
should be held fixed. Several authors (e.g. Hitchcock 2001a, 2007; Halpern
and Pearl 2005; Halpern and Hitchcock 2015; Halpern 2016; Weslake 2020)
have appealed to causal models (more precisely: structural equation models)
in giving such an account.
35 Though we won’t go into the detals, Hitchcock (2001a) demonstrates that such an approach
doesn’t commit one to causal transitivity.
36 For simplicity, I’ll assume that the alternative to Bob’s having this intention is his not intending
to shoot under any circumstances. However, if we wanted to model the possibility that (e.g.)
Bob has the intention to shoot come what may, we could deploy a multi-valued variable.
37 Of course, Alice’s shooting might depend upon Bob’s not having shot even earlier, but if we
wanted to represent whether Bob shot even earlier, we’d need an additional variable in our
model. (Variable B can’t serve this function, since B represents a state that depends on A. Given
that this is a token causal scenario, A itself doesn’t depend upon the state represented by B. If
it did, we’d have a causal loop.) In general, whether a variable is exogenous or endogenous
depends on what other variables we include in our model.
EP
38 As we’ll discuss further in Section 4.6.6, the equation replacement method for evaluating coun-
terfactuals is sometimes taken as modelling what would happen if the variables in the antecedent
SEA
of the counterfactual were set to the specified values by means of ‘interventions’ (Woodward
2003, 98). Glynn (2013) discusses the extent to which the equation replacement method could
alternatively be seen as modelling a version of Lewis’s (1979) counterfactual semantics.
A D 0) causes D D 1 (rather than D D 0): that is, that Alice’s shooting (rather
than not) causes the deer to die (rather than survive).
To see this, focus on the path hA; Di in EP. SEA tells us that A D 1 is
a cause of D D 1 if it’s true that, when all variables that lie off this path
are held fixed at their actual values, if A had taken the value A D 0, then
D would have taken the value D D 0. In EP, there are two variables that
lie off the path hA; Di: namely BI and B, which have BI D 1 and B D 0 as
their actual values. Thus the fact that A D 1 is a cause of D D 1 is indicated
by the truth relative to EP of the ENF counterfactual A D 0&BI D 1&B D
0 D D 0.39 The truth of this ENF counterfactual relative to EP can be veri-
fied by employing the equation replacement method described in the previous
subsection.
Analysis SEA also yields the correct verdict that BI D 1 isn’t a cause of
D D 1 (confirming that it’s ‘merely’ a preempted backup). To see this, con-
sider the sole directed path from BI to D in EP: hBI; B; Di. The sole variable
that lies off this path is A, the actual value of which is A D 1. For BI D 1
to count as a cause of D D 1 according to SEA, it would be necessary that
the following counterfactual hold in EP: BI D 0&A D 1 D D 0. How-
ever it’s easy to verify via the equation replacement method that this isn’t
the case.
Analysis SEA can also potentially help with Late Preemption,40 an example
which – as we saw – Lewis struggles to deal with. To model Late Preemption it
will be helpful to recall that, since Bob first waits to see whether Alice succeeds
in killing the deer, the deer would have died slightly later if Alice hadn’t killed
it. Suppose the deer would have died by 1:00pm if Alice had shot, but only by
1:01pm if Alice hadn’t shot. Thus we can adopt the following model (‘LP’) of
Late Preemption.41
39 I previously said that A D 0&B D 0 D D 0 is the counterfactual that the advocate of the
de facto dependence approach takes to be indicative of causation of D D 1 by A D 1. And so it
would be, according to SEA, if our model only included variables A, B, and D. But what I said
previously was a slight simplification in that, once BI is included in our model, it too must be
held fixed according to SEA’s general recipe for revealing the sort of de facto dependence that’s
indicative of causation. In this instance, holding fixed BI – although harmless – isn’t doing any
real work. It’s the holding fixed of B at B D 0 that’s the key to revealing latent dependence of
D D 1 on A D 1.
40 The treatment suggested here follows that of Halpern and Pearl (2005, 863–5) and Hitchcock
(2007, 524–9).
41 It’s worth emphasising how the interpretation of variable B differs in LP as compared to EP.
In LP, B D 1 represents Bob’s intention to shoot iff Alice doesn’t kill the deer (rather than
iff Alice doesn’t shoot). As in the case of EP, I assume for simplicity that the alternative to
Bob’s having the intention he does is for him to have no intention to shoot the deer under any
circumstances.
LP
525), we might suppose that ‘we will have adequately captured our intuitions
about the case if we can show that’ A D 1 is a cause of both D1W00 D 1 and
D1W01 D 1.
It’s also easy to see that BI D 1 isn’t a cause of D1W00 D 1 or D1W01 D 1
according to SEA.42 First, since there’s not even a directed path from BI to D1W00
(reflecting the fact that Bob is going to give Alice until at least 1:00pm to kill
the deer before himself shooting), SEA straightforwardly doesn’t count BI D 1
as a cause of D1W00 D 1. Second, though there’s a directed path from BI to D1W01
– namely, hBI; B; D1W01 i – holding fixed the off-path variables A and D1W00 at
their actual values – A D 1 and D1W00 D 1 – doesn’t reveal de facto dependence
of D1W01 D 1 upon A D 1: it’s false that BI D 0&A D 1&D1W00 D 1 D D 0,
as can again be verified via the equation replacement method.
Although SEA deals well with Early Preemption and at least reasonably
well with Late Preemption, it arguably runs into difficulties with Symmetric
Overdetermination. Before demonstrating this and discussing whether there
might be an alternative SEM analysis of causation that fares better, we ought
to return to the question – deferred earlier – of what makes for an apt SEM.
Several criteria for model appropriateness have been suggested in the literature.
Specifically, the following criteria for the variables of such a model have been
suggested:
1. (Partition) The values of each of the variables in the variable set ought
to form a partition (i.e. a set of mutually exclusive and jointly exhaustive
alternatives) (Halpern and Hitchcock 2010, 397–8; Blanchard and Schaffer
2015, 182).
2. (Independence) The values of distinct variables in the variable set shouldn’t
be logically or metaphysically related (Hitchcock 2001a, 287; Halpern and
Hitchcock 2010, 397).
3. (Naturalness) The values of the variables ought to represent reasonably
natural and intrinsic states of affairs (cf. Blanchard and Schaffer 2015, 182).
Since analyses like SEA take all and only values of distinct variables to be
potential causal relata, (Partition) ensures that we don’t thereby miss actual
causal relations because they obtain between the values of a single variable,
The motivation for (Veridicality) should be obvious: if the model entails false
counterfactuals (via the equation replacement method described in Section
4.6.1), then it’s not a reliable guide to genuine causal relations. (Serious Pos-
sibilities) is needed to avoid a variety of counterintuitive results (see, e.g.,
Hitchcock 2001a, 287; Woodward 2003, 86–91). Most straightforwardly, it
helps avoid the overgeneration of cases of causation by absence. For instance,
JFK’s death presumably counterfactually depends upon Paul McCartney’s fail-
ure to get in the way of the oncoming bullets. But most of us would be reluctant
to say that the latter is a cause of the former. This might be explained by the fact
that we view McCartney’s getting in the way as a non-serious possibility. (Seri-
ous Possibilities) ensures that no appropriate model of the events surrounding
43 The fact that (Naturalness) refers to ‘reasonably’ natural and intrinsic states of affairs introduces
a certain amount of vagueness into the account, but is necessary because demanding perfect
naturalness would presumably rule out high-level states of affairs (i.e. the states – such as price
inflation and asbestos exposure – of concern to the non-fundamental sciences) as causes and
effects, which would be undesirable. Since any account of causation will need a condition like
this, the SEM approach doesn’t suffer a relative disadvantage in this regard.
If absences are unnatural states of affairs (e.g. if they’re disjunctions of positive events, cf.
Lewis 1986b, 189–93), we might replace (Naturalness) with the requirement that each variable
has at most one value representing such a state of affairs if we wish to allow (as we will in the
following) that absences of reasonably natural events should count as causes and effects, and
also that positive events that are causes and effects may be represented by binary variables that
have the absence of these events as one of their values.
44 If one regards appeal to ‘serious possibilities’ in an account of causation as too problematic, one
option would be to simply admit that, e.g., McCartney’s failure to get in the way was a cause
of JFK’s death. If so, as Blanchard and Schaffer (2015, 198) point out, (Serious Possibilities)
‘may be reinterpreted, not as an aptness constraint on models, but as a descriptive psychological
claim about which causal models are most readily available to us when we form our causal
judgements’.
45 (Stability) thus renders the notion of an appropriate model relative to the causal claim being
evaluated.
46 For further discussion, see Hitchcock (2001a, 295–8).
Even if SEA fares well with respect to Early Preemption and Late Pre-
emption, it runs into difficulties with Symmetric Overdetermination. The
trouble is that, in this case, there doesn’t appear to be any feature of the
actual situation that we can hold fixed to reveal de facto dependence. The
following model (‘SO’) would seem to be a reasonable one for Symmetric
Overdetermination:
47 Paul and Hall focus in particular upon the influence that such presuppositions have on our judge-
ments about relations between variables (i.e. what structural equations we judge to be true). This
is liable to affect our judgements about (Veridicality) as well as (Stability).
SO
Variables: fA; B; Dg
Equations: A D 1, B D 1, D D MaxfA; Bg
There’s only one path from A to D and only one variable off this path, namely
B. So the test for causation appealed to by SEA requires that D D 1 de facto
depends upon A D 1 when B is held fixed at its actual value B D 1. But there’s
no such dependence: the deer’s death doesn’t counterfactually depend upon
Alice’s shot when we hold fixed Bob’s shot. The falsity of the counterfactual
A D 0&B D 1 D D 0 can be verified via the equation replacement method
(described in Section 4.6.1) with respect to SO. Strictly analogous considera-
tions show that SEA yields the verdict that B D 1 isn’t a cause of D D 1 either.
This is a problem to the extent that we regard symmetric overdeterminers as
causes.
The most obvious way to try to deal with Symmetric Overdetermination
is to liberalise the analysis by allowing that we may sometimes vary features
of the actual situation in seeking latent dependence of effect on putative cause.
This is an approach taken by Hitchcock (2001a, 289) and Halpern and Pearl
(2005), inter alia. For instance, were we allowed to suppose that Bob didn’t
shoot (and hold this fixed by including it in the antecedent of the relevant ENF),
then we’d recover counterfactual dependence of the deer’s death on Alice’s
shooting.
But we don’t want to be too liberal: if we’re allowed to vary just any feature
of the actual situation, we’ll end up generating spurious cases of causation. For
instance, consider the intention of Bob to shoot if Alice doesn’t kill the deer in
Late Preemption. If we hold fixed the non-actual fact that Alice didn’t shoot
(by including it in the antecedent of the relevant ENF), then the deer’s death
comes to counterfactually depend upon Bob’s intention. Yet Bob’s intention
isn’t a cause of the deer’s death.
Briefly, the way this problem is standardly tackled (Hitchcock 2001a, 289;
Halpern and Pearl 2005, 853–5) is by noting that, in Late Preemption, if
Alice hadn’t shot (A D 0), then the causal path between Bob’s intention and
the deer’s death would have been crucially different: Bob would have shot
rather than not doing so (i.e. B would have taken B D 1 rather than B D 0
in LP). In contrast, in SO, setting A D 0 doesn’t make any difference to the
causal pathway between Bob’s shooting and the deer’s dying (and this is so
even if we enriched SO by interpolating variables between B and D, such as
one representing the presence of the speeding bullet mid-air between Bob and
the deer). Setting A D 0 is therefore permitted in the latter case but not the
former.
It was remarked at the outset of this section that counterfactual analyses of cau-
sation have typically been directed at understanding token causation. Having
said that, when it comes to approaches that invoke variables, it’s possible to ana-
lyse a family of causal relations that aren’t in themselves type or token causal
relations, but in terms of which it might be possible to analyse type causation:
namely, a family of relations of causation between variables as opposed to the
(actual) values of those variables. Causal relations between variables are dis-
cussed quite extensively by Pearl (2009), Spirtes et al. (2000), and Woodward
(2003).
Woodward (2003) distinguishes various relations of causation between vari-
ables. Direct causation (Woodward 2003, 55), which is an inherently model-
relative notion of causation, is defined by him so that X counts as a direct cause
of Y in a model iff X is a parent of Y. On the other hand, X is defined as a total
cause of Y iff there’s a possible change to the value of X that makes a difference
to the value of Y (Woodward 2003, 51).48 This isn’t an inherently model-relative
notion of causation.
Finally, X is defined as a contributing cause of Y relative to a model iff there’s
a path P from X to Y and some possible setting of the off-path variables (which
doesn’t have to be their actual values) such that, when the off-path variables are
held fixed at that setting, there’s a possible change to the value of X that makes
a difference to the value of Y (Woodward 2003, 57).49 Though this definition is
model-relativised, one might define X to be a contributing cause of Y simpliciter
iff X is a contributing cause of Y relative to at least one apt model.50
These definitions of various causal relations that might obtain between vari-
ables don’t invoke the actual values of the cause and effect variables. This
sets them apart from definitions of actual causation (i.e. the token causal rela-
tion that has been of central philosophical focus). Still, they don’t necessarily
capture type-causal notions either (cf. Hitchcock 2007, 503–4). Specifically,
there’s no requirement that the variables in question represent type-level phe-
nomena. For example, according to these definitions, A counts as a direct, total,
and contributing cause of D1W00 in our model LP of Late Preemption. Yet A
and D1W00 represent quite specific things: namely, whether Alice shoots when
she has the chance, and whether the deer is dead at 1pm. In general, where
variables represent particulars such as token events, causation between such
variables isn’t ‘type causation’ in the usual sense. Hitchcock (2007, 503–4)
suggests that we might in such cases regard these relations as part of what he
calls the ‘token causal structure’ of a scenario, but not as instances of actual
causation.51
Still, it’s of course possible to use variables to represent things that are non-
particular. For instance, consider the Arrhenius Equation (AE) in chemistry.
The AE shows how the rate of a chemical reaction depends upon temperature.
It can be formulated as follows:
Ea
k D Aexp RT : (AE)
Here k is the reaction rate (the amount of reaction product produced per litre of
the reaction solution per second), A and Ea are constants whose values depend
upon the nature of the reactants, R is the universal gas constant, and T is absolute
temperature. The AE captures the fact that small increases in temperature result
in dramatic increases in reaction rate. AE is a single-equation SEM where k and
T are the variables.
Relative to AE, temperature T counts as a direct, total, and contributing cause
of the reaction rate k by Woodward’s definitions. Moreover, the variables in
AE don’t concern particulars. They are general in the sense that the AE can be
applied to any reaction rather than being limited to a single one. As applied to
variables like these, Woodward’s notions of causation between variables can
reasonably be thought of as capturing type-level relations.
4.6.6 Interventions
Before closing our discussion of the counterfactual approach it’s worth noting
that, with the notion of a SEM in hand, we’re now in a position to more precisely
characterise the notion of an ‘intervention’, which plays an important role in
some theorists’ thinking about the SEM approach to causation, and which we’ll
appeal to again in our discussion of probabilistic causation. Our characterisa-
tion will be a slightly simplified version of that given by Woodward (2003,
94–114).
On Woodward’s approach, interventions are characterised in terms of ‘inter-
vention variables’. The notion of an intervention variable can be understood in
terms of a three-place relation between variables: ‘I is an intervention variable
for X with respect to Y’, where X is the variable that’s the subject of the inter-
vention, and the intervention is one that can test whether Y depends upon X. We
can keep track of this using subscripts: a variable that’s a putative intervention
variable for X with respect to Y can be denoted IX;Y .
According to Woodward’s definition, IX;Y counts as an intervention variable
for X with respect to Y iff (a) IX;Y is a contributing cause of X (in the sense
defined in the previous subsection); (b) IX;Y acts as a ‘switch’ for X: that is,
there are values of IX;Y (call these ‘switching values’) such that when IX;Y takes
those values, the value of X no longer depends on the values of any variables
besides X that don’t lie intermediate on a directed path from IX;Y to X; (c) there’s
no directed path from IX;Y to Y that doesn’t go via X; (d) there’s no variable that’s
an ancestor of IX;Y that lies on a directed path to Y that doesn’t run via IX;Y (i.e.
there’s no ‘common cause’ of IX;Y and Y).52
52 It seems there’s something of a regress in the offing in the definition of an intervention vari-
able. As we’ll see, intervention variables are supposed to test what counterfactuals are true and
The idea, then, is that there are circumstances under which IX;Y makes a dif-
ference to the value of X (condition (a)) and indeed there’s a certain set of values
of IX;Y such that when IX;Y takes one among those values, the value of X doesn’t
depend upon those of any other variable besides IX;Y (except any that lie on a
path from IX;Y to X) (condition (b)). Moreover, IX;Y isn’t a variable that poten-
tially influences Y independently of its influence on X: any influence that IX;Y
has on Y goes by way of its influence upon X (condition (c)). Finally, there’s no
variable that’s a ‘common cause’ of IX;Y and Y so there’s no correlation between
IX;Y and Y that isn’t explained by the influence of IX;Y upon X (condition (d)).
The latter two conditions are designed to ensure that any changes in the value of
Y associated with changes in IX;Y are entirely due to changes that IX;Y produces
in X. This ensures that any such changes in the value of Y genuinely reflect its
dependence on X.
It’s consistent with the definition of an intervention variable that, when the
value of IX;Y lies outside the set of ‘switching values’, its value doesn’t make a
difference to that of X. In such circumstances, there might be apt models from
which this variable can be omitted (i.e. left as latent). Of course if the variable
is then ‘added’ to the model, this will require an adjustment to the structural
equations so that the equation for X now reflects its dependence upon IX;Y in
addition to its other parents in the model.
To illustrate, recall the example Early Preemption. In this scenario, the bul-
let’s passing the mid-point between Alice and the deer might be represented by
a variable M which takes M D 1 if the bullet indeed passes this mid-point and
M D 0 otherwise (see Figure 5). Now imagine a putative intervention on M
with respect to D (the variable representing the deer’s death) that comprises
someone’s placing an object that would intercept the speeding bullet before it
reached this mid-point. Whether such an object is placed might be represented
by a binary variable IM;D that takes value IM;D D 1 if one is or IM;D D 0
otherwise.
Consider whether IM;D is a genuine intervention variable for M with respect
to D, the variable representing whether the deer dies. IM;D acts as a ‘switch’
with respect to M in the sense that, when IM;D D 0, the value of M depends
upon the value of A – representing whether Alice shoots – but when IM;D D 1,
then M D 0 no matter what the value of A. Moreover, as is implied by its acting
as a switch, IM;D is a contributing cause of M (i.e. there are circumstances under
therefore what the structural equations are – and hence what the arrows in the graph are – in a
model satisfying (Veridicality). But don’t we need an intervention variable for the intervention
variable in order to establish what arrows emanate from IX;Y itself, and therefore whether IX;Y
satisfies (for instance) condition (c)? This issue is taken up in (Glynn 2013).
which it makes a difference to the value of M). Now if no one actually places an
object to intercept the bullet (indeed, perhaps there’s no-one present who has
any intention or capacity to do so), then it would be entirely reasonable to omit
IM;D from our model altogether. However, once IM;D is included (as it is in the
model represented in Figure 5), it can be verified that it satisfies the remaining
conditions for an intervention variable for M with respect to D. Specifically,
there’s no directed path from IM;D to D that doesn’t go via M and there’s no
variable that’s an ancestor of IM;D that lies on a directed path to D that doesn’t
run via IM;D .53
Note, by contrast, that if we added a variable that was a parent of A (perhaps
representing whether someone shouts at Alice to stop her from shooting) to our
model then this wouldn’t count as an intervention variable for M with respect
to D (though – so long as such a shout wouldn’t scare off the deer – it may
count as an intervention variable for A with respect to D). That’s because there
would be a directed path from such a variable to D that bypasses M: namely
the one that runs via A and B.
When it comes to conditions (c) and (d) for an intervention variable, it’s of
course not enough that they be satisfied in just any old model that includes IX;Y ,
X, and Y in its variable set. Rather it will have to be the case that, no matter how
many variables we include in our model, then (provided the resulting model sat-
isfies (Veridicality), (Naturalness), etc.) conditions (c) and (d) still hold. That’s
because if there’s a path from IX;Y to Y that bypasses X that could be revealed by
the addition of more variables to the model, or if there are unmodeled common
causes of IX;Y and Y, then changes in the value of IX;Y won’t reliably test the
influence of X upon Y.
53 It might be objected that – if in fact there’s no one around with the intention and capacity to place
an object that would intercept the bullet – a model including IM;D violates (Serious Possibilities).
One possible response would be to claim that, while we may wish to think about interventions
(which may in some cases correspond to far-fetched possibilities) when considering whether
the counterfactuals entailed by a causal model are true, and while to help us do so we may wish
to expand the original model to include variables representing these interventions, it isn’t the
expanded model of which we need (Serious Possibilities) to hold, but only the original model.
So far we’ve only defined the notion of an ‘intervention variable’. But the
definition of an intervention in terms of an intervention variable is simple
enough. Setting aside some subtleties discussed by Woodward (2003, 94–114),
we can think of an intervention on X with respect to Y as simply X’s taking
some value X D x as a result of an intervention variable IX;Y taking one of its
switching values. So, in Early Preemption, IM;D taking IM;D D 1 serves as
an intervention on M with respect to D that sets M D 0. The interventionist’s
proposal is that when evaluating counterfactuals for the purposes of analyzing
causation, we should suppose their antecedents are realised by interventions.
This is taken to be equivalent to adopting the ‘equation replacement’ method
of evaluating counterfactuals introduced in Section 4.6.1. That’s because both
methods involve a change to the target variable, X, that preserves the functional
relationship between X and Y as well as affecting neither the values of ances-
tors of Y that don’t lie on a pathway between X and Y nor any influence that
those ancestors have on Y that isn’t mediated by X.54 In the next section, we’ll
see that the notion of an intervention can also potentially help with analysing
probabilistic causation.55
5 Probabilistic Causation
David Hume – whose regularity theory was examined in Section 2 – wrote
at a time when Newtonian mechanics ruled the roost. While it has recently
been argued that Newtonian mechanics isn’t strictly deterministic,56 it can be
treated as such for most intents and purposes, and in any case doesn’t yield well-
defined non-trivial (i.e non-0 or 1) probabilities for the unfolding of events
over time. At a time when it was reasonable to think that the universe was
fundamentally deterministic, it might have seemed plausible to think that all
causes are, in the circumstances in which they occur, lawfully sufficient for
their effects.
But Newtonian mechanics is now known not to be strictly correct, and has
been supplanted by quantum mechanics (QM) and GTR. On the orthodox
54 The assumption that it’s possible to intervene upon each variable V in a model of a causal system
– thus overriding the usual functional dependence described by the equation for V – without
impacting the functional relationships described by any of the other equations in the model is
referred to as the assumption of ‘modularity’. For a defence of this assumption, see Hausman
and Woodward (1999); for criticism, see Cartwright (2002, 2004).
55 Note that, on the foregoing characterisation of an intervention, interventions needn’t be human
actions. Rather, natural processes can serve as interventions provided they have the formal char-
acteristics described in this section (see Woodward 2003, 103–4). So there’s nothing particularly
anthropocentric about adopting an interventionist semantics for the counterfactuals needed to
analyse causation.
56 See Earman (1986, esp. 29–40), Malament (2008, 803–4) and Norton (2008).
57 As Lewis (1986b, 180–4) argues, because the probabilities involved here are (assuming ortho-
dox QM) objective, it’s simply wrong to think that, although the bomb would still have had
a certain (small) probability of exploding, there’s nevertheless some fact of the matter about
whether it would or wouldn’t have exploded in such circumstances.
58 One might object as follows (thanks to an anonymous referee for pressing me on this point).
Suppose that, for example, the bomb’s detonation mechanism consists in the shooting of one
piece of U-235 into another and that in the actual world the impact between the two pieces occurs
at 12 noon. Then, one might argue, if an insignificant number of U-235 atoms in the bomb’s
core in fact spontaneously decayed prior to 12 noon, then it’s true after all that, if I hadn’t placed
my radioactive material near the Geiger, then the bomb wouldn’t have exploded. A worry about
this response, however, is that there would still have been a chance of significant spontaneous
decay occurring just after 12 noon. One may be forced to accept reasonably stringent standards
of event fragility if one is to maintain that the resultant explosion is a different event from that
which occurred in the actual world and therefore that the explosion counterfactually depended
upon my action.
59 These questions are discussed by, inter alia, Albert (2000), Loewer (2001), and Callender and
Cohen (2010).
This says that the probability of e’s occurring conditional upon c’s occurring is
greater than the probability of e’s occurring conditional upon c’s not occurring:
in Probabilistic Bomb, for example, the probability of the bomb’s explod-
ing conditional upon my placing the radioactive material near the Geiger is
greater than the probability of the bomb’s exploding conditional upon my not
doing so.
There are, however, at least three problems with understanding probability-
raising in terms of conditional probabilities if causation is, in turn, to be
understood in terms of probability-raising (see Lewis 1986b, 178–9). The first
60 This is the more traditional approach, adopted by Suppes (1970) and Reichenbach (1971),
among others.
61 Probabilistic type-causation is discussed in Section 5.6.
also helps with the problem of common causes. If, say, we take the probabilities
relevant to assessing whether the barometer reading was a cause of the storm to
be those that obtain immediately before the barometer indicates a storm, then
since the atmospheric pressure has already fallen by that time (and so has proba-
bility 1 of doing so), the barometer reading by then doesn’t raise the probability
of the atmospheric pressure falling and hence doesn’t raise the probability of
the storm.64
There is, however, a third problem with understanding probability-raising in
terms of conditional probabilities. This problem is more technical. Conditional
probabilities are standardly defined as a ratio:
P.A&B/
P.AjB/ D (3)
P.B/
But ratios are undefined when the denominator is equal to zero. Now suppose
that event c has probability 1 of occurring shortly before it does occur. Then,
at that time, P.:O.c// D 0 and so the conditional probability on the RHS
of Inequality 1 is undefined. The concern is thus that, if probability-raising is
understood in terms of conditional probabilities, we appear not to be able to
say that events that have probability 1 (shortly before their occurrence) raise
the probability of (and therefore cause) anything.
It’s possible that, in our world, no event has probability 1 before – even
shortly before – it occurs. But even if so, wouldn’t we like our probabilistic
analysis of causation to apply to deterministic worlds too? A plausible thought
e’s causes won’t yet have occurred by that time and might retain a probability less than 1. If
so, we’re liable to get the result that e raises the probability of those causes in the conditional
probability sense.
64 The solution under consideration in the last two paragraphs of the main text has appealed to
the notion that objective chance is time-relative. Certainly we often speak as though this is
so. For instance, as I write, it’s reasonable to say ‘the chance of average global temperatures
increasing by > 0:5ı C by 2050 is increasing each day the international community puts off
tough decisions’. But we can allow this without committing to the notion that time-relativity is
a fundamental feature of chance. For example, Hoefer (2007, 562) regards chances as relativised
to chance setups (e.g. a particular coin flip might be a chance setup relative to which the chance
of heads is 0.5) and not as in any fundamental way relativised to times (Hoefer 2007, sect. 3.2).
Commenting on David Lewis’s view of chances as time-relative, Hoefer (2007, 564–5) says that
‘[f]or Lewis, a non-trivial time-indexed objective probability Prt .A/ is, in effect, the chance of
A occurring given the instantiation of a big setup: the entire history of the world up to time t.’
(Hoefer himself makes the plausible allowance that objective chances may be defined given
setups that are more limited: for instance, there’s a well-defined chance of a silver atom’s being
deflected in a particular direction given that it’s fired through a Stern-Gerlach device.) One way
of construing this is to adopt an analysis of time-relativised objective probabilities according
to which Pt .A/ Ddef P.AjHt / where Ht is a proposition characterising the entire history of
the world through time t. That is, objective probability isn’t inherently time-relativised, but a
time-relativised notion of objective probability can be defined by conditioning the objective
probability distribution upon history up to the time in question.
is that deterministic worlds are simply the special case where the set of each
event’s causes raises its probability all the way to 1. But in a deterministic
world each event has probability 1 of occurring (shortly before it occurs). So it
seems that no event could raise the probability of any other in the conditional
probability sense. Since there surely is causation in deterministic worlds (if
Bohmian mechanics turns out to be true then ours is a deterministic world – at
least at the fundamental level), a probabilistic analysis that invoked conditional
probabilities would fail to apply to some possible cases of causation.65
The counterfactual approach to probability-raising, advocated by Lewis
(1986b, 175–9), seemingly overcomes these difficulties. To see how the coun-
terfactual approach interprets probability-raising, suppose that c and e are
actual events and that, at a time t just after the occurrence of c, the (uncon-
ditional) probability of e’s occurring is Pt .O.e// D x (the subscript to the
probability function indicates that this is the objective probability distribution
that obtains at t). Then consider the following counterfactual:
This says that, if c hadn’t happened, then the probability at time t of e occur-
ring would have less than x (the actual probability of e at time t). This is a
counterfactual understanding of what it is for c to raise e’s probability.
This approach has the capacity to overcome the problems associated with
the conditional probability approach (Lewis 1986b, 175–9). For, firstly, the fact
that :O.c/ has zero probability in the actual world doesn’t imply that there’s
no world in which it’s true, and doesn’t stand in the way of O.e/ having a well-
defined probability in such a world. For instance, in a deterministic world, it’s
plausible that, where c and e are actual events and c is a non-overdetermining
cause of e, then P.:O.c// D 0 and P.O.e// D 1 and yet, evaluated with respect
to such a world, it’s a true counterfactual that if c hadn’t occurred, then the
probability of e would have been 0.
Second, assuming counterfactuals don’t backtrack, this approach avoids the
problem of effects raising the probability of their causes and also the prob-
lem of common causes. That’s because, on the non-backtracking reading of
counterfactuals, it’s not true that (for example) if the barometer hadn’t said
there was going to be a storm, then the probability of the atmospheric pressure
having earlier fallen would have been lower. Rather, because the atmospheric
pressure has already fallen, its probability of doing so is 1 by a time t just after
65 At best it would seem that it could serve as an empirical analysis of causation, in the sense
discussed in Section 3.4.1.
the barometer says there’s going to be a storm, and it’s also 1 at t in the near-
est worlds (according to a similarity metric that implies non-backtracking) in
which the barometer doesn’t indicate that there’s going to be a storm. Moreo-
ver, because the barometer reading doesn’t raise the probability of the fall in
atmospheric pressure in the counterfactual sense, it doesn’t raise the probability
of the storm either.
As in the deterministic context, one can achieve the desired non-backtracking
results about counterfactuals by appealing to interventions (as an alternative to
Lewis’s similarity semantics). Though it will be easier to be formally precise
about the notion of an ‘intervention’ in the probabilistic case once we have
the notion of a probabilistic causal model – to be introduced in Section 5.3
– in hand, we can for the time being just think of interventions as exogenous
manipulations of variable values. As in the deterministic case, variables can be
used to represent the occurrence or non-occurrence of events.
Goldszmidt and Pearl (1992) introduce a special notation do.X D x/ to rep-
resent that the value of variable X is set to X D x by means of an intervention.
They then use P.jdo.X D x// to represent the probability distribution that
would result from setting X to X D x by means of an intervention. Using that
notation, the interventionist counterfactual approach to probability-raising says
that – where C and E are binary variables which each have the possible values
1 and 0 – C D 1 raises the probability of E D 1 iff:
Despite the resemblance in the notation, terms like P.E D 1jdo.C D 0// don’t
represent conditional probabilities. Rather, they represent counterfactual prob-
abilities: the probability for E D 1 that would obtain if C were intervened upon
to set C D 0. Thus Inequality 5 says that the probability for E D 1 that would
obtain if C were intervened upon to set C D 1 is higher than the probability of
E D 1 that would obtain if C were intervened upon to set C D 0.66 In virtue of
the non-backtracking character of interventionist counterfactuals, this inequal-
ity won’t obtain if the events represented by C D 1 and E D 1 are independent
66 Given that C D 1 must be the actual value of C if (the state represented by) C D 1 is to be an
actual cause of (the state represented by) E D 1, couldn’t we simply have the actual uncondi-
tional probability P.E D 1/ that obtains shortly after the occurrence of the state represented by
C D 1 figure on the LHS of Inequality 5 (as opposed to the probability for E D 1 that would
have obtained under an intervention setting C D 1)? The answer is ‘yes’ if one is consider-
ing whether C D 1 straightforwardly raises the probability of E D 1. However, as we’ll see
in the next subsection, some approaches to probabilistic causation appeal to a notion of latent
probability-raising that’s revealed by holding certain other variables fixed (by interventions).
On some approaches we’re allowed to hold common causes of C and E fixed at non-actual
values, in which case an intervention may well be needed to return C to its actual value.
5.2 Difficulties
Even when understood in a sense that rules out probability-raising between
independent effects of a common cause and probability-raising of causes by
their effects, there are (insurmountable) difficulties for any account that seeks
simply to identify causation with probability-raising. Cases of probabilistic pre-
emption can be used to illustrate that (even when understood in such a sense)
probability-raising is neither necessary nor sufficient for causation (Menzies
1989, 1996). An example is provided by the following variant on Probabilistic
Bomb.
Probabilistic Preemption
That is, my placing my U-232 near the Geiger lowers the probability of the
bomb’s exploding. That’s because it strongly lowers the probability of your
placing your more potent Th-228 near the Geiger (your Th-228 is more potent
because Th-228 has a shorter half life than U-232 and your chunk of Th-228
contains at least as many atoms as my chunk of U-232). Probability-raising
(even in the interventionist counterfactual sense) is therefore unnecessary for
causation.
Probabilistic Preemption also illustrates the fact that probability-raising
(even in the interventionist counterfactual sense) isn’t sufficient for causation.
Specifically, notice that your decision (D D 1) to place your Th-228 near the
Geiger iff I don’t place my U-232 there raises the probability of the explosion.67
That is:
67 At least if we assume that the alternative is for you to decide not to place your Th-228 near the
Geiger come what may. More will be said about causal contrastivity in the probabilistic case in
Section 5.4.
68 Cases of probability-raising without causation are discussed by Menzies (1989), Schaffer
(2001), and Hitchcock (2004), inter alia. Cases of causation without probability-raising are
discussed by Hesslow (1976), Rosen (1978), and Salmon (1980), inter alia.
69 Such an approach is taken by Hitchcock (2001b), Kvart (2004), and Fenton-Glynn (2017),
inter alia. Cartwright (1979, 423) endorses an an analogous approach to understanding type-
causation: specifically that a type-level factor C is a cause of a type-level factor E iff ‘C increases
the probability of E in every [population] which is otherwise causally homogenous with respect
to E’, where a situation is ‘causally homogenous’ with respect to E iff all other causal factors
for E are held fixed. Note that in the case of type causation it generally makes no sense to speak
of holding factors fixed ‘at their actual values’, since the values will vary across populations.
So Cartwright effectively requires that C increases the probability of E no matter what values
we hold these other factors fixed at. For a critique of this requirement, see Carroll (1991). Eells
(1991, 86–7) offers a similar definition of what he calls ‘positive causal relevance’, but also
defines ‘negative causal relevance’ (when C lowers the probability of E no matter what values
we hold these other factors fixed at), and ‘mixed causal relevance’ (e.g. when C raises the prob-
ability of E for some assignments of values to these other factors, but fails to raise it for others).
although my placing my U-232 near the Geiger doesn’t raise the probabil-
ity of the bomb’s exploding, latent probability-raising is revealed by holding
fixed the fact that you don’t place your Th-228 there. On the interventionist
approach, ‘holding fixed’ your not placing your Th-228 there means interven-
ing on the variable Y to keep it at Y D 0 whilst varying the value of M by
means of a further intervention. The result, captured by Inequality 8, is that
latent probability-raising is revealed:
P.E D 1jdo.Y D 0&M D 1// > P.E D 1jdo.Y D 0&M D 0// (8)
This inequality says that the probability of the explosion if interventions are
made to ensure that you don’t place your radioactive material near the Geiger
but I place mine there is higher than the probability of the explosion if inter-
ventions are made to ensure that neither of us places our radioactive material
near the Geiger.
Just as the features of the actual situation to be held fixed in seeking de facto
dependence in the deterministic case could be more clearly identified with the
aid of SEMs, so the features of the actual situation to be held fixed in seeking de
facto probability-raising can be more clearly identified with the help of prob-
abilistic causal models. Though the converse problem of probability-raising
without causation doesn’t have such a clear analogue in the deterministic case
– it’s much more plausible to take counterfactual dependence to be sufficient
for causation than it is to take probability-raising to be sufficient – it turns out
that appeal to probabilistic causal models opens up possible avenues for dealing
with the problem of probability-raising without causation too.
We’ll return to the question of whether (latent) probability-raising rather than, say, probability-
lowering should – as many accounts seem to assume – play a special role in our causal thinking
in Section 5.7.
70 I’ll use asterisks to indicate that a model is a probabilistic model, rather than a SEM.
PP
Variables: fM; D; Y; T; Eg
That is, if interventions occur to ensure that (i) you don’t make the decision
that you do; (ii) you don’t place your radioactive material near the Geiger;
and (iii) the bomb doesn’t explode, then interventions changing whether or
not I place my radioactive material near the Geiger make a difference to the
probability that the threshold is reached.71 It’s key here that an intervention to
prevent the bomb exploding would be something – such as the severing of the
wire between the Geiger and the bomb – that doesn’t affect whether the thresh-
old is reached. This is needed to secure the relevant non-backtracking reading
of the counterfactuals. The justification for the inclusion of each of the other
arrows in Figure 6 is analogous to the justification for the inclusion of that
from M to T.
Now consider why there isn’t, for instance, an arrow from Y to E. That’s
because there’s no assignment of values to all other variables in the model
such that, when the variables are held fixed at those values by interventions,
intervening on Y makes a difference to the probability distribution over E. In
particular, any such assignment includes an assignment of a value to T but,
when the value of T is held fixed by an intervention (either at 1 or 0), then the
value of Y no longer has a bearing on the probability that E D 1. Next, consider
why there isn’t an arrow from D to M. This is simply because your decision
is completely irrelevant to my action and can’t be made probabilistically rele-
vant to it by holding fixed by interventions the other variables in the model.72
Finally, consider why there isn’t an arrow from E to T. This is because, as men-
tioned in the previous paragraph, an intervention on E would only count as such
if it doesn’t affect T. When the value of E is changed in such a way, its value
71 It’s important to say something about which time’s probabilities are in question. One suggestion
(assumed in what follows) is that, where Xi represents an event that occurs at t then, in evaluating
whether there’s an arrow from Xi to Xj , the probabilities to consider are those that would obtain
just after t if Xi had been intervened on and interventions had already occurred by t to fix the
values of all variables in the model besides Xj . The fact that some of those other variables
represent events that occur later than t doesn’t stand in the way of interventions fixing their
values having occurred prior to t. For example, the wire connecting the Geiger and the bomb
could already have been severed to ensure that the bomb doesn’t explode (an intervention on
the value of E) by the time I place my radioactive material near the Geiger.
One might wonder whether, in a probabilistic world, it’s reasonable to suppose that an inter-
vention occurring by t could deterministically fix the value of a variable E representing an event
occurring sometime later. In fact, it’s not strictly necessary to appeal to deterministic interven-
tions to get the results described in the main text. For instance, an intervention that (merely)
sufficiently lowered the probability of your placing your radioactive material near the Geiger
(even in the circumstance in which I don’t place mine there) would ensure that my action raises
the probability that the threshold is reached. Still, the complication of probabilistic interventions
isn’t an avenue we’ll pursue further here.
72 In the language of graph theory, Y acts as an ‘unshielded collider’ for D and M (Spirtes et al.
2000, 10): that is, there are arrows from each of D and M to Y but there’s no arrow connecting
D and M. While probabilistically independent variables can be rendered dependent by condi-
tioning upon an unshielded collider, they can’t be rendered dependent by intervening on an
unshielded collider.
has no bearing upon that of T (no matter what values we hold the other vari-
ables in the model fixed at). Each of the other arrow absences in Figure 6 has
a justification analogous to the justification for the absence of an arrow from Y
to E, from D to M, or from E to T.
Just as, in seeking to analyse deterministic causation in terms of SEMs
appeal to an ‘apt’ SEM is needed, so too in seeking an analysis of (prob-
abilistic) causation in terms of probabilistic models, the notion of an ‘apt’
probabilistic model is needed. The requirements (Partition), (Independence),
(Naturalness), and (Serious Possibilities) placed on SEMs carry across to prob-
abilistic models without need of modification (and with the same motivation).
(Veridicality) however needs some modification, or at least elaboration, to
make clear what it means for a probabilistic model to ‘entail only true coun-
terfactuals’. The requirement (Veridicality ) is what we need to substitute in
its place.
The requirement (Veridicality ) is thus the requirement that the model (specif-
ically its do./ function) entail true counterfactuals about what the objective
probabilities would be under all combinations of values of the variables in
the model’s variable set. The models that I consider in what follows all sat-
isfy (Veridicality ). Specifically, in describing various causal scenarios, I’ll
stipulate what the objective probabilities would be under various counter-
factual suppositions about their variables. I shall then only describe mod-
els of these scenarios that entail the correct values for these counterfactual
chances.
The only requirement left to discuss now is (Stability). This can be applied
to probabilistic models in exactly the same form (and with the same motiva-
tion) as it was applied to SEMs in Section 4.6.3, except that its reference to
(Veridicality) should be replaced by a reference to (Veridicality ).
PC
75 One disanalogy is that, while SEA builds in contrastivity on both the cause and the effect sides,
PC builds in contrastivity on the cause side only. Though it’s possible to incorporate contrastiv-
ity on the effect side too, this is a bit less natural in the probabilistic case where there’s often not
a determinate fact about what value the effect variable would have taken (but only a probability
distribution over alternative values) had the cause variable taken its contrast value.
P.E D 1jdo.D D 1&Y D 0&M D 1// > P.E D 1jdo.D D 1&Y D 0&M D 0//
(10)
This simply says that, given your decision but your failure to place your
material near the Geiger, the probability of the explosion is higher if I place
my material there than if I don’t. This is of course true in Probabilistic
Preemption.
Analysis PC is, however, in danger of giving the incorrect result that your
decision to place your material near the Geiger iff I don’t place mine there,
D D 1, is also a cause of the explosion, E D 1. To see this, first note that it’s
presumably the case that if both of us place our material near the Geiger then
the probability of the threshold being reached is higher than if I alone place
my material there. Next note that, since the case is probabilistic, we might rea-
sonably allow that your decision merely makes it very likely that you won’t
place your radioactive material near the Geiger if I do rather than deterministi-
cally preventing you from doing so. But now consider the path hD; Y; Ei from
D to E. There’s one variable – M – that lies off this path. But it could be that
D D 1 raises the probability of E D 1 even holding M fixed at its actual value
M D 1.76
P.E D 1jdo.M D 1&D D 1// > P.E D 1jdo.M D 1&D D 0// (11)
That is, holding fixed that I place my material near the Geiger, your decision
still raises the probability of the bomb’s exploding because there’s some chance
given your decision that you’ll place your material near the Geiger too. In this
case, PC counts your decision as a cause even though you don’t in fact place
your material near the Geiger.
To think about how to modify PC to give a more adequate analysis of causa-
tion, it helps to consider why we don’t regard your decision as a cause. The key
reason is presumably that your placing your material near the Geiger (some-
thing which you don’t in fact do) is an essential part of the causal process
via which your decision had the potential to bring about the explosion. Of
76 To secure this result we need to assume that the alternative to your making the decision that you
do is for you to make some decision (e.g. to simply walk away from the scene) that makes it
less likely than it in fact is that you will also place your material near the Geiger. Now we could
either simply stipulate this, or we could model the case with a multi-valued variable representing
the various possible decisions open to you. We could then make the explicitly contrastive point
that PC (incorrectly) treats your making the decision that you do rather than (e.g.) deciding to
walk away as a cause of the bomb’s explosion.
course, it wouldn’t be very helpful to simply build into our analysis the require-
ment that ‘there be a complete causal process’ from putative cause to putative
effect without further characterisation of what a ‘causal process’ is and what
it is for such a process to be incomplete.77 And, as discussed in Section 1, it
seems impossible to give an account of causal processes without falling back on
counterfactuals (which, in the probabilistic case, might include counterfactuals
about probabilities).
Fortunately, we can sidestep these issues because it appears there’s a probabi-
listic symptom of the incompleteness of a causal process. Take the relationship
between your decision and the bomb’s explosion and note that, although the
probability of the bomb’s exploding may (even holding fixed my placing my
radioactive material near the Geiger) be higher if you take your decision than
if you don’t, the probability of the bomb’s exploding given that you take your
decision but you don’t place your radioactive material near the Geiger is (even
holding fixed my placing my material near the Geiger) no higher than if you’d
never taken your decision in the first place. That is, although Inequality 11
holds, it’s also the case that:
P.E D 1jdo.M D 1&D D 1&Y D 0// P.E D 1jdo.M D 1&D D 0// (12)
In words, holding fixed the fact that you make your decision but don’t place
your material near the Geiger, the probability of the bomb’s exploding is (not
only higher if I place my material there than if I don’t, but is) higher if I place
my material there and the threshold is reached than if I don’t place my material
there in the first place.
This suggests that we might try to revise PC so that the revised analysis
requires not merely that there’s de facto probability-raising of effect by cause,
but also that this probability-raising be suitably robust. This is the approach
taken by Fenton-Glynn (2017). The following is a slightly streamlined version
of that analysis.
PC1
Analysis PC1 is complex and one can best get to grips with it by seeing it in
action. To start with, observe that it yields the correct results about Probabi-
listic Preemption. First, it correctly counts M D 1 as a cause of E D 1. To
see this, consider the path hM; T; Ei in PP . There are two variables – D and
Y – that lie off this path, and one – T – that lies intermediate between M and
E upon it. The set of intermediate on-path variables is therefore just fTg which
has two subsets: fTg and ;. Inequality 10 shows that M D 1 raises the proba-
bility of E D 1 when we hold the off-path variables fixed at their actual values.
This de facto probability-raising is trivially robust (in the sense captured by
Inequality IN1 of PC1) relative to the empty subset of fTg. But it’s also robust
relative to the non-empty subset of fTg. This is indicated by the holding of Ine-
quality 13. The fact that the de facto probability-raising of E D 1 by M D 1 is
robust relative to both subsets of the set of intermediate on-path variables, fTg,
is precisely what is required for PC1 to diagnose M D 1 as a cause of E D 1.
Second, PC1 yields the correct result that D D 1 isn’t a cause of E D 1. To
see this note that hD; Y; T; Ei is the only path from D to E in PP . The variable
M is the only one that lies off this path and its actual value is M D 1. Moreover,
although D D 1 may raise the probability of E D 1 when we hold fixed that
M D 1, as revealed by the holding of Inequality 11, this probability-raising
isn’t robust relative to the subset fYg of the intermediate on-path variables, as
is revealed by the holding of Inequality 12.
Probabilistic Preemption is an early preemption case. But it’s quite easy
to verify that PC1 yields the correct results in certain cases of late preemption
too. Consider Figure 3 from Section 4.6.2. This was a graph of the SEM that we
constructed for Late Preemption. But Figure 3 could just as well represent a
version of Late Preemption which differs from the original in that each of the
relations is merely probabilistic: specifically, Bob’s intention merely makes it
likely that Bob will shoot if the deer isn’t dead by 1:00pm, Alice’s shot merely
makes it likely that the deer will be dead by 1:00pm, Bob’s shooting would
merely make it likely that the deer would be dead by 1:01pm if Alice didn’t
shoot. In such a case, PC1 entails that Alice’s shot A D 1 was a cause of D1W01
= 1. To see this, consider the path hA; D1W00 ; D1W01 i. A D 1 raises the probability
of D1W01 D 1 when we hold the two off-path variables – BI and B – fixed at
their actual values BI D 1 and B D 0. That is:
That is, given that Bob intends to shoot if the deer isn’t dead by 1:00pm, but in
fact Bob doesn’t shoot, the probability that the deer is dead by 1:01pm is higher
if Alice shoots than if she doesn’t.
This de facto probability-raising is robust in the sense that it continues to hold
when we take into account the actual value of the (only) intermediate on-path
variable D1W00 , namely D1W00 D 1. That is:
This says that, holding fixed that Bob intends to shoot if the deer isn’t dead by
1:00pm but in fact Bob doesn’t shoot, the probability that the deer is dead by
1:01pm is higher if Alice shoots and the deer is dead by 1:00pm than if simply
Alice doesn’t shoot in the first place.
Analysis PC1 also implies that Alice’s shooting was a cause of the deer’s
being dead by 1:00pm. To see this, consider the path hA; D1W00 i. Holding fixed
the off-path variables BI, B, and D1W01 at their actual values BI D 1, B D 0, and
D1W01 D 1, A D 1 raises the probability of D1W00 D 1. That is:
This might seem a bit of a ‘cheat’ because it relies on the fact that the connection
between the variables D1W00 and D1W01 is deterministic: given that the deer is
dead at 1:00pm it’s dead at 1:01pm with probability 1. But note that, even if this
link were merely probabilistic (perhaps there’s some probability of the deer’s
being resuscitated between 1:00pm and 1:01pm!), PC1 would still yield the
correct result. Specifically, in that case, although BI D 1 would de facto raise
the probability of D1W01 D 1 (Equality 17 would – assuming that, given Bob’s
intention, there’s some chance of his shooting even if the deer is dead at 1:00pm
– now be converted into an inequality, with the term on the LHS being greater
than the term on the RHS), this de facto probability-raising wouldn’t be robust
once we took account of the actual value – B D 0 – of the intermediate on-path
variable in the usual way. That is:
78 Note that, since the causal process connecting Alice’s shot to the deer’s being dead at 1:00pm
is complete, interpolating variables along the path hA; D1W00 i wouldn’t affect the fact that this
probability-raising is robust. For instance, (holding the off-path variables fixed at their actual
values) the probability of the deer’s being dead at 1:00pm would be higher if Alice shot and her
speeding bullet passed mid-air between her and the deer than it would be if Alice hadn’t shot
in the first place. So interpolating a variable M representing the presence of the speeding bullet
along the path hA; D1W00 i wouldn’t affect the verdict of PC1.
79 Plausibly, the absence of such a path is no mere idiosyncrasy of our model, but rather will be
a feature of any model of the scenario satisfying (Veridicality*). That’s because the absence of
such a path reflects the assumption of the example that Bob’s intention is to give Alice enough
of an opportunity to kill the deer that he won’t himself shoot the deer dead by 1:00pm.
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
Causation 75
That is (even holding the off-path variables fixed at their actual values) the
probability of the deer’s being dead by 1:01pm given Bob’s intention but also
the fact that Bob doesn’t shoot is no higher than it would be if simply Bob
hadn’t had the intention in the first place.
Overdetermination of Probabilities
80 I stipulate that some of the probabilities in this example are 0 or 1 for simplicity. More complex
examples can be given in which all probabilities are between 0 and 1.
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
76 Philosophy of Science
OP
81 Though, as noted in Footnote 71, in the probabilistic case we might also have use for the notion
of a probablistic intervention.
https://doi.org/10.1017/9781108588300 Published online by Cambridge University Press
78 Philosophy of Science
might take her individual probability of suffering heart disease to equal the
population-level frequency with which people who on average smoke 10-a-
day, exercise for 30 minutes, and consume 20g of saturated fat develop heart
disease. Thus a type-level model might be used to calculate the risk of heart
disease in a particular person. But this is far from an innocuous move, since it
gives rise to the notorious reference-class problem.82
While this isn’t the place to enter into extended discussion of the issue, it’s
worth pointing out that there are alternative ways of understanding single-case
probabilities, including ‘propensity’ interpretations (e.g. Popper 1959; Hack-
ing 1965; Mellor 1971) and ‘best system’ interpretations (e.g. Lewis 1994).
When it comes to ‘type-level’ models, on the other hand, a frequency inter-
pretation is tempting, but it’s not the only possibility. For instance, the ‘best
system’ approach provides an interpretation of type-level as well as single-case
probabilities.
When we’re dealing with models that represent type-level phenomena and
frequency/statistical information about them, there are quite sophisticated tech-
niques for causal discovery (see, e.g., Spirtes et al. 2000). In particular, we can
often discover a lot about the causal structure (e.g. direct, total, and contribut-
ing causal relations among variables) simply by examining the conditional
and unconditional correlations between variables, at least if we’re prepared to
make certain general assumptions about the relationship between causation and
probability. Such general assumptions include that events that don’t cause one
another are probabilistically independent conditional upon any common causes
they might have, that if A causes C (only) by causing B then A and C are proba-
bilistically independent conditional upon B,83 and that independent causes of a
single effect become probabilistically dependent when that effect is conditioned
upon.84 We can get even further if we have statistics available from randomised
control trials (RCTs). That’s because RCTs act like type-level ‘interventions’:
in an RCT, values of some variable X are assigned randomly to members of
a population thus, within that population, breaking the usual dependence of X
82 Venn (1866) was the first to systematically describe this problem. Hájek (2007) provides a
contemporary discussion.
83 These two assumptions are captured by the so-called Causal Markov Condition (Spirtes et al.
2000, 11).
84 To illustrate: patients with MS are often misdiagnosed as having lupus and vice versa because
many of the symptoms are the same. Loss of coordination, for example, is an effect of both
conditions. So consider the class of patients presenting with coordination loss. We would expect
that the incidence of MS is lower in that subset of such patients who have lupus than it is among
that subset of such patients who don’t have lupus. This explains why, for instance, a doctor
presented with such a patient who rules out lupus becomes more confident that MS is the correct
diagnosis and, conversely, why a doctor who discovers that such a patient has lupus wouldn’t
usually investigate whether the patient additionally has MS.
upon its parents (i.e. acting as a ‘switch’). This allows us to determine the causal
influence of X upon a target variable Y while being reasonably confident that
any association between X and Y isn’t due to common causes or indeed due to
Y’s being a cause of X.
It’s unquestionable that our discovery – by such methods – of type-level
causal relations and statistical information about their strength informs our
judgements about causal relations and probabilities at the token level. But the
relationship between the type- and the token level, as ever, isn’t straightfor-
ward. For instance, statistical associations at the type level needn’t be reflected
by genuine indeterminism at the token level (still less need they match the
single-case probabilities): probabilities at the type level may reflect unmod-
elled causes, measurement errors and the like. Nor can actual causal relations
simply be ‘read off’ from those at the type level. For instance, driving with a
damaged tyre might be a type-level cause of car crashes (because of the risk of
blow out) but, in a particular case where a driver drives with a damaged tyre
and crashes, the former needn’t be an actual cause of the latter (suppose the
driver fell asleep at the wheel and there was no blow-out). However, delving
deeper into the complexities surrounding the relation between type and token
levels is a task that must await another occasion.
Any apparent cause that failed to raise its effects’ chances would not only be
neither evidence for them nor explain them, it would not be a means to them if
they were ends. And nothing that satisfied none of these three connotations
would be a cause: the whole point of calling it a cause would be lost. In
particular, to call something a cause that provides no way of bringing about
its effects seems to me an obvious contradiction in terms. (Mellor 1995, 88)
that Mellor describes, or has to reject the idea that probability-raising is neces-
sary to satisfy these connotations. Doing the former seems rather unpalatable,
but the advocate of the de facto probability-raising approach can explain why
doing the latter is less so.
Continuing to focus on the example of Probabilistic Preemption, note that,
given that you didn’t actually place your Th-228 near the Geiger in Probabi-
listic Preemption, my placing my U-232 there clearly explains why the bomb
exploded. Also fairly clear is that my placing my U-232 near the Geiger was a
means to the bomb’s exploding. Of course, there’s a sense in which my failure
to place my U-232 there would also have been a means (indeed a more effective
one) to the bomb’s exploding. Yet, in the scenario, I didn’t know this since I
didn’t know of your presence. Nevertheless, although it’s plausibly a platitude
about causation that causes are means to their effects, it’s surely not a plati-
tude that for anything to count as a cause it must be the best means to its effect
or even the best available in the circumstances. For example, if someone who
works in asbestos removal also smokes, surely her smoking has the potential
to cause her to suffer lung cancer even if she could have more surely brought
about the latter by removing her protective mask at work.85
Is my placing my U-232 evidence that the bomb exploded? It depends.
What is evidence for what presumably depends upon the epistemic state of
the explainee. For instance, it would be odd to say that the causes of the explo-
sition count as evidence of its occurrence for someone who actually witnessed
the explosion. Likewise, for someone who hasn’t witnessed the explosion but
who knows of your intention to place your Th-228 near to the Geiger if I don’t
place my U-232 there, the fact that I do place my U-232 there is plausibly
evidence against the explosion. Still, for someone who doesn’t know of your
intention, or who knows of your intention but also knows that in fact you don’t
place your Th-228 near to the Geiger, my action is presumably evidence that
the bomb explodes. I take it that this is sufficient for the relevant connotation
of causation to be satisfied.
Interestingly, then, if the de facto probability-raising account is correct, then
it might be true to say that c caused e (since c de facto raised the probability of
85 Of course there’s a disanalogy between this case and Probabilistic Preemption in that smoking
and not wearing a protective mask to work are not mutually exclusive, but my placing my U-
232 near the Geiger and my not doing so are. So perhaps one could argue that, where c is
a member of a set S of mutually exclusive (and jointly exhaustive?) possibilities, c is only a
cause of e if c is the best means to e in S? Yet I suspect that, in doing so, one leaves the realm
of straightforward ‘connotations’ or platitudes about causation for the realm of philosophical
theorising. Moreover, such an argument seems to simply beg the question against the advocate
of the de facto probability-raising account.
Spraying the plant lowers the probability of the plant’s survival. But if the plant
survives then, as Cartwright points out, the spraying can’t explain its survival.
Likewise the spraying is neither evidence for the survival nor is it in any sense
a means to bringing about the survival. Plausibly this is because the spraying
doesn’t even bear a de facto probability-raising relation to the survival, and for
this reason isn’t an actual cause.
6 Conclusion
In this Element, we’ve examined three broad traditions in the philosophy of
causation: the regularity, counterfactual, and probabilistic approaches. Our
discussion showed how contemporary thinking about causation has been influ-
enced by a long history of intellectual thought on the subject, but we’ve also
87 For debate over this latter point, see e.g. (McGrath 2005), (Halpern and Hitchcock 2015) and
(Blanchard and Schaffer 2015).
Jacob Stegenga
University of Cambridge
Jacob Stegenga is Reader in the Department of History and Philosophy of Science at the
University of Cambridge. He has published widely on fundamental topics in reasoning
and rationality and philosophical problems in medicine and biology. Prior to joining
Cambridge, he taught in the United States and Canada, and he received his PhD from
the University of California San par Diego.
Big Data
Wolfgang Pietsch
Objectivity in Science
Stephen John
Causation
Luke Fenton-Glynn