0% found this document useful (0 votes)
30 views17 pages

Blamey Mackenzie 2007 Theories of Change and Realistic Evaluation Peas in A Pod or Apples and Oranges

This article discusses the differences and similarities between Theories of Change and Realistic Evaluation, two theory-based approaches to evaluation that have gained popularity in the UK. The authors emphasize the importance of context in understanding how complex programs lead to changes in outcomes and explore how 'theory' is conceptualized differently in each approach. They conclude by offering insights for evaluators on selecting appropriate frameworks for their practice based on these evaluations' unique characteristics.

Uploaded by

Felipe Madio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views17 pages

Blamey Mackenzie 2007 Theories of Change and Realistic Evaluation Peas in A Pod or Apples and Oranges

This article discusses the differences and similarities between Theories of Change and Realistic Evaluation, two theory-based approaches to evaluation that have gained popularity in the UK. The authors emphasize the importance of context in understanding how complex programs lead to changes in outcomes and explore how 'theory' is conceptualized differently in each approach. They conclude by offering insights for evaluators on selecting appropriate frameworks for their practice based on these evaluations' unique characteristics.

Uploaded by

Felipe Madio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Evaluation

Blamey and Mackenzie: Theories Copyright


of Change and Realistic Evalution
© 2007
SAGE Publications (Los Angeles,
London, New Delhi and Singapore)
DOI: 10.1177/1356389007082129
Vol 13(4): 439 – 455

Theories of Change and Realistic


Evaluation
Peas in a Pod or Apples and Oranges?

AV R I L B L A M E Y
NHS Health Scotland, UK

M H A I R I M AC K E N Z I E
University of Glasgow, Scotland

Two proponents of theory-based approaches to evaluation that have


found favour in the UK in recent years are Theories of Change and Realistic
Evaluation. In this article we share our evolving views on the points of
connection and digression between the approaches based on our reading
of the theory-based evaluation literature and our practice experience.
We provide a background to the two approaches that emphasizes the
importance of programme context in understanding how complex
programmes lead to changes in outcomes. We then explore some of the
differences in how ‘theory’ is conceptualized and used within the two
approaches and consider how knowledge is generated and cumulated in
subtly different ways depending on the approach that is taken. Finally, we
offer our thoughts on what this means for evaluators on the ground
seeking an appropriate framework for their practice.

K E Y W O R D S : context; knowledge generation; realistic evaluation;


theories of change; theory

Introduction
We have recently completed a range of evaluations that utilized a theory-based
evaluation approach. In particular, our evaluations were informed by the Aspen
Institute’s Theories of Change framework (Connell et al., 1995; Fulbright-Anderson
et al., 1998). As we struggled to conceptualize this and put it into practice we
found ourselves debating how different our evaluation might have looked had we
instead used Realistic Evaluation (Pawson and Tilley, 1997). In the midst of this

439
Evaluation 13(4)
we noted that the two (currently favoured ways of applying theory-based evalu-
ation within the UK) are often used interchangeably in discussion. Our emerging
view is that they are different in many important ways.
In this article we share our evolving views on the points of connection and
digression between the approaches, based on our reading of the theory-based
evaluation literature1 and our practice experience. Needless to say, it is not only
our views that are evolving; as both approaches have been applied in the field,
and learning about their strengths and weaknesses generated, their proponents
have further refined their thinking. The versions of the approaches upon which
this article is based are those articulated in the early work of the Aspen Institute
in relation to Theories of Change (Connell et al., 1995; Fulbright-Anderson et al.,
1998) and Realistic Evaluation (Pawson and Tilley, 1997). Where relevant to our
discussions we highlight instances of later refinements. In addition, to aid explan-
ation we have used a hypothetical smoking cessation programme to illustrate the
issues that we discuss.
We have organized the article in four main sections. First, we provide a back-
ground to the two approaches that emphasizes the importance of programme
context in understanding how complex programmes lead to changes in outcomes.
In this section we also briefly outline the steps taken to conduct a Theories of
Change and a Realistic Evaluation. Second, we explore some of the differences
in how ‘theory’ is conceptualized and used within the two approaches. Third, we
consider how knowledge is generated and cumulated in subtly different ways
depending on the approach that is taken. Fourth, we offer our thoughts on what
(if anything) all of this relatively abstract discussion means for evaluators on the
ground seeking an appropriate framework for their practice.

Where do the Approaches Come from and How do They


Manifest Themselves?
At their simplest both Theories of Change and Realistic Evaluation have emerged
to fill a deficit in policy and programme evaluation. The history of evaluation
shows that, despite significant government and philanthropic funding, previous
evaluation attempts to learn about the success or failure of social programmes
have resulted in disappointingly inconclusive findings (Pawson and Tilley, 1997;
Weiss, 1998). This is a charge that these authors level at both experimental and
qualitative approaches, although for very different reasons. They argue that ques-
tions of efficacy and effectiveness have proved difficult to answer within the experi-
mentalist paradigm because it conceives programmes as unified entities through
which recipients are processed, and where contextual factors are conceptualized
as confounding variables that it behoves the evaluator to control. This leads to
programme outcomes being aggregated across different groups of individuals in
heterogeneous contextual settings and across the many and varied manifestations
of a single social programme. For the theory-based evaluator, on the other hand,
programmes are not monoliths, people are not passive recipients of opportun-
ities to improve their health, wealth and social standing offered through various

440
Blamey and Mackenzie: Theories of Change and Realistic Evalution
initiatives, and context is key to understanding the interplay between programmes
and effects. Context itself is multifaceted and operates at a variety of levels. These
include: political, social, organizational and individual dimensions. Inevitably,
measuring or accounting for contexts is a difficult process in such evaluations. For
example, the political context for our hypothetical smoking cessation programme
if launched across the UK in 2006 would have been different due to the delay in
implementation of the smoking in public places legislation in England compared
to Scotland. At a local level such variation would be likely to be compounded. For
example, differences in the social make-up of intervention areas (including levels
of social capital and geographical deprivation) would impact on normative views
of smoking and associated motivations for cessation. At an organizational level
we might anticipate varying enthusiasm from local managers and those charged
with delivery. Local stakeholders would also welcome the newly launched pro-
gramme in different ways, ranging from open arms through indifference to down-
right hostility, depending on their previous experience of such initiatives and the
nature of existing service provision. These variations in enthusiasm would also be
anticipated in the responses of those targeted by the intervention. Flattening out
these variations in context through a traditional trial, it is argued, would eliminate
a key ingredient in the mix. In addition, for many such interventions, context is
not simply an interesting backdrop but is instead explicitly targeted for change
(Gambone, 1998).
Context, therefore, must be considered as part of the evaluation and can be
key to uncovering the circumstances in which, and the reasons why, a particular
intervention works. These approaches acknowledge that particular contexts can
enhance or detract from programme effectiveness and that such contexts may
include factors that are within or outside the control of programme implement-
ers. For example, a smoking cessation intervention implemented in the UK at
the present time might be much more successful than one established 10 years
earlier, due to legislation on tobacco advertising. Such legislation might impact
on the norms of smoking behaviour and service uptake through reducing expos-
ure to positive images of smoking. Its institution, however, would not be in the
gift of smoking cessation programme implementers. An example of a contextual
issue that would be within the control of the programme implementers would be
improving partnership links between health providers and local social housing
associations so that smoking cessation groups can be provided within their hous-
ing complexes. The anticipated impact would be an improvement in access and
referral routes for ‘hard-to-reach’ groups.
Both Theories of Change and Realistic Evaluation approaches indicate that the
impact of social programmes cannot be determined with any degree of confidence
if there is no knowledge about the context within which they have taken place. In
the absence of such knowledge, alternative possible explanations for any changes
uncovered (such as the existence of other similar interventions, secular trends
or environmental changes) cannot be dismissed. An understanding of context is,
therefore, vital in relation to attributing cause. Context is also seen as important
in terms of replicating the intervention in any future setting or in learning about
possible generalizable causal pathways (we discuss generalizability later).

441
Evaluation 13(4)
So much for traditional experimental designs; theory-based proponents simi-
larly believe that qualitative methods are not fit for the evaluation purpose within
complex interventions. Their critique of these methods is equally dismissive but
much more limited. Pawson and Tilley are, for example, scathing about social con-
structionist approaches that aim to describe multiple truths, arguing that within
this paradigm relativist perspectives become mired in context.2 We perceive that
the gist of their argument is that, whilst such methods may improve internal valid-
ity by uncovering participants’ perspectives and detailing the unique contexts
in which they are grounded, they do not address external validity. They do not
address the issue of counterfactuals and so do not make an assessment of pro-
gramme impact – the question of ‘what works’, therefore, remains unanswered.
In addition, the purpose of qualitative approaches is not to draw representative
samples that allow generalizations to wider populations (Connell and Kubisch,
1998; Pawson and Tilley, 1997).
The perceived inadequacy of both methodological stances is exacerbated by
the increasingly complex nature of the health and social programmes devised to
tackle seemingly intractable problems. Such interventions invariably rely on part-
nership approaches involving multisectoral representation (Dowling et al., 2004)
and community collaboration (Barnes et al., 2005), and set themselves goals at the
level of individuals, communities, organizations, structure and policy (Fulbright-
Anderson et al., 1998). This type of programme is well documented as a signifi-
cant component of the current government’s approach to addressing long-term
and embedded social problems (Barnes et al., 2005; Martin and Sanderson, 1999).
Alongside the support for such programmes has come an increased emphasis on
evidence-based practice and a demand for accountability of public spending. Pro-
gramme evaluation has, therefore, flourished and is increasingly expected to pro-
vide formative learning for individual programmes and lessons for future policy
implementation as well as to address questions of ‘what works’ (that is, to address
questions best answered using traditional impact evaluations) (Davies et al., 2000;
Mackenzie et al., 2006; Martin and Sanderson, 1999).
This shared disenchantment with methods-driven evaluation is the starting
point for Theories of Change and Realistic Evaluation. The task of comparing their
approaches is, however, made difficult by the different concepts and foci within
their descriptions of how theory is articulated and how knowledge generation
should proceed. Before turning to a more substantive discussion of their similar-
ities and differences we summarize the main facets of a Theories of Change and a
Realistic Evaluation approach using the language of their proponents and using
the shared example of our hypothetical smoking cessation programme (see Boxes
I and II). We continue to use this example throughout the remainder of the article
as a means of grounding our analysis of the relatively abstract concepts employed
by the two approaches.

What is Meant by Theory?


Theory is used in variety of ways within different theory-based approaches (Stame,
2004). A key problem in getting to grips with the literature on theory-based

442
Blamey and Mackenzie: Theories of Change and Realistic Evalution
Box I. Undertaking a Theories of Change Evaluation (adapted from Connell et al., 1995;
Fulbright-Anderson et al., 1998)

To elicit the theory of change underlying a planned programme, the evaluator works with
a wide range of stakeholders in a collaborative manner. Part of the evaluator’s role is to
facilitate the articulation of the relevant theories and to highlight conflicting and discrep-
ant theories. To help capture expectations of change, stakeholders are asked to focus
explicitly on the following steps and to reflect on the contextual factors that influence
their decision-making.
Step 1: The focus here is on the long-term vision of an initiative and is likely to relate
to a timescale that lies beyond the timeframe of the initiative. Its aim should be
closely linked to the existence of a local or national problem. For example, our
smoking cessation programme might have a long-term vision of eradicating inequal-
ities in smoking prevalence by 2020.
Step 2: Having agreed the ultimate aim of the programme, stakeholders are encour-
aged to consider the necessary outcomes that will be required by the end of the
programme if such an aim is to be met in the longer term. Within our programme
they might, for instance, anticipate a decrease in differential prevalence between the
most and least deprived areas from x percent to x ⫺ z percent.
Steps 3 and 4: Stakeholders are then asked to articulate the types of outputs and
short-term outcomes that will help them to achieve the specified targets. These
might include reductions in differential access to acceptable smoking cessation pro-
grammes.
Step 5: A t this stage those involved with the programme consider the most appropriate
activities or interventions required to bring about the required change. Different
strategies of engagement might, for example, be utilized to target pregnant women,
middle-aged men and young adolescents.
Step 6: Finally, stakeholders are required to consider the resources that can realisti-
cally be brought to bear on the planned interventions. These will include staff and
organizational capacity, the existence of supportive networks and facilities as well
as financial capability.
Following a collective and iterative process the resulting programme theory must fulfil
a set of pre-specified criteria: that it must be plausible, doable and testable. First
then, the theory of change that is elicited should be interrogated to ensure that the
underlying logic is one that is acceptable to stakeholders either because of its existing
evidence base or because it seems likely to be true in a normative sense. Second, the
implementation theory itself should be questioned to ensure that timescales, financial
resources and capacities add up to the aspirations of the programme. Finally, the
Theory of Change needs to be articulated in such a way that it can be open to evalu-
ation; this is only possible where there is a high degree of specificity concerning the
outcomes of the programme. The proponents of the approach provide examples of
overall programme theories.
A Theories of Change evaluator then takes the programme map generated through
this process and, using standard multi-method approaches as relevant, monitors the
unfolding of the programme in practice and integrates the findings. Here the guid-
ance for the putative evaluator ends.

443
Evaluation 13(4)
Box II. Undertaking a Realistic Evaluation (adapted from Pawson and Tilley, 1997)

Pawson and Tilley propose a Realistic Evaluation approach that is, superficially at least, less
circumscribed. They have no set steps for the would-be realist evaluator to follow. How-
ever, our reading of their map is as follows:
Step 1: The evaluator through dialogue with programme implementers attempts
to understand the nature of a social programme: what is the aim of our smoking ces-
sation programme; what is the nature of the target population at whom it is aimed;
in what kinds of contexts and settings will it operate; and what are the prevailing
theories about why smoking cessation services will work for some people in some
circumstances?
Step 2: The evaluator maps out a series of potential mini theories that relate the vari-
ous contexts of a programme to the multiple mechanisms by which it might oper-
ate to produce different outcomes. For example, practitioner knowledge and the
existing evidence base might suggest that focusing the educational component of a
midwife-led smoking cessation programme on the ill-effects on babies in utero will
be most effective in women with no previous children. On the other hand, young
(non-pregnant) female smokers who are less likely to find the distant health conse-
quences of their habit salient may relate more positively to those interventions that
are designed to appeal to their self-image.
Step 3: At this stage the evaluator undertakes an ‘outcome inquiry’ in relation to these
mini theories. This involves building up a quantitative and qualitative picture of the
programme in action. It might, for example, address how multiple different types of
smoker fare when it comes to breaking the habit following different kinds of cessa-
tion service delivered in a variety of ways.This picture includes an assessment of the
extent to which different underlying psychological motivations and mechanisms have
been triggered by specific services in particular smokers.
Step 4: Through an exploration of how context, mechanism and outcome (CMO)
configurations play out within a programme, the evaluator refines and develops
tentative theories of what works for whom in what circumstances.

evaluation is the fundamental lack of consistency on how different types of theory


are described. Different terms, for example, are used to describe the same type
of theory and similar labels are given to epistemologically separate kinds of the-
ory. Let us illustrate. We consider that there are two discrete conceptualizations of
theory that are relevant to this discussion. One relates to the hypothesized links
between a programme’s activities and its anticipated outcomes. In the case of our
smoking cessation example this would indicate that our intervention has an implicit
or explicit theory about the required staffing levels required to deliver the service
and the subsequent numbers of one-to-one or group interactions they would need,
in order to achieve an x percent reduction in smoking in y timescale.
Weiss (1995) calls this ‘implementation theory’. She states, for example, that
implementation theory is ‘what is required to translate objectives into ongoing
service delivery and programme operation’ (Weiss, 1995: 58). Chen (1990), on
the other hand, calls this type of hypothesizing ‘prescriptive theory’; Saunders

444
Blamey and Mackenzie: Theories of Change and Realistic Evalution
(personal communication, 2005) has indicated that he uses the term ‘little t theory’
and, to confuse matters further, the performance management literature (Lindgren,
2001) uses ‘programme theory’ to refer to this theory. For the purposes of this
article we will use Weiss’s preferred term of ‘implementation theory’.
The second type of theory that we want to highlight refers to the hypothesized
causal links between mechanisms released by an intervention and their anticipated
outcomes. In our example this would relate to the anticipated response levels
amongst different target groups and the hypothesized triggers that might motivate
them to engage with and adhere to the programme (e.g. concerns over self-image
for young women or the health of their unborn child for pregnant women). Weiss
(1995) refers to the thinking about ‘the responses on the people to programme
activities’ as ‘programme theory’, Chen names this ‘descriptive theory’ and Saunders
(personal communication, 2005) contrasts it with implementation theory by call-
ing it ‘big T theory’. It is also referred to as ‘middle-range’ theory by Pawson and
Tilley (1997). Again, we will use Weiss’s term ‘programme theory’ to refer to this
type of theorizing.
Thus it is evident that terms are sometimes used interchangeably and can
often be counterintuitive. We suggest that this scenario adds to the difficulty in
understanding where approaches to theory-based evaluation, such as Theories of
Change and Realistic Evaluation, overlap and/or differ.
Both Theories of Change and Realistic Evaluation are concerned with under-
standing the theory of an initiative. Each approach requires that the interven-
tion’s theory be used to inform the evaluation’s purpose and focus, and the key
questions that it will address. Theory should also drive the selection of methods.
To an extent both approaches necessarily attempt to uncover elements of both
‘implementation’ and ‘programme’ theory (Stame, 2004). However, we suggest
that they each place a stronger emphasis on, and may indeed be best suited to
dealing with, one of the two types of theory already discussed.
Theories of Change explicitly recognize the importance of both programme
and implementation theory and the need to ultimately integrate these. Weiss, for
example, in the first volume of Aspen Institute papers, stated that ‘I call the com-
bination of program theory and implementation theory the program’s theories
of change’ (Weiss, 1995: 58). We suspect, however, that within the applications of
the approach to date, evaluation practitioners have been predominantly engaged
with explicating implementation theory. For instance, the examples provided in
Fulbright-Anderson et al. (1998) are predominantly descriptions of intervention
elements and their links, and Auspos and Kubisch (2004) suggest that uncovering
programme theory is perhaps more aspirational than practical. In addition the rec-
ommended process for articulating theories of change is concerned with the types
of activities, timescales and anticipated outcomes or thresholds of change. The
approach, as a result, entails mapping the nuts and bolts of the programme. Connell
and Kubisch, for example, state that:
A theory of change approach would seek the agreement from all stakeholders that, for
example, activities A1, A2, and A3, if properly implemented (and with the ongoing)
presence of contextual factors X1, X2 and X3) should lead to outcomes O1, O2 and
O3; and if these activities, contextual supports, and outcomes all occur more or less as

445
Evaluation 13(4)
expected, the outcomes will be attributable to the interventions. (Connell and Kubisch,
1998: 19)

Once an intervention is mapped out the evaluator’s focus turns to uncovering


the rationales for the programme design and then to establishing the plausibility,
do-ability and testability of the conjectured links between the programme activity
and the anticipated outcomes. The rationales, the interrogations of the theory of
change models and the focus on articulating outcomes (which are often expres-
sions of changes in knowledge, attitudes or behaviours of individuals or groups)
may then take Theories of Change into the realms of ‘programme theory’ (Stame,
2004). Nonetheless there is a dearth of published examples of the approach being
systematically used in this way. This failure to get beyond implementation theory
may be in part due to the strong emphasis the Theory of Change literature places on
the process of articulation relative to critiquing and testing. In addition, we would
suggest that the level of complexity, lack of existing evidence base and limited
capacity for planning within such community-building interventions simply makes
it unfeasible to uncover both implementation and programme theory (see later
section for what the approaches mean in practice).
Unlike Theories of Change we suggest that realistic evaluators have a more
explicit intent in uncovering programme theory. Such theory, rather than being
about the nuts and bolts of programmes and their possible linkages, is more
concerned with psychological and motivational responses leading to behaviour
change. The terminology of Realistic Evaluation, therefore, is of context, mech-
anisms and outcome configurations (CMOs) that attempt to hypothesize the causal
and situational triggers for changes in behaviour or responses to the interventions.
As Pawson and Tilley (1997) highlight, the realistic evaluator has to
. . . see what the initiative fires in people’s minds. This is what a realist means by mech-
anisms; we cannot simply treat programs as things, we have to follow them through into
the choices made by recipients. (Pawson and Tilley, 1997: 188)

The explanatory theory sought by Pawson and Tilley, therefore, becomes a


generalizable mechanism that explains why an individual or group of individu-
als (within a particular context) respond in a particular and relatively predict-
able way to an intervention (or aspects of an intervention) (Stame, 2004). Once
such CMO configurations are teased from a wider programme they themselves
become testable theories. In reality, for Realistic Evaluation to identify CMO
configurations it requires a reasonable understanding of the programme activ-
ities, target groups and intended outcomes and so its proponents must understand
the implementation theory. However they are only interested in implementation
theory as a route to hypothesizing the causal trigger that fires the appropriate
mechanism in certain circumstances.
Regardless of the type of theory that is being explored we think that it is fair
to say that the approaches differ with regard to the question of whose theory is
being tested. In the Theories of Change approach the theory is ideally articulated,
owned and approved by a wide range of stakeholders. The reasoning for this is that
it is these stakeholders who best understand the intervention and it is they who
will, at a later stage, require to be convinced that the outcomes that are measured

446
Blamey and Mackenzie: Theories of Change and Realistic Evalution
are attributable to the detail of the intervention theory they approved (see later
section on attribution). Such stakeholders not only own the theory of change gen-
erated but should also be involved in key decisions about which, of the many pos-
sibly competing elements of the theory, become the foci for the actual evaluation
activity:
Will these different theories of change be included as parallel, integrated or competing
strands in the overall theory, or will some be selected for inclusion in the implementa-
tion and evaluation and others not? The task of addressing these issues should not fall
solely, or even primarily, to the evaluator, but the evaluation discussion may serve as the
context within which they are played out. (Connell and Kubisch, 1998: 31)

This approach resonates with other aspects of the Theories of Change’s under-
lying philosophy such as a commitment to generating community engagement,
capacity-building and ownership, and to evaluation as a means of developing pro-
grammes. In fact, it might be argued that these principles are as important to the
Theories of Change approach as its contribution to attribution (as discussed later).
However, in practice within the UK, Sullivan and Stewart (2006) have argued
that, with regard to ownership of a programme’s theory, the ideal of community
ownership is rarely attained.
On the other hand, the realist evaluator articulates the theory through gen-
eral conversations and interviews with a more limited and purposive selection of
stakeholders. This approach shows less concern for prospectively articulated, well-
specified and consensual action plans and capacity building, and more interest in
identifying promising hypothesized causal triggers. S/he then goes on to suggest
the most promising theories at more of a distance from the programme than in The-
ories of Change. This decision is based on the existing evidence base and evaluator
knowledge and experience rather than on the relative importance placed on the
theories by implementers per se. In this sense the theories generated, whilst partly
emerging from discussions with stakeholders, are specified and owned more by the
evaluators rather than approved and ‘signed up to’ by the stakeholders. Similarly it
is the evaluator who prioritizes the CMO configurations that are worthy of further
investigation and that become the foci of the evaluation.

What Are the Implications for the Kinds of Knowledge Generated?


So far we have suggested that theory has subtle but important differences in
meaning across Theories of Change and Realistic Evaluation. The distinctions,
however, do not stop there. We suggest that there are two further points at which
some degree of divergence can be identified between the approaches. These are:
the ways in which knowledge is generated over time; and the kinds of claims that
can be made about the causal links between interventions and their outcomes. We
deal with each of these in turn.

The Process of Generating Knowledge


Both approaches take an explicitly cumulative approach to knowledge gener-
ation. This means that learning is believed to accrete slowly within and across

447
Evaluation 13(4)
evaluations rather than delivering big bang answers to questions of programme
effectiveness. While Pawson and Tilley may be rather unfair to characterize more
traditional approaches as ‘one-off affairs [that] neither look back and build on
previous findings, nor look forward to future evaluations’ (1997: 115), there is a
tendency for traditional evaluations to be rather self-contained assessments that
do not build on learning from other disciplines or policy domains. In contrast,
Realistic Evaluation approaches aim to constantly refine learning about which
mechanisms are triggered in which circumstances for which individuals. In prac-
tice this would mean that the evaluator builds on what is already known about
the circumstances under which different strategies work and, in particular, to
think laterally about the existing literature. Lessons about the best way to engage
young women in smoking cessation, for example, may come from a rich range of
evaluations concerned with engaging young women in interventions outside the
boundaries of health promotion. Each evaluation should thus start with theory
and end with more refined propositions for future testing with the goal being
cumulatively to learn ‘more and more about less and less’ (Pawson and Tilley,
1997: 198). This notion of knowledge generation lies at the heart of the realist
synthesis proposed by Pawson, an approach to reviewing the evidence that again
focuses on developing greater understanding of context, mechanism and outcome
configurations (Pawson, 2002a, 2002b, 2006).
The Theories of Change approach, as described by its proponents in 1998, is
less directive about how knowledge should cumulate within and between evalu-
ations. There is an implicit assumption, however, that those theories that are pro-
spectively traced and found to be hardy specimens are taken as read in future
evaluations, thus freeing up the evaluator to track those theories about which
little is known. The focus is less on constant refining of what works for whom in
what circumstances, and more on acquiring more and more pieces of the know-
ledge jigsaw. However, as Weiss says, cumulating knowledge is problematic when
so very little is known to start with:
Unfortunately, program theory in many fields is at a low stage of development. In those
fields, the theories that evaluations are likely to be testing are low-level approximations,
riddled with inaccuracies and false paths. To discover their flaws and painstakingly revise
and improve them will take multiple iterations. This is an arduous way to generate gen-
eralisable knowledge. (Weiss, 1998: 69)

This chimes with the most recent views of the Aspen proponents who have found
that the reality of undertaking Theories of Change evaluations is hampered by a lack
of evidence about how to deliver change at an individual and community level. For
example, they cite the fact that there is limited generalizable knowledge about how
to build programmes that engage and strengthen multiply deprived communities
(Auspos and Kubisch, 2004). Thus, they suggest that acquiring reasonably robust
evidence (both formal and experiential) about community engagement processes
is an important prelude to testing theories of change. There is a whiff of chicken and
egg about this argument since it seems that to generate useful learning one needs
to start with good theory, yet this is in turn reliant on a strong evidence base. It
becomes difficult to see how those practices without a good evidence base are able

448
Blamey and Mackenzie: Theories of Change and Realistic Evalution
to generate one through a Theories of Change approach, despite the fact that it is
precisely these kinds of problem areas for which the approach was first lauded.

Causal Attribution
Regardless of how they cumulate knowledge between and within programme
evaluations both theory-based approaches are concerned with determining logi-
cal pathways between aspects of an intervention and its subsequent impacts on
different subpopulations, and thus making claims about causation. The received
wisdom in the evaluation canon, however, is that attributing outcomes to inter-
ventions cannot be done with any degree of rigour in the absence of controlled
comparisons (ideally, randomized controlled trials). How then do Theories of
Change and Realistic Evaluation face this challenge?
The Theories of Change approach argues that the attribution problem can
be partly addressed through the process of building consensus amongst a wide
group of stakeholders (for example, funders, planners, practitioners and recipi-
ents) about a programme’s theory and then testing the extent to which antici-
pated thresholds, timelines and outcomes are achieved.
Let us suppose, for example, that our smoking cessation stakeholders agree that
running a certain number of smoking cessation groups within prescribed settings
and with a prespecified number of individuals from particular subpopulations will
lead to x percent adherence to the programme, y percent abstinence at 6 weeks
and z percent quit rate at 6 months. Attribution would be enhanced and the path-
ways assumed to be causal2 (or at least associated) if the intervention plan turns
into reality with approximately similar numbers of people benefiting as predicted
and provided there are no other competing contextual explanations for the
observed shift in smoking behaviours (for example, radical changes to tobacco
control policies). Sullivan and colleagues (2002) suggest that the approach is
naive in its belief that consensus can be meaningfully reached with a group of
stakeholders where power is unequally distributed. However, the important elem-
ent to note here is that the audience that needs to be satisfied about the attribu-
tion question is not a scientific one but a more pragmatic one made up of those
who will be required to make decisions about future funding, local planning and
practice, and engagement with services. This leaves the door open for the stake-
holder group to determine the level of agreement that is required between its
prospectively agreed theory and its outcomes in practice and means that where
stakeholders are determined to see success and causal pathways in a programme
then they can do so even if an outside observer might remain sceptical. Else-
where we have suggested that, rather than being a technical concept, attribution
takes on a rather constructivist light within the early iterations of the Theories of
Change approach (Mackenzie and Blamey, 2005); in reality this means that it is
the evaluator who must assume the role of sceptical observer. The approach does
not, however, object to a randomized controlled approach to dealing with attri-
bution (Auspos and Kubisch, 2004) and evaluators using a Theories of Change
framework have found it possible to build traditional experimental methods into
programme learning (Weitzman et al., 2002).

449
Evaluation 13(4)
A Realistic Evaluation approach has a different take on attribution and, in par-
ticular, is epistemologically antagonistic to the use of controlled trials (Pawson and
Tilley, 1997; Tilley, 2004). Their argument goes as follows. Experimentalists employ
a successionist approach to causation. That is, with context held constant (through
randomization) outcomes are collected from those exposed or not exposed to the
intervention. Where these outcomes, in a statistically systematic fashion, achieve
the desired change, the intervention can be imputed to be causally related to its
outcomes. Realists, on the other hand, adopt a generative approach to attribu-
tion. This differs from its successionist neighbour in that it is explicitly focused on
a cumulative and iterative process of theory building, testing and refinement in
relation to specific programme subcomponents. Thus, as well as seeking patterns
between interventions and their outcomes, it focuses on the generative mechanism
by which the relationship is established. Once evidence about the success of particu-
lar configurations is mustered it can be used to generate refined theories worthy
of testing in the context of various other programmes. What this means in practice
is that the realist evaluator tests out whether theories are worth their salt by track-
ing the outcomes emerging from specific mechanism and context configurations.
For example, rather than testing whether the decrease in smoking rates is signifi-
cantly greater for those young women who receive one-to-one counselling within
health centres compared to those who do not, the realist evaluator would map
out a range of potential context, mechanism and outcome constellations within a
programme and observe which of these triggered the desired effect. Once tested it
may be that the generative mechanisms in particular configurations are sufficiently
promising that they might be imported to other interventions that aim to change
community-level behaviour (such interventions may or may not be within the field
of smoking cessation). It is important to note that the prospective nature of these
theories, and the potential for there to be literally hundreds of them distinguishes
this approach from subgroup analyses within sophisticated experimental studies.

What does this Mean for Evaluation Practice?


The practical examples used within the key texts of the two approaches are tan-
gibly different. Pawson and Tilley use examples predominantly focused within
an individual policy domain such as prisoner education programmes, property-
marking schemes or CCTV interventions. The examples provided by the Aspen
Institute, on the other hand, tend to be large-scale geographical, whole-community,
multi-topic, multi-site interventions. These are often aimed at achieving social
inclusion, community building, youth development, crime reduction, improved
education and mainstream social service improvement simultaneously. It is likely
due to such complexity that recent applications of the approach have failed to get
beyond descriptions of implementation theory (Auspos and Kubisch, 2004).
This, aligned with the other differences outlined in this article, convinces us
that the approaches are best suited to different evaluation challenges. A The-
ories of Change approach, we argue, is concerned more with overall programme
outcomes and the synergies between the various strands of an intervention. Thus
it helps to provide a strategic perspective on a complex programme and equates

450
Blamey and Mackenzie: Theories of Change and Realistic Evalution
to what Pawson (2003) describes as staring complexity in the face. The downside,
however, is that the theories uncovered are relatively superficial and are more
likely to remain at the implementation level.
Realistic Evaluation, on the other hand, is concerned less with the overall pro-
gramme and more with the most promising CMO configurations. Pawson (2003:
486) refers to this as ‘concentrating your fire’. In this sense the approach delivers
more precise and substantive programme learning but deals less well with highly
complex, multi-site interventions with multiple outcomes.
One implication from the level at which the two approaches operate is that
there is no obvious reason for believing that Theories of Change and Realistic
Evaluation could not coexist within the one programme evaluation, with the
former providing broad strategic learning about implementation theory and the
latter bearing down on smaller and more promising elements of embedded pro-
gramme theory. Bonner (2003), however, has identified that a lack of programme
clarity, and the ensuing implementation failures that beset many complex pro-
grammes (Bauld et al., 2005), can prevent programmes from reaching the stage
where mechanisms of change can realistically be tested. An explicit attempt to
bring the two approaches together, however, might yield powerful policy as well
as methodological learning.
Turning now to the resource implications of utilizing theory-based approaches,
it is fair to say that both Theories of Change and Realistic Evaluation are likely to
be more time consuming than more traditional approaches that do not uncover the
theory underpinning interventions. The Theories of Change approach, in particu-
lar, requires substantial time to access and work with the wide range of stakehold-
ers necessarily involved in the theory generation. The need for agreement on the
theories articulated, and the specificity required to test they are plausible, doable,
and testable, requires an iterative and time-consuming process (Mackenzie and
Blamey, 2005). The focus on highly complex, multi-site interventions consequently
requires measurement at multiple levels (the individual, group, organization and
community) making the processes fraught with practical and conceptual difficulty
(Barnes et al., 2003; Sullivan et al., 2002). In addition the approach’s commitment
to improving programme delivery and integrating process and outcome requires
the articulated theory to be revisited over the course of the intervention. Realistic
Evaluation may be less time-consuming since it places a lesser emphasis on gain-
ing stakeholder consensus on articulated theories or evaluation priorities. On the
other hand, additional time is required for evaluators to identify, prioritize and
explore the most promising CMOs.
Finally, the fact that evaluators using either approach work in close proximity
to implementers, and may influence the programme that they will go on to judge,
leaves them open to criticism with regard to their objectivity. Such fears can per-
haps be addressed through phasing the different elements of the approaches or
using different evaluators for formative and summative aspects of the evaluation.
However, the essence of both approaches requires a more intensive relationship
between evaluators and key stakeholder, as without this they would be unable to
uncover the programme theory and reality of programme delivery. It might be
argued that the distance between evaluators and implementers and the consequent

451
Evaluation 13(4)
lack of knowledge of intervention theory caused by a preoccupation with supposed
‘objectivity’ has contributed to the limitations of traditional impact evaluations.

Conclusion
From what has gone before it should now be apparent that, while we believe that
Theories of Change and Realistic Evaluation may both be from the same stable,
they are in practice very different horses. The two are not synonymous; and will
run best in different parts of the evaluation course. We suggest that they take
different approaches to articulating theory, in practice generate different types
of theory, require differing degrees of stakeholder involvement, are more or less
suited to different degrees of complexity, and ultimately provide different types
of knowledge for implementers, policy-makers and for use in future evaluations.
A challenge for evaluators is to test out how best the approaches might be
used synergistically to improve policy and methodological learning. We suggest
that many policy programmes lend themselves to the explicit testing of a dual
Theories of Change/Realistic Evaluation model. This might entail the use of The-
ories of Change as a means of explicating implementation theory for the purpose
of programme planning, improvement and the development of robust monitor-
ing systems at a macro programme level. Realistic Evaluation approaches might
then be brought to bear on more micro level aspects of the most promising pro-
gramme theories. For example, while Theories of Change might help develop a
more detailed understanding of how a smoking cessation programme was being
implemented at the national and local level and the challenges in giving life to
policy aspirations, Realistic Evaluation could help us to illuminate if and why
specific elements of the intervention work in particular groups or settings. Such
purposive utilization of the two approaches, we suggest, might yield synergistic
learning.
Nonetheless, for any evaluation approach to work effectively within the policy
process, there needs to be an improved understanding of the types of outcome
that can be generated. Policy-makers also need to respond to lessons from the
processes of theory critiquing and testing. Subsequent interventions can only be
improved if those with the power to influence future activity take cognizance of
evaluation learning and change existing systems and practice.
Perhaps it is also true to say that some evaluators have been too cavalier in
touting theory-based approaches as the answer to policy-makers’ and programme
planners’ problems, and what is needed is much greater realism about the size
of the evidence base that can be advanced within any one evaluation. Although
unpalatable, stakeholders in developing policy and in implementing and evalu-
ating programmes should, therefore, take counsel from Weiss (1998) who suggests
that our appetite for learning should be whetted with a dose of reality about how
knowledge accretes over time:
Evaluation will never provide all the answers. What it can do – and this is no minor
contribution – is help to rally support for effective programs, identify innovative pro-
grams that are making advances over current service, and expose the failings of existing
programs, along with indications of the kind of change that would make them work . . .

452
Blamey and Mackenzie: Theories of Change and Realistic Evalution
At one point I bemoaned this slow and indirect approach to social change and yearned
for bolder contributions. In recent years, however, I have come to appreciate how dif-
ficult social change is and how resistant social problems are to intervention. I am more
impressed with the utility of evaluation findings in stimulating incremental increases in
knowledge and in program effectiveness. Over time cumulative increments are not such
small potatoes after all. (Weiss, 1998: 319)

Notes
The authors would like to thank two anonymous reviewers for their comments and, in
particular, to one for their notion of evaluator as ‘sceptical observer’. While the views
expressed represent the opinions of the authors, these views were shared with colleagues
(including Ray Pawson, one of the key proponents of Realistic Evaluation) at a Scottish
Evaluation Network event facilitated by the authors in 2003. The authors also presented
a refined version of this work more recently at the 2005 UKES conference. Those in
attendance at both events discussed the presented ideas and helped us to further clarify
our thoughts on these issues. We thank them for their contributions.

1. Theory-based approaches have, however, been criticized for setting up flimsy versions
of both experimental (Julnes et al., 1998) and social constructionist (Dahler-Larsen,
2001) approaches in order to knock them down.
2. Given the poorly specified nature of most programme theories of change, this level
of confidence can rarely be achieved (Mackenzie and Blamey, 2005) and the Aspen
proponents argue that attempts to articulate useful theories have suffered from a poor
knowledge base (Auspos and Kubisch, 2004).

References
Auspos, P. and A. Kubisch (2004) Building Knowledge about Community Change: Moving
beyond Evaluations. Washington, DC: The Aspen Institute.
Barnes, M., H. Sullivan and E. Matka (2003) ‘Evidence, Understanding and Complexity:
Evaluation in Non-Linear Systems’, Evaluation 9: 263–82.
Barnes, M., L. Bauld, M. Benzeval, K. Judge, M. Mackenzie and H. Sullivan (2005) Health
Action Zones: Partnerships for Health Equity. London: Routledge.
Bauld, L., K. Judge, M. Barnes, M. Benzeval, M. Mackenzie and H. Sullivan (2005)
‘Promoting Social Change: The Experience of Health Action Zones in England’,
Journal of Social Policy 34(3): 427–45.
Bonner, L. (2003) ‘Using Theory-Based Evaluations to Build Evidence-Based Health and
Social Care Policy and Practice’, Critical Public Health 13(1): 77–92.
Chen, H. T. (1990) Theory-Driven Evaluation. London: SAGE.
Connell, J. and A. Kubisch (1998) ‘Applying a Theory of Change Approach to the Evalu-
ation of Comprehensive Community Initiatives: Progress, Prospects and Problems’, in
K. Fulbright-Anderson, A. Kubisch and J. Connell (eds) New Approaches to Evaluat-
ing Community Initiatives, vol. 2, Theory, Measurement, and Analysis. Washington, DC:
Aspen Institute.
Connell, J. P., A. C. Kubisch, L. B. Schorr and C. H. Weiss (1995) New Approaches to Evalu-
ating Community Initiatives, vol. 1, Concepts, Methods and Contexts. Washington, DC:
Aspen Institute.

453
Evaluation 13(4)
Dahler-Larsen, P. (2001) ‘From Programme Theory to Constructivism: On Tragic, Magic
and Competing Theories’, Evaluation 7: 331–49.
Davies, H. T. O., S. M. Nutley and P. C. Smith, eds (2000) What Works? Evidence Based
Policy and Practice in Public Services. Bristol: Policy Press.
Dowling, B., M. Powell and C. Glendenning (2004) ‘Conceptualising Successful Partner-
ships’, Community 12(4): 309–17.
Fulbright-Anderson, K., A. Kubisch and J. Connell, eds (1998) New Approaches to Evaluat-
ing Community Initiatives, vol. 2, Theory, Measurement, and Analysis. Washington, DC:
Aspen Institute.
Gambone, M. (1998) ‘Challenges of Measurement in Community Change Initiatives’, in
K. Fulbright-Anderson, A. Kubisch and J. Connell (eds) New Approaches to Evaluat-
ing Community Initiatives, vol. 2, Theory, Measurement, and Analysis. Washington, DC:
Aspen Institute.
Julnes, G., M. M. Mark and G. Henry (1998) ‘Promoting Realism in Evaluation: Realistic
Evaluation and the Broader Context’, Evaluation 4(4): 483–506.
Lindgren, L. (2001) ‘The Non Profit Sector Meets the Performance-Management Move-
ment’, Evaluation 7(3): 285–392.
Mackenzie, M. and A. Blamey (2005) ‘The Practice and the Theory: Lessons from the
Application of a Theories of Change Approach’, Evaluation 11(2): 151–68.
Mackenzie, M., A. Blamey and P. Hanlon (2006) ‘Using and Generating Evidence: Policy
Makers’ Reflections on Commissioning and Learning from the Scottish Health Demon-
stration Projects’, Evidence and Policy 2(2): 211–26.
Martin, S. and I. Sanderson (1999) ‘Evaluating Public Policy Experiments: Measuring Out-
comes, Monitoring Processes or Managing Pilots?’, Evaluation 5: 245–58.
Pawson, R. (2002a) ‘Evidence-Based Policy: In Search of a Method’, Evaluation 8(2): 157–81.
Pawson, R. (2002b) ‘Evidence-Based Policy: The Promise of Realist Synthesis’, Evaluation
8(3): 340–58.
Pawson, R. (2003) ‘Nothing as Practical as a Good Theory’, Evaluation 9(4): 471–90.
Pawson, R. (2006) Evidence-Based Policy: A Realist Perspective. London: SAGE.
Pawson, R. and N. Tilley (1997) Realistic Evaluation. London: SAGE.
Stame, N. (2004) ‘Theory-Based Evaluation and Varieties of Complexity’, Evaluation 10(1):
58–76.
Sullivan, H. and M. Stewart (2006) ‘Who Owns the Theory of Change?’, Evaluation 12(2):
179–99.
Sullivan, H., M. Barnes and E. Matka (2002) ‘Building Collaborative Capacity through
“Theories of Change”: Early Lessons from the Evaluation of Health Action Zones in
England’, Evaluation 8: 205–26.
Tilley, N. (2004) ‘Applying Theory-Driven Evaluation to the British Crime Reduction
Programme’, Criminal Justice 4(3): 255–76.
Weiss, C. H. (1995) ‘Nothing as Practical as Good Theory: Exploring Theory-Based Evalu-
ation for Comprehensive Community-Based Initiatives for Children and Families’, in
J. P. Connell, A. C. Kubisch, L. B. Schorr and C. H. Weiss (eds) New Approaches to Evalu-
ating Community Initiatives, vol. 1, Concepts, Methods and Contexts. Washington, DC:
Aspen Institute.
Weiss, C. H. (1998) Evaluation: Methods for Studying Programs and Policies. Upper Saddle
River, NJ: Prentice Hall.
Weitzman, B. C., D. Silver and K.-N. Dillman (2002) ‘Integrating a Comparison Group
Design into a Theory of Change Evaluation: The Case of the Urban Health Initiative’,
American Journal of Evaluation 23(4): 371–85.

454
Blamey and Mackenzie: Theories of Change and Realistic Evalution
AV R I L B L A M E Y is a Senior Public Health Advisor with NHS Health
Scotland. Please address correspondence to: NHS Health Scotland,
Clifton House, Clifton Place, Glasgow G3 7LS, UK.
[email: [email protected]]

M H A I R I M A C K E N Z I E is a Senior Lecturer in Public Policy at the University


of Glasgow. Please address correspondence to: Department of Urban Studies,
University of Glasgow, 25 Bute Gardens, Glasgow G12 8RS, UK.
[email: [email protected]]

455

You might also like