Evaluations influence -
the dream, the nightmare,
and some waking thoughts
Australasian Evaluation Society International Conference
Keynote Address
Sydney, Australia 1 September 2011
Patricia Rogers [Link]@[Link]
Discussions about evaluation and complexity
1997 2002
Commission of the Future of
2003 2006
Health Care in Canada IBM Systems Journal,
Discussion Paper No. 8 The new dynamics of
Complicated and Complex strategy: Sense-making
Systems What would in a complex and
successful reform of complicated world
Medicare look like? Cynthia Kurtz
Dave Snowden Evidence-Based
Sholom Glouberman
Policy A Realist
Realistic Evaluation Brenda Zimmerman
Perspective
Ray Pawson Ray Pawson
Nick Tilley
2007 2007 2008 2008
Harvard Business Review
Leader's Framework for Evaluation in Complex Evaluation
Decision Making Adaptive Systems Using Programme
Dave Snowden Glenda Eoyang Theory to Evaluate
Mary Boone Thomas Berkas Complicated and
Complex Aspects
of
Interventions
Getting to Maybe Patricia Rogers
Frances Westley
Brenda Zimmerman
Michael Quinn Patton 2
Sources for this thinking about evaluation and complexity
2008 2008 2009 2010
Exploring the science NORAD conference
of complexity: Ideas Evaluating the complex
and implications for Oslo, Norway
development and
humanitarian efforts.
ODI Working Paper 285 Developmental
Ben Ramalingan Workshop, Cali, evaluation
Harry Jones Columbia Michael Quinn Patton
2010 2011 2011
Purposeful Program Evaluating the complex
Theory Kim Forss, MIta Marra,
Sue Funnell Robert Schwarz (eds)
Patricia Rogers 3
Two framings of simple, complicated and complex
Glouberman and Zimmerman 2002 Kurtz and Snowden 2003
Simple Tested recipes assure The domain of the known,
replicability
Cause and effect are well
Expertise is not needed understood,
Best practices can be confidently
recommended,
Complicated Success requires high level of The domain of the knowable
expertise in many specialized
fields + coordination Expert knowledge is required,
Complex Every situation is unique The domain of the unknowable,
previous success does not
guarantee success Patterns are only evident in
retrospect.
Expertise can help but is not
sufficient; relationships are key
Glouberman, S., and Zimmerman, B. Complicated and Complex Systems: What Would Successful Reform of Medicare Look Like?
Ottawa: Commission on the Future of Health Care in Canada, 2002. [Link] les/Glouberman_E.pdf.
Kurtz, C. F. and D. J. Snowden (2003) The New Dynamics of Strategy: Sense-making in a Complex and Complicated World, IBM
Systems Journal 42(3): 46283. ( who also discuss chaotic and disordered)
4
Evaluation influence for simple aspects of interventions
What interventions look like Discrete, standardized intervention
How interventions work Pretty much the same everywhere
Questions asked in evaluation What works?
Are we doing it right?
Nature of advice given by evaluation Single way to do it
Best practices
Process needed for evaluation Knowledge transfer
influence
Metaphor for evaluation influence Google directions (one way to do it
little skill needed to follow instructions)
5
Impact evaluation for complicated aspects of interventions
What interventions look like Different in different situations
How interventions work Differently in different situations
(different people or different
implementation environments)
Questions asked in evaluation What works for whom in what contexts?
Nature of advice given by evaluation Contingent
Good practices in particular situations
Process needed for evaluation Knowledge translation to new situations
influence
Metaphor for evaluation influence Transport map and timetable (need
some skill to choose the most
appropriate option for that time and
place)
6
Impact evaluation for complex aspects of interventions
What interventions look like Non standardized and changing, adaptive,
and emergent
How interventions work Results sensitive to initial conditions as
well as to context, generalisations rapidly
decay
Questions asked in evaluation What is working?
Nature of advice given by evaluation Dynamic and emergent
Principles
Process needed for evaluation Ongoing knowledge generation
influence
Metaphor for evaluation influence Topographical map and compass (need to
work it out as you go along)
7
8
Positiv No Negati
e influen ve
influen ce influen
ce ce
The dream The nightmares
9
Positiv
e
influen
ce
10
Good evaluation Which can lead to
provides..
Accurate, timely and credible Fixing problems
information that identifies and
explains poor performance
Accurate, timely and credible Reinforcing, supporting,
information that identifies and repeating, replicating,
explains good performance expanding good practice
Clear signals about what is More focus (attention and
important resources) on priority issues
Clear accountability and Increases motivation to find
consequences ways to improve performance
Increased ability to generate Ongoing capacity to learn
and use information
11
4 common barriers to positive evaluation influence
1. Technical limitations of available evidence, delays in feedback,
uncertainty/disagreement about what is needed in the intervention
and the evaluation
DELAY
INPUTS ACTIVITIES OUTPUTS OUTCOMES
EVALUATION (INCLUDING ALL FORMS OF EVALUATION FROM
NEEDS ASSESSMENT, PROGRAM DESIGN, PROCESS
EVALUATION, MONITORING AND IMPACT ASSESSMENT)
12
4 common barriers to positive evaluation influence
1. Technical limitations of available evidence, delays in feedback,
uncertainty/disagreement about what is needed in the intervention
and the evaluation
2. Cognitive taking in new information, overcoming assumptions
"It is impossible for someone to learn what
they think they already know."
Epictetus (AD 55?-135?), Greek Stoic
philosopher
13
4 common barriers to positive evaluation influence
1. Technical limitations of available evidence, delays in feedback,
uncertainty/disagreement about what is needed in the intervention
and the evaluation
2. Cognitive taking in new information, overcoming assumptions
3. Emotional defensive routines in response to shame, fear, and
grief
Great nations are like great people:
when they make a mistake, they realize it;
having realized it, they admit it;
having admitted it, they correct it;
they consider those who point out their faults
as their most benevolent teachers.
Lao Tzu
14
4 common barriers to positive evaluation influence
1. Technical limitations of available evidence, delays in feedback,
uncertainty/disagreement about what is needed in the intervention
and the evaluation
2. Cognitive taking in new information, overcoming assumptions
3. Emotional defensive routines in response to shame, fear, and
grief
4. Organisational- incentives that support or restrict generation
and use of information for improvement including organised self-
interest, dysfunctional accountability systems and adversarial
politics
15
No
influen
ce
No
influen
ce
According to GAOs analysis,
replacing the $1 note with a $1 coin could save
the government approximately $5.5 billion over 30
years.
Replacing the $1 Note with a $1 Coin Would Provide a Financial
Benefit to the Government March 2011
A Dollar Coin Could Save Millions. July 1995.
1-Dollar Coin: Reintroduction Could Save Millions If It Replaced
the 1-Dollar Note. May 1995.
One-Dollar Coin: Reintroduction Could Save Millions if Properly
Managed. March 1993.
A New Dollar Coin Has Budgetary Savings Potential but
Questionable Acceptability. Washington, D.C.: June 1990.
16
No
influen
ce
17
Ways that evaluation can and does have a negative influence
Negati
ve Instrumental use:
influen Misleading leads to incorrect decisions about
ce changes/continuation/termination due to:
Error
Data corruption
Gaming
Over-generalization (emphasis only on the
average effect)
Failure to appropriately address contributing
factors
Belief in the randomization fairy
18
Data corruption
2001 2008 2011
The company behind Victorian hospitals Prosecutors to review
the Ambulance manipulating data, widespread cheating
Emergency Dispatch admitting patients to in Atlanta schools
Service illegally made "virtual wards", and
phantom calls to boost inconsistently measuring
Cheat and exclusion
its performance for waiting times to meet claims rock literacy
financial gain. Government benchmarks tests
for bonus payments.
19
Data corruption
Strategies used to manipulate results in drug trials:
choice of placebo as comparator
selection of subjects (Bodenheimer, 2000)
manipulation of doses (Angell, 2004).
method of drug administration (Bodenheimer, 2000).
manipulation of timescales (Pollack & Abelson, 2006).
suspect statistical analysis
deceptive publication
suppression of negative results (Mathews, 2005)
selective publishing (Mathews, 2005, Armstrong, 2006; Harris,
2006; Mathews, 2005; Zimmerman & Tomsho, 2005)
opportunistic data analysis (Bodenheimer, 2000)
control of authorship (Bodenheimer, 2000)
House, E. 2008, Blowback: the consequences of evaluation, American
Journal of Evaluation, vol. 29, no. 4, 41626.)
20
Evaluation influence understood as knowledge transfer
between researchers, policymakers and practitioners
RESEARCHERS POLICYMAKERS PRACTITIONERS
SINGLE STUDY
FIND THAT DECIDE
DO
THING A TO DO
THING A
WORKS THING A
SEVERAL STUDIES
21
Evaluation influence understood as knowledge translation
between researchers, policymakers and practitioners
RESEARCHERS POLICYMAKERS PRACTITIONERS
SINGLE STUDY
FIND THAT DECIDE
DO
THING A TO DO
THING A
WORKS THING A
SEVERAL STUDIES
STUDIES MIGHT THERE MIGHT
BE TOO BE MIGHT NOT BE
NARROW AND DIFFERENTIAL FEASIBLE IN
IGNORE EFFECTS OTHER
IMPORTANT THING A MIGHT LOCATIONS
EVIDENCE ONLY WORK IN
SOME MIGHT NOT BE
THERE MIGHT CONTEXTS SCALEABLE TO
BE MULTIPLE
UNINTENDED LOCATIONS
NEGATIVE
EFFECTS
22
Risks of over-generalizing What Works example - Early Head Start
A review of early intervention programs for disadvantaged children,
found that that some evidence-based programs which were effective,
on average, were not only ineffective but actually damaging for the
most disadvantaged, even when properly implemented, and even
when the level of participation in the service was analysed (Westhorp
(2008)
For example, the Early Head Start program was found to have
unfavourable outcomes for children in families with high levels of
demographic risk factors (Mathematica Policy Research Inc, 2002).
Westhorp, G (2008) Development of Realist Evaluation Methods for Small Scale
Community Based Settings Unpublished PhD Thesis, Nottingham Trent University
Mathematica Policy Research Inc (2002). Making a Difference in the Lives of Infants and
Toddlers and Their Families: The Impacts of Early Head Start, Vol 1. US Department
of Health and Human Services.
23
Belief in the randomization fairy
Example: Random assignment makes the two
groups statistically equivalent in all aspects other
than access to treatment, with the result that only
the difference in treatment can cause a difference in
outcomes between them. (Smith & Sweetman,
2008)
24
How the randomization fairy can be a dangerous myth
A study that compared results from RCTs to those from observational
studies found that, while the overall average effect size was similar,
results from RCTs were more varied.
Some of the RCTs produced paradoxical findings (that is, in some of
trials the interventions produced negative effects and in other trials
the same intervention had positive effects) which could be explained
by random variation between treatment and control groups in terms of
contributing factors (Concato et al, 2000).
Concato J., Shah M.P.H. and Horwitz R.I. (2000) "Randomized,Controlled Trials, Observational
Studies, and the Hierarchy of ResearchDesigns", New England Journal of Medicine, 342,
25,pp.18871892,
Worrall, J. (2002). What Evidence in Evidence-Based Medicine? Causality: Metaphysics and
Methods: Technical Report 01/03. London: Centre for Philosophy of Natural and Social; Sciences,
London School of Economics.
25
:
Failure to address contributing factors
1st potted plant thought experiment
If 200 potted plants are randomly assigned to
either a treatment group that receives daily
water, or to a control that receives none,
and both groups are placed in a dark cupboard,
the treatment group does not have
better outcomes than the control.
Erroneous conclusion: Watering plants is
an ineffective intervention
:
Failure to address combined attribution
2nd potted plant thought experiment
If 200 potted plants are randomly assigned to
either a treatment group that receives daily
water, or to a control that receives none,
and both groups receive light,
the treatment group has better
outcomes than the control.
Inappropriate question: What proportion
of survival and growth is due to the
water?
Ways that evaluation can and does have a negative influence
Negati
ve Process influence:
influen Goal displacing diverts efforts from achieving
ce goals to meeting targets
Gaming and data corruption leads to cynicism
(process influence) as well as incorrect
decisions (instrumental use)
Shaming, reducing peoples mana
encourages defensive routines and
disengagement
Overwhelming scale of deficiencies reduces
motivation to try
Dissing (disrespecting) inappropriately
damages reputation, reducing support
Opportunity cost using resources that could
be used for operations and for useful
evaluation
28
Gaming
High standards 'rob' Victoria
Dan Harrison Education Correspondent
June 28, 2011
Federal Schools Minister Peter Garrett, who will announce the payments today,
said the result reflected the fact Victoria had set ambitious targets.
''These results reflect the fact that Victoria was starting from a high base and
was monitoring performances for the largest number of students of all the
states, so may have been over-ambitious in its targets.''
STATE TARGET PERFORMANCE PAYMENT
QLD 81.5% 88.9% $48.5 MILLION
VIC 92.98% 91.5% $ 9.4 MILLION
29
Some strategies for sharing knowledge about the
nightmares of no influence and negative influence
Genuine Evaluation
[Link]
With Jane Davidson and guest bloggers
Commentary on current examples of genuine
and non-genuine evaluation including
barriers to influence and risks of negative
influence
30
Some strategies for sharing knowledge about the
nightmares of no influence and negative influence
BetterEvaluation [Link]
Improving evaluation practice and theory by sharing information about evaluation
methods and approaches
Founding partners:Overseas Development Institute, Pact, ILAC (Institutional
Learning and Change Initiative of the Consultative Group on International
Agricultural Research), RMIT University
Current funders: Rockefeller Foundation and IFAD (International Fund for
Agricultural Development)
31
Documenting
Sharing
R&D COMMUNITY
Events Descriptions
Comments
WEBSITE Examples
Guides
Tools
32
33
Sources of content descriptions, examples, guides
34
34
Addressing complicated aspects of interventions
Support evaluation users to translate
findings to a new situation
Maybe the same theory of change
but a different theory of action
Understand which contexts
enable/prevent the mechanisms
Recognise the multiple contributors to
impacts
Does it have to
be pizza?
35
Addressing complex aspects of interventions
Support evaluation users in short run
learning cycles and ongoing adaptation
Support effective working relationships
36
Traditional Traditional
software evaluation
development development
Specification Terms of
Reference/Evaluation
Brief
Design Evaluation
Design/Plan
Implementation Implementation
Delivery Delivery
37
Agile software Agile ongoing evaluative
development inquiry
adaptive and
responsive as specs
change
Overall goal Overall goal
Quick build Initial data sweep (Bron
McDonald)
Reality testing (Michael Patton)
Review and revise Iterative, ongoing inquiry
Real time data, self monitoring
Repeat quick build
38
Example of real time, collaborative mapping - Ushadishi
39
40