0% found this document useful (0 votes)
19 views20 pages

292adf2b29e7883bf8a63a42ea7dfb48

Uploaded by

Gohan Adiputra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views20 pages

292adf2b29e7883bf8a63a42ea7dfb48

Uploaded by

Gohan Adiputra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Article

American Journal of Evaluation


1-20
ª The Author(s) 2018
Assessing Triangulation Article reuse guidelines:
[Link]/journals-permissions
Across Methodologies, DOI: 10.1177/1098214018804195
[Link]/home/aje
Methods, and Stakeholder
Groups: The Joys, Woes,
and Politics of Interpreting
Convergent and Divergent Data

Rebecca Campbell1, Rachael Goodman-Williams1,


Hannah Feeney1, and Giannina Fehler-Cabral2

Abstract
The purpose of this study was to develop triangulation coding methods for a large-scale action
research and evaluation project and to examine how practitioners and policy makers interpreted
both convergent and divergent data. We created a color-coded system that evaluated the extent of
triangulation across methodologies (qualitative and quantitative), data collection methods (obser-
vations, interviews, and archival records), and stakeholder groups (five distinct disciplines/organi-
zations). Triangulation was assessed for both specific data points (e.g., a piece of historical/
contextual information or qualitative theme) and substantive findings that emanated from further
analysis of those data points (e.g., a statistical model or a mechanistic qualitative assertion that links
themes). We present five case study examples that explore the complexities of interpreting tri-
angulation data and determining whether data are deemed credible and actionable if not convergent.

Keywords
mixed methods, action research, participant observation, qualitative methods, quantitative methods

The concept of triangulation has a long history in the social sciences, capturing the interest of
quantitative, qualitative, and mixed methods scholars alike. In quantitative research, D. T. Campbell
and Fiske (1959) created the multitrait, multimethod matrix for systematically comparing findings
across different data collection methods; if results converged across methods, D. T. Campbell and
Fiske (1959) argued that researchers could have greater confidence in the validity of their

1
Michigan State University, East Lansing, MI, USA
2
HarderþCompany Community Research, San Francisco, CA, USA

Corresponding Author:
Rebecca Campbell, Michigan State University, 130A Psychology Bldg., East Lansing, MI 48824, USA.
Email: rmc@[Link]
2 American Journal of Evaluation XX(X)

conclusions (see also Webb, Campbell, Schwartz, & Sechrest, 1966). In qualitative research, early
conceptualizations of triangulation highlighted how multiple methods can reveal shared perspec-
tives and realities, without making epistemological claims regarding the “truth” of the findings
(Denzin, 1978; Lincoln & Guba, 1985). In mixed methods research, triangulation has been lauded
as a strategy for exploring viewpoints revealed through divergent data (Greene, 2007; Greene &
Caracelli, 1997; Greene, Benjamin, & Goodyear, 2001). Although quantitative, qualitative, and
mixed methodologists have varied epistemological and practical reasons for studying triangulation,
there remains an enduring shared value across these methodological traditions for collecting and
comparing information generated by disparate methods and sources.
In mixed methodology scholarship specifically, the literature on triangulation has been centered
primarily on what Greene and Caracelli (1997) termed the theoretical level of inquiry to engage the
paradigmatic and epistemological issues inherent in evaluative comparisons of data collected by
different means. By contrast, the technical levels of inquiry are far less developed, as procedural
guidance for mixed method triangulation coding is sparse (see Farmer, Robinson, Elliott, & Eyles,
2006; McConney, Rudd, & Ayres, 2002; Sands & Roer-Stier, 2006, as notable exceptions). Like-
wise, few scholars have delved into the political level of inquiry to explore how studying triangula-
tion can reveal contested spaces in evaluation projects and influence what stakeholders deem
credible and actionable evidence (Donaldson, Christie, & Mark, 2015). The purpose of this article
is to advance these levels by presenting a practical framework for coding data convergence in a
large-scale, policy-focused mixed methodology project and to explore the political implications of
both convergent and divergent data. We present five case studies that illustrate how our triangulation
assessments influenced what actions our evaluation team and our community stakeholders decided
to take—or not to take—in this policy context. To set the stage for this project, we will begin with a
brief review of the literature on triangulation coding in mixed methods research and evaluation.

The Technical Level: What Is Triangulation and How Is It Assessed?


To begin, a working definition of triangulation is necessary, and we draw upon Denzin’s (1978)
influential conceptualization as a starting point. Given that social science methodologies have
evolved considerably since the 1970s, we also offer our reinterpretations of Denzin’s four types
of triangulation in this exploration. First, he defined methodological triangulation as the use of
multiple methods to study the phenomenon of interest, working either within method (e.g., multiple
variations of a survey) or between methods (e.g., a survey and observational data collection). Denzin
did not draw a distinction between methodology and methods, but it seems reasonable to infer that
this type of triangulation could include multiple methodologies (e.g., quantitative and qualitative
methodologies) and/or multiple data collection methods (interviews, surveys, observations, archival,
each of which could be quantitative or qualitative). Second, Denzin defined data triangulation as
collecting data from multiple sources (i.e., people/stakeholders), across time, space, and person.
Third, for investigator triangulation, more than one investigator/evaluator is involved in the process
of collecting, analyzing, and/or interpreting data to assess the extent to which the investigators reach
similar conclusions. Finally, Denzin proposed theoretical triangulation whereby multiple theories
are brought to bear in the study of the phenomenon of interest. Denzin noted this form of triangula-
tion is rare and likely impractical (see also critiques by Mathison, 1988), and he did not pursue this
type of triangulation in his later works.
Building upon Denzin’s (1978) three main types of triangulation, Figure 1 presents a visual
schematic of mixed methods triangulation design options. In this figure, methodological triangula-
tion is represented in the columns, including both qualitative and quantitative methodologies and
multiple methods. For simplicity, we include only three data collection methods (interviews/sur-
veys, observations, and archival records), though this could be expanded to include others. In the
Campbell et al. 3

QUALITATIVE QUANTITATIVE
Interviews/
Interviews Observations Archival Observations Archival
Surveys
STAKEHOLDERS

Group A • • • • • •
Group B • • • • • •
Group C • • • • • •
Within methodology, across method, within stakeholder group

Within methodology, within method, across stakeholder group

Across methodology, across method, within stakeholder group

Across methodology, across method, across stakeholder group

Figure 1. Mixed Methods Triangulation Design Options.

rows, we represent stakeholder triangulation, a renaming of Denzin’s data triangulation, to reflect


the practice of collecting the data from multiple individuals within and across stakeholder groups. As
Figure 1 illustrates, researchers/evaluators could employ narrowly defined mixed method triangula-
tion designs, such as within methodology (all qualitative), across data collection methods (inter-
views, observations, archival), and within stakeholder group (one stakeholder group). Another
example of a tightly focused mixed method triangulation could be a within methodology (qualita-
tive), within data collection method (interview), and across stakeholder groups. Broader interpreta-
tions of mixed methods triangulation would involve crossing the methodological boundary to
include both qualitative and quantitative work, with varied data collection methods, within or across
stakeholder groups. For any of these design options, investigator triangulation could be added
whereby the collection, analysis, and interpretation of the data would involve multiple investigators.
Figure 1 highlights that researchers/evaluators could be collecting substantial amounts of data,
depending on the complexity of the triangulation design (different methodologies, methods, stake-
holders, and/or investigators), so how should this disparate information be analyzed to assess the
nature and extent of convergence? The technical literature consists primarily of case studies high-
lighting triangulation results (e.g., Bailey & Hutter, 2008; Deacon, Bryman, & Fenton, 1998; Flick,
Garms-Homolova, Herrmann, Kuck, & Röhnsch, 2012; Greene & McClintock, 1985; Murphy,
Banyard, Fennessey, 2013), with fewer examples presenting methods for assessing convergence
and divergence. McConney and colleagues (2002) developed a mathematical coding and weighting
system whereby evaluators could rate conflicting findings on a number of dimensions (accuracy,
relevance, representativeness) to yield a final summative evaluation of a program’s merit or worth.
For exploratory formative projects, whereby researchers/evaluators are not tasked with making
summative judgments, coding the extent of convergence and reasons for divergence may be illu-
minating. For example, Farmer and colleagues (2006) proposed a four-category rating system,
whereby emerging themes from each stakeholder group and/or data collection method are compared
and evaluated for (1) full agreement across sources; (2) partial agreement across sources; (3) silence,
when a theme emerges from one stakeholder group or data collection method, but not others; and (4)
dissonance, when there is complete disagreement across sources. Farmer, Robinson, Elliott, and
Eyles (2006) employed these methods to assess convergence in an evaluation of the Canadian Heart
4 American Journal of Evaluation XX(X)

Health Dissemination Project, which involved collecting 40 qualitative interviews and 30 archival
records across three Canadian provinces (i.e., a within methodology, across method, across stake-
holder triangulation). They found that these coding methods worked well for documenting conver-
ging findings and for distinguishing incomplete from divergent data.
A similar process for coding convergence was proposed by Sands and Roer-Stier (2006) in their
work interviewing 17 mother–daughter dyads (i.e., a within methodology, within method, across
stakeholder triangulation). They examined the extent to which emerging themes in the mothers’
interviews were consistent with their daughters’, using a five-category rating system: (1) same
story–same meaning, (2) same story–different interpretation, (3) missing pieces (both women were
privy to a piece of information but only one mentioned it in her interview), (4) unique information
(information was known to only one woman), and (5) illuminating (different accounts were provided
by each participant). Like Farmer et al.’s (2006) coding scheme, this approach also distinguishes
agreement from completeness but refines the assessment of missingness to differentiate information
that was not shared by both but was (presumably) known to both, from information that was not
known to both. It may be difficult to ascertain what participants could have possibly known or not
known, but Sands and Roer-Stier (2006) were able to make these distinctions in their project.
The coding methods proposed by Farmer and colleagues (2006) and Sands and Roer-Stier (2006)
are consistent with Greene’s (2002, 2007; Greene et al., 2001) calls for the systematic study of
diversity not merely the documentation of convergence. Furthermore, these coding methods assess
the extent of convergence and possible reasons for divergent points of view, rather than assigning a
simple yes/no determination of whether information converged. Both projects were within metho-
dology assessments—all qualitative—and relatively small in scale. How triangulation coding could
be employed in large projects that transcend methodological boundaries to include qualitative and
quantitative data are in need of further exploration.

The Political Level: What Might Triangulation Reveal?


Much of the literature on triangulation focuses on convergence and the claims of credibility that can
be made when multiple methods and sources provide consistent information (Erzberger & Prein,
1997; Flick, 1992; Hammersley, 2008; Howe, 2012; Moran-Ellis et al., 2006; Morse, 2015), but
reviewing triangulation design options (Figure 1) and triangulation coding methods (above) under-
scores that divergent findings are perhaps more likely, as there are so many ways for data not to
converge. Indeed, Mathison (1988) explained that the expectation for convergence is
“unrealistic . . . in practice, triangulation as a strategy provides a rich and complex picture of some
social phenomenon being studied, but rarely does it provide a clear path to a singular view of what is
the case” (pp. 13, 15). Likewise, Kidder and Fine (1987) noted that stakeholder groups’ cultures may
be so vastly different that convergence of their narratives is simply not possible. Greene and
colleagues have argued that this is precisely why mixed method designs are valuable, as divergent
findings create an opportunity for “more insightful understanding,” which can “lead to conceptual
re-reframing . . . or to discovery of new concepts that constitute theoretical or practical insight”
(Greene et al., 2001, p. 30; see also Greene, 2002, 2007; Greene & Caracelli, 1997; Greene &
McClintock, 1985). Greene, Benjamin, and Goodyear (2001) further contended that “because dif-
ferent methods themselves advance different values and interests, mixed method evaluation invites
pluralism and thus critical commentary, public debate, and deliberative dialog about diverse per-
spectives on important public issues” (p. 30). As a type of mixed methods research, triangulation
coding will likely reveal disagreements and inconsistencies, so what are the potential political
implications of uncovering contested issues?
In an evaluation context, triangulation results may affect what stakeholders consider credible and
actionable evidence (Donaldson et al., 2015). If findings do not converge across methods and
Campbell et al. 5

sources, are they still actionable? Rallis (2015) noted that transparency of process is key to cred-
ibility: “stakeholders use findings, depending on whether they understand and accept how we have
created and told the story, that is, how we made meaning of words and images” (p. 143). Triangula-
tion coding is an excellent way to show where there are convergent and divergent perspectives, and
that transparency may be instrumental for identifying actionable next steps, even when there is
disagreement. At the same time, Miller (2015) noted that accuracy of evaluative data also determines
what stakeholders deem credible and actionable, and triangulation coding could reveal deeper
problems in the data. Comparing methods and sources could show that some information is not
simply not correct—not a difference in opinion or a different point of view, but an inaccuracy. Did
sources simply misunderstand or misremember, or did sources deliberately mislead the evaluator?
This is an uncomfortable reality, one readily acknowledged in evaluation practice (Patton, 2008;
Weiss, 1973), but triangulation coding may lay that reality bare, in incontrovertible detail. As such,
triangulation is not merely a methodological task of cross-checking data, it is also a commitment to
engaging political complexities and competing interests that may affect the credibility and usability
of the data.

The Current Study


The purpose of this study was to take up the technical challenge of developing triangulation coding
methods for large-scale, across-methodology projects and to advance political analysis by sharing
how a multidisciplinary team of evaluators, practitioners, and policy makers interpreted both con-
vergent and divergent data. Specifically, we had three primary goals. First, at a technical level, we
wanted our coding methods to capture convergence separate from missingness and uniqueness and
to provide some context as to why data did or did not converge, consistent with the coding proce-
dures developed by Farmer et al. (2006) and Sands and Roer-Stier (2006). However, given the large
volume of data collected in this project, we needed simplified and streamlined coding strategies that
would help us capture these key distinctions reliably but efficiently. To that end, we created a green-
yellow-red coding system to assess triangulation, evoking the commonly understood “go-caution-
stop” meaning implied in this color-coding scheme.1 The “green” codes signal triangulation across
methodology, data collection method, and stakeholder group, giving clear evidence from multiple
methods/sources regarding the presence and shared meaning of information. “Yellow” data merit
caution because the presence and/or meaning of a theme is in dispute, or the data could reflect an
insular point of view that is not shared by others. “Red” data are incomplete: Information was
provided by only one data source, so it may be necessary to gather additional data to determine
whether other methods/sources can “fill in the hole” (thereby converting “red” data to “yellow” or
“green”). Alternatively, the information may be unique and no one else or no other data source can
verify it, so the evaluator must decide whether further analysis and action is warranted or if there
may be reasons not to share that information (e.g., confidentiality).
Second, to address both technical and political concerns, we applied this green-yellow-red coding
method to individual data points (e.g., factual information or qualitative theme) and to findings/
results that emanated from further analysis of those data points (e.g., a statistical model or a
mechanistic qualitative assertion that links themes). In this project, we had reason to believe that
stakeholders might purposely provide incorrect information to try to sway the evaluation (see below
for further discussion of the context of this work), so we needed to check the accuracy of information
(for factual data points) and the degree of disagreement (for thematic data points) before proceeding
with further analysis to generate findings from those data. Assessing triangulation of both individual
data points and findings based on those data is not yet common in the literature, but we felt it was
necessary for promoting transparency (Rallis, 2015) and checking accuracy (Miller, 2015), which
are critical determinants of credibility and usability. Put another way, in high-stakes, high-conflict
6 American Journal of Evaluation XX(X)

projects, it is worth the effort to find misinformation before scaffolding those inaccuracies into
broader findings.
Third, at the political level of analysis, we wanted to promote discussion about what evaluators
should do with divergent data. The historical presumption has been that convergent data lend
credibility and confidence and are therefore actionable (e.g., D. T. Campbell & Fiske, 1959; Erzer-
berger & Prein, 1997; Webb et al., 1966). The implied corollary is that divergent data are not
credible and therefore not actionable, but Greene’s (2002, 2007; Greene et al., 2001) theoretical
analysis challenges that presumption, so we wanted to explore how stakeholders decide whether
convergent and divergent data are credible and actionable (McConney et al., 2002). We will share
five cases studies that illustrate how we implemented our color-coding system and what actions
we—both the evaluation team and our collaborators—decided to take based on green, yellow, and
red triangulation findings.

The Context for the Current Study: The Detroit Sexual Assault Kit (SAK) Action Research
Project
Although this is an article about methodological triangulation, it is necessary to provide some
context about the substantive focus of this project, as the results are case study examples of con-
vergence and divergence that require content grounding for comprehension. Briefly, this project
addressed the growing national problem of untested SAKs (also termed rape kits). A rape kit is
typically collected within 24–72 hr after a sexual assault in order to obtain biological evidence from
victims’ bodies (e.g., semen, blood, saliva; Department of Justice, 2013). This evidence can be
analyzed for DNA and compared against other criminal reference DNA samples in Combined DNA
Index System , the federal DNA database, which can be instrumental in solving crimes and prose-
cuting rapists (R. Campbell, Feeney, Fehler-Cabral, Shaw, & Horsford, 2017; Strom & Hickman,
2010). However, in jurisdictions throughout the United States, police frequently do not submit rape
kits for forensic DNA testing, and instead, kits are shelved in police property, unprocessed and
ignored for years (R. Campbell et al., 2017; Pinchevsky, 2018). Conservative estimates indicate
there are at least 200,000 untested SAKs in U.S. police departments, and large stockpiles of kits have
been documented in over five dozen jurisdictions, sometimes totaling more than 10,000 untested
SAKs in a single city (R. Campbell et al., 2017). The failure to test rape kits for DNA evidence has
drawn public outrage as well as the attention of Human Rights Watch (2009, 2010) and the U.S.
Department of Justice (2015), which highlighted the problem of untested SAKs as an example of
biased and discriminatory police practices in their policy report, Gender Bias in Law Enforcement
Response to Sexual Assault.
Detroit, MI, was one of many U.S. cities with large numbers of untested rape kits. In August
2009, approximately 11,000 SAKs were found in a remote police property storage facility. Local,
county, and state officials demanded immediate review and intervention. In fortuitous timing, the
National Institute of Justice released funding for collaborative action research projects whereby
researchers would work with community practitioners in jurisdictions with large numbers of
untested SAKs to develop and evaluate change strategies (NIJ, 2010). In the action research model,
researchers/evaluators collect data about the problem at hand, share the findings with community
collaborators, and together, informed by those results, the team develops, implements, and evaluates
change strategies until successful solutions are institutionalized (Klofas, Hipple, & McGarrell,
2010). Detroit, MI, was selected for one of these grants, and we (the first and last authors of this
article) were the primary research partners in this project.
One of the required aims of the action research project was to examine the underlying reasons
why so many SAKs were not submitted by the police for forensic DNA testing. At that time this
project began, law enforcement officials were defending their decisions not to test these rape kits for
Campbell et al. 7

QUALITATIVE QUANTITATIVE
Interviews/
Interviews Observations Archival Observations Archival
Surveys

A. Police • • • •
STAKEHOLDERS

B. Prosecutors • • • •
C. Forensic
Science/ Crime Lab
• • • •
D. Nursing/
Medicine
• • • •
E. Victim Services/
Advocacy
• • • •
Total Total Total Total

42 formal ~186 hours of 1,268 sexual 1,268 sexual


interviews observations assault case assault case
over 30 months police reports police reports
187 informal
interviews 93 intra- 93 intra-
organization organization
documents documents

Figure 2. Mixed Methods Triangulation Design for the Detroit SAK Action Research Project.
Note. SAK ¼ Sexual Assault Kit.

DNA and did not perceive that there was a problem to be solved. By contrast, practitioners from
other disciplines—victim advocacy, nursing/medicine, prosecution, and forensic sciences—were
alarmed that so many kits had not been tested, particularly because so many of these victims were
Black women and/or poor women (see R. Campbell et al., 2015). This was a point of tremendous
conflict between stakeholders, so the action research project began under a dark cloud of interper-
sonal and interorganizational tension. In this context, it seemed possible, perhaps even likely, that
stakeholders might try to sway our investigation into why rape kits were not tested, and thus,
triangulating information across stakeholder groups and data collection methods seemed prudent.
Elsewhere, we discuss the substantive findings of this component of the action research project (see
R. Campbell et al., 2015), but briefly, rape kits were not submitted for DNA testing because Detroit-
area organizations simply did not have adequate resources (staffing, time, financial) to test all kits
and investigate all reported sexual assault cases. However, there was also clear and compelling
evidence that rape kits were not tested because police did not believe rape victims, and their
adherence to rape myth stereotypes influenced their decisions not to invest their limited resources
in this crime and these victims. In this article, we will share how triangulation coding helped us
uncover these findings.

Method
Triangulation Design
Figure 2 summarizes the triangulation design we used in the Detroit SAK Action Research Project.
We collected data from all five stakeholder groups/disciplines that are involved in collecting and
testing rape kits, investigating and prosecuting sexual assault crimes, and providing services to
victims, including police, prosecution, forensic science/crime laboratory, nursing/medicine, and
8 American Journal of Evaluation XX(X)

victim services/advocacy. One critical stakeholder group was not formally represented in this study:
rape survivors whose SAKs had not been submitted for testing. In the field of sexual assault
research/evaluation, it is rare that rape survivors can be interviewed in the midst of pending legal
cases, as the researchers can inadvertently become parties to the case. Each one of the *11,000
untested SAKs represented a potentially open, active legal case, given that the prosecutors’ office
had made clear that they intended to have these cases investigated and reevaluated for possible
prosecution. Our institutional review board (IRB) was concerned about possible negative iatrogenic
effects of the research on open case proceedings, a concern strongly shared by the prosecutors. As
such, we could not include rape survivors whose kits had not been tested as a stakeholder group. We
sought other avenues for obtaining survivors’ perspectives in the action research project (see R.
Campbell et al., 2015), but we could not include them as a stakeholder group in the triangulation
design.
We used multiple data collection methods with each stakeholder group, including qualitative
ethnographic observations, qualitative interviews, and archival records that were qualitatively and
quantitatively coded (e.g., sexual assault police reports, intraorganizational records of staffing
levels, policies and procedures, see Figure 2). For the qualitative observational and interview data,
we included high-level leadership and frontline practitioners from each stakeholder group, as well
as current and former employees (e.g., key individuals who had changed positions or retired but
were once closely involved in the issue of SAK processing). The data were reasonably well
distributed across stakeholder groups, such that we had multiple observations, interviews, and
archival records within each stakeholder group and across all stakeholder groups (see R. Campbell
et al. [2015] for more information regarding how data saturation was monitored throughout the
project).

Triangulation Coding Procedures


Throughout data collection, the evaluation team open coded all data sources to capture emerging
themes and monitor data saturation (Corbin & Strauss, 2008; Miles, Huberman, & Saldana, 2014).
We used a standard word processing program to conduct this coding, though specialized qualitative
software programs are also available. The open codes captured two kinds of data points: (1) factual
information that may be of interest in future analyses (e.g., dates that key events occurred, staffing
levels over time) and (2) emerging qualitative themes (e.g., beliefs about rape victims, level of
investigational effort invested in a case). We then reviewed all open codes across all data sources to
decide which data points should be subjected to triangulation analyses. Given the labor involved in
conducting such an analysis, we did not seek to triangulate every piece of factual information/
thematic code but only those pertaining to our focal research question regarding why this city had
amassed so many untested rape kits (see Mathison, 1988).
To assess triangulation of a data point, we compared the information that was provided about that
fact/theme by each stakeholder group to determine whether individuals within the same stakeholder
group provided the same information and whether individuals from different stakeholder groups also
provided the same information. Likewise, we checked whether the same information appeared
across different data collection methods (e.g., the same fact/theme that emerged in a qualitative
interview could be verified in archival records). Table 1 provides the operational definitions of
green, yellow, and red triangulation codes that guided our work. Once we were deeper in the data, we
decided to form two “green” codes: dark green to reflect that information converged across meth-
odology, data collection method, and stakeholder group (all three), and “regular” green to reflect
convergence across any two (methodology, method, or stakeholder). Again, we used standard word
processing and spreadsheet programs to track this work; specialized qualitative software also pro-
vides options for conducting these comparisons. Two members of the evaluation team completed
Table 1. Triangulation Coding Operational Definitions and Examples.

Nature of Triangulation

Triangulation Within Across Within Across


Code Method Method Stakeholder Stakeholder Operational Definition of Code Example

Dark Green   Individuals within the same AND different Multiple qualitative interviews with participants
stakeholder groups, AND multiple data in Groups A AND B, AND archival records
collection methods triangulate confirm same information
Green   Individuals within the same AND different Multiple qualitative interviews with participants
stakeholder groups using one data collection in Groups A AND B confirm same information
method triangulate
  Individuals within the same stakeholder group Multiple qualitative interviews with participants
AND multiple data collection methods in Group A AND archival records confirm
triangulate same information
Yellow   Individuals within the same stakeholder group Multiple qualitative interviews with participants
using one data collection method triangulate. in Group A BUT no other data collection
NO triangulation from different stakeholders method confirms the same information
or different data collection methods. Using the OR
same method with different stakeholder Multiple qualitative interviews with
groups or a different method within the same participants in Group A conflict with those in
stakeholder group reveals different meanings Group B. Other data collection methods
or unclear, equivocal, or no information to provide no clear insights
reconcile perspectives
Red A single individual provides information that is One participant in Group A provides information
not triangulated by other stakeholders or by in a qualitative interview that could not be
another data collection method confirmed by any other person or method

9
10 American Journal of Evaluation XX(X)

this process of comparing information across stakeholder groups and data types (i.e., investigator
triangulation), as this task required judgment regarding the extent to which the information con-
verged/did not converge (MacQueen, McLellan-Lemal, Bartholow, & Milstein, 2008). The coders
reviewed and discussed the compiled information and then, using our green-yellow-red color-coding
system, evaluated the extent to which each data point converged across methodology/method/
source. There was consistently high agreement between the coders for the dark green, green, and
red codes; yellow codes required more extensive review to determine the final triangulation code.
Data points that were green or yellow proceeded to the next stages of data analyses (see Results for
our rationale to analyze yellow data). For qualitative data, we employed Erickson’s (1986) analytic
induction method to identify mechanistic assertions that linked individual themes together into a
hypothesized explanation for why SAKs went untested for decades. The quantitative data were
analyzed with multilevel logistic regression to examine whether historical and contextual factors
(e.g., the date when federal funding for DNA testing became available) predicted SAK testing rates
(see R. Campbell et al. [2015] for details).
To assess triangulation of a finding that emanated from further analysis of a data point, we
applied the same codes in Table 1 to each finding. For our qualitative results, the themes within
each assertion had already been triangulated as data points, thus our task was to ascertain whether the
mechanistic assertion in the finding could be supported with other qualitative or quantitative find-
ings. For the quantitative results, we examined whether statistically significant effects in the multi-
level logistic regression model were also identified as salient and important in the qualitative data
(and vice versa for nonsignificant findings).

Triangulation Case Studies Selection


The final step in our analysis was to select specific triangulation case studies for discussion with our
stakeholders and for inclusion in this article. We brought green, yellow, and red triangulation results
(data points and findings) to our practitioner and policy colleagues for review, so that we could
consider implications of both convergent and divergent data. We had complex confidentiality
concerns to navigate in those discussions as IRB regulations stipulate that data cannot be released
in identifiable form. In close-knit, long-term projects, it is quite possible that deidentified data are
recognizable to those inside the project. Stakeholders might be able to recognize a source (by the
content, word choice, speech mannerisms, etc.), even if it was properly deidentified for outside
dissemination. Whenever possible, we used common strategies to protect confidentiality, such as
lightly editing interview excerpts to remove distinctive language, but our best option was to develop
case study summaries that synthesized which data were compared and what we found in that
comparison. We were mindful of Rallis’s (2015) arguments regarding transparency, and we worked
with our community partners to find an acceptable balance between protecting confidentiality and
presenting details (see R. Campbell et al. [2015] for extended discussion of IRB and ethics issues in
this project).

Results
“The Unreliable Narrator:” A Case Study of Red and Yellow Data Points
In literature, an unreliable narrator is someone who tells a story while layering a distorted lens over
that reality such that the resulting narrative becomes untrustworthy. Whereas the concept of “truth”
is a subject of debate in qualitative research (Lincoln, Lynham, & Guba, 2011; Randall & Phoenix,
2009), in an evaluation context, it may be helpful to at least know whether a participant’s views may
be unreliable and untrustworthy. For example, in our efforts to understand why so many rape kits in
this jurisdiction were not submitted for forensic DNA testing, we inquired about the focal police
Campbell et al. 11

department’s policies regarding submitting kits for testing. Multiple stakeholders in that organiza-
tion stated that the department had a written SAK submission policy, but when we asked (repeat-
edly) to see that policy, we were not provided with any documentation; other stakeholder groups said
they had never seen the policy and when they also tried to obtain a copy of the documentation, none
was provided. We could not verify the existence of this policy through any source external to this
organization or through another data collection method. The triangulation code for this data point
was yellow: Individuals within the same organization provided consistent information, using the
same method, but no outside source (i.e., another stakeholder group) or data collection method (i.e.,
archival record of the written policy) could verify this information. Given the salience of this matter
(whether there was a written policy regarding SAK submissions and testing), we decided to discuss
this data point in our final report, highlighting it as “yellow” information to be clear that the focal
organization stated it had a policy, but we could find no tangible evidence of its existence. As
evaluators, we felt that the yellow data were credible and actionable and highlighted this finding in
our dissemination, a decision supported by other organizations in the action research project.
Digging deeper into the transcript of the individual who first told us that the police department
had a written policy about SAK testing, we noticed that other key details—such as the sequence of
events regarding the discovery of the untested kits, descriptions of meetings about that discovery,
dates of key events, who was involved in these events—could not be triangulated at all. These data
points were coded as red because no other individuals within or across stakeholder groups—or
another data collection method—could verify the information as provided. Reviewing these red
data points revealed an interesting pattern: Each piece of (unverified) information presented the
focal organization in a better light—slight differences in the timing of events, sequences, and actors
involved that together helped portray the organization as less blameworthy. The interview was
replete with these systematic shifts and slights, interspersed with information that did triangulate
either across stakeholders or data collection method (i.e., green). When we rechecked our qualitative
field notes regarding our interactions with this individual (in team meetings and in the interview
itself), we had not made any mention that we were suspicious about the veracity of the information
provided; rather, our notes indicated that this individual was straightforward, informative, and
helpful. And, it turns out, incorrect—incorrect in a patterned, predictable way. It is not surprising
that someone might spin a narrative to favor a particular perspective, and again, some qualitative
researchers might find that an interesting issue to pursue in its own right. Our point is that we did not
know and could not tell that the information was inaccurate until we conducted this triangulation
analysis. Had we used these specific data points, as provided by this individual, we would have
disseminated incorrect information that would have cast this organization in a better light than was
warranted. Based on the results of the triangulation coding, we (the evaluation team) decided that
this transcript should not be included in subsequent analyses. We checked whether the exclusion of
the transcript would remove unique points of view from consideration (i.e., perspectives not shared
by others but not inaccurate). The transcript contained multiple inaccuracies but also other ideas and
comments that were raised by other stakeholders that could be independently verified.

“Don’t Go There:” A Case Study of a Red Data Point


This case example highlights a different contextual meaning of red data, not as information that is
unverifiable due to inaccuracies, but as a signal from the participants to the evaluator/researcher:
don’t go there, this is not an issue we want to discuss. In nearly all of our qualitative interviews,
stakeholders discussed the problem of chronic burnout: Staff were overworked and undercompen-
sated in extremely difficult working conditions with minimal external recognition for their work.
However, one participant stated that staff were experiencing not only burnout but also vicarious
trauma, a condition experienced by those in helping professions, such that repeated exposure to
12 American Journal of Evaluation XX(X)

highly upsetting, traumatic material starts to change people’s cognitions and behaviors, often
decreasing their compassion and empathy for others (Figley, 1995; Office for Victims of Crime,
2017). This participant posited that investigating sexual assault cases was causing vicarious trauma
among service providers, and this trauma was negatively affecting their personal health and well-
being and, ultimately, the quality of their professional work (e.g., treating rape survivors with
decreased respect and empathy). This was a distinctive point of view, one that readily stoodout
while we were open-coding the data; however, as the analyses proceeded, we did not find other
mentions of this idea from other stakeholder groups in observational data, interview data, or archival
records.
When we initially reviewed this red data point, we were unsure whether this was an instance of
missing/incomplete data or unique data. Given that we were conducting follow-up interviews with
participants, as well as informal interviews throughout the project, we attempted to broach the
subject of vicarious trauma and ascertain whether others also felt it was a salient issue, but they
had not mentioned it in their interviews (i.e., red due to missingness that could convert to yellow or
green data) or whether this was indeed a unique perspective (i.e., red due to uniqueness that would
remain red). In that follow-up work, we realized that this code was red because participants did not
want to talk about this subject—it was not accidentally missing, it was purposely missing. It was an
off-limits topic as stakeholders gave numerous verbal and behavioral indications when directly
asked that this was not something they wished to discuss. Thus, the data point would remain red,
due to uniqueness, and we had to decide what to do with the information we had learned. This project
was not intended to be a study of vicarious trauma—there was no indication in any of the research
materials, consent forms, and so on, that indicated to the participants that this would be a subject of
inquiry, and we respected participants’ limits and did not pursue any further analyses on this topic.
However, we did disclose in our final written report (which was previewed by the stakeholders prior
to submission to the funder) that this topic came up in the project but was not deemed focal by the
stakeholders. We debated whether inclusion of this example here, in this article, pushed those limits,
but because we had already signaled to stakeholders that we retained the option to mention the
absence of this theme (as we did in the final report), we decided to tell this story, and its backstory,
here to advance discussion about the underlying reasons why data may be missing or unique in
triangulation coding.

“Saving Our Bacon:” A Case Study of a Dark Green Data Point


A long-standing theme in the triangulation literature is that converging data lend credibility to the
findings, so researchers/evaluators can feel more confident in their findings when responding to
challenges of their work leveled by the scientific community. In this case example, we explore how
well-triangulated data can also help researchers/evaluators withstand political fire and pushback
from powerful stakeholders. In our qualitative interviews with law enforcement personnel, we asked
why they did not submit some SAKs for DNA testing, probing whether there were particular kinds of
cases or situations in which their standard practice would be to store a kit instead of test it. Multiple
police representatives specifically stated that if they thought the victim had been engaged in pros-
titution at the time of the assault, they were suspicious as to whether a “real” crime did in fact occur,
so they were less likely to submit an SAK for DNA testing. When asked how they knew a victim
might be engaged in sex work, their answers tended to become vague, noting that maybe it was the
time of day, the location, or just a “you can tell” feel. Regardless of whether they had proof the
victim was engaged in prostitution, the suspicion thereof was sufficient, in their view, to question
whether a crime occurred and therefore to store a kit instead of testing it for DNA. Police discussed
this issue freely in their interviews, and sometimes publicly in team meetings, which were captured
in our observational field notes. This belief was also reflected in actual police reports (i.e., there was
Campbell et al. 13

written documentation that because they thought a victim might have been engaged in prostitution,
they did not believe her account of the sexual assault). In interviews with other stakeholders groups,
participants noted that they were aware of this practice by police and gave specific examples of cases
in which they had seen this practice. Given how strongly this theme triangulated (i.e., dark green),
we were confident in moving forward to explore how this “presumption of prostitution” theme was
related to other data points, such as police investigational effort and how victims were treated by the
police.
For the members of the evaluation team, this was clearly a credible and actionable finding, as it
revealed problematic practices that needed to be remedied with training, supervision, and broader
organizational norm-setting, so we brought it to our community partners for discussion. We antici-
pated that we might get some pushback from police personnel about our public airing of this theme
and subsequent findings pertaining to this theme, and indeed, law enforcement challenged us on this
specific issue. The triangulation assessment had helped us identify and organize incontrovertible
evidence, which we could—and did—lay out for stakeholders to refute their assertion that this was
practice was not common. The triangulation data saved our bacon when we were attacked for
presenting controversial, politically sensitive information. As stakeholders continued to discuss
these results, they agreed that all organizations in the partnership would benefit from additional
training on best practices in working with sexual assault survivors, which was instituted a few
months later.

“Agree to Disagree:” A Case Study of a Yellow Finding


Triangulation coding requires judgment as to whether the perspectives of stakeholders and/or
information provided by different data collection methods/sources converge and coalesce into a
common, unified finding, and these judgments can be hard to make in practice. An issue in this case
example was a qualitative finding that race, gender, and social class oppressions were key con-
tributing reasons why the police did not submit rape kits for testing and investigate these reported
sexual assault cases. The vast majority of the rape survivors whose SAKs were not tested for DNA
evidence were Black women/girls, many of whom lived in poverty (per U.S. Census data for
Detroit). All stakeholder groups made note of this, except the police. In other words, we had
interview data from multiple stakeholders regarding the intentions and behavior of another group
(the police), who, when this issue was raised in the collaborative team meetings, vociferously refuted
the assertion that their actions were rooted in race, gender, and class bias. Police representatives
highlighted that many members of the department were African American and many were female,
and that among the higher leadership ranks of the department, women and African Americans have
been well represented throughout the years. To our ears, this rebuttal seemed wholly consistent with
theories of institutional racism, which emphasize that it is possible to have individuals in leadership
positions who are women and/or people of color within institutions that have systemic, structural-
level policies and practices that promote oppression and systematically deny or limit access to
goods, services, and opportunities in society, including the right to safety and protection (Alexander,
2012; Bonilla-Silva, 1997; Sidanius & Pratto, 2001). However, this is an interpretation of the
argument put forth by the police to challenge a finding; going back to the finding itself, how should
we characterize the extent to which this finding was triangulated across stakeholder groups and data
collection methods?
Within the evaluation team, we debated how to assign a triangulation code to this finding. The
interviews were consistent across stakeholder groups, except for the group at the center of this
finding. We had observational data documenting these arguments during stakeholder team meet-
ings, but these data seemed to be more a public airing of information that had been disclosed in
individual interviews rather than an independent data source. In the archival data, we saw direct
14 American Journal of Evaluation XX(X)

behavioral indicators of sexism in that victims were referred to as “ho’s” and “heffers” and other
derogatory names. Given that most of these victims were poor African American women and girls,
it seemed inconceivable to us that these derogatory references weren’t also steeped in racism and
classism, even though specific racialized language in the reports was rare. Taken together, some
coders felt this information merited a green triangulation code (i.e., a variation on across-method,
across stakeholder triangulation); however, other coders thought yellow was the more appropriate
code because the data from the stakeholder group at the heart of this finding (the police) did not
triangulate and the archival data did not provide, in their view, clear support either. All coders did
agree that if we were to label this finding as green, such a designation would obscure the contro-
versy about this matter and would not clearly convey that there were vastly different perspectives
on this issue.
In the end, we decided on yellow as the triangulation code, as it reflected the conflicting views
among stakeholders and the “agree to disagree” decision within the evaluation team, as we didn’t
feel entirely settled with this assessment. To the evaluators and to all project stakeholders except the
police, this was deemed an actionable finding, one that merited inclusion in our final report, in
subsequent publications, and in training curricula that emanated from this project. As expected, the
police “agreed to disagree” with the inclusion of this finding in the report (and other outlets) and
asked that we include their objection and their rationale for their objection, which we readily agreed
to do as it helps convey the disparate perspectives in these data. This case example highlights what
would be problematic about presenting only convergent data for analysis and dissemination. Includ-
ing “yellow data” provides a way for evaluators to document and explain the nature of contested
spaces.

“One Way or Another:” A Case Study of Yellow and Green Findings


In our efforts to understand how and why Detroit amassed so many untested rape kits, we examined
the resources available in these organizations over time to understand whether they realistically
could have responded to all reported sexual assault cases and tested all rape kits. We asked stake-
holders which specific resources (programs, funding, staff) were particularly helpful to them and
why. In our qualitative interviews, multiple stakeholder groups noted that when the crime lab
received a federal Department of Justice grant in 2005 to support outsourcing DNA testing, the
funding was instrumentally helpful in managing testing workloads and increasing the number of kits
that could be tested each year. We verified via archival records the date and amount of funds
received by the crime lab, but we were not able determine how exactly those funds were allocated
and what percentage of the monies were used for SAKs specifically versus testing DNA in other
types of crimes. Nevertheless, stakeholders steadfastly maintained that the funds increased the
number of SAKs tested. Given that we had quantitative data regarding SAK submission rates over
a 30-year period of time, we decided to conduct an across-methodology triangulation assessment to
explore whether the receipt of funds was associated with a significant increase in SAK submissions.
The multilevel modeling results indicated that submissions for testing did increase after the crime
lab received these funds, but the effect became nonsignificant in a multivariate model that accounted
for other key historical events and resource changes over time. This led to a yellow triangulation
code, as the findings did not converge across methodologies. Stakeholders were not surprised by
these mixed, nontriangulated findings and argued that the funds—though helpful—were probably a
“drop in bucket” of what they truly needed. Some stakeholders expressed concern that the mixed
findings could be interpreted to suggest that the federal funding program was ineffective, but most
indicated that the triangulation assessment provided useful data that they would present in future
grant applications to make the case that additional funding was necessary to create demonstrable
change in this community.
Campbell et al. 15

Another resource highlighted by stakeholders as making an important positive contribution to


serving rape victims and processing rape kits was the community’s Sexual Assault Nurse Examiner
(SANE) program. SANE programs provide comprehensive postassault health care and forensic
evidence collection for rape victims and often function as a connector organization in their com-
munities by bringing together stakeholder groups that work with rape victims. Most U.S. SANE
programs were founded in the 1980s and 1990s (R. Campbell, Patterson, & Lichty, 2005; Depart-
ment of Justice, 2013), but Detroit’s SANE program was not established until 2006, substantially
later than what has been typical in many other communities. However, once Detroit did have a
SANE program, stakeholders called it a “game-changer.” In the qualitative interviews, all stake-
holder groups praised this program for the quality of the care they provided to victims and the
technical accuracy of their forensic evidence collection procedures. In the observational data, we
witnessed many meetings in which this program was heralded by other stakeholder groups. In
archival data, we saw a qualitative change in sexual assault police reports over time, as more
detectives were clearly working in collaboration with the forensic nurses to document medical
forensic evidence. Again, we had data available for an across-methodology triangulation to explore
whether SAK submission practices changed after the implementation of the SANE program. In the
multilevel model, this variable was associated with a significant increase in rape kit testing, account-
ing for linear time trends and other historical events and resources. These were clear, unequivocal
findings that the SANE program was instrumental in this community’s response to sexual assault
and the problem of untested rape kits (i.e., dark green triangulation code). Detroit-area and state-
level policy makers were quick to disseminate these findings and advocate for increased funding for
SANE programs. The triangulation coding provided directly useful data for practitioners and policy
makers, and “one way or another . . . whether it’s good or bad, now we know and we can use this
information,” one stakeholder noted, to help advocate for needed resources.

Discussion
In this study, we wanted to extend the technical literature on triangulation by developing and
implementing an assessment procedure for use in a large-scale project that spanned multiple meth-
odologies, data collection methods, and stakeholder groups. We created a color-coded system that
evaluated the extent of within and/or across source convergence of individual data points and larger
findings that came from further analysis of those data. The green-yellow-red color codes further
distinguished whether divergence was attributable to conflicting information or missing/unique data.
The coding procedures we established were straightforward to implement with standard word
processing and spreadsheet software and did not require specialized qualitative analysis software.
However, because the underlying logic of this approach is rooted in Glaser’s constant comparison
process (Glaser, 2007), the coding framework and operational definitions we developed (Table 1)
could be easily implemented in specialized analysis software. Overall, we spent 3 weeks (in a 30-
month project) working on these analyses; the volume of data to be scanned increased time-to-
completion, but the coders’ deep familiarity with the data prior to conducting the triangulation
analyses helped our timeline. Consistent with Mathison’s (1988) perspective that evaluators need
to think carefully about what merits triangulation and why, we did not assess convergence for every
data point, and instead restricted our analysis to a primary focal question and to specific data points/
themes pertaining to that question.
We acknowledge that our coding system does not capture as much detail as the procedures
proposed by Farmer et al. (2006) and Sands and Roer-Stier (2006) because it does not “roll in” to
the coding an interpretation of the disagreement (e.g., “silence,” “dissonance,” “illumination”).
Given the tremendous volume of data we had to scan, we felt this was a reasonable modification:
This coding system reliably locates and distinguishes convergent and divergent data, with some
16 American Journal of Evaluation XX(X)

initial context about the nature of disagreement (i.e., yellow/red) that can be explored with stake-
holders in later stages of analysis. We also note that our coding methods and case study examples do
not address the conceptual and technical issues of assessing triangulation longitudinally (see Denzin,
1978). Across-time triangulation has not been well explored in research to date, and such work
would need to consider whether change over time reflects a “failure” in triangulation or a substantive
outcome (akin to how low test–retest reliabilities may reflect development, not unreliability per se;
Singleton & Straits, 2018). In this action research project, our aim was to transform this city’s
response to sexual assault, so change over time was a desired outcome. What across-time triangula-
tion means in such contexts is complicated, as direction and rate of change over time may or may not
converge across stakeholder groups and across data collection methods. We hope that braver eva-
luation teams will venture into this conceptual and methodological work.
In this triangulation project, we also wanted to explore the intersection of the technical and
political levels of inquiry to explore how divergent data are interpreted and acted upon by evaluators
and stakeholders (Greene & Caracelli, 1997). Historically, triangulation has been conceptualized as
an indicator of validity, validity as a component of credibility, and credibility as a determinant of
action (Mark, 2015). Thus, it might be expected that only convergent data (i.e., dark green/green)
would be considered credible and actionable, as these data points/findings were consistent across
methodology, data collection method, and or/stakeholder group. However, in this project dark green/
green, yellow, and red data were viewed as credible by both the evaluation team and stakeholders;
direct action was taken based on dark green/green and yellow data. In other words, disputed,
conflicting data were considered a reasonable evidence base for action—why?
As Miller (2015) pointed out, the accuracy of evaluative findings influences stakeholders’ per-
ceptions of credibility, and triangulation coding provides a structured way of checking for inaccura-
cies. We were concerned that stakeholders might provide misinformation to try to influence the
findings, as was highlighted in the “The Unreliable Narrator” case study. Our decisions to cross-
check information, to be transparent with our partners that we were doing this, and to remove
incorrect information from the analyses boosted our credibility with our stakeholders as it signaled
that we were aware of the politics at-play. Collecting corroborating evidence is standard practice in
the criminal justice system, and our use of those techniques in the evaluation helped established
trust, but we acknowledge that in other settings and systems, triangulation assessments might not be
as positively perceived by stakeholders.
Whether evaluation findings are perceived as credible and actionable is also influenced by the
extent to which stakeholders can “see” the process by which the findings were generated, particu-
larly so in qualitative evaluation (Rallis, 2015). As the case studies “Saving Our Bacon” and “Agree
to Disagree” illustrated, we had some rather damning stories to tell in this project. The data were
clear that rape myth acceptance was a key reason why police did not submit rape kits for DNA
testing, and the triangulation coding helped us compile and organize information to show the police
how we reached this conclusion. We were able to capture disagreements regarding the role of
sexism, racism, and classism in the law enforcement response to sexual assault, and these disputes
are part of the story (Greene et al., 2001). That our findings “showed” these conflicts was important
to stakeholders and increased the credibility of the work as a foundation for action.
Sometimes, we did not have sufficient evidence to be able to tell a story, and therefore, nothing to
show” and act upon. In the case study “Don’t Go There,” our initial thought was that our red data
point regarding vicarious trauma among service providers reflected incomplete data that would
triangulate with more data collection. It did not. Instead, we learned that this was an off-limits topic
to stakeholders. We did not feel that we had sufficient data to make a claim that vicarious trauma had
affected police actions in these cases of untested rape kits, but there was a short story to be told about
why we could not make this claim. We elected to tell that story here to emphasize that triangulation
coding can reveal more than simply convergence and divergence. Here, it uncovered what seemed to
Campbell et al. 17

be an important subject for future research and evaluation, but one that was ultimately out of scope
for this action research project, despite our best efforts to understand its relevance.
Finally, credibility and usability are evaluated by stakeholders in situ, and what is relevant and
needed in that context is paramount (Greene, 2015; Julnes & Rog, 2015; Schwandt, 2015). The
context of this work was what one stakeholder termed “an unimaginable public safety crisis:”
11,000þ sexual assaults had been reported to the police and it was wholly unclear how many of
these cases had been thoroughly investigated, given that key evidence sat untested. This city needed
to understand what went wrong and how to fix it—quickly—because the clock was literally running
as the statute of limitations on these criminal cases was expiring. Our action research project had to
address immediate, pressing needs: Did the city have any resources to leverage to address the
problem, and what resources did they need? In the “One Way or Another” case study, we saw that
even conflictual information about the utility of specific resources was helpful and actionable for
stakeholders as they sought to remedy this problem. It wasn’t “perfect” information by any means,
nothing that would sit near the top of any “hierarchy of evidence” model (see Schwandt, 2015 for a
critical review). Yet, the triangulation assessments helped stakeholders understand “what was cer-
tain and what was uncertain,” as one leader noted, and that information, in this context, was
sufficient evidence for action.

Authors’ Note
The opinions or points of view expressed in this document are solely those of the authors and do not reflect the
official positions of any participating organization or the U.S. Department of Justice.

Acknowledgments
The authors thank the members of the action research project and academic colleagues who provided feedback
on prior drafts of this manuscript.

Declaration of Conflicting Interests


The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or pub-
lication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publica-
tion of this article: The action research project described in this article was supported by a grant from the
National Institute of Justice (2011-DN-BX-0001).

Note
1. We recognize that red-yellow-green color coding as a data visualization method is problematic for color-
blind individuals; therefore, any visualizations of data coded in the method would need to be modified
appropriately (see Evergreen, 2014).

References
Alexander, M. (2012). The new Jim Crow: Mass incarceration in the age of colorblindness. New York, NY:
The New Press.
Bailey, A., & Hutter, I. (2008). Qualitative to quantitative: Linked trajectory of method triangulation in a study
on HIV/AIDS in Goa, India. AIDS Care, 20, 1119–1124.
Bonilla-Silva, E. (1997). Rethinking racism: Toward a structural interpretation. American Sociological Review,
62, 465–480.
18 American Journal of Evaluation XX(X)

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod
matrix. Psychological Bulletin, 56, 81–105.
Campbell, R., Feeney, H., Fehler-Cabral, G., Shaw, J., & Horsford, S. (2017). The national problem of untested
sexual assault kits (SAKs): Scope, causes, and future directions for research, policy, and practice. Trauma,
Violence, & Abuse, 18, 363–376.
Campbell, R., Fehler-Cabral, G., Pierce, S. J., Sharma, D., Bybee, D., Shaw, J., . . . Feeney, H. (2015). The
Detroit sexual assault kit (SAK) action research project (ARP). Washington DC: National Institute of
Justice.
Campbell, R., Patterson, D., & Lichty, L. F. (2005). The effectiveness of sexual assault nurse examiner (SANE)
programs: A review of psychological, medical, legal, and community outcomes. Trauma, Violence, &
Abuse, 6, 313–329.
Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Techniques and procedures for developing
grounded theory (3rd ed.). Thousand Oaks, CA: Sage.
Deacon, D., Bryman, A., & Fenton, N. (1998). Collision or collusion? A discussion and case study of the
unplanned triangulation of quantitative and qualitative research methods. International Journal of Social
Research Methodology, 1, 47–63.
Denzin, N. (1978). Sociological methods. New York, NY: McGraw-Hill.
Department of Justice. (2013). A national protocol for sexual assault medical forensic examinations: Adults &
adolescents (2nd ed.). Washington, DC: Author.
Department of Justice. (2015). Identifying and preventing gender bias in law enforcement response to sexual
assault and domestic violence. Washington, DC: Author.
Donaldson, S., Christie, C., & Mark, M. (2015). Credible and actionable evidence. Thousand Oaks, CA: Sage.
Erickson, F. (1986). Qualitative methods in research on teaching. In M. C. Wittrock (Ed.), Handbook of
research on teaching (pp. 119–161). New York, NY: Macmillan.
Erzberger, C., & Prein, G. (1997). Triangulation: Validity and empirically-based hypothesis construction.
Quality and Quantity, 31, 141–154.
Farmer, T., Robinson, K., Elliott, S. J., & Eyles, J. (2006). Developing and implementing a triangulation
protocol for qualitative health research. Qualitative Health Research, 16, 377–394.
Figley, C. R. (1995). Compassion fatigue: Toward a new understanding of the costs of caring. In B. H. Stamm
(Ed.), Secondary traumatic stress: Self-care issues for clinicians, researchers, and educators (pp. 3–28).
Baltimore, MD: Sidran Press.
Flick, U. (1992). Triangulation revisited: Strategy of validation or alternative? Journal for the Theory of Social
Behaviour, 22, 175–197.
Flick, U., Garms-Homolova, V., Herrmann, W. J., Kuck, J., & Röhnsch, G. (2012). “I can’t prescribe something
just because someone asks for it . . . ”: Using mixed methods in the framework of triangulation. Journal of
Mixed Methods Research, 6, 97–110.
Glaser, B. G. (2007). Doing formal theory. In A. Bryant & K. Charmaz (Eds.), The SAGE handbook of
grounded theory (pp. 97–113). Thousand Oaks, CA: Sage.
Greene, J. C. (2002). With a splash of soda, please: Towards active engagement with difference. Evaluation, 8,
259–266.
Greene, J. C. (2007). Mixed methods in social inquiry. San Francisco, CA: John Wiley.
Greene, J. (2015). How evidence earns credibility in evaluation. In S. Donaldson, C. Christie, & M. Mark (Eds.),
Credible and actionable evidence (pp. 205–220). Thousand Oaks, CA: Sage.
Greene, J. C., Benjamin, L., & Goodyear, L. (2001). The merits of mixing methods in evaluation. Evaluation, 7,
25–44.
Greene, J. C., & Caracelli, V. J. (1997). Defining and describing the paradigm issue in mixed-method evalua-
tion. New Directions for Evaluation, 74, 5–17.
Greene, J., & McClintock, C. (1985). Triangulation in evaluation: Design and analysis issues. Evaluation
Review, 9, 523–545.
Campbell et al. 19

Hammersley, M. (2008). Questioning qualitative inquiry: Critical essays. London, England: Sage.
Howe, K. R. (2012). Mixed methods, triangulation, and causal explanation. Journal of Mixed Methods
Research, 6, 89–96.
Human Rights Watch. (2009). Testing justice: The rape kit backlog in Los Angeles City and County. New York,
NY: Author.
Human Rights Watch. (2010). “I used to think the law would protect me” Illinois’s failure to test rape kits. New
York, NY: Author.
Julnes, G., & Rog, D. (2015). Actionable evidence in context. In S. Donaldson, C. Christie, & M. Mark (Eds.),
Credible and actionable evidence (pp. 221–258). Thousand Oaks, CA: Sage.
Kidder, L. H., & Fine, M. (1987). Qualitative and quantitative methods: When stories converge. New Directions
for Evaluation, 35, 57–75.
Klofas, J., Hipple, N. K., & McGarrell, E. (Eds.). (2010). The new criminal justice: American communities and
the changing world of crime control. New York, NY: Routledge.
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
Lincoln, Y. S., Lynham, S. A., & Guba, E. G. (2011). Paradigmatic controversies, contradictions, and emerging
confluences, revisited. In N. K. Denzin & Y. S. Lincoln (Eds.), The SAGE handbook of qualitative research
(pp. 97–128). Thousand Oaks, CA: Sage.
MacQueen, K. M., McLellan-Lemal, E., Bartholow, K., & Milstein, B. (2008). Team-based codebook devel-
opment: Structure, process, and agreement. In G. Guest & K. M. MacQueen (Eds.), Handbook for team-
based qualitative research (pp. 119–135). Lanham, MD: Altamira.
Mark, M. (2015). Credible and actionable evidence. In S. Donaldson, C. Christie, & M. Mark (Eds.), Credible
and actionable evidence (pp. 275–302). Thousand Oaks, CA: Sage.
Mathison, S. (1988). Why triangulate? Educational Researcher, 17, 13–17.
McConney, A., Rudd, A., & Ayers, R. (2002). Getting to the bottom line: A method for synthesizing findings
within mixed-method program evaluation. American Journal of Evaluation, 23, 121–140.
Miles, M. B., Huberman, A. M., & Saldana, J. (2014). Qualitative data analysis: A method sourcebook.
Thousand Oaks, CA: Sage.
Miller, R. (2015). How people judge the credibility of information. In S. Donaldson, C. Christie, & M. Mark
(Eds.), Credible and actionable evidence (pp. 39–61). Thousand Oaks, CA: Sage.
Moran-Ellis, J., Alexander, V. D., Cronin, A., Dickinson, M., Fielding, J., Sleney, J., & Thomas, H. (2006).
Triangulation and integration: Processes, claims and implications. Qualitative Research, 6, 45–59.
Morse, J. M. (2015). Critical analysis of strategies for determining rigor in qualitative inquiry. Qualitative
Health Research, 25, 1212–1222.
Murphy, S. B., Banyard, V. L., & Fennessey, E. D. (2013). Exploring stakeholders’ perceptions of adult female
sexual assault case attrition. Psychology of Violence, 3, 172–184.
National Institute of Justice. (2010). Solicitation: Strategic approaches to sexual assault kit (SAK) evidence: An
action research project (SL #000947). Washington, DC: Author.
Office for Victims of Crime. (2017). The vicarious trauma toolkit. Washington, DC: Author.
Patton, M. Q. (2008). Utilization-focused evaluation (4th ed.). Thousand Oaks, CA: Sage.
Pinchevsky, G. M. (2018). Criminal justice considerations for unsubmitted and untested sexual assault kits: A
review of the literature and suggestions for moving forward. Criminal Justice Policy Review, 29, 925–945.
Rallis, S. (2015). When and how qualitative methods provide credible and actionable evidence. In S. Donald-
son, C. Christie, & M. Mark (Eds.), Credible and actionable evidence (pp. 137–156). Thousand Oaks, CA:
Sage.
Randall, W. L., & Phoenix, C. (2009). The problem with truth in qualitative interviews: Reflections from a
narrative perspective. Qualitative Research in Sport and Exercise, 1, 125–140.
Sands, R. G., & Roer-Strier, D. (2006). Using data triangulation of mother and daughter interviews to enhance
research about families. Qualitative Social Work, 5, 237–260.
20 American Journal of Evaluation XX(X)

Schwandt, T. (2015). Credible evidence of effectiveness: necessary but not sufficient. In S. Donaldson, C.
Christie, & M. Mark (Eds.), Credible and actionable evidence (pp. 259–273). Thousand Oaks, CA: Sage.
Sidanius, J., & Pratto, F. (2001). Social dominance: An intergroup theory of social hierarchy and oppression.
Cambridge, MA: University Press.
Singleton, R. A., & Straits, B. C. (2018). Approaches to social research (6th ed). New York, NY: Oxford
University Press
Strom, K. J., & Hickman, M. J. (2010). Unanalyzed evidence in law-enforcement agencies. Criminology &
Public Policy, 9, 381–404.
Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive measures: Nonreactive
research in the social sciences. Chicago, IL: Rand McNally.
Weiss, C. H. (1973). Where politics and evaluation research meet. Evaluation Practice, 14, 93–106.

You might also like