292adf2b29e7883bf8a63a42ea7dfb48
292adf2b29e7883bf8a63a42ea7dfb48
Abstract
The purpose of this study was to develop triangulation coding methods for a large-scale action
research and evaluation project and to examine how practitioners and policy makers interpreted
both convergent and divergent data. We created a color-coded system that evaluated the extent of
triangulation across methodologies (qualitative and quantitative), data collection methods (obser-
vations, interviews, and archival records), and stakeholder groups (five distinct disciplines/organi-
zations). Triangulation was assessed for both specific data points (e.g., a piece of historical/
contextual information or qualitative theme) and substantive findings that emanated from further
analysis of those data points (e.g., a statistical model or a mechanistic qualitative assertion that links
themes). We present five case study examples that explore the complexities of interpreting tri-
angulation data and determining whether data are deemed credible and actionable if not convergent.
Keywords
mixed methods, action research, participant observation, qualitative methods, quantitative methods
The concept of triangulation has a long history in the social sciences, capturing the interest of
quantitative, qualitative, and mixed methods scholars alike. In quantitative research, D. T. Campbell
and Fiske (1959) created the multitrait, multimethod matrix for systematically comparing findings
across different data collection methods; if results converged across methods, D. T. Campbell and
Fiske (1959) argued that researchers could have greater confidence in the validity of their
1
Michigan State University, East Lansing, MI, USA
2
HarderþCompany Community Research, San Francisco, CA, USA
Corresponding Author:
Rebecca Campbell, Michigan State University, 130A Psychology Bldg., East Lansing, MI 48824, USA.
Email: rmc@[Link]
2 American Journal of Evaluation XX(X)
conclusions (see also Webb, Campbell, Schwartz, & Sechrest, 1966). In qualitative research, early
conceptualizations of triangulation highlighted how multiple methods can reveal shared perspec-
tives and realities, without making epistemological claims regarding the “truth” of the findings
(Denzin, 1978; Lincoln & Guba, 1985). In mixed methods research, triangulation has been lauded
as a strategy for exploring viewpoints revealed through divergent data (Greene, 2007; Greene &
Caracelli, 1997; Greene, Benjamin, & Goodyear, 2001). Although quantitative, qualitative, and
mixed methodologists have varied epistemological and practical reasons for studying triangulation,
there remains an enduring shared value across these methodological traditions for collecting and
comparing information generated by disparate methods and sources.
In mixed methodology scholarship specifically, the literature on triangulation has been centered
primarily on what Greene and Caracelli (1997) termed the theoretical level of inquiry to engage the
paradigmatic and epistemological issues inherent in evaluative comparisons of data collected by
different means. By contrast, the technical levels of inquiry are far less developed, as procedural
guidance for mixed method triangulation coding is sparse (see Farmer, Robinson, Elliott, & Eyles,
2006; McConney, Rudd, & Ayres, 2002; Sands & Roer-Stier, 2006, as notable exceptions). Like-
wise, few scholars have delved into the political level of inquiry to explore how studying triangula-
tion can reveal contested spaces in evaluation projects and influence what stakeholders deem
credible and actionable evidence (Donaldson, Christie, & Mark, 2015). The purpose of this article
is to advance these levels by presenting a practical framework for coding data convergence in a
large-scale, policy-focused mixed methodology project and to explore the political implications of
both convergent and divergent data. We present five case studies that illustrate how our triangulation
assessments influenced what actions our evaluation team and our community stakeholders decided
to take—or not to take—in this policy context. To set the stage for this project, we will begin with a
brief review of the literature on triangulation coding in mixed methods research and evaluation.
QUALITATIVE QUANTITATIVE
Interviews/
Interviews Observations Archival Observations Archival
Surveys
STAKEHOLDERS
Group A • • • • • •
Group B • • • • • •
Group C • • • • • •
Within methodology, across method, within stakeholder group
Health Dissemination Project, which involved collecting 40 qualitative interviews and 30 archival
records across three Canadian provinces (i.e., a within methodology, across method, across stake-
holder triangulation). They found that these coding methods worked well for documenting conver-
ging findings and for distinguishing incomplete from divergent data.
A similar process for coding convergence was proposed by Sands and Roer-Stier (2006) in their
work interviewing 17 mother–daughter dyads (i.e., a within methodology, within method, across
stakeholder triangulation). They examined the extent to which emerging themes in the mothers’
interviews were consistent with their daughters’, using a five-category rating system: (1) same
story–same meaning, (2) same story–different interpretation, (3) missing pieces (both women were
privy to a piece of information but only one mentioned it in her interview), (4) unique information
(information was known to only one woman), and (5) illuminating (different accounts were provided
by each participant). Like Farmer et al.’s (2006) coding scheme, this approach also distinguishes
agreement from completeness but refines the assessment of missingness to differentiate information
that was not shared by both but was (presumably) known to both, from information that was not
known to both. It may be difficult to ascertain what participants could have possibly known or not
known, but Sands and Roer-Stier (2006) were able to make these distinctions in their project.
The coding methods proposed by Farmer and colleagues (2006) and Sands and Roer-Stier (2006)
are consistent with Greene’s (2002, 2007; Greene et al., 2001) calls for the systematic study of
diversity not merely the documentation of convergence. Furthermore, these coding methods assess
the extent of convergence and possible reasons for divergent points of view, rather than assigning a
simple yes/no determination of whether information converged. Both projects were within metho-
dology assessments—all qualitative—and relatively small in scale. How triangulation coding could
be employed in large projects that transcend methodological boundaries to include qualitative and
quantitative data are in need of further exploration.
sources, are they still actionable? Rallis (2015) noted that transparency of process is key to cred-
ibility: “stakeholders use findings, depending on whether they understand and accept how we have
created and told the story, that is, how we made meaning of words and images” (p. 143). Triangula-
tion coding is an excellent way to show where there are convergent and divergent perspectives, and
that transparency may be instrumental for identifying actionable next steps, even when there is
disagreement. At the same time, Miller (2015) noted that accuracy of evaluative data also determines
what stakeholders deem credible and actionable, and triangulation coding could reveal deeper
problems in the data. Comparing methods and sources could show that some information is not
simply not correct—not a difference in opinion or a different point of view, but an inaccuracy. Did
sources simply misunderstand or misremember, or did sources deliberately mislead the evaluator?
This is an uncomfortable reality, one readily acknowledged in evaluation practice (Patton, 2008;
Weiss, 1973), but triangulation coding may lay that reality bare, in incontrovertible detail. As such,
triangulation is not merely a methodological task of cross-checking data, it is also a commitment to
engaging political complexities and competing interests that may affect the credibility and usability
of the data.
projects, it is worth the effort to find misinformation before scaffolding those inaccuracies into
broader findings.
Third, at the political level of analysis, we wanted to promote discussion about what evaluators
should do with divergent data. The historical presumption has been that convergent data lend
credibility and confidence and are therefore actionable (e.g., D. T. Campbell & Fiske, 1959; Erzer-
berger & Prein, 1997; Webb et al., 1966). The implied corollary is that divergent data are not
credible and therefore not actionable, but Greene’s (2002, 2007; Greene et al., 2001) theoretical
analysis challenges that presumption, so we wanted to explore how stakeholders decide whether
convergent and divergent data are credible and actionable (McConney et al., 2002). We will share
five cases studies that illustrate how we implemented our color-coding system and what actions
we—both the evaluation team and our collaborators—decided to take based on green, yellow, and
red triangulation findings.
The Context for the Current Study: The Detroit Sexual Assault Kit (SAK) Action Research
Project
Although this is an article about methodological triangulation, it is necessary to provide some
context about the substantive focus of this project, as the results are case study examples of con-
vergence and divergence that require content grounding for comprehension. Briefly, this project
addressed the growing national problem of untested SAKs (also termed rape kits). A rape kit is
typically collected within 24–72 hr after a sexual assault in order to obtain biological evidence from
victims’ bodies (e.g., semen, blood, saliva; Department of Justice, 2013). This evidence can be
analyzed for DNA and compared against other criminal reference DNA samples in Combined DNA
Index System , the federal DNA database, which can be instrumental in solving crimes and prose-
cuting rapists (R. Campbell, Feeney, Fehler-Cabral, Shaw, & Horsford, 2017; Strom & Hickman,
2010). However, in jurisdictions throughout the United States, police frequently do not submit rape
kits for forensic DNA testing, and instead, kits are shelved in police property, unprocessed and
ignored for years (R. Campbell et al., 2017; Pinchevsky, 2018). Conservative estimates indicate
there are at least 200,000 untested SAKs in U.S. police departments, and large stockpiles of kits have
been documented in over five dozen jurisdictions, sometimes totaling more than 10,000 untested
SAKs in a single city (R. Campbell et al., 2017). The failure to test rape kits for DNA evidence has
drawn public outrage as well as the attention of Human Rights Watch (2009, 2010) and the U.S.
Department of Justice (2015), which highlighted the problem of untested SAKs as an example of
biased and discriminatory police practices in their policy report, Gender Bias in Law Enforcement
Response to Sexual Assault.
Detroit, MI, was one of many U.S. cities with large numbers of untested rape kits. In August
2009, approximately 11,000 SAKs were found in a remote police property storage facility. Local,
county, and state officials demanded immediate review and intervention. In fortuitous timing, the
National Institute of Justice released funding for collaborative action research projects whereby
researchers would work with community practitioners in jurisdictions with large numbers of
untested SAKs to develop and evaluate change strategies (NIJ, 2010). In the action research model,
researchers/evaluators collect data about the problem at hand, share the findings with community
collaborators, and together, informed by those results, the team develops, implements, and evaluates
change strategies until successful solutions are institutionalized (Klofas, Hipple, & McGarrell,
2010). Detroit, MI, was selected for one of these grants, and we (the first and last authors of this
article) were the primary research partners in this project.
One of the required aims of the action research project was to examine the underlying reasons
why so many SAKs were not submitted by the police for forensic DNA testing. At that time this
project began, law enforcement officials were defending their decisions not to test these rape kits for
Campbell et al. 7
QUALITATIVE QUANTITATIVE
Interviews/
Interviews Observations Archival Observations Archival
Surveys
A. Police • • • •
STAKEHOLDERS
B. Prosecutors • • • •
C. Forensic
Science/ Crime Lab
• • • •
D. Nursing/
Medicine
• • • •
E. Victim Services/
Advocacy
• • • •
Total Total Total Total
Figure 2. Mixed Methods Triangulation Design for the Detroit SAK Action Research Project.
Note. SAK ¼ Sexual Assault Kit.
DNA and did not perceive that there was a problem to be solved. By contrast, practitioners from
other disciplines—victim advocacy, nursing/medicine, prosecution, and forensic sciences—were
alarmed that so many kits had not been tested, particularly because so many of these victims were
Black women and/or poor women (see R. Campbell et al., 2015). This was a point of tremendous
conflict between stakeholders, so the action research project began under a dark cloud of interper-
sonal and interorganizational tension. In this context, it seemed possible, perhaps even likely, that
stakeholders might try to sway our investigation into why rape kits were not tested, and thus,
triangulating information across stakeholder groups and data collection methods seemed prudent.
Elsewhere, we discuss the substantive findings of this component of the action research project (see
R. Campbell et al., 2015), but briefly, rape kits were not submitted for DNA testing because Detroit-
area organizations simply did not have adequate resources (staffing, time, financial) to test all kits
and investigate all reported sexual assault cases. However, there was also clear and compelling
evidence that rape kits were not tested because police did not believe rape victims, and their
adherence to rape myth stereotypes influenced their decisions not to invest their limited resources
in this crime and these victims. In this article, we will share how triangulation coding helped us
uncover these findings.
Method
Triangulation Design
Figure 2 summarizes the triangulation design we used in the Detroit SAK Action Research Project.
We collected data from all five stakeholder groups/disciplines that are involved in collecting and
testing rape kits, investigating and prosecuting sexual assault crimes, and providing services to
victims, including police, prosecution, forensic science/crime laboratory, nursing/medicine, and
8 American Journal of Evaluation XX(X)
victim services/advocacy. One critical stakeholder group was not formally represented in this study:
rape survivors whose SAKs had not been submitted for testing. In the field of sexual assault
research/evaluation, it is rare that rape survivors can be interviewed in the midst of pending legal
cases, as the researchers can inadvertently become parties to the case. Each one of the *11,000
untested SAKs represented a potentially open, active legal case, given that the prosecutors’ office
had made clear that they intended to have these cases investigated and reevaluated for possible
prosecution. Our institutional review board (IRB) was concerned about possible negative iatrogenic
effects of the research on open case proceedings, a concern strongly shared by the prosecutors. As
such, we could not include rape survivors whose kits had not been tested as a stakeholder group. We
sought other avenues for obtaining survivors’ perspectives in the action research project (see R.
Campbell et al., 2015), but we could not include them as a stakeholder group in the triangulation
design.
We used multiple data collection methods with each stakeholder group, including qualitative
ethnographic observations, qualitative interviews, and archival records that were qualitatively and
quantitatively coded (e.g., sexual assault police reports, intraorganizational records of staffing
levels, policies and procedures, see Figure 2). For the qualitative observational and interview data,
we included high-level leadership and frontline practitioners from each stakeholder group, as well
as current and former employees (e.g., key individuals who had changed positions or retired but
were once closely involved in the issue of SAK processing). The data were reasonably well
distributed across stakeholder groups, such that we had multiple observations, interviews, and
archival records within each stakeholder group and across all stakeholder groups (see R. Campbell
et al. [2015] for more information regarding how data saturation was monitored throughout the
project).
Nature of Triangulation
Dark Green Individuals within the same AND different Multiple qualitative interviews with participants
stakeholder groups, AND multiple data in Groups A AND B, AND archival records
collection methods triangulate confirm same information
Green Individuals within the same AND different Multiple qualitative interviews with participants
stakeholder groups using one data collection in Groups A AND B confirm same information
method triangulate
Individuals within the same stakeholder group Multiple qualitative interviews with participants
AND multiple data collection methods in Group A AND archival records confirm
triangulate same information
Yellow Individuals within the same stakeholder group Multiple qualitative interviews with participants
using one data collection method triangulate. in Group A BUT no other data collection
NO triangulation from different stakeholders method confirms the same information
or different data collection methods. Using the OR
same method with different stakeholder Multiple qualitative interviews with
groups or a different method within the same participants in Group A conflict with those in
stakeholder group reveals different meanings Group B. Other data collection methods
or unclear, equivocal, or no information to provide no clear insights
reconcile perspectives
Red A single individual provides information that is One participant in Group A provides information
not triangulated by other stakeholders or by in a qualitative interview that could not be
another data collection method confirmed by any other person or method
9
10 American Journal of Evaluation XX(X)
this process of comparing information across stakeholder groups and data types (i.e., investigator
triangulation), as this task required judgment regarding the extent to which the information con-
verged/did not converge (MacQueen, McLellan-Lemal, Bartholow, & Milstein, 2008). The coders
reviewed and discussed the compiled information and then, using our green-yellow-red color-coding
system, evaluated the extent to which each data point converged across methodology/method/
source. There was consistently high agreement between the coders for the dark green, green, and
red codes; yellow codes required more extensive review to determine the final triangulation code.
Data points that were green or yellow proceeded to the next stages of data analyses (see Results for
our rationale to analyze yellow data). For qualitative data, we employed Erickson’s (1986) analytic
induction method to identify mechanistic assertions that linked individual themes together into a
hypothesized explanation for why SAKs went untested for decades. The quantitative data were
analyzed with multilevel logistic regression to examine whether historical and contextual factors
(e.g., the date when federal funding for DNA testing became available) predicted SAK testing rates
(see R. Campbell et al. [2015] for details).
To assess triangulation of a finding that emanated from further analysis of a data point, we
applied the same codes in Table 1 to each finding. For our qualitative results, the themes within
each assertion had already been triangulated as data points, thus our task was to ascertain whether the
mechanistic assertion in the finding could be supported with other qualitative or quantitative find-
ings. For the quantitative results, we examined whether statistically significant effects in the multi-
level logistic regression model were also identified as salient and important in the qualitative data
(and vice versa for nonsignificant findings).
Results
“The Unreliable Narrator:” A Case Study of Red and Yellow Data Points
In literature, an unreliable narrator is someone who tells a story while layering a distorted lens over
that reality such that the resulting narrative becomes untrustworthy. Whereas the concept of “truth”
is a subject of debate in qualitative research (Lincoln, Lynham, & Guba, 2011; Randall & Phoenix,
2009), in an evaluation context, it may be helpful to at least know whether a participant’s views may
be unreliable and untrustworthy. For example, in our efforts to understand why so many rape kits in
this jurisdiction were not submitted for forensic DNA testing, we inquired about the focal police
Campbell et al. 11
department’s policies regarding submitting kits for testing. Multiple stakeholders in that organiza-
tion stated that the department had a written SAK submission policy, but when we asked (repeat-
edly) to see that policy, we were not provided with any documentation; other stakeholder groups said
they had never seen the policy and when they also tried to obtain a copy of the documentation, none
was provided. We could not verify the existence of this policy through any source external to this
organization or through another data collection method. The triangulation code for this data point
was yellow: Individuals within the same organization provided consistent information, using the
same method, but no outside source (i.e., another stakeholder group) or data collection method (i.e.,
archival record of the written policy) could verify this information. Given the salience of this matter
(whether there was a written policy regarding SAK submissions and testing), we decided to discuss
this data point in our final report, highlighting it as “yellow” information to be clear that the focal
organization stated it had a policy, but we could find no tangible evidence of its existence. As
evaluators, we felt that the yellow data were credible and actionable and highlighted this finding in
our dissemination, a decision supported by other organizations in the action research project.
Digging deeper into the transcript of the individual who first told us that the police department
had a written policy about SAK testing, we noticed that other key details—such as the sequence of
events regarding the discovery of the untested kits, descriptions of meetings about that discovery,
dates of key events, who was involved in these events—could not be triangulated at all. These data
points were coded as red because no other individuals within or across stakeholder groups—or
another data collection method—could verify the information as provided. Reviewing these red
data points revealed an interesting pattern: Each piece of (unverified) information presented the
focal organization in a better light—slight differences in the timing of events, sequences, and actors
involved that together helped portray the organization as less blameworthy. The interview was
replete with these systematic shifts and slights, interspersed with information that did triangulate
either across stakeholders or data collection method (i.e., green). When we rechecked our qualitative
field notes regarding our interactions with this individual (in team meetings and in the interview
itself), we had not made any mention that we were suspicious about the veracity of the information
provided; rather, our notes indicated that this individual was straightforward, informative, and
helpful. And, it turns out, incorrect—incorrect in a patterned, predictable way. It is not surprising
that someone might spin a narrative to favor a particular perspective, and again, some qualitative
researchers might find that an interesting issue to pursue in its own right. Our point is that we did not
know and could not tell that the information was inaccurate until we conducted this triangulation
analysis. Had we used these specific data points, as provided by this individual, we would have
disseminated incorrect information that would have cast this organization in a better light than was
warranted. Based on the results of the triangulation coding, we (the evaluation team) decided that
this transcript should not be included in subsequent analyses. We checked whether the exclusion of
the transcript would remove unique points of view from consideration (i.e., perspectives not shared
by others but not inaccurate). The transcript contained multiple inaccuracies but also other ideas and
comments that were raised by other stakeholders that could be independently verified.
highly upsetting, traumatic material starts to change people’s cognitions and behaviors, often
decreasing their compassion and empathy for others (Figley, 1995; Office for Victims of Crime,
2017). This participant posited that investigating sexual assault cases was causing vicarious trauma
among service providers, and this trauma was negatively affecting their personal health and well-
being and, ultimately, the quality of their professional work (e.g., treating rape survivors with
decreased respect and empathy). This was a distinctive point of view, one that readily stoodout
while we were open-coding the data; however, as the analyses proceeded, we did not find other
mentions of this idea from other stakeholder groups in observational data, interview data, or archival
records.
When we initially reviewed this red data point, we were unsure whether this was an instance of
missing/incomplete data or unique data. Given that we were conducting follow-up interviews with
participants, as well as informal interviews throughout the project, we attempted to broach the
subject of vicarious trauma and ascertain whether others also felt it was a salient issue, but they
had not mentioned it in their interviews (i.e., red due to missingness that could convert to yellow or
green data) or whether this was indeed a unique perspective (i.e., red due to uniqueness that would
remain red). In that follow-up work, we realized that this code was red because participants did not
want to talk about this subject—it was not accidentally missing, it was purposely missing. It was an
off-limits topic as stakeholders gave numerous verbal and behavioral indications when directly
asked that this was not something they wished to discuss. Thus, the data point would remain red,
due to uniqueness, and we had to decide what to do with the information we had learned. This project
was not intended to be a study of vicarious trauma—there was no indication in any of the research
materials, consent forms, and so on, that indicated to the participants that this would be a subject of
inquiry, and we respected participants’ limits and did not pursue any further analyses on this topic.
However, we did disclose in our final written report (which was previewed by the stakeholders prior
to submission to the funder) that this topic came up in the project but was not deemed focal by the
stakeholders. We debated whether inclusion of this example here, in this article, pushed those limits,
but because we had already signaled to stakeholders that we retained the option to mention the
absence of this theme (as we did in the final report), we decided to tell this story, and its backstory,
here to advance discussion about the underlying reasons why data may be missing or unique in
triangulation coding.
written documentation that because they thought a victim might have been engaged in prostitution,
they did not believe her account of the sexual assault). In interviews with other stakeholders groups,
participants noted that they were aware of this practice by police and gave specific examples of cases
in which they had seen this practice. Given how strongly this theme triangulated (i.e., dark green),
we were confident in moving forward to explore how this “presumption of prostitution” theme was
related to other data points, such as police investigational effort and how victims were treated by the
police.
For the members of the evaluation team, this was clearly a credible and actionable finding, as it
revealed problematic practices that needed to be remedied with training, supervision, and broader
organizational norm-setting, so we brought it to our community partners for discussion. We antici-
pated that we might get some pushback from police personnel about our public airing of this theme
and subsequent findings pertaining to this theme, and indeed, law enforcement challenged us on this
specific issue. The triangulation assessment had helped us identify and organize incontrovertible
evidence, which we could—and did—lay out for stakeholders to refute their assertion that this was
practice was not common. The triangulation data saved our bacon when we were attacked for
presenting controversial, politically sensitive information. As stakeholders continued to discuss
these results, they agreed that all organizations in the partnership would benefit from additional
training on best practices in working with sexual assault survivors, which was instituted a few
months later.
behavioral indicators of sexism in that victims were referred to as “ho’s” and “heffers” and other
derogatory names. Given that most of these victims were poor African American women and girls,
it seemed inconceivable to us that these derogatory references weren’t also steeped in racism and
classism, even though specific racialized language in the reports was rare. Taken together, some
coders felt this information merited a green triangulation code (i.e., a variation on across-method,
across stakeholder triangulation); however, other coders thought yellow was the more appropriate
code because the data from the stakeholder group at the heart of this finding (the police) did not
triangulate and the archival data did not provide, in their view, clear support either. All coders did
agree that if we were to label this finding as green, such a designation would obscure the contro-
versy about this matter and would not clearly convey that there were vastly different perspectives
on this issue.
In the end, we decided on yellow as the triangulation code, as it reflected the conflicting views
among stakeholders and the “agree to disagree” decision within the evaluation team, as we didn’t
feel entirely settled with this assessment. To the evaluators and to all project stakeholders except the
police, this was deemed an actionable finding, one that merited inclusion in our final report, in
subsequent publications, and in training curricula that emanated from this project. As expected, the
police “agreed to disagree” with the inclusion of this finding in the report (and other outlets) and
asked that we include their objection and their rationale for their objection, which we readily agreed
to do as it helps convey the disparate perspectives in these data. This case example highlights what
would be problematic about presenting only convergent data for analysis and dissemination. Includ-
ing “yellow data” provides a way for evaluators to document and explain the nature of contested
spaces.
Discussion
In this study, we wanted to extend the technical literature on triangulation by developing and
implementing an assessment procedure for use in a large-scale project that spanned multiple meth-
odologies, data collection methods, and stakeholder groups. We created a color-coded system that
evaluated the extent of within and/or across source convergence of individual data points and larger
findings that came from further analysis of those data. The green-yellow-red color codes further
distinguished whether divergence was attributable to conflicting information or missing/unique data.
The coding procedures we established were straightforward to implement with standard word
processing and spreadsheet software and did not require specialized qualitative analysis software.
However, because the underlying logic of this approach is rooted in Glaser’s constant comparison
process (Glaser, 2007), the coding framework and operational definitions we developed (Table 1)
could be easily implemented in specialized analysis software. Overall, we spent 3 weeks (in a 30-
month project) working on these analyses; the volume of data to be scanned increased time-to-
completion, but the coders’ deep familiarity with the data prior to conducting the triangulation
analyses helped our timeline. Consistent with Mathison’s (1988) perspective that evaluators need
to think carefully about what merits triangulation and why, we did not assess convergence for every
data point, and instead restricted our analysis to a primary focal question and to specific data points/
themes pertaining to that question.
We acknowledge that our coding system does not capture as much detail as the procedures
proposed by Farmer et al. (2006) and Sands and Roer-Stier (2006) because it does not “roll in” to
the coding an interpretation of the disagreement (e.g., “silence,” “dissonance,” “illumination”).
Given the tremendous volume of data we had to scan, we felt this was a reasonable modification:
This coding system reliably locates and distinguishes convergent and divergent data, with some
16 American Journal of Evaluation XX(X)
initial context about the nature of disagreement (i.e., yellow/red) that can be explored with stake-
holders in later stages of analysis. We also note that our coding methods and case study examples do
not address the conceptual and technical issues of assessing triangulation longitudinally (see Denzin,
1978). Across-time triangulation has not been well explored in research to date, and such work
would need to consider whether change over time reflects a “failure” in triangulation or a substantive
outcome (akin to how low test–retest reliabilities may reflect development, not unreliability per se;
Singleton & Straits, 2018). In this action research project, our aim was to transform this city’s
response to sexual assault, so change over time was a desired outcome. What across-time triangula-
tion means in such contexts is complicated, as direction and rate of change over time may or may not
converge across stakeholder groups and across data collection methods. We hope that braver eva-
luation teams will venture into this conceptual and methodological work.
In this triangulation project, we also wanted to explore the intersection of the technical and
political levels of inquiry to explore how divergent data are interpreted and acted upon by evaluators
and stakeholders (Greene & Caracelli, 1997). Historically, triangulation has been conceptualized as
an indicator of validity, validity as a component of credibility, and credibility as a determinant of
action (Mark, 2015). Thus, it might be expected that only convergent data (i.e., dark green/green)
would be considered credible and actionable, as these data points/findings were consistent across
methodology, data collection method, and or/stakeholder group. However, in this project dark green/
green, yellow, and red data were viewed as credible by both the evaluation team and stakeholders;
direct action was taken based on dark green/green and yellow data. In other words, disputed,
conflicting data were considered a reasonable evidence base for action—why?
As Miller (2015) pointed out, the accuracy of evaluative findings influences stakeholders’ per-
ceptions of credibility, and triangulation coding provides a structured way of checking for inaccura-
cies. We were concerned that stakeholders might provide misinformation to try to influence the
findings, as was highlighted in the “The Unreliable Narrator” case study. Our decisions to cross-
check information, to be transparent with our partners that we were doing this, and to remove
incorrect information from the analyses boosted our credibility with our stakeholders as it signaled
that we were aware of the politics at-play. Collecting corroborating evidence is standard practice in
the criminal justice system, and our use of those techniques in the evaluation helped established
trust, but we acknowledge that in other settings and systems, triangulation assessments might not be
as positively perceived by stakeholders.
Whether evaluation findings are perceived as credible and actionable is also influenced by the
extent to which stakeholders can “see” the process by which the findings were generated, particu-
larly so in qualitative evaluation (Rallis, 2015). As the case studies “Saving Our Bacon” and “Agree
to Disagree” illustrated, we had some rather damning stories to tell in this project. The data were
clear that rape myth acceptance was a key reason why police did not submit rape kits for DNA
testing, and the triangulation coding helped us compile and organize information to show the police
how we reached this conclusion. We were able to capture disagreements regarding the role of
sexism, racism, and classism in the law enforcement response to sexual assault, and these disputes
are part of the story (Greene et al., 2001). That our findings “showed” these conflicts was important
to stakeholders and increased the credibility of the work as a foundation for action.
Sometimes, we did not have sufficient evidence to be able to tell a story, and therefore, nothing to
show” and act upon. In the case study “Don’t Go There,” our initial thought was that our red data
point regarding vicarious trauma among service providers reflected incomplete data that would
triangulate with more data collection. It did not. Instead, we learned that this was an off-limits topic
to stakeholders. We did not feel that we had sufficient data to make a claim that vicarious trauma had
affected police actions in these cases of untested rape kits, but there was a short story to be told about
why we could not make this claim. We elected to tell that story here to emphasize that triangulation
coding can reveal more than simply convergence and divergence. Here, it uncovered what seemed to
Campbell et al. 17
be an important subject for future research and evaluation, but one that was ultimately out of scope
for this action research project, despite our best efforts to understand its relevance.
Finally, credibility and usability are evaluated by stakeholders in situ, and what is relevant and
needed in that context is paramount (Greene, 2015; Julnes & Rog, 2015; Schwandt, 2015). The
context of this work was what one stakeholder termed “an unimaginable public safety crisis:”
11,000þ sexual assaults had been reported to the police and it was wholly unclear how many of
these cases had been thoroughly investigated, given that key evidence sat untested. This city needed
to understand what went wrong and how to fix it—quickly—because the clock was literally running
as the statute of limitations on these criminal cases was expiring. Our action research project had to
address immediate, pressing needs: Did the city have any resources to leverage to address the
problem, and what resources did they need? In the “One Way or Another” case study, we saw that
even conflictual information about the utility of specific resources was helpful and actionable for
stakeholders as they sought to remedy this problem. It wasn’t “perfect” information by any means,
nothing that would sit near the top of any “hierarchy of evidence” model (see Schwandt, 2015 for a
critical review). Yet, the triangulation assessments helped stakeholders understand “what was cer-
tain and what was uncertain,” as one leader noted, and that information, in this context, was
sufficient evidence for action.
Authors’ Note
The opinions or points of view expressed in this document are solely those of the authors and do not reflect the
official positions of any participating organization or the U.S. Department of Justice.
Acknowledgments
The authors thank the members of the action research project and academic colleagues who provided feedback
on prior drafts of this manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publica-
tion of this article: The action research project described in this article was supported by a grant from the
National Institute of Justice (2011-DN-BX-0001).
Note
1. We recognize that red-yellow-green color coding as a data visualization method is problematic for color-
blind individuals; therefore, any visualizations of data coded in the method would need to be modified
appropriately (see Evergreen, 2014).
References
Alexander, M. (2012). The new Jim Crow: Mass incarceration in the age of colorblindness. New York, NY:
The New Press.
Bailey, A., & Hutter, I. (2008). Qualitative to quantitative: Linked trajectory of method triangulation in a study
on HIV/AIDS in Goa, India. AIDS Care, 20, 1119–1124.
Bonilla-Silva, E. (1997). Rethinking racism: Toward a structural interpretation. American Sociological Review,
62, 465–480.
18 American Journal of Evaluation XX(X)
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod
matrix. Psychological Bulletin, 56, 81–105.
Campbell, R., Feeney, H., Fehler-Cabral, G., Shaw, J., & Horsford, S. (2017). The national problem of untested
sexual assault kits (SAKs): Scope, causes, and future directions for research, policy, and practice. Trauma,
Violence, & Abuse, 18, 363–376.
Campbell, R., Fehler-Cabral, G., Pierce, S. J., Sharma, D., Bybee, D., Shaw, J., . . . Feeney, H. (2015). The
Detroit sexual assault kit (SAK) action research project (ARP). Washington DC: National Institute of
Justice.
Campbell, R., Patterson, D., & Lichty, L. F. (2005). The effectiveness of sexual assault nurse examiner (SANE)
programs: A review of psychological, medical, legal, and community outcomes. Trauma, Violence, &
Abuse, 6, 313–329.
Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Techniques and procedures for developing
grounded theory (3rd ed.). Thousand Oaks, CA: Sage.
Deacon, D., Bryman, A., & Fenton, N. (1998). Collision or collusion? A discussion and case study of the
unplanned triangulation of quantitative and qualitative research methods. International Journal of Social
Research Methodology, 1, 47–63.
Denzin, N. (1978). Sociological methods. New York, NY: McGraw-Hill.
Department of Justice. (2013). A national protocol for sexual assault medical forensic examinations: Adults &
adolescents (2nd ed.). Washington, DC: Author.
Department of Justice. (2015). Identifying and preventing gender bias in law enforcement response to sexual
assault and domestic violence. Washington, DC: Author.
Donaldson, S., Christie, C., & Mark, M. (2015). Credible and actionable evidence. Thousand Oaks, CA: Sage.
Erickson, F. (1986). Qualitative methods in research on teaching. In M. C. Wittrock (Ed.), Handbook of
research on teaching (pp. 119–161). New York, NY: Macmillan.
Erzberger, C., & Prein, G. (1997). Triangulation: Validity and empirically-based hypothesis construction.
Quality and Quantity, 31, 141–154.
Farmer, T., Robinson, K., Elliott, S. J., & Eyles, J. (2006). Developing and implementing a triangulation
protocol for qualitative health research. Qualitative Health Research, 16, 377–394.
Figley, C. R. (1995). Compassion fatigue: Toward a new understanding of the costs of caring. In B. H. Stamm
(Ed.), Secondary traumatic stress: Self-care issues for clinicians, researchers, and educators (pp. 3–28).
Baltimore, MD: Sidran Press.
Flick, U. (1992). Triangulation revisited: Strategy of validation or alternative? Journal for the Theory of Social
Behaviour, 22, 175–197.
Flick, U., Garms-Homolova, V., Herrmann, W. J., Kuck, J., & Röhnsch, G. (2012). “I can’t prescribe something
just because someone asks for it . . . ”: Using mixed methods in the framework of triangulation. Journal of
Mixed Methods Research, 6, 97–110.
Glaser, B. G. (2007). Doing formal theory. In A. Bryant & K. Charmaz (Eds.), The SAGE handbook of
grounded theory (pp. 97–113). Thousand Oaks, CA: Sage.
Greene, J. C. (2002). With a splash of soda, please: Towards active engagement with difference. Evaluation, 8,
259–266.
Greene, J. C. (2007). Mixed methods in social inquiry. San Francisco, CA: John Wiley.
Greene, J. (2015). How evidence earns credibility in evaluation. In S. Donaldson, C. Christie, & M. Mark (Eds.),
Credible and actionable evidence (pp. 205–220). Thousand Oaks, CA: Sage.
Greene, J. C., Benjamin, L., & Goodyear, L. (2001). The merits of mixing methods in evaluation. Evaluation, 7,
25–44.
Greene, J. C., & Caracelli, V. J. (1997). Defining and describing the paradigm issue in mixed-method evalua-
tion. New Directions for Evaluation, 74, 5–17.
Greene, J., & McClintock, C. (1985). Triangulation in evaluation: Design and analysis issues. Evaluation
Review, 9, 523–545.
Campbell et al. 19
Hammersley, M. (2008). Questioning qualitative inquiry: Critical essays. London, England: Sage.
Howe, K. R. (2012). Mixed methods, triangulation, and causal explanation. Journal of Mixed Methods
Research, 6, 89–96.
Human Rights Watch. (2009). Testing justice: The rape kit backlog in Los Angeles City and County. New York,
NY: Author.
Human Rights Watch. (2010). “I used to think the law would protect me” Illinois’s failure to test rape kits. New
York, NY: Author.
Julnes, G., & Rog, D. (2015). Actionable evidence in context. In S. Donaldson, C. Christie, & M. Mark (Eds.),
Credible and actionable evidence (pp. 221–258). Thousand Oaks, CA: Sage.
Kidder, L. H., & Fine, M. (1987). Qualitative and quantitative methods: When stories converge. New Directions
for Evaluation, 35, 57–75.
Klofas, J., Hipple, N. K., & McGarrell, E. (Eds.). (2010). The new criminal justice: American communities and
the changing world of crime control. New York, NY: Routledge.
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
Lincoln, Y. S., Lynham, S. A., & Guba, E. G. (2011). Paradigmatic controversies, contradictions, and emerging
confluences, revisited. In N. K. Denzin & Y. S. Lincoln (Eds.), The SAGE handbook of qualitative research
(pp. 97–128). Thousand Oaks, CA: Sage.
MacQueen, K. M., McLellan-Lemal, E., Bartholow, K., & Milstein, B. (2008). Team-based codebook devel-
opment: Structure, process, and agreement. In G. Guest & K. M. MacQueen (Eds.), Handbook for team-
based qualitative research (pp. 119–135). Lanham, MD: Altamira.
Mark, M. (2015). Credible and actionable evidence. In S. Donaldson, C. Christie, & M. Mark (Eds.), Credible
and actionable evidence (pp. 275–302). Thousand Oaks, CA: Sage.
Mathison, S. (1988). Why triangulate? Educational Researcher, 17, 13–17.
McConney, A., Rudd, A., & Ayers, R. (2002). Getting to the bottom line: A method for synthesizing findings
within mixed-method program evaluation. American Journal of Evaluation, 23, 121–140.
Miles, M. B., Huberman, A. M., & Saldana, J. (2014). Qualitative data analysis: A method sourcebook.
Thousand Oaks, CA: Sage.
Miller, R. (2015). How people judge the credibility of information. In S. Donaldson, C. Christie, & M. Mark
(Eds.), Credible and actionable evidence (pp. 39–61). Thousand Oaks, CA: Sage.
Moran-Ellis, J., Alexander, V. D., Cronin, A., Dickinson, M., Fielding, J., Sleney, J., & Thomas, H. (2006).
Triangulation and integration: Processes, claims and implications. Qualitative Research, 6, 45–59.
Morse, J. M. (2015). Critical analysis of strategies for determining rigor in qualitative inquiry. Qualitative
Health Research, 25, 1212–1222.
Murphy, S. B., Banyard, V. L., & Fennessey, E. D. (2013). Exploring stakeholders’ perceptions of adult female
sexual assault case attrition. Psychology of Violence, 3, 172–184.
National Institute of Justice. (2010). Solicitation: Strategic approaches to sexual assault kit (SAK) evidence: An
action research project (SL #000947). Washington, DC: Author.
Office for Victims of Crime. (2017). The vicarious trauma toolkit. Washington, DC: Author.
Patton, M. Q. (2008). Utilization-focused evaluation (4th ed.). Thousand Oaks, CA: Sage.
Pinchevsky, G. M. (2018). Criminal justice considerations for unsubmitted and untested sexual assault kits: A
review of the literature and suggestions for moving forward. Criminal Justice Policy Review, 29, 925–945.
Rallis, S. (2015). When and how qualitative methods provide credible and actionable evidence. In S. Donald-
son, C. Christie, & M. Mark (Eds.), Credible and actionable evidence (pp. 137–156). Thousand Oaks, CA:
Sage.
Randall, W. L., & Phoenix, C. (2009). The problem with truth in qualitative interviews: Reflections from a
narrative perspective. Qualitative Research in Sport and Exercise, 1, 125–140.
Sands, R. G., & Roer-Strier, D. (2006). Using data triangulation of mother and daughter interviews to enhance
research about families. Qualitative Social Work, 5, 237–260.
20 American Journal of Evaluation XX(X)
Schwandt, T. (2015). Credible evidence of effectiveness: necessary but not sufficient. In S. Donaldson, C.
Christie, & M. Mark (Eds.), Credible and actionable evidence (pp. 259–273). Thousand Oaks, CA: Sage.
Sidanius, J., & Pratto, F. (2001). Social dominance: An intergroup theory of social hierarchy and oppression.
Cambridge, MA: University Press.
Singleton, R. A., & Straits, B. C. (2018). Approaches to social research (6th ed). New York, NY: Oxford
University Press
Strom, K. J., & Hickman, M. J. (2010). Unanalyzed evidence in law-enforcement agencies. Criminology &
Public Policy, 9, 381–404.
Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive measures: Nonreactive
research in the social sciences. Chicago, IL: Rand McNally.
Weiss, C. H. (1973). Where politics and evaluation research meet. Evaluation Practice, 14, 93–106.