Suurtamm & Koch, (2014) .
Suurtamm & Koch, (2014) .
DOI 10.1007/s11092-014-9195-0
There have been increasing calls in both the classroom assessment literature
(e.g., Gardner 2006; Stobart 2008) and in the mathematics education literature
(e.g., National Council of Teachers of Mathematics [NCTM] 1995, 2000a; Wiliam
2007) for teachers to shift their assessment practices. In both literatures, teachers are
C. Suurtamm (*)
Faculty of Education, University of Ottawa, 145 Jean Jacques Lussier,
Ottawa, ON K1N 6N5, Canada
e-mail: [email protected]
M. J. Koch
Faculty of Education, University of Manitoba, Rm 236 Education Bldg.,
71 Curry Place, Winnipeg, MB R3T 2N2, Canada
e-mail: [email protected]
Educ Asse Eval Acc
1 Theoretical perspectives
Teachers’ opportunities for dialogue and collaborative work are well recognized as
ways of encouraging change in classroom practice and supporting the implementation
of new ideas (Cochran-Smith and Lytle 2009: Fullan 2001; Hargreaves 2009; Lachance
and Confrey 2003; Webb and Jones 2009). We see the value of collaboration as
grounded in work on situated learning and communities of practice that suggest that
social practice is the primary, generative source of learning (Lave and Wenger 1991;
Wenger 1998). However, in many cases, learning communities have been set up in
school districts as part of the rhetoric of school reform and often have an external
agenda that may have little meaning for teachers (Hargreaves 2009). In contrast, in a
community of practice that is meaningful for teachers, dialogue needs to be practice-
based and practitioner-directed (Lee and Shaari 2012). The resulting exploration of
practice can adapt to the immediate learning priorities of the teachers, thereby enabling
rich insights to emerge. In this project, we sought to provide such a forum wherein
teachers could explore their experiences with assessment, engage in dynamic and
iterative discussions about topics that are meaningful to their assessment practice,
and try new assessment ideas. As researchers, we sought to better understand the action
and thinking of these teachers as they implemented new assessment approaches and as
they participated in these communities of practice.
In Ontario, 1 recent policy documents and ongoing educational reform efforts reflect
current thinking about mathematics teaching and learning and about classroom assess-
ment practice. The provincial mathematics curriculum documents that teachers are
required to follow serve essentially the same purpose as standards in the USA. The
Ontario curriculum encourages teachers to engage students in a mathematical activity
through problem solving and collaborative investigation. These activities help to
support students’ procedural and conceptual understanding of mathematical ideas
(OME 2005a, 2005b, 2007). The curriculum also includes important messages about
classroom assessment that align with current assessment literature. In addition, a recent
provincial assessment policy document, Growing Success (OME 2010), establishes
principles to guide teachers in their assessment practice. This document discusses the
multiple purposes of assessment and provides descriptions of assessment for learning
(AfL), assessment as learning (AaL), and assessment of learning (AoL). In addition,
Growing Success distinguishes between assessment (i.e., the process of gathering
information about how well students are achieving the curriculum expectations) and
evaluation which is defined as “the process of judging the quality of student learning on
the basis of established criteria” (2010, p.147). 2 Teachers are encouraged to use a
variety of assessment tools and to provide students with multiple opportunities to show
what they know and can do. The value of providing students with ongoing feedback
and developing students’ self-assessment skills is also stressed. Another important
dimension of this document is that it explicitly acknowledges the role of professional
1
In Canada, education is a provincial responsibility.
2
Though this distinction is clearly made in the Growing Success document, many Ontario teachers either use
assessment and evaluation interchangeably or use the term assessment to encompass both activities.
Educ Asse Eval Acc
3 Analytic framework
Both the literature we have reviewed and our experiences working with teachers in
various contexts suggest that as teachers incorporate new assessment practices and
implement inquiry-based approaches to mathematics teaching and learning, they are
likely to face multiple and varied challenges. In this study, we frame the challenges that
were discussed in the communities of practice as dilemmas of practice. Focusing on
these dilemmas of practice enables us to value the complexity of educational change, to
better understand the change process, and to suggest ways that teachers can be
supported as they further develop their practice (Adler 1998; Windschitl 2002). The
analytic framework we used to better understand these teachers’ dilemmas is adapted
from one developed by Windschitl (2002). Windschitl used a framework of four types
of dilemmas (conceptual, pedagogical, cultural, and political) to put forth a phenom-
enological perspective on what it means to enact constructivist teaching methodologies.
This phenomenological perspective recognizes that enacting constructivist methodolo-
gies is not a matter of simply applying instructional strategies but rather must consider
Educ Asse Eval Acc
the multiple contexts of teaching and the complex web of beliefs and concerns of a
variety of stakeholders including teachers, administrators, students, and parents. The
enactment of new practices brings forth ambiguities, tensions, and compromises that
need to be negotiated in practice. These dilemmas are important aspects of teachers’
intellectual and lived experience and play a key role in day-to-day classroom practice.
We see a strong parallel between Windschitl’s (2002) characterization of the enact-
ment of constructivist teaching methodologies and our observations of the enactment of
new approaches to classroom assessment. We found that using an adaptation of
Windschitl’s framework suits the situation of teachers introducing new assessment
practices in their mathematics classrooms and helps us to more closely examine the
types of dilemmas teachers face. We first outline Windschitl’s definitions of the four
dilemmas and then present our adaptation of his framework. In the “Method” section,
we discuss how we used this adapted framework in the analysis of our data.
Windschitl (2002) presents conceptual dilemmas as arising from teachers’ grappling
with the philosophical, psychological, and epistemological underpinnings of construc-
tivism. For instance, teachers may question whether particular disciplinary knowledge
can be constructed by students or needs to be explicitly taught. Pedagogical dilemmas
arise from attempts to design learning experiences that are based on a constructivist
philosophy and to adopt strategies such as managing new kinds of discourse or
facilitating students’ collaborative work. Cultural dilemmas emerge from attempts to
shift the classroom culture which might include a reorientation of the roles of teachers
and students in constructivist learning contexts or managing the various expectations of
teachers and students. As an example, teachers may struggle with how to develop a
classroom culture where students take responsibility for their own learning. Political
dilemmas emerge when constructivist ideas meet stakeholder norms and policies that
may appear to conflict with constructivist ideas. For instance, teachers may wonder
whether constructivist ways of teaching will adequately prepare students for standard-
ized testing situations. While highlighting each type of dilemma can be very useful,
Windschitl points out that dilemmas are inherently complex and may not necessarily
fall neatly into one of these four domains. Therefore, the overlap and interconnections
between dilemmas must also be examined.
The adaptation of Windschitl’s (2002) framework to address the dilemmas teachers
face when implementing new assessment ideas emerged primarily from our previous
research with teachers where we were able to see examples of each category. We see
conceptual dilemmas in assessment arise as teachers attempt to understand the concep-
tual underpinnings of current views of assessment and of mathematics teaching and
learning. For instance, teachers often are not presented with the rationale for making
changes in assessment practice, and they question whether the changes will improve
student learning. They grapple with such things as the different purposes of assessment,
the role of formative assessment, the value of aligning instruction and assessment, or
what it means to understand mathematics. Pedagogical dilemmas arise as teachers
create and enact new assessment opportunities. These dilemmas are often connected
to how to create assessment tasks, strategies, and tools, and they may occur as teachers
design mathematics activities, determine ways of recording, or work to find time for
meeting with students and providing feedback. Cultural dilemmas focus on changes in
classroom and school culture with regard to assessment practice. Windschitl seems to
restrict this category to the cultural changes in the classroom experienced by teachers
Educ Asse Eval Acc
Definition Examples
Conceptual dilemmas Grappling with current thinking in Understanding the different purposes
assessment and mathematics teaching of assessment
and learning; considering the “why” Questioning why a test alone will not suffice
of assessment in the assessment of mathematics
Pedagogical dilemmas Grappling with the creation of Finding efficient ways to record observations
assessment tasks, strategies, Designing a meaningful rubric
and tools; dealing with Finding ways to increase students’ involvement
the “how to” of assessment in the assessment process
Cultural dilemmas Focus on changes in classroom and Dealing with student expectations with respect
school culture with regard to to marks
assessment practice; often arise Being influenced by colleagues’ concerns
when new assessment practices about new approaches to assessment
threaten existing cultural practices Addressing parents’ or administrators’ concerns
Political dilemmas Dealing with school, district, or Being restricted to pre-made report card
provincial policies on classroom comments that do not align with teacher
and large-scale assessment that thinking
may or may not align with Aligning assessment levels used on rubrics with
teachers’ assessment thinking required report card percentage grades
and practices Reconciling current thinking in classroom
assessment with the requirements of
test-based accountability assessments
and students and places other stakeholders in the political dilemma category. In our
adaptation, we see cultural dilemmas as those that arise within the broader school
culture including administrators, parents, and other stakeholders. We see this broader
cultural context as strongly influencing or even co-constructing the classroom culture.
Teachers may face dilemmas when their new assessment practices threaten existing
cultural practices within a school or department setting or challenge parents’ notions of
assessment. Political dilemmas arise when teachers try to align their thinking and
practice with provincial, district, and school policies around assessment, particularly
with regard to accountability. For instance, teachers may be trying to make sense of a
new assessment policy that may conflict with the way they were thinking about
assessment. Table 1 summarizes our definitions of the dilemma categories in assess-
ment and offers additional examples to provide more clarity.
4 Method
and in meeting with others to discuss their assessment practice. They also assisted us
with establishing the communities of practice by helping to plan meetings throughout
the project and by participating in the sessions.
At the initial meeting in each district, we addressed questions about the project,
obtained participants’ consent, and provided an opportunity for participants to become
acquainted with the researchers and with one another. The participants also began to
share aspects of their assessment practice. We asked each participant to post an
assessment challenge on chart paper and then encouraged them to form small discus-
sion groups based on similar challenges. We also negotiated a tentative schedule of
future meetings to occur every 4–8 weeks depending on constraints such as school
breaks, examinations, or reporting periods. The initial meeting in each district took
place in November 2008, and we continued to meet with these communities of practice
until May 2010. Each meeting lasted approximately 2 h.
In District A, we had eight meetings over the 2 years. These were held after school
hours. While a total of 16 teachers participated in the District A community of practice,
there was generally a core of seven teachers attending most meetings. In District B, a
total of 26 teachers participated. The meetings were held during school hours, and the
school district paid for coverage of the participating teachers’ classes while they
attended the meetings. In the first year, there were three meetings with a total of 12
teachers participating, though attendance varied with approximately 10 teachers at most
meetings. During the second year, 14 additional teachers joined bringing the group to a
total of 26. As a result, the group was divided into two communities of practice based
on geographic location within the district. In that year, each of the District B commu-
nities of practice met three times. Thus, over the 2 years, nine meetings were held in
District B. In summary, 42 teachers participated in this research, and we gathered data
from a total of 17 meetings.
For both districts, in each meeting, we focused on sharing assessment practices and
challenges. As the project proceeded, teachers often reported their experiences trying
out ideas that had been generated at previous meetings, and they brought samples of
performance tasks, recording sheets, or rubrics and discussed how they were using
them. As researchers, we also shared assessment strategies that we had used, either in
our previous elementary or secondary teaching experience or in our current university
teaching. Over time, the teachers appeared comfortable enough to bring drafts of
assessment materials to the meeting so that they were able to get feedback from the
group. During the second year of the project, each group also used one of two
assessment resources depending on the grade focus of the group (National Council of
Teachers of Mathematics NCTM. 2000, 2001). Short readings from these resources
were used, at times, to stimulate a discussion. These resources were also used as the
basis for critically examining the design of an assessment instrument, such as a rubric,
without critiquing the work of any individual participant. Thus, while we provided
some structures for the meetings by initiating discussion with a question or a brief
reading, the focus of each meeting emerged mostly from the teachers.
The meetings were audio recorded, and the assessment materials the teachers shared
were gathered. After transcribing the recordings, our first step in the analysis process
was to read through each transcript in order to separate it into a series of conversations.
Individual conversations became the unit of analysis, and these conversations were
considered during the layers of analysis that we conducted. In the first layer, we focused
Educ Asse Eval Acc
5 Findings
To present our findings, we draw on our first layer of analysis and provide a summary
of the assessment practices these teachers were using. Using our second layer of
analysis, we then consider the dilemmas that emerged for these teachers as they
incorporated new assessment ideas into their practice. We provide not only examples
of conversations for each dilemma type but also show that many of the conversations
demonstrate the interconnections between the dilemma types. We also want to
reiterate that the goal of the project and of our analysis using Windschitl’s framework
was not to evaluate teachers’ assessment practices. Rather, we wanted to gain an
understanding of these teachers’ experiences and of the dilemmas that emerged in
their practice.
Our first layer of analysis suggests that the teachers in the study are using a variety of
assessment strategies to get a sense of the students’ understanding and to provide feedback
to the students and themselves. Their assessments took a variety of forms such as small
quizzes, math journals, conferencing, and observation. For instance, one Grade 6 teacher
in District B spent some time focusing on encouraging students to explain their solutions
and write answers in a way that could be understood by classmates. Our conversations
with the teachers revealed to us that the teachers wanted to make the students’ mathe-
matical thinking more apparent to both students and teachers. A Grade 7/8 teacher in
District A talked about the ways that she engaged students in a group activity where they
were creating algebraic patterns for one another to “break”. She explained that while the
students were busy creating and decoding algebraic patterns, she was freed up to listen to
students’ thinking. A Grade 9 teacher in District B chose a previously assigned homework
question and asked students to rewrite their responses with an emphasis on explaining
their thinking. He then provided students with descriptive feedback on their mathematical
understanding and the clarity with which they had explained their solution. At times,
Educ Asse Eval Acc
rather than the teacher providing feedback, he worked with students to help them provide
descriptive feedback to one another.
Teachers were also working on ways of designing new assessment strategies and
tools such as creating ways to record their observations or developing rubrics where
students were involved in determining the criteria so that the rubric is more useful to
students. Many discussions centered around creating summative assessment tasks that
mirrored the types of problem-solving tasks these teachers incorporated in their class-
room practice. These summative tasks often took the form of performance tasks that
were spread over several days.
In this section, we discuss the dilemmas of practice that emerged from our second layer
of analysis. We revisit the four dilemma categories presented in Table 1 and demonstrate
how they connect with the data analysis. For each category, we give a brief summary of
the dilemma category, describe some specific dilemmas from our data that connect with
the category, and provide an example of at least one conversation within each category.
Subsequently, we discuss some of the ways that these dilemmas are interconnected.
So, for these questions, I call them ‘Level 3 questions’ and the kids can get
up to a three on that stuff. The only way for them to get a Level 4 then is if
they choose a Level 4 question from the problems at the end. (Anita, District
B November 17, 2008).
There was a further discussion about creating and scoring these “leveled” questions,
and it became apparent that several teachers felt that they needed to have questions at
Educ Asse Eval Acc
each level of achievement in order to assign levels to students. During the next meeting
2 months later, teachers were working in small groups using tasks and samples of
student work taken from the province-wide assessment. The teachers were examining
the samples of student work and were asked what they thought the assessment criteria
for each level might be. The issue of leveled questions came up in a small group
discussion as they looked at one question and the student work samples.
Jim: Yeah. So ultimately it was a Level four question in the first place, right?
Jim: Yeah.
Jim: (Pause) Okay. You know and this is a good eye opener cause then it makes a
little more sense, too, with what we’ve been doing in our program. (District B,
January 29, 2009).
Unfortunately, reading this transcript may not do justice to the a-ha moment that Jim
had when he realized that it is not the question that is leveled, but the student responses
that are assessed in levels. Jim’s realization could certainly be noticed in the recording
both by the duration of the pause before he said each “okay” as well as in the tone of his
voice. It was a turning point as he realized that a well-designed question can elicit a
variety of levels of a student response. The ensuing discussion with the whole group
indicated that many of these teachers had collaboratively worked through a conceptual
shift about the idea that an effective question can elicit a full range of student solutions.
Amidst conversations about assessment, conceptual dilemmas of what mathematics
is and how students learn mathematics often surfaced. Teachers talked about what
inquiry learning means to them, the value of students struggling with a problem, and
just how much prompting or scaffolding is appropriate. Often, the nature of learning
was at the heart of the discussion. For instance, there was an extensive discussion in
District A, as one teacher struggled with how to provide questions that “move
students from A to B”. This prompted another teacher to question whether
Educ Asse Eval Acc
Jason: This is my number one question ‘Do you need to record that somewhere?’
Number two—‘How should you record that?’ and number three, you know
‘What will we do with it if we had it recorded?’
Barbara: Yeah, yeah, exactly and, I mean, it’s not always going to be something
you would use to move forward. It depends on what it is, right.
Jason: Yeah.
This issue of recording and tracking the information that teachers receive from
observing problem-solving situations recurred during several conversations. Teachers
brought in different recording sheets that they used so that through sharing their
strategies, they could refine this practice. For instance, during the second year of the
project, Jason revisited the issue of recording by bringing in a recording sheet one of his
colleagues was using that he wanted to share with the group.
Educ Asse Eval Acc
Jason: So, I really want to go deeper into this whole idea of the assessment for
learning, in terms of having a record of the ongoing conversations we have with
kids while they're learning. And so we don’t know if this [the recording sheet] is
the right way because it's quite a bit of tracking and detail, but we’re
experimenting with it and we’re trying it out. (District A, January 21, 2010).
The group was interested in the way students’ conversations were recorded as it
involved a coding system where the teacher could quickly jot down anecdotal notes
about their students’ mathematical thinking.
Many other conversations that suggest pedagogical dilemmas took place in our
meetings in the two districts. Conversations occurred in both districts about
designing rubrics that clearly communicate assessment criteria to students.
Teachers also grappled with developing and trusting students’ skills in self and
peer assessment. Many teachers focused on working with students to give written
feedback to one another and often struggled with ways to help students provide
meaningful peer assessment. Teachers brought in samples of strategies and tools
they were using; and in some cases, they decided to work together on the design
of a tool, try it out in their classroom, and bring back student work and feedback
to share with the group.
Cultural dilemmas in assessment focus on changes in classroom and school culture with
regard to assessment practice. In our study, dilemmas often arose when new assessment
practices threatened existing cultural practices. There were often discussions about
student expectations with respect to classroom practices, and specifically with respect
to marks. Teachers also wrestled with colleagues’ concerns about new approaches to
assessment, the role of consistency in assessment practices among department members,
and the parents’ and administrator’s understanding of assessment.
Teachers often discussed the challenges they faced in shifting the classroom culture
so that students could become comfortable with new classroom practices. One such
discussion in District A revolved around the use of concrete materials, or manipula-
tives, in problem-solving activities. Claire began the discussion by describing a
situation where she had Grade 9 students work on a problem-solving activity and
had a container of square tiles available at the side of the classroom for students to
use. She remarked that while students readily got up from their seats to get a
calculator, ruler, or graph paper, none of the students got up to get the tiles. On
another occasion she tried something different. She placed a small bag of tiles on each
group’s table and observed as the students eagerly used the tiles to help them solve
the problem.
There must have been something about the fact that they had to go get them
that didn’t, that felt like they might be saying “I can’t do this without the
tiles”. Whereas when everybody had them, everybody used them. (Claire,
District A, June 9, 2009).
Educ Asse Eval Acc
Several others talked about the cultural shift as their students accepted that they
could use concrete materials in both instructional and assessment tasks. As Demi
explained of her Grade 10 students;
I tried to introduce algebra tiles. And this year we used the algebra tiles to factor
and one of the kids said to me “If I use the tiles, does that make me dumb?” And I
said “No, we all have different avenues to see things . . . It's, you know, it’s just
we all learn differently”. Now that you say that, it would be nice if I could give
them each a little kit of the algebra tiles . . . And then the other big surprise for
them was “Are you going to let us use this on the test or the exam?” I said “Well,
yeah, that’s perfectly fine because you are still showing me your understanding.”
So it's a big step for me to let them, you know, show me how they can do it
without prescribing one way. (Demi, District A, June 9, 2009).
This quotation demonstrates the classroom cultural shifts that take place for the
students and for the teacher. In this conversation, the teachers went on to discuss
other assessment opportunities they provide for students to show what they know and
can do.
Cultural dilemmas also occurred with interactions with colleagues or administrators.
The issue of consistency and colleagues’ expectations was a recurring theme. Many of
the participants felt fairly isolated in their efforts to try to find or develop alternate
forms of assessment. For instance, in a discussion in District A, teachers were talking of
the difficulty of incorporating alternate forms of assessment rather than only using a
unit test, particularly when their colleagues were not doing so. Barbara, a secondary
mathematics teacher, mentioned that she found it much easier to engage in alternative
types of assessment, such as performance tasks or student presentations, when she is
teaching a course that is not being taught by other teachers in the department.
I find it easiest in a class that’s very different anyway where I don’t have to worry
about being consistent with other teachers or anything else. Like my gifted class, I
find it much more easy to be flexible and try different things. (Barbara, District A,
November 26, 2009).
Issues of consistency relate to the cultural dilemmas that teachers face as they work
within schools and mathematics departments where other teachers may be using more
traditional forms of assessment and the participants are often challenged by colleagues
about what they are doing and why. For the sake of consistency across classes, teachers
may be discouraged from using new assessment approaches that others may be
unwilling to use.
Educ Asse Eval Acc
Teachers also talked about the role of parents in changing assessment practices.
Many parents are much more familiar with a test at the end of a unit and may lack
confidence in new forms of assessment. In one discussion, Brian mentions that parents
often have trouble interpreting a rubric that accompanies a task and seeing how it is
used to assess student work. He remarks,
This comment led to a discussion of the design of rubrics for problem-solving activities
that could provide clarity to both students and parents without being overly
prescriptive.
Political dilemmas in assessment arose as teachers’ thinking and practice were juxta-
posed with provincial, district, and school assessment policies. Many discussions
focused on understanding and interpreting policy about grading and reporting, such
as matching assessment levels used on rubrics with required report card percentage
grades. For instance, during a discussion of reporting with elementary and middle
school teachers in District B, teachers talked about particular ways that they are
instructed to make comments on a report card. Rather than being able to write their
own comments to accompany a mark, they must choose from an established list of
report card comments. As one Grade 6 teacher comments;
The way I would like to write my report cards and the way I’m expected to write
my report cards are two very different things. I often feel really bad signing my
name to something when I don’t agree with the way it is written. You know, it’s
written with Ministry [of Education] words that I would never say if I was
speaking face-to-face with a parent. Half of the parents don’t understand it. When
they see you know, ‘with considerable effectiveness’ eighteen times throughout
the report card, they just tune it out. They don’t even read that. . . I’d like to use
language they would understand but we’re told that we can’t do that (Melanie,
District B, April 27, 2010).
The conversation continued with others commenting on the value of using student-
friendly language in rubrics in class and parent-friendly language in reporting. Many
teachers felt constrained by the feeling that they had to use language consistent with
that used by the Ministry of Education.
There were many discussions in both districts over the course of the 2 years about
the “Achievement Chart,” which (as previously noted) is found in all Ontario curric-
ulum documents and is intended as a tool to guide the assessment of student work. The
Achievement Chart is divided into four categories of knowledge and skills, and for each
of these categories, descriptors for four levels of achievement are provided. The four
Educ Asse Eval Acc
categories of knowledge and skills are the following: knowledge and understanding,
thinking, communication, and application. In some schools, particularly at the second-
ary level, teachers are expected to organize their marks into these four categories. At the
very least, the expectation is that teachers will provide assessments that cover these four
categories. From the very first meetings in each district, teachers claimed that they were
expected to identify assessment questions and/or tasks to match each of the categories.
They saw this as a challenge as they felt they needed several tasks or questions for each
category, and they also saw that many of the questions or tasks that they designed
spanned more than one category. One conversation in District A began with Bob
suggesting
It’s [the Achievement Chart] pretty much a straightjacket, I think, and it doesn’t
make any sense. I mean, people are spending so much time trying to figure out ‘Is
this an application?’ or ‘Is this a knowledge item?’. There’s no way you can ever
decide whether something is one or the other. (Bob, District A, November 28, 2008).
Bob sees no real reason to separate assessment and marks into these categories, and he
feels that he has been incorporating and interconnecting these categories in his work for
quite some time. He finds the policy directing him to separate them to be counterpro-
ductive. However, there was some disagreement among the teachers in this conversa-
tion. One speaker stated that she finds it helpful to differentiate where students are
doing well and where they need assistance, and the categories help her with this
differentiation. Another teacher claimed that this policy encourages teachers to consider
incorporating more than just knowledge into the teaching and learning of mathematics.
There was also a discussion about what the categories mean, particularly the thinking
category. The participants went on to discuss the definition of thinking, ways to assess
thinking, and what formative assessment of thinking might look like.
Teachers also struggled with the role of province-wide accountability assessments in
their classroom practice. While the teachers involved in the study appeared to have
progressive ideas and practices with respect to assessment, many shared their practice
of preparing students for province-wide assessments by practicing multiple-choice
questions. In addition, while all of the teachers felt that they had a good sense of what
their students knew and could do, many felt that the results of the provincial assessment
were the measures valued by parents, school administrators, and the general public.
Like Windschitl, we do not see the categories of dilemmas as mutually exclusive and
have several examples that show the interconnectedness of these categories. For
instance, several conversations in District A focused on the alignment of collaborative
learning situations and assessment practices. One such conversation about collaborative
assessment appears to have participants grappling with pedagogical, conceptual, and
political dilemmas. The conversation began with Claire indicating that she finds it
difficult for students to work on their own in an assessment situation since they
typically work together on tasks in instructional situations. She wonders whether they
could work collaboratively on an assessment task but struggles with how she would
assess individual students if the solution to a problem was a group effort. She also
Educ Asse Eval Acc
believes that the assessment messages in the policy documents require that summative
assessments evaluate individual student achievement only. Others in the group
discussed how separating students for an assessment task seems artificial and counter
to how the students are used to working. Some of the participants shared the ways that
they incorporate some group work in their assessment. Nancy, a secondary mathemat-
ics teacher, explains;
And so, when the students are working really well together, and that’s how
they’re learning, is it fair for a summative assessment to force them to generate
their own independent product? And we all kind of thought that we do things,
when we do performance tasks, that allow the students to work together, like
giving them an opportunity to brainstorm [collaboratively] and then go off and
create their own individual product. (Nancy, District A, November 28, 2008).
Barbara followed Nancy’s comment by describing how her use of student collaboration
in an assessment has evolved.
Something that’s been happening to me just maybe in the last 2 years . . . where
there’s a real challenging problem-solving activity, where they [students] have to
sit down and figure out how to solve it, and it incorporates maybe more than just
one unit. I start them off in pairs usually, and what happens is that I’m always
going to separate them, right? It’s always you start together in pairs for the first 10,
15 min, and then you’ll do the rest by yourselves because you’ll consolidate some
ideas. But what happens is, when the 10 or 15 min is up, they’re still working so
deeply, . . . and they’re getting excited because one kid will say ‘Well, okay, but
I’m stuck’ and the other kid will say ‘I couldn’t get to there, but now I know what
to do’. And they’re so excited about what’s going on that I often end up not
separating them, just because, you know what? ‘You’re working so well together,
just go ahead and finish that way’. And part of me says ‘Well, should it have been
independent?’ And then I figure so many other summative assessments, my tests
are definitely independent, so maybe it’s okay in problem solving or in thinking if
they end up doing it together. (Barbara, District A, November 28, 2008).
Nancy added that she often interviews individual students about their solution. She
explains,
We see a variety of types of dilemmas here. There are pedagogical dilemmas about
how one arranges students who are working on an assessment task. There are concep-
tual dilemmas as participants’ grapple with the idea that if instructional tasks are done
collaboratively, then assessment tasks should reflect that collaboration. This challenges
the teachers’ thinking about the alignment of curriculum and assessment. For some, like
Educ Asse Eval Acc
Claire, collaborative assessment raises questions about feeling comfortable with what
an individual student really knows. This tends to bring the discussion back to peda-
gogical issues as participants offer ways of getting at individual student’s understand-
ing. In addition, the conversation is framed within a political context where teachers
interpret policy documents as stating that formative assessment can include collabora-
tive tasks, but summative assessment and evaluation must be based on individual
student work.
This is just one of the many examples of conversations where different categories of
dilemmas are interconnected. In fact, it is difficult to label each conversation with just
one dilemma category. In the examples we provided within our description of each
category, the reader probably started seeing other categories of dilemmas. For instance,
in the discussion of consistency in the section on cultural dilemmas, Jason’s discussion
also brings forth conceptual dilemmas about the meaning of consistency. And, in the
section on political dilemmas, the discussion of the Achievement Chart categories is
also framed by conceptual dilemmas, particularly around the concern over “what is
thinking?.”
6 Discussion
This study provides many valuable insights into teachers’ thinking and actions as they
implement new ideas in assessment. In our discussion, we focus on three areas of
insight that emerge from this research: the value of using our adaptation of Windschitl’s
framework to better understand dilemmas in assessment practice, the role of coherence
in messages about classroom assessment, and the potential for communities of practice
to enable teachers to further develop their assessment practice and expand their capacity
for professional judgement.
We find that the adapted framework helps us to better understand the complex process
of changing assessment practice. Being able to parse out conversations into categories
helps to unravel the struggles that teachers grapple with and focuses attention on
particular types of dilemmas. Developing an in-depth understanding of the conceptual,
pedagogical, cultural, and political dimensions of the dilemmas that teachers face is the
first step in better understanding the complexity of changing assessment practice. For
instance, the pedagogical dilemmas that emerge as teachers work to develop effective
assessment strategies and tools are qualitatively different than the political dilemmas
teachers face as they implement assessment policy. Acknowledging and understanding
each type of dilemma is particularly valuable in helping us to recognize that teachers’
work needs to be supported in different ways and also in determining what those
supports might look like. Pedagogical dilemmas that deal with access to resources or
the specifics of assessment practice may be able to be addressed through workshops or
enhanced support for sharing resources. Cultural dilemmas, such as dealing with the
differing views of assessment held by students, parents, and teachers, may be addressed
through administrative support as well as ongoing forms of communication that make
classroom assessment practice more transparent to everyone. Recognizing the different
Educ Asse Eval Acc
types of dilemmas and that each needs to be supported in a variety of ways is essential
for those who design policy, professional development, and other supports for the
implementation of new assessment ideas.
Moreover, while our observations are based on the experiences of a small group of
teachers in two Ontario districts, the categories in the framework seem to reflect
observations made in other studies of teachers’ assessment practice. For instance,
Webb and Jones (2009) demonstrate that changing assessment practice to genuinely
support student learning constitutes a major change in a classroom culture. Their
observations of the challenges faced by six primary teachers and their classes in the
UK are quite similar to the cultural dilemmas we observed. Earl et al. (2011), drawing
on a number of studies of assessment practice in Canada, observe that teachers may
begin to change their assessment practice by trying out new tools and techniques, but a
more fundamental shift in thinking and beliefs about assessment is needed to enable
teachers to make adaptive assessment decisions in their classrooms. They highlight the
need for teachers to move beyond a surface-level understanding of an assessment that
supports student learning to engage in a deeper understanding of these concepts. The
discussion by Earl et al. (2011) closely parallels the distinctions between pedagogical
and conceptual dilemmas in our study. Being able to identify the different dimensions
of dilemmas using the adapted framework helps to identify aspects of the change
process and points to the multiple and diverse ways that change must be supported.
Along with looking at the dilemma types individually, the framework also demon-
strates that high-quality assessment lies at the intersection of political, cultural, con-
ceptual, and pedagogical issues. Our data show that many conversations include more
than one dilemma category. While distilling the experiences into the four categories
emphasizes the different layers of concern that teachers face, considering the intercon-
nectedness of these layers foregrounds the complexity of implementing and supporting
new practices in classroom contexts. In other words, professional development that
provides new assessment strategies and tools, perhaps addressing pedagogical di-
lemmas, will not suffice as issues of classroom, department, school, and community
cultures may also need to be addressed. Likewise, grappling with teachers’ conceptual
dilemmas by developing their understanding of a sound assessment practice may clash
with district, state or provincial, or federal assessment policies regarding a large-scale
accountability. These sorts of clashes need to be acknowledged and addressed.
Furthermore, policy makers who mandate new assessment practices need to be aware
of the implications politically, conceptually, pedagogically, and culturally and recognize
that a mandate alone will not suffice.
Perhaps, coming most clearly from our analysis of the interconnections between
dilemmas, this framework demonstrates the pivotal role of coherence in supporting
the development of new assessment practices. Our observations underscore the impor-
tance of coherence in the messages teachers receive from various sources including
current thinking in assessment, policy documents, professional development activities,
existing cultural practices, and available teacher resources. Similar messages within and
between curriculum documents and assessment policies and across the policies that are
established at the school, district, and provincial levels are one key dimension of
Educ Asse Eval Acc
coherence. At the same time, policy documents need to echo current ways of thinking
about assessment and to reflect the views of assessment that teachers are encountering
in research-based professional development initiatives and publications. Coherence is
further enhanced when similar messages are given to administrators, teachers, parents,
and students with the goal of developing everyone’s conceptual understanding of
assessment. The importance of a coherent assessment system is emphasized in
NCTM’s Assessment Standards document (1995). The Coherence Standard is one of
six mathematics assessment standards, and it emphasizes alignment between curricu-
lum, instruction, and assessment. Furthermore, it suggests that assessment should not
produce conflicting messages.
In this research project, we found that coherence was an essential support for the
participating teachers as they worked to develop their assessment practice. For the most
part, the teachers in this study were receiving similar messages from the Ontario
curriculum, assessment policy documents such as Growing Success 3 (OME 2010),
various teacher resources they were using, and in professional development opportu-
nities. As a result, many of the things they were doing collaboratively in their
departments or schools were aligned with current thinking about assessment and
facilitated the development of these teachers’ professional judgement with regard to
assessment. We also saw that in those instances where a policy was misunderstood or
was mandated without providing teachers with an opportunity to understand the
meaning or rationale of the policy, the ambiguity generated dilemmas for the teachers.
Our observations demonstrate how coherence supports teachers in further developing
their assessment practice and suggest that a lack of coherence is likely to exacerbate the
dilemmas teachers experience as they implement new assessment strategies. We are
encouraged that the recently released Growing Success policy document is consistent
with current thinking about assessment and brings greater coherence across curriculum
and policy documents in Ontario. Furthermore, it acknowledges the critical role of the
teachers’ professional judgement in classroom assessment practice.
3
The final version of Growing Success was released toward the end of our 2-year study, but an earlier draft of
this document had been released, and many Ontario teachers had become familiar with the contents through
the earlier drafts and through other policy documents.
Educ Asse Eval Acc
We see communities of practice as an ideal forum for moving beyond tools and
techniques, for developing deep conceptual understandings, for working through dilemmas
in assessment practice, and ultimately for enhancing the teachers’ capacity for professional
Educ Asse Eval Acc
judgment. This sort of sustained, practice-based, teacher-led initiative can be very effective
in moving teachers beyond a surface level of change in their assessment practice.
7 Concluding comments
Our work has implications for both research and practice. We provide a rich
description of teachers engaged in a change process concerning assessment.
Many of the areas that teachers touched on are rich sources for further research,
such as issues of consistency, or assessment of collaborative inquiry. Issues such
as these require further understanding, and additional research would be benefi-
cial. Additionally, we suspect that the dilemmas’ framework may be useful to
other researchers when examining new initiatives in teacher practice. Identifying
dilemmas, determining how the dilemmas interact, and examining how teachers
are supported in negotiating dilemmas are important components to implementing
new initiatives. We found that Windschitl’s (2002) framework helped us to
examine the dilemmas in more depth and to recognize the variety of ways that
the implementation of new ideas needs to be supported. Our work also provides an
example of effective research methods that focus on teacher practice. The com-
munity of practice provided a forum for sustained professional development and
dialogue as well as a rich research setting. While we were able to identify certain
components that facilitated this, such as connections to practice and a non-
evaluative stance, each of these components could be the subject for more detailed
research.
In terms of implications for practice, the insights gained through our analysis are of
interest to teachers, those working with teachers, and policy makers seeking to influ-
ence assessment practice. We have had opportunities to share our work using this
framework with groups of these stakeholders. In a 3-day mathematics education
leadership conference, 300 participants were exposed to the dilemmas’ framework.
We engaged them in a discussion about examples of types of dilemmas and ways that
they could be supported. The majority of the conference audience was mathematics
resource teachers, mathematics teaching coaches, school district consultants, or those
responsible for mathematics teacher education. Feedback from these discussions indi-
cated that the framework gave them ways of considering the complex task of profes-
sional development around assessment. Relating to the framework helped them realize
that in their work with teachers, they need to consider a multi-pronged approach to deal
with the political, cultural, conceptual, and pedagogical aspects of implementing new
ideas. As the 3-day conference was concluding, we could hear the categories of
Windschitl’s framework as part of the participants’ vocabulary. The ease with which
they adopted these ideas told us that our work was also useful for their work with
teachers.
In this paper, we have described some of the complex assessment dilemmas
that teachers face as they implement new assessment practices. While some
might think that dilemmas are problematic and need to be solved, we saw
discussions of these dilemmas as generative. The intense and sustained work
that our teacher participants engaged in as they wrestled with important assess-
ment issues helped to break down the isolation that teachers often feel. The
Educ Asse Eval Acc
Acknowledgments The authors wish to thank the Canada Foundation for Innovation (CFI) who provided
an infrastructure grant for the research facilities (Pi Lab) used in this project. The second author would also
like to acknowledge the financial support she received from the Social Sciences and Humanities Research
Council of Canada. Most importantly, we wish to acknowledge the dedication and commitment of the research
project participants, without whom this work would not have been possible.
References
Adler, J. (1998). A language of teaching dilemmas: unlocking the complex multilingual secondary mathe-
matics classroom. For the Learning of Mathematics, 18(1), 24–33.
Ball, D. L. (Ed.). (2003). Mathematical proficiency for all students: toward a strategic research and
development program in mathematics education. Santa Monica, CA: RAND Institute
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles,
Policy, and Practice, 5(1), 7–74.
Brookhart, S. M. (2003). Developing measurement theory for classroom assessment purposes and uses.
Educational Measurement: Issues and Practice, 22(4), 5–12.
Carr, W., & Kemmis, S. (1986). Becoming critical: education, knowledge, and action research. London:
Falmer.
Cochran-Smith, M., & Lytle, S. L. (2009). Inquiry as stance: practitioner research for the next generation.
New York: Teachers College Press.
Delandshere, G., & Petrosky, A. R. (1998). Assessment of complex performances: limitations of key
measurement assumptions. Educational Researcher, 27(2), 14–24.
Earl, L. (2003). Assessment as learning: using classroom assessment to maximize student learning. Thousand
Oaks, CA: Corwin.
Earl, L. M., Volante, L., & Katz, S. (2011). Unleashing the promise of assessment for learning. Education
Canada, 51.
Forman, E. A. (2003). A sociocultural approach to mathematics reform: speaking, inscribing, and doing
mathematics within communities of practice. In J. Kilpatrick, W. G. Martin, & D. Schifter (Eds.), A
research companion to principles and standards for school mathematics (pp. 333–352). Reston, VA:
National Council of Teachers of Mathematics.
Fullan, M. (2001). The new meaning of educational change (3rd ed.). New York: Teachers College Press.
Gardner, J. (Ed.) (2006). Assessment and learning. Thousand Oaks, CA: Sage.
Gearhart, M., & Saxe, G. B. (2004). When teachers know what students know: integrating mathematics
assessment. Theory into Practice, 43(4), 304–313.
Gipps, C. V. (1999). Socio-cultural aspects of assessment. Review of Research in Education, 24, 355–392.
Graue, M. E., & Smith, S. Z. (1996). Shaping assessment through instructional innovation. Journal of
Mathematical Behavior, 15(2), 113–136.
Hargreaves, A. (1994). Changing teachers, changing times: teachers work and culture in a postmodern age.
London: Cassell.
Hargreaves, A. (2009). A decade of educational change and a defining moment of opportunity—an introduc-
tion. Journal of Educational Change, 10, 89–100.
Hargreaves, A., & Fullan, M. J. (2012). Professional capital: transforming teaching in every school. New
York: Teachers College Press.
Educ Asse Eval Acc
Jordan, B., & Henderson, A. (1995). Interaction analysis: foundations and practice. The Journal of the
Learning Sciences, 4(1), 39–103.
Kincheloe, J. L. (2003). Teachers as researchers: qualitative inquiry as a path to empowerment. London:
Routledge Falmer.
Lachance, A., & Confrey, J. (2003). Interconnecting content and community: a qualitative study of secondary
mathematics teachers. Journal of Mathematics Teacher Education, 6(2), 107–137.
Lave, J., & Wenger, E. (1991). Situated learning: legitimate peripheral participation. New York: Cambridge
University Press.
Lee, D. H. L., & Shaari, I. (2012). Professional identity or best practices? An exploration of the synergies between
professional learning communities and communities of practices. Creative Education, 3(4), 457–460.
Lipman, P. (2009). Paradoxes of teaching in neo-liberal times: education 'reform' in Chicago. In S. Gewirtz, P.
Mahony, I. Hextall, & A. Cribb (Eds.), Changing teacher professionalism: international trends, chal-
lenges and ways forward (pp. 67–80). Abingdon, Oxon: Routledge.
Luke, A. (2011). Generalizing across borders: policy and the limits of educational science. Educational
Researcher, 40(8), 367–377.
Lund, A. (2008). Assessment made visible: individual and collective practices. Mind, Culture, and Activity, 15,
32–51.
Moss, C. M., & Brookhart, S. M. (2009). Advancing formative assessment in every classroom. Alexandria:
ASCD.
National Council of Teachers of Mathematics (NCTM). (1995). Assessment standards for school mathematics.
Reston: NCTM.
National Council of Teachers of Mathematics (NCTM). (2000). Principles and standards for school mathe-
matics. Reston, VA: NCTM.
National Council of Teachers of Mathematics (NCTM). (2001). Mathematics assessment: cases and discus-
sion questions (grades K-5). Reston, VA: NCTM.
National Council of Teachers of Mathematics (NCTM). (2000). Mathematics assessment: cases and discus-
sion questions (grades 6–12). Reston, VA: NCTM.
National Council of Teachers of Mathematics. (NCTM). (1989). Curriculum and evaluation standards for
school mathematics. Reston, VA: NCTM.
Ontario Ministry of Education (OME). (2005a). The Ontario curriculum grades 1–8 mathematics revised.
Toronto, ON: Queen's Printer.
Ontario Ministry of Education (OME). (2005b). The Ontario curriculum grades 9 and 10 mathematics
revised. Toronto, ON: Queen's Printer.
Ontario Ministry of Education (OME). (2007). The Ontario curriculum grades 11 and 12 mathematics revised.
Toronto, ON: Queen's Printer.
Ontario Ministry of Education (OME). (2010). Growing success: assessment, evaluation, and reporting in
Ontario schools. Toronto, ON: Queen's Printer.
Romagnano, L. (2001). The myth of objectivity in mathematics assessment. Mathematics Teacher, 94(1), 31–
37.
Sfard, A. (2003). Balancing the unbalanceable: the NCTM Standards in the light of theories of learning
mathematics. In J. Kilpatrick, G. Martin, & D. Schifter (Eds.), A research companion for NCTM
Standards (pp. 353–392). Reston, VA: National Council for Teachers of Mathematics.
Shepard, L. A. (2001). The role of classroom assessment in teaching and learning. In V. Richardson (Ed.),
Handbook of research on teaching (4th ed., pp. 1066–1101). Washington, DC: American Educational
Research Association.
Stobart, G. (2008). Testing times: the uses and abuses of assessment. London: Routledge.
Suurtamm, C. (2004). Developing authentic assessment: case studies of secondary school mathematics
teachers’ experiences. Canadian Journal of Science, Mathematics, and Technology Education, 4, 497–
513.
Suurtamm, C., & Graves, B. (2007), Curriculum implementation in intermediate mathematics (CIIM):
Research Report. www.edu.gov.on.ca/eng/studentsuccess/lms/files/CIIMResearchReport2007.pdf
Suurtamm, C., Koch, M., & Arden, A. (2010). Teachers’ emerging assessment practices in mathematics:
classrooms in the context of reform. Assessment in Education: Principles, Policy, and Practice, 17(4),
399–417.
Volante, L., & Beckett, D. (2011). Formative assessment and the contemporary classroom: synergies and
tensions between research and practice. Canadian Journal of Education, 34(2), 239–255.
Webb, M., & Jones, J. (2009). Exploring tensions in developing assessment for learning. Assessment in
Education: Principles, Policy & Practice, 16(2), 165–184.
Educ Asse Eval Acc
Wenger, E. (1998). Communities of practice: learning, meaning and identity. Cambridge, UK: Cambridge
University Press.
Wiliam, D. (2007). Keeping learning on track: classroom assessment and the regulation of learning. In F. K.
Lester (Ed.), Second handbook of research on mathematics teaching and learning (Vol. II, pp. 1053–
1098). Reston, VA: NCTM.
Windschitl, M. (2002). Framing constructivism in practice as the negotiation of dilemmas: an analysis of the
conceptual, pedagogical, cultural, and political challenges facing teachers. Review of Educational
Research, 72(2), 131–175.