Journal of Teacher Education http://jte.sagepub.
com
Assessing Teacher Education: The Usefulness of Multiple Measures for Assessing Program Outcomes
Linda Darling-Hammond
Journal of Teacher Education 2006; 57; 120
DOI: 10.1177/0022487105283796
The online version of this article can be found at:
http://jte.sagepub.com/cgi/content/abstract/57/2/120
Published by:
http://www.sagepublications.com
On behalf of:
American Association of Colleges for Teacher Education (AACTE)
Additional services and information for Journal of Teacher Education can be found at:
Email Alerts: http://jte.sagepub.com/cgi/alerts
Subscriptions: http://jte.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 10.1177/0022487105283796
ASSESSING TEACHER EDUCATION
THE USEFULNESS OF MULTIPLE MEASURES
FOR ASSESSING PROGRAM OUTCOMES
Linda Darling-Hammond
Stanford University
Productive strategies for evaluating outcomes are becoming increasingly important for the improve-
ment, and even the survival, of teacher education. This article describes a set of research and assess-
ment strategies used to evaluate program outcomes in the Stanford Teacher Education Program
during a period of program redesign over the past 5 years. These include perceptual data on what can-
didates feel they have learned in the program (through surveys and interviews) as well as independ-
ent measures of what they have learned (data from pretests and posttests, performance assessments,
work samples, employers’ surveys, and observations of practice). The article discusses the possibili-
ties and limits of different tools for evaluating teachers and teacher education and describes future
plans for assessing beginning teachers’performance in teacher education, their practices in the initial
years of teaching, and their pupils’ learning.
Keywords: teacher education reform; teacher education
Productive strategies for evaluating outcomes tion standards (Wise, 1996). The Teachers for a
are becoming increasingly important for the im- New Era initiative launched by the Carnegie
provement, and even the survival, of teacher Corporation of New York and other founda-
education. In the political arena, debates about tions requires that the 11 institutions supported
the legitimacy and utility of teacher education to redesign their programs collect evidence
as an enterprise are being fought on the basis of about how their teachers perform and how the
presumptions—and some evidence—about students of these teachers achieve.
whether and how preparation influences teach- In light of these concerns, teacher educators
ers’ effectiveness, especially their ability to in- are seeking to develop strategies for assessing
crease student learning in measurable ways the results of their efforts—strategies that
(see, e.g., Darling-Hammond, 2000, in response appreciate the complexity of teaching and
to Ballou & Podgursky, 2000; Darling-Hammond learning and that provide a variety of lenses on
& Youngs, 2002, in response to U.S. Department the process of learning to teach. Many programs
of Education, 2002). The federal Higher Educa- are developing assessment tools for gauging
tion Act now requires that schools of education their candidates’ abilities and their own success
be evaluated based on graduates’ performance as teacher educators in adding to those abilities.
on licensing tests, and the National Council for Commonly used measures range from candi-
Accreditation of Teacher Education now re- date performance in courses, student teaching,
quires that programs provide evidence of out- and on various assessments used within pro-
comes as they respond to each of the accredita- grams to data on entry and retention in teach-
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 120-138
DOI: 10.1177/0022487105283796
© 2006 by the American Association of Colleges for Teacher Education
120
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
ing, as well as perceptions of preparedness on that are critical to an evaluation agenda that
the part of candidates and their employers once both documents and improves teacher educa-
they are in the field. In rare cases, programs tion. Consortia of universities engaged in such
have developed evidence of teachers’ “impact” assessments may also play a useful role in
based on analyses of changes in their pupils’ enabling the costly and difficult research
learning gauged through measures of student on teacher effectiveness that policy makers
attitudes or behavior, work samples, perfor- desire. Finally, I discuss how these studies
mance assessments, or scores on standardized and tools have been and are being used to
tests. inform curriculum changes and program
The impact or “effectiveness” data increas- improvements.
ingly demanded by policy makers are, of
course, the most difficult to collect and interpret BACKGROUND OF THE PROGRAM
for several reasons: First is the difficulty
of developing or obtaining comparable pre- The STEP program has historically been a 12-
measures and postmeasures of student learning month postgraduate program in secondary
education offering a master’s degree and a Cali-
that can gauge change in valid ways that educa- 1
fornia teaching credential. Following a
tors feel appropriately reflect genuine learning;
strongly critical evaluation conducted in 1998
second is the difficulty of attributing changes in
(Fetterman et al., 1999), the program was sub-
student attitudes or performances to an individ-
stantially redesigned to address a range of con-
ual teacher, given all of the other factors influ-
cerns that are perennial in teacher education.
encing children, including other teachers past
These included a lack of common vision across
and present; third is the difficulty of attributing
the program; uneven quality of clinical place-
what the teacher knows or does to the influence
ments and supervision; a fragmented curricu-
of teacher education. Complex and costly
lum with inconsistent faculty participation and
research designs are needed to deal with these
inadequate attention to practical concerns such
issues. as classroom management, technology use, and
In this article, I describe a set of research and literacy development; limited use of effective
assessment strategies used to evaluate program pedagogical strategies and modeling in
outcomes in the Stanford Teacher Education courses; little articulation between courses and
Program (STEP) for the period of program rede- clinical work; and little connection between the-
sign during the past 5 years, along with some of ory and practice (see also critiques of teacher
the findings from this research. In addition, I education outlined in Goodlad, 1990; National
describe future plans for assessing beginning Commission on Teaching and America’s
teachers’ performance in teacher education, Future, 1996).
their practices in the initial years of teaching, The STEP program traditionally also had sev-
and their pupils’ learning. These plans include eral strengths. These included the involvement
Stanford and a consortium of more than 15 Cali- of senior faculty throughout the program, an
fornia universities involved in the Performance emphasis on content pedagogy and on learning
Assessment for California Teachers (PACT) pro- to teach reflectively, and a year-long clinical
ject, which has developed and validated a experience running in parallel with course
teacher performance assessment (TPA) used to work in the 1-year credential and master’s
examine the planning, instruction, assessment, degree program. The redesign of STEP sought
and reflection skills of student teachers against to build on these strengths while implementing
professional standards of practice. We believe reforms based on a conceptual framework that
that these authentic assessments offer more infused a common vision that draws on profes-
valid measures of teaching knowledge and skill sional teaching standards into course design,
than traditional teacher tests, and they inspire program assessments, and clinical work.
useful changes in programs as they provide rich The program’s conceptual framework is
information about candidate abilities—goals grounded in a view of teachers as reflective
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 121
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
practitioners and strategic decision makers who students’ proficiency in integrating technology
understand the processes of learning and devel- into their teaching.
opment—including language acquisition and Finally, the program sought to develop
development—and who can use a wide reper- strong relationships with a smaller number of
toire of teaching strategies to enable diverse placement schools that are committed to strong
learners to master challenging content. A strong equity-oriented practice with diverse learners.
social justice orientation based on both commit- These have included several comprehensive
ment and skills for teaching diverse learners high schools involved in restructuring and cur-
undergirds all aspects of the program. In addi- riculum reform and several new, small, reform-
tion to understanding learning and develop- minded high schools in low-income, “minor-
ment in social and cultural contexts, profes- ity” communities, some of which were started
sional knowledge bases include strong in collaboration with the program. The guiding
emphasis on content-specific pedagogical idea is that if prospective teachers are to learn
knowledge, literacy development across the about practice in practice (Ball & Cohen, 1999),
curriculum, pedagogies for teaching special the work of universities and schools must be
needs learners and English language learners, tightly integrated and mutually reinforcing.
knowledge of how to develop and enact curric- The secondary program has served between
ulum that includes ongoing formative and per- 60 and 75 candidates each year in five content
formance assessments, and skills for construct- areas—math, English, history/social science,
ing and managing a purposeful classroom that sciences, and foreign language. A new elemen-
incorporates skillful use of cooperative learning tary program will graduate about 25 candidates
and student inquiry. Finally, candidates learn in each year. During the course of the redesign,
a cohort and increasingly, in professional devel- with enhanced recruitment, the diversity of the
opment school placements that create strong student body grew substantially, increasing
professional communities supporting skills for from 15% to approximately 50% students of color
collaboration and leadership. in both the secondary and elementary cohorts.
To create a more powerful program that It is clear that small programs like this one do
would integrate theory and practice, faculty col- not provide staff for large numbers of class-
laborated in redesigning courses to build on one rooms. Instead, they can play a special role in
another and add up to a coherent whole. developing leaders for the profession if they can
Courses incorporated assignments and perfor- develop teachers who have sophisticated
mance assessments (case studies of students, knowledge of teaching and are prepared not
inquiries, analyses of teaching and learning, only to practice effectively in the classroom but
curriculum plans) to create concrete applica- also to take into account the “bigger picture” of
tions and connections to the year-long student schools and schooling—to both engage in state-
teaching placement. Student teaching place- of-the-art teaching and to be agents of change in
ments were overhauled to ensure that candi- their school communities. Indeed, in the San
dates would be placed with expert cooperating Francisco Bay Area, striking numbers of STEP
teachers (CTs) whose practice is compatible graduates lead innovations and reforms as
with the program’s vision of good teaching. A teachers, department chairpersons, school prin-
“clinical curriculum” was developed on clearer cipals, school reform activists within and across
expectations for what candidates would learn schools, founders and leaders of special pro-
through carefully calibrated graduated respon- grams serving minority and low-income stu-
sibility and supervision on a detailed rubric dents, and increasingly, as new school founders.
articulating professional standards. Supervi- Thus, these leadership goals are explicit as part
sors were trained in supervision strategies and of the program’s design for training. Described
the enactment of the standards-based evalua- here are some of the studies and assessment
tion system. In addition, technology uses were tools thus far developed to evaluate how well
infused throughout the curriculum to ensure these efforts are implemented and what the out-
122 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
comes are for preparedness, practice, and Although we have conducted studies in all
effectiveness in supporting student learning. three of these categories, it is worth noting that
most of the work falls in the first category—
evidence about the professional performance of
CONCEPTUALIZING OUTCOMES
candidates. In this category, we include perfor-
OF TEACHER EDUCATION
mance on teacher education assignments re-
Assessing outcomes requires, first, a defini- quiring analyses of teaching and learning—
tion of what we expect teacher education to ac- including a performance test of teacher knowl-
complish and influence in terms of candidate edge (spilling over a bit into the second cate-
knowledge, skills, and dispositions and, sec- gory)—as well as performance in the classroom
ond, means for measuring these things. As Mar- during student teaching and (spilling into the
ilyn Cochran-Smith (2001) has observed, third category) practices in the classroom dur-
ing the 1st year of teaching. In all of these assess-
The question that is currently driving reform and
policy in teacher education is what I refer to as “the
ments, we agree with Cochran-Smith (2001)
outcomes question.” This question asks how we that a conception of standards is needed to
should conceptualize and define the outcomes of productively examine teacher performance:
teacher education for teacher learning, professional
Constructing teacher education outcomes in terms
practice, and student learning. (p. 2)
of the professional performances of teacher candi-
Cochran-Smith identified three ways that out- dates begins with the premise that there is a profes-
sional knowledge base in teaching and teacher
comes of teacher education are currently being education based on general consensus about what it
considered: is that teachers and teacher candidates should know
1. through evidence about the professional perfor- and be able to do. The obvious next step, then, is to
mance of teacher candidates; ask how teacher educators will know when and if in-
2. through evidence about teacher test scores; and dividual teacher candidates know and can do what
3. through evidence about impacts on teaching prac- they ought to know and be able to do. A related and
tice and student learning. larger issue is how evaluators (i.e. higher education
institutions themselves, state departments of educa-
In what follows, I describe studies in each of tion, or national accrediting agencies) will know
when and if teacher education programs and institu-
these categories that seek to evaluate the candi- tions are preparing teachers who know and can do
date learning that occurs through particular what they ought to know and be able to do. (p. 22)
courses and pedagogies, as well as through the
program as a whole; the teaching performance of This question is easier to address than it once
individuals as preservice candidates and as was because of the performance-based stan-
novice teachers; and the outcomes of this perfor- dards developed during the past decade by the
mance for students. With respect to the learning National Board for Professional Teaching Stan-
of students taught by STEP candidates, I de- dards and the Interstate New Teacher Assess-
scribe the use of student learning evidence col- ment and Support Consortium (INTASC),
lected in the PACT teaching portfolio as a means which has developed standards for beginning
for evaluating candidates’ planning, instruc- teacher licensing that have been adopted or
tional, and assessment abilities, and I describe a adapted in more than 30 states. These have been
planned study that will examine evidence of integrated into the accreditation standards of the
student learning derived from standardized National Council for Accreditation of Teacher
tests and performance assessments for students Education and reflect a consensual, research-
of beginning teachers who are graduates of grounded view of what teachers should know
STEP and other institutions. In addition, I de- and be able to do. The studies presented here
scribe the ways in which these studies and the define outcomes related to candidates’ knowl-
assessment tools they have produced are used edge and practice in ways that derive directly
for ongoing program improvement, including from these standards. Several use assessments
changes in curriculum, pedagogy, and clinical developed on the standards (e.g., the INTASC
supports. test of teacher knowledge, a rubric used by su-
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 123
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
pervisors for evaluating student teaching per- data on what they feel they have learned in the
formance based on the California Standards for program (through surveys and interviews) as
the Teaching Profession—derived in turn from well as independent measures of what they
the INTASC standards, and a survey of pro- have learned (data from pretests and posttests,
gram graduates developed to represent the di- performance assessments, work samples, and
mensions of teaching included in the standards observations of practice). Finally, to learn about
of the National Board for Professional Teaching what our candidates do after they have left
Standards and INTASC). STEP—whether they enter and stay in teaching
The development of these studies occurred as and what kinds of practices they engage in—we
the teacher education program was explicitly have used data from graduate surveys aug-
moving to integrate these standards into its cur- mented with data from employers and direct
riculum and assessments for both course work observations of practice. We have learned much
and clinical work. This standards integration about the possibilities and limits of different
process had the effect of clarifying goals, articu- tools and strategies for evaluating teacher
lating for candidates the kinds of abilities they education candidates and program effects.
were expected to develop and, for faculty and
supervisors, the kinds of supports and guid- Perceptual Data About
ance they would need to provide. This created Candidate Learning
consonance between the program’s efforts and
the criteria against which candidate learning Surveys. We developed a survey of graduates
was being evaluated, and it made the results of that has now been used for six cohorts of gradu-
the studies much more useful than would have ates to track perceptions of preparedness across
been the case if measures of learning were out of multiple dimensions of teaching and provide
sync with the program’s aspirations. data about beliefs and practices and informa-
The data represented in the studies include tion about career paths. Although there are limi-
assessments of candidates’ learning and perfor- tations to self-report data—in particular the fact
mance from objective tests, from supervisors that candidates’ feelings of preparedness may
and CTs’ observations in student teaching and not reflect their actual practices or their success
from researchers’ observations in the early with students—research finds significant corre-
years of teaching, from work samples, from lations between these perceptions and teachers’
reports of candidates’ practices, and from can- sense of self-efficacy (itself correlated with stu-
didates’ own perceptions of their preparedness dent achievement) as well as their retention in
and learning, both during the program and teaching (for a discussion, see Darling-
once they had begun teaching. The PACT per- Hammond, Chung, & Frelow, 2002). To triangu-
formance assessment allows systematic analy- late these data, a companion survey of employ-
sis of candidates’ performances across different ers collects information about how well
domains of teaching and comparison with those prepared principals and superintendents be-
of other California teacher education programs. lieve our graduates are along those same di-
That assessment and the consortium of institu- mensions in comparison to others they hire. The
tions involved in developing the assessment survey was substantially derived from a na-
will enable future studies (also described tional study of teacher education programs by
below) that examine the effectiveness of teach- the National Center for Restructuring Education,
ers in terms of their students’ learning gains in Schools, and Teaching (Darling-Hammond, in
their 1st year of teaching. press), which allowed us to compare our results
on many items to that of a national sample of be-
2
ginning teachers. Conducting the survey with
TRACKING CANDIDATES’ LEARNING four cohorts in the first round of research also al-
To examine what candidates learn in the lowed us to look at trends in graduates’ percep-
STEP program, we have collected perceptual tions of preparedness with time (Darling-
124 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
Hammond, Eiler, & Marcus, 2002) and to exam- as compatible with the goals of the program,
ine how our redesign efforts were changing there was noticeable variability in certain prac-
those perceptions. tices, such as using research to make decisions,
We learned in a factor analysis that gradu- involving students in goal setting, and involving
ates’ responses to the survey loaded onto fac- parents. We found that the use of these and other
tors that closely mirror the California Standards teaching practices was highly correlated with
for the Teaching Profession, a finding that sug- teachers’ sense of preparedness. Teachers who
gests the validity of the survey in representing felt most prepared were most likely to adjust
distinct and important dimensions of teaching teaching based on student progress and learn-
(see appendix.) We were pleased to discover ing styles, to use research in making decisions,
that employers felt very positively about the and to have students set some of their own
skills of STEP graduates: On all of the dimen- learning goals and assess their own work. Obvi-
sions of teaching measured, employers’ ratings ous questions arise about whether differences
were above 4 on a 5-point scale, and 97% of in the course sections to which candidates were
employers gave the program the top rating of 5 assigned are related to these different practices.
on the question, “Overall, how well do you feel Equally interesting is the fact that graduates
STEP prepares teacher candidates?” Of the who feel better prepared are significantly more
employers, 100% said they were likely to hire likely to feel highly efficacious—to believe they
STEP graduates in the future, offering com- are making a difference and can have more
ments such as, “STEP graduates are so well pre- effect on student learning than peers, home
pared that they have a huge advantage over vir- environment, or other factors. Although we
tually all other candidates,” and “I’d hire a found no relationship between the type of
STEP graduate in a minute. . . . They are well school a graduate taught in and the extent to
prepared and generally accept broad responsi- which she or he reported feeling efficacious or
bilities in the overall programs of a school.” Pro- well prepared, there are many important ques-
gram strengths frequently listed include strong tions to be pursued about the extent to which
academic and research training for teaching, practices and feelings of efficacy are related to
repertoire of teaching skills and commitment to aspects of the preparation experience and
diverse learners, and preparation for leadership aspects of the teaching setting.
and school reform. Employers were less critical Other research finds that graduates’ assess-
of candidates’ preparedness than were candi- ments of the utility of their teacher education
dates themselves, a finding similar to that of experiences evolve during their years in prac-
another study of several teacher education tice. With respect both to interviews and survey
programs (Darling-Hammond, in press). data, we would want to know how candidates
We were also pleased to learn that 87% of our who have been teaching for different amounts
graduates continued to hold teaching or other of time and in different contexts evaluate and
education positions, most in very diverse reevaluate what has been useful to them and
schools, and that many had taken on leadership what they wish they had learned in their
roles. Most useful to us were data showing preservice program. Using survey data, it is not
graduates’ differential feelings of preparedness entirely possible to sort out these possible expe-
along different dimensions of teaching, which rience effects from those of program changes
were directly useful in shaping ongoing that affect cohorts differently. Interviews of
reforms. However, given the limits of self- graduates at different points in their careers that
report data, these needed to be combined with ask for such reflections about whether and
other sources of data, as discussed in the Using when certain kinds of knowledge became
Data for Program Improvement section below. meaningful for them would be needed to
We also want to know about the practices examine this more closely.
graduates engage in. Although 80% or more Also important is the collection of data on
reported engaging in practices we would view what candidates and graduates actually do in
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 125
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
the classroom and what influences their deci- what learning experiences were important to
sions about practice. Whether it is possible to them.
link such data on practices—which are con- In another study, researchers looked at learn-
nected to evidence about preparation—to evi- ing in the Crosscultural, Language and Aca-
dence about relevant kinds of student learning demic Development (CLAD) strand of courses
is a question that is examined further below. and experiences intended to prepare candidates
to teach culturally and linguistically diverse
Interviews of students and graduates. Interviews students (Bikle & Bunch, 2002). At the end of the
of students and graduates have been an impor- year, the researchers conducted hour-long inter-
tant adjunct to survey findings, as they have al- views with a set of students—selected to repre-
lowed us to triangulate findings and better
sent diverse subject areas and teaching place-
understand the perceptions of candidates about
ments—to understand how they felt their
how they were prepared. We have used inter-
courses addressed the three domains of CLAD:
views in a number of studies and highlight
(a) language structure and first and second lan-
three of them here as distinctive examples of
guage development; (b) methods of bilingual,
how they have been helpful. In one instance, we
English language development and content
explored the results of a particular course that
instruction; and (c) culture and cultural diver-
had been redesigned; in another, a strand of
sity. They reviewed course syllabi from eight
courses was evaluated; and in a third, the effects
courses that treated aspects of cultural and lin-
of the program as a whole were examined. In all
guistic diversity to assess what instructors
of these studies, candidates were asked not only
about not only how prepared they felt but also intended for students to learn in terms of these
about how they perceived the effects of specific domains, and they reviewed student teachers’
courses and experiences. This explicit prompt- capstone portfolios to examine the extent to
ing—in conjunction with other data—allowed which candidates integrated course work and
greater understanding of the relationships clinical experiences regarding the needs of
between program design decisions and student English language learners into specific portfolio
experiences. assignments.
In a study discussed by Roeser (2002), an The interviews not only explored what candi-
instructor who had struggled with a course on dates learned in classes and applied to their
adolescent development found that student placements but also placed this learning in the
evaluations improved significantly after the context of previous life experiences and future
course was redesigned to include the introduc- plans. Researchers asked for specific instances
tion of an adolescent case study that linked all of in courses and student teaching in which partic-
the readings and class discussions into a clinical ipants were able to connect classroom learning
inquiry. The instructor conducted structured to practice or conversely, felt unprepared to deal
follow-up interviews with students after the with an issue of linguistic diversity. Finally, they
conclusion of the course to examine their views asked candidates what would excite or concern
of the learning experience as well as of adoles- them about teaching a large number of linguisti-
cent students’ development. He placed candi- cally diverse students. The use of interview
dates’ views of adolescent students in the con- data—alongside samples of work from candi-
text of a developmental trajectory of student dates’ portfolios and syllabi—was extremely
teachers, documenting changes in their per- helpful in providing diagnostics that informed
spectives about adolescents as well as about later program changes (discussed below.)
their own roles and as teachers. These reports of A third study examines what already-experi-
candidate perspectives on their students, com- enced teachers felt they learned during this
bined with their reports of their own learning preservice program (Kunzman, 2002, 2003),
and the data from confidential course evalua- providing insights about the value that formal
tions collected with time, provided a rich set of teacher education may add to the learning
information on what candidates learned and teachers feel they can get from experience alone.
126 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
About 20% of STEP students have already Analyses of Candidate Performance
had at least 1 year of teaching experience before
entering the preservice program. Unlike some Pretests and posttests of teaching knowledge. A
programs serving teachers with experience, more unusual strategy for gauging learning
these teachers are fully integrated into the was the use of the INTASC pilot Test of Teaching
cohort, taking all the same courses and engag- Knowledge (TTK) to look at preprogram and
ing in a full year of supervised student teaching postprogram evidence about candidate knowl-
like other candidates. Using a semistructured edge of learning, development, teaching, and
protocol, the author interviewed 23 of these assessment. The TTK was developed on the
STEP graduates from 1999 and 2000, asking INTASC standards by a group of teacher educa-
them about their teaching experience prior to tors and state officials from the INTASC consor-
STEP and any training they might have had, tium, in collaboration with Educational Testing
their year of STEP study, and for 1999 gradu- Services. It aimed to respond to the problem of
ates, their 1st year back in their own classroom teacher tests that have been critiqued for not
since graduation. testing teaching knowledge well—either be-
cause they focus on only basic skills or subject
Five themes emerged from interviews as
matter knowledge or because they ask ques-
areas of important learning for these experi-
tions about teaching in ways that are overly
enced teachers: (a) increased effectiveness
simplified, inauthentic, or merely require
working with struggling students; (b) greater
careful reading to discern the “right” answer
sophistication in curriculum planning, particu-
(Darling-Hammond, Wise, & Klein, 1999;
larly in identifying and matching long-term
Haertel, 1991). For many years there have been
objectives and assessment; (c) greater apprecia-
press accounts of journalists and others not
tion for collaborative teaching and ability to
trained to teach who could take teacher compe-
nurture collegial support; (d) structured oppor-
tency tests and do as well as trained teachers be-
tunities for feedback and reflection on teaching
cause the content of the test so poorly
practice; and (e) development of theoretical represented the professional knowledge base,
frameworks to support teaching skills and whereas tests in some other professions are vali-
vision. dated by comparing the scores of untrained
An analysis that tied this perceived learning novices with those of individuals who have re-
back to specific courses and program experi- ceived preparation (e.g., new law students vs.
ences helped us to understand how various graduates of law school), this approach has not
aspects of the program were working for these been used to validate teacher tests in the past.
students. Discovering how much they valued Our experience with using the TTK at the
certain kinds of learning opportunities encour- beginning of the first quarter and end of the
aged us to maintain and expand certain compo- fourth quarter of a four-quarter preparation
nents as we considered annual program changes. program was instructive in this regard. We were
The study also confirmed some of our decisions able both to document growth in learning for
about how to educate already-experienced our candidates and provide evidence that for
teachers in a preservice program—a phenome- the most part, the instrument appears to mea-
non that is common in California where many sure teaching knowledge that is acquired in a
individuals enter teaching without initial train- teacher education program (Shultz, 2002). The
ing. We concluded that these recruits appear to 26 constructed response items on the pilot test
benefit at least as much as other candidates (in we used are distributed across four sections.
some cases perhaps more) from traditional stu- In the first section, candidates respond to 4
dent teaching in the classroom of an expert vet- multiple-part questions addressing specific
eran and from a systematic set of courses that knowledge about learners and how that knowl-
provides a conceptual framework and research edge might influence the learning and/or
base that both connects and corrects parts of teaching process. The second section asks can-
their prior knowledge. didates to read a case study or classroom
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 127
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
vignette focusing on aspects of learning, stu- to the discipline, such as the concept of irony in
dent behavior, or classroom instruction and to English, evolution in science, pi in math, or the
answer 7 questions related to the case study. The cultural differences in a foreign language. The
third section provides a “folio” or a collection of incident may have been particularly successful,
documents and asks candidates to answer 7 unsuccessful, surprising, or revealing and
questions dealing with a particular learner or should have the potential to serve as a site for
aspect of learning or teaching illustrated in the exploring interesting dilemmas or questions
documents. In the final section, candidates about teaching and learning. Student teachers
answer 8 short, focused questions assessing must provide evidence of student learning to
propositional knowledge about specific theo- analyze how that learning (or lack of learning)
ries, learning needs, instructional strategies, or was shaped by classroom decisions. (For a
teaching concepts. description of the process of developing this
For most items, it was clear that most candi- pedagogy, see Shulman, 1996.)
dates knew very little at the start of their train- We examined data including students’ cases
ing—in the pretest, candidates often wrote “I (from outline to final draft), students’ final self-
have no idea” or “I’m looking forward to learn- assessment essays, interviews with instructors,
ing about this during my year at STEP”—and and interviews with a sample of students
they knew a great deal more (usually attaining (Hammerness, Darling-Hammond, & Shulman,
the maximum score) at the end. However, 7 of 2002). Using the framework of “novice/expert”
the 26 items appeared to suffer from some of the thinking proposed by Berliner (1986, 1991), we
same flaws as items on earlier tests of teaching coded and scored student work, finding that
knowledge—that is, they were answerable by students’ successive case drafts demonstrated a
novices before they began their training development from naïve generalizations to
because they required only a careful reading of sophisticated, theory-based explanations of the
the question or prompt to discern the desired issues at play in their cases, characteristic of
response. In some cases, although the item more “expert” thinking about teaching. We also
appeared to be a valid measure of professional found that certain aspects of the course peda-
knowledge, the scoring rubric was designed in gogy were important in helping student teach-
a way that did not detect qualitative differences ers learn to think like a teacher, including read-
in responses. These findings suggest both the ing theoretical works in conjunction with
value of the test and a need for further refine- writing cases; sharing cases with peer readers;
ment to enhance the validity of such measures. receiving specific, theoretically grounded, con-
crete feedback from instructors; and revising
Samples of student work. We studied how stu- the case several times in response to feedback
dents learn to analyze their teaching by analyz- about important elements of the context and
ing the several drafts of a curriculum case study teaching as well as potential theoretical expla-
they wrote in a course on principles of learning nations for what occurred.
for teaching. In this course, case writing is de-
signed to promote the application of learning Longitudinal observations of clinical practice.
theory to practical experiences in the classroom; Another tool we developed to track candidates’
a student-written curriculum case analyzing an learning is a detailed rubric for supervisors to
instance of the candidate’s own teaching serves use in evaluating student teaching progress,
as the central product of the class. The case fo- based on the California Standards for the Teach-
cuses on the teaching of a curriculum segment ing Profession. This tool was informed by ef-
with specific disciplinary goals, so that students forts at other institutions, especially the
will address central questions concerned with University of California campuses at Santa
engaging students in the learning of subject Barbara and Santa Cruz. Previous Stanford ob-
matter. Students are asked to write about an in- servation forms were entirely open-ended and
cident in which they were trying to teach a key produced widely differing kinds of observa-
concept, problem, topic, or issue that is central tions of very different elements of teaching, de-
128 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
pending on what different observers thought to dates’ scores on this instrument: First, teacher
comment on. Research on assessment suggests candidates and supervisors viewed the rubric
that clear criteria are important for developing as helpful in focusing their efforts and clarifying
performance and that the usefulness of clinical goals. Second, we learned from using the instru-
experiences is weakened by lack of distinction ment in multiple observations that consensus
between outstanding and ineffective teaching between university supervisors and CTs about
in assessment processes (Diamonti, 1977; the meaning of the rubric scores grew with time,
McIntyre, Byrd, & Foxx, 1996), inadequate for- probably as a function of repeated use, conver-
mative assessment (Howey & Zimpher, 1989), sations between supervisors and CTs, and per-
and a lack of clear roles for many supervisors haps, the modest training efforts conducted by
and CTs (Cole & Knowles, 1995; Williams, the program. The exact-score correlations
Ramanathan, Smith, Cruz, & Lipsett, 1997). between CTs’ and supervisors’ evaluations
Having specific indicators of each of the six were very low at the beginning of the year and
California Standards for the Teaching Profes- improved noticeably as the year went on. How-
sion standards (the standards are noted in the ever, the correlations were never as high as
appendix) and their associated substandards would ideally be desirable, even if the assess-
outlined on a scale from novice to expert pro- ments were generally very close.
vided guidance to supervisors and CTs in what Thus, a third thing we learned is that the use
to focus on (clarifying the content standards for of such assessments requires intensive, explicit
clinical practice) and how to make judgments of efforts to develop shared meanings if they are to
performance—what counts as proficient perfor- be viewed as reliable assessments for determin-
mance adequate to sustain a recommendation ing recommendations for certification and for
for credentialing. conducting research on learning and perfor-
The relationship between these measures of mance. Finally, there are questions about how
performance in student teaching and what one can independently confirm the improve-
teachers do in “real” teaching is likely to ments in practice that seem to be indicated by
depend in part on the nature and duration of the scores on an observational instrument through
clinical experience. In this program, with a year- other measures of practice. I turn to these next.
long student teaching placement, it is possible
for candidates to gradually take on nearly all of
ANALYZING PRACTICE AS AN OUTCOME
the full responsibilities of a teacher, typically
OF PREPARATION
engaging in independent practice by February
or March of the school year after assisting and Although it is very helpful to look at candi-
coteaching for the 5 or 6 previous months. This dates’ learning in courses and their views of
allows teaching to be assessed as both a mea- what they have learned, it is critical to examine
sure of candidate learning-in-progress and by whether and how they can apply what they
the end of the year, as a proximal “outcome” of have learned in the classroom. The problem of
the overall preparation process. Furthermore, “enacting” knowledge in practice (Kennedy,
both the standards-based assessment instru- 1999) is shared by all professions, but the prob-
ment and to an even greater degree, the PACT lem is particularly difficult in teaching, where
assessment (described below) help to structure professionals must deal with large numbers of
the kinds of performances candidates must clients at one time, draw on many disparate
engage in if they are to be assessed, thus, creat- kinds of knowledge and skill, balance compet-
ing more systematic opportunities to learn and ing goals, and put into action what they have
perform for student teachers than might other- learned while evaluating what is working from
wise occur by chance, given different contexts moment to moment and changing course as
and expectations held by CTs. needed. To begin to explore whether our candi-
We learned several things about clinical dates can enact their learning in the classroom,
assessment strategies from examining candi- we conducted two kinds of studies to examine
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 129
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
candidates’ actual performance as teachers, formance assessment (TPA) as a basis for pro-
both in the independent portion of the year- grams’ credentialing recommendations, the
long student teaching they undertake as state developed its own TPA but gave colleges
preservice candidates and as beginning the option to develop their own and submit
teachers after they have graduated. them, with evidence of validity and reliability,
for approval. Twelve colleges created a consor-
Observations of graduates’ teaching practice. tium to develop a TPA—all of the University of
Hammerness (in press) first recorded program California campuses, Stanford University and
intentions through close analysis of syllabi and Mills College, plus 2 of the California State Uni-
program documents and through interviews versity campuses. This consortium has since
with faculty members; she then observed and grown to 17 programs and will continue to ex-
interviewed 10 novice teacher graduates of the pand. The TPA created by the PACT consortium
program using an observation form recording is modeled on both the National Board for Pro-
evidence of five key program elements in the fessional Teaching Standards’ portfolio and on
graduates’ practices. These elements include the portfolio for beginning teacher licensing
concern for students as learners and for their used by the state of Connecticut.
prior experiences and learning, the use of peda- The PACT includes a “teaching event” (TE)
gogical content strategies to make subject mat- portfolio in the subject area(s) candidates teach
ter accessible to students, commitment to plus “embedded signature assessments” used
equity, capacity to reflect, and commitment to in each teacher education program (e.g., the
change. Teachers’ practice was coded as to development of curriculum units, child case
whether there was “strong evidence,” “some studies, or analyses of learning). With modest
evidence,” or “little evidence” of practice philanthropic support and substantial in-kind
reflecting the 27 indicators of these elements. contributions from the universities themselves,
The Hammerness (in press) study found that the assessments were piloted, scored, revised,
efforts to create program coherence on a set of and piloted again in academic years 2002-2003
themes were generally reflected in strong evi- and 2003-2004. During this period of time, more
dence of practices related to these themes. In than 1,200 candidates at PACT institutions
particular, attention to students’ needs and piloted TEs in the areas of elementary literacy and
learning, use of well-grounded content peda- mathematics, English/language arts, history/
gogical strategies, and commitment to equity social science, mathematics, and science. More
for students were in strong evidence in virtually than 250 teachers and teacher educators were
all of the graduates’ practice. However, candi- trained to score these assessments in spring
dates felt less sure about their assessment prac- 2003 and spring 2004. Technical studies of reli-
tices than their other instructional approaches, ability and validity have been conducted on
and evidence of reflection and engagement in these data (see Pecheone & Chung, 2006 [this
school change was spottier. These were areas issue], for details.)
identified for further curriculum work. Because For each TE, candidates complete several en-
this study included a careful analysis of syllabi tries that are integrated on a unit or segment of
across the program, as well as detailed observa- instruction of about 1 week in length. These en-
tions of graduates’ practices, it could inform tries include
specific changes in the curriculum (discussed 1. a description of their teaching context, including
below). students and content;
2. a set of lesson plans from the segment of instruction;
The PACT teaching assessment. Finally, the 3. one or two videotapes of instruction during the unit
PACT assessment developed by a set of Califor- (depending on the field);
nia universities has provided a means to evalu- 4. samples of student work during the unit; and
5. written reflections on instruction and student learn-
ate elements of teaching skill systematically and
ing during the unit.
authentically within the program. When Cali-
fornia passed a law requiring a teacher per-
130 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
This collection of teacher and student arti- Finally, candidates collect all of the student
facts is based on a planning, instruction, assess- work from one assessment during the unit and
ment, and reflection model in which candidates analyze it in terms of what the work shows
use knowledge of students’ skills and abili- about student learning and areas for further
ties—as well as knowledge of content and how teaching for different groups of students. This
best to teach it—in planning, implementing, work is included in the portfolio for scoring,
and assessing instruction. The planning, in- along with the teacher candidate’s analysis and
struction, assessment, and reflection model is feedback to students. This evidence allows
distinct in its placement of student learning at analysis of the kind and quality of work asked
the center of the assessment system. Although of and produced by students, how it reflects
many clinical assessments of preservice candi- state standards and is aligned to what was
taught, how well it was supported instructionally,
dates focus on teacher activities and behaviors,
and how closely and thoughtfully the teacher
paying little attention to evidence about student
candidate can evaluate the work to understand
outcomes, the PACT TEs focus on evidence of
what different students have learned and to
student learning of defined objectives—includ-
plan for future instruction.
ing the learning of English language learners
The PACT assessments provide evidence of
and students with learning differences—and candidate performance on authentic tasks of
ask candidates to consider the extent to which teaching scored in systematic ways that have
these objectives were attained for all students allowed the participating universities to evalu-
and how to adapt instruction to improve ate overall candidate performance, the relative
student learning. strength of different areas of preparation (e.g.,
There are several ways in which the PACT STEP candidates do better on planning, instruc-
emphasizes attention to pupil learning. First, in tion, and reflection than they do on assessment),
the design of the instructional unit, candidates and the performance of candidates in compari-
must describe how they have planned their unit son to those at other California institutions,
based on what they know about their students’ which provides a broader perspective on our
prior knowledge and learning and explain how work and its success. Figure 1 illustrates some of
their plans accommodate the needs of the group the data available from the PACT, suggesting,
and individuals, including English language for example, that scores are highest and most
learners and students with exceptional needs. consistent across institutions on the planning
Second, as part of their planning, teachers show task and increasingly variable for instruction,
how they will incorporate formative as well as assessment, reflection, and language develop-
summative assessments in the unit and how ment. In general, scores are lowest on the assess-
they will use what they learn from the assess- ment task, suggesting an area for attention
ments to guide their teaching. Third, teachers across institutions. As described below, the
teach the unit and record reflections each day PACT assessments will also provide a linchpin
about the students’ responses and evidence of in a broader study of candidate effectiveness
learning; then they describe how they will that examines practice and student learning
respond to students’ needs in the next day’s les- gains.
son. (Student teachers report this is a particu-
larly powerful aspect of their PACT experience.)
Fourth, candidates are asked to provide com- RESEARCH ON GRADUATES’
EFFECTIVENESS
mentary on the videotapes they submit of them-
selves teaching part of the unit. The guiding As noted earlier, the most difficult and, to
questions they answer in this task, as well as many, the most important question is how what
others, focus on what they have observed about teachers have learned ultimately influences
student learning of both specific disciplinary what their pupils learn. Even if teacher educa-
content and skills and of academic language. tion students are followed into their classrooms,
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 131
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
3.50
B
3.00 C
D
E
F
2.50 G
H
I
J
2.00 K
L
Score
1.50
1.00
0.50
0.00
Overall Mean Planning Instruction Assessment Reflection Academic
Score Language
FIGURE 1: Performance Assessment for California Teachers FIGURE 2: Data Collection Design
Scores by Institution and Area, 2004 NOTE: PACT = Performance Assessment for California Teachers.
there are many complexities in approaching this literacy and mathematics), teacher practices in
question, including the problem of linking what the classroom, and teacher effectiveness as eval-
teachers have learned to what they later do in uated by their students’ achievement on both
the classroom—and then linking what they do state standardized tests in literacy and mathe-
to what their students learn, accounting for the matics and curriculum-based performance
variability in what these pupils bring with assessments given at the beginning and end of
them. It is very difficult for most individual pro- the school year that are more sensitive to higher
grams to be able to secure adequate data on order thinking and performance skills (see Fig-
these questions given the many and diverse dis- ure 2.) These measures of student learning will
tricts and contexts their candidates leave to allow analysis of both how students perform on
teach in, the small samples that can be tracked large-scale assessments, controlling for their
with any comparability, and the difficulty in prior years’ scores on these same tests, and how
securing useful and comparable pupil assess- their performance has changed during the
ment data. We are seeking to approach this diffi- course of the school year on constructed
cult question by capitalizing on the develop- response performance tasks that reflect the
ment of many of the assessments earlier development of key reading, writing, and
described, including the PACT assessments, mathematics skills.
and by leveraging the cooperation of members To build a chain of evidence, beginning teach-
of the PACT consortium to develop a large ers will be followed from the last year of their
enough sample within a few large urban areas preparation into their 1st year of teaching. The
with enough variability in training to begin to analyses, using an approach rather like a path
link program features to practices and student analysis, will evaluate the multiple connections
outcomes. among candidates’ entering characteristics,
The study will evaluate the practices and their preparation experiences, performance as
effectiveness of a sample of 300 to 400 elemen- preservice candidates on traditional measures
tary teacher education graduates from a num- and the PACT performance assessment, and
ber of the PACT universities, using measures of their practice as teachers. Teaching practice will
preservice teacher preparation experiences be examined through observations and analysis
(documented components of programs and sur- of teaching artifacts such as lesson plans, video-
veys measuring candidate perceptions of prep- tapes, and student work samples. Even absent
aration and preparedness), preservice measures the consideration of pupil learning, these analy-
of teacher “quality” (e.g., grades, licensure test ses will be valuable for exploring relationships
scores, supervisory ratings, and PACT scores in among measures of performance and for begin-
132 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
ning to understand how what candidates Another goal is to evaluate the effects of pro-
encounter in their programs may influence gram reforms on candidates’ opportunities to
what they are able to do in their classrooms. learn and on later performance. Using different
It is clear that school contexts will have a large strategies allowed us to triangulate data from
effect on teacher practice as well. Data on school several sources to look for patterns in responses.
contexts, including student demographics,
working conditions, leadership and culture, Analyzing Strengths and Weaknesses
and the nature of beginning teacher induction
and supports, will be used to explore these rela- Looking across several measures, we found,
tionships and to provide appropriate statistical for example, confirmations that candidates felt
controls. Multivariate multilevel analyses of the well prepared in terms of planning and organiz-
predictors of teacher effectiveness will be con- ing curriculum in their subject matter and using
ducted, exploring the correlates of both prac- a wide repertoire of teaching and assessment
tices and pupil learning gains including strategies adapted to student needs, that their
preservice components (e.g., course work ele- supervisors saw substantial growth in these
ments, length and design of student teaching), areas in terms of practice during the course of
other indicators of teacher quality (grades, test the year (Lotan & Marcus, 2002), and that test
scores, background variables), indicators of measures recorded growth in knowledge about
teacher performance in preservice (supervisory these areas (Shultz, 2002). When compared to a
ratings, PACT scores), and the amount and kind national sample of beginning teachers, these
of induction support. While examining influ- were areas in which the program also appeared
ences on teacher effectiveness, these analyses relatively strong (Darling-Hammond, Eiler, &
will also provide concurrent and predictive
Marcus, 2002).
validity evidence about the PACT assessment as
We noted that areas in which the program
a measure of teacher quality.
appeared relatively strong compared to other
The virtue of this design for examining
programs were not always areas where we were
teacher effectiveness is that in contrast to exist-
fully satisfied. For example, even though 90% of
ing large-scale databases, the study will include
STEP graduates reported feeling adequately
more detailed measures of teacher education
prepared to teach English language learners (as
content and performance as well as broader
compared to 50% of a national random sample
measures of student achievement. It will treat
teacher preparation and teaching as more than a of beginning teachers), fewer students felt
black box. And in contrast to many small quali- “very well” prepared in this than in some other
tative studies of individual programs, it will areas, and our more in-depth examination of
allow us to examine variation in preparation students’ experiences in the CLAD strand of
using both qualitative and quantitative mea- courses helped us to parse out which areas of
sures of teacher performance and effects. How- their preparation were stronger (e.g., prepara-
ever, even with these advantages, this approach tion to address diverse cultures and to use
will just begin to scratch the surface of the work “sheltered” techniques to teach content) and
to be done in establishing the relationships which were weaker (e.g., preparation to teach
among aspects of preparation, teacher learning, English language skills to new English
teaching practice, and student learning that the language learners; Bikle & Bunch, 2002).
field is wrestling with. In addition, although most candidates felt
well prepared to use a range of assessments,
there was variation across subject matter areas;
USING DATA FOR PROGRAM IMPROVEMENT we observed less sophistication in the practice
An obvious goal for evaluations of program of some in this area, compared to areas such as
outcomes is to identify areas where it appears planning and instruction, in both the follow-up
the program is succeeding more and less well. observations of graduates and the PACT. This
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 133
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
confirms the importance of supplementing self- Analyzing the Effects of Program Reforms
report instruments with other sources of data,
and it points to an area of needed curriculum One of the goals of the research was to
development. At the time of the survey, the pro- uncover whether there were changes in candi-
gram relied mostly on the subject-specific cur- dates’ learning during the 3 years that a number
riculum and instruction courses to teach about of program reforms were implemented (see
assessment, and they treated this topic very Hammerness & Darling-Hammond, 2002, for a
unevenly. This was addressed both by adding discussion of these changes). By collecting sur-
an assessment module in the practicum and veys from 4 years of program graduates, we
asking faculty in subject-specific methods were able to examine whether there were
courses to discuss collectively what each was changes in their views of certain aspects of the
doing in the area of assessment and to define program with time. Although there were not
areas to develop further within those courses. significant differences with time in most areas,
We found some other areas where graduates there were some areas where program changes
felt less well prepared. On our graduate survey, seemed to have made a large difference in grad-
uates’ feelings of preparedness. Some of these
generally more than 80% of graduates felt ade-
were positive and others were less so. On one
quately prepared for most of the tasks of teach-
hand, the introduction of much more explicit
ing. However, somewhat smaller proportions
work on how to use technology in the class-
(ranging from 73% to 79% when all 4 years of
room, how to work with parents, and how to
survey data were averaged) felt adequately pre-
address special needs of exceptional students
pared to identify and address special learning
appeared to result in large increases in the pro-
needs or difficulties, to work with parents, to
portions of graduates feeling adequately pre-
use technology in the classroom, to create inter-
pared in these domains (exceeding 80% in each
disciplinary curriculum, to resolve interper-
category by 2000).
sonal conflict, and to assume leadership respon-
On the other hand, a sharp drop in candi-
sibilities in their school. Some of these are areas dates’ self-reported readiness to create interdis-
where teacher education programs have gener- ciplinary curriculum could also be attributed to
ally received lower ratings from their graduates program reforms. As efforts were made to tie
(e.g., special education, technology use). Oth- courses more tightly together and streamline
ers, such as creating interdisciplinary curricu- the curriculum to allow for the introduction of
lum, are areas where our secondary program, new content, a course that had earlier required
which is heavily focused on content pedagogy an interdisciplinary curriculum project allowed
within the disciplines, does less work than students to use their discipline-based curricu-
many elementary programs or those with a lum unit as the site for embedding required
different orientation. group-work tasks. Thus, fewer students had the
Making sense of these findings in program experience of constructing interdisciplinary
terms required triangulation with other data curriculum. This project was reinstated. How-
and an examination of trends with time (see ever, given the expectations for secondary
below). These survey responses were some- school teachers in the field, their own felt needs,
times reinforced by performance on the TTK. and the shortness of the program, we decided
For example, candidates’ pretest to posttest that other needs were more pressing than giv-
score gains were partial in areas such as ing additional curriculum time to interdisci-
responding to students’ special needs, in which plinary curriculum.
they showed increased understanding of the As in many program decisions, the faculty
content requested in the question but could not needed to consider the trade-offs among com-
always discuss how they would apply their peting goals for a 1-year teacher education pro-
understanding to instructional practices gram and decide which values should guide a
(Shultz, 2002). These findings led to the redesign decision about whether or how to rethink the
of an instructional module on special education. curriculum.
134 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
Another change—the infusion of CLAD as a much more difficult to draw inferences from the
core part of the program design—increased the data that are useful for evaluating and develop-
exposure many students received to the knowl- ing appropriate changes.
edge and skill base needed to teach culturally
and linguistically diverse students but may CONCLUSION
have sacrificed depth in the area of English lan-
guage development. That and California’s Each kind of tool described here has the
proposition outlawing bilingual education put potential to contribute different insights to an
a course on bilingual education into an odd assessment of candidates’ progress and pro-
position in the curriculum. Data about student gram outcomes. Although each has limitations,
perceptions of preparedness allowed the fac- we have found them powerful in the aggregate
ulty to plan a redesign of this component in for shedding light on the development of pro-
light of what students felt they knew and could fessional performance and how various pro-
do and where they wished they knew still more. gram elements support this learning. We would
like to develop even more powerful measures of
In using data to inform program changes, we
performance—including further refinement of
found it crucial to have several sources of data
the teaching event that candidates develop, vid-
on the same question, including information
eotape, and reflect on as part of a culminating
that explicitly examines the connections
portfolio, as well as more extensive systematic
between particular findings and specific
observations of graduates’ practice and their
aspects of the curriculum, to draw inferences
students’ outcomes—to supplement and vali-
about what is working well, what is not, and
date these kinds of measures. Having examined
what can be done about it. More nuanced and
a range of strategies, it seems to us that it will be
detailed student feedback is also gathered from important in this era of intense focus on single
evaluations of specific course sections and ses- measures of teacher education outcomes to
sions, supervisory groups, student teaching press for the use of multiple measures that allow
placements, and student experiences. These a comprehensive view of what candidates learn
illuminate survey and interview findings and and what a program contributes to their
shed light on the results of the TTK, the clinical performance.
observations, and student work samples. With-
out course-specific information, it would be
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 135
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
136
APPENDIX
Teacher Survey Factors
Item Mean
Responses on a 5-Point Scale to “How well do you think your teacher preparation prepared you to . . .” (N = 152) Loading Value (Standard Deviation)
Factor 1: Design curriculum and instruction (CSTP Standards 3, 4: Understanding and organizing subject matter for student learning;
Planning instruction and designing learning experiences for all students)
Q5: Develop curriculum that builds on students’ experiences, interests and abilities. 0.720 4.02 (0.79)
Q7: Create interdisciplinary curriculum. 0.694 3.24 (1.13)
Q1: Teach the concepts, knowledge, and skills of your discipline(s) in ways that enable students to learn. 0.641 4.09 (0.80)
Q25: Use knowledge of learning, subject matter, curriculum, and student development to plan instruction. 0.582 4.10 (0.79)
Q9: Relate classroom learning to the real world. 0.581 3.72 (0.94)
Q14: Provide a rationale for teaching decisions to students/parents/colleagues. 0.500 3.93 (0.96)
Q6: Evaluate curriculum materials for their usefulness and appropriateness for your students. 0.484 3.76 (0.95)
Q18: Develop students’ questioning and discussion skills. 0.457 3.68 (0.91)
Q8: Use instructional strategies that promote active student learning. 0.455 4.30 (0.68)
Factor 2: Support diverse learners (CSTP Standard 1: Engaging and supporting all students in learning)
Q26: Understand how factors in the students’ environment outside of school may influence their life and learning. 0.707 3.91 (0.90)
Q21: Teach students from a multicultural vantage point. 0.690 3.72 (.090)
Q24: Encourage students to see, question, and interpret ideas from diverse perspectives. 0.630 3.78 (0.86)
Q10: Understand how students’ social, emotional, physical, and cognitive development influences learning. 0.552 3.95 (0.83)
Q19: Engage students in cooperative work as well as independent learning. 0.507 4.27 (0.75)
Q2: Understand how different students are learning. 0.472 3.97 (0.84)
Factor 3: Use assessment to guide learning and teaching (CSTP Standard 5: Assessing student learning)
Q29: Give productive feedback to students to guide their learning. 0.669 3.68 (0.87)
Q30: Help students learn how to assess their own learning. 0.643 3.41 (0.82)
Q27: Work with parents and families to better understand students and to support their learning. 0.582 3.14 (0.86)
Q28: Use a variety of assessments (e.g., observation, portfolios, tests, performance tasks, anecdotal records) to determine 0.488 4.09 (0.80)
student strengths, needs & programs.
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
Factor 4: Create a productive classroom environment (CSTP Standard 2: Creating and maintaining effective environments
for student learning)
Q34: Maintain discipline and an orderly, purposeful learning environment. 0.739 3.63 (0.93)
Q4: Help all students achieve high academic standards. 0.671 3.62 (0.83)
Q3: Set challenging and appropriate expectations of learning and performance for students. 0.603 3.85 (0.88)
Q15: Help students become self-motivated and self-directed. 0.482 3.43 (0.92)
Q12: Teach in ways that support new English language learners. 0.468 3.71 (1.05)
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
Q20: Use effective verbal and nonverbal communication strategies to guide student learning and behavior. 0.458 3.87 (0.89)
Factor 5: Develop professionally (CSTP Standard 6: Developing as a professional educator)
Q36: Assume leadership responsibilities in your school. 0.700 3.32 (1.12)
Q35: Plan and solve problems with colleagues. 0.572 3.42 (1.06)
Q16: Use technology in the classroom. 0.566 3.21 (1.00)
Q33: Resolve interpersonal conflict. 0.528 3.14 (1.07)
NOTE: CSTP = California Standards for the Teaching Profession.
Diamonti, M. C. (1977). Student teacher supervision. Edu-
NOTES
cational Forum, 41(4), 477-486.
1. In 2001, the program began a pathway for “co-term” stu- Fetterman, D., Connors, W., Dunlap, K., Brower, G., Matos,
dents beginning in the undergraduate years; and in 2003, an ele- T., & Paik, S. (1999). Stanford Teacher Education Program
mentary education program was added, beginning with Stanford
1997-98 evaluation report. Stanford, CA: Stanford
undergraduates, who receive a disciplinary bachelor’s degree and
a master’s in education in 5 years. University.
2. A very similar version of the survey was used in a study of Goodlad, J. I. (1990). Teachers for our nation’s schools. San
3,000 beginning teachers entering teaching through different pro- Francisco: Jossey-Bass.
grams and pathways in New York City. The results and the survey
instrument are reported in Darling-Hammond, Chung, and
Haertel, E. H. (1991). New forms of teacher assessment. In
Frelow (2002). G. Grant (Ed.), Review of research in education (Vol. 17, pp.
3-29). Washington, DC: American Educational
Research Association.
REFERENCES Hammerness, K. (in press). From coherence in theory to
coherence in practice. Teachers College Record.
Ball, D. L., & Cohen, D. K. (1999). Developing practice,
developing practitioners: Toward a practice-based the- Hammerness, K., & Darling-Hammond, L. (2002). Meeting
ory of professional education. In. L. Darling-Hammond old challenges and new demands: The redesign of the
& G. Sykes (Eds.), Teaching as the learning profession: A Stanford Teacher Education Program. Issues in Teacher
handbook of policy and practice (pp. 3-32). San Francisco: Education, 11(1), 17-30.
Jossey-Bass. Hammerness, K., Darling-Hammond, L., & Shulman, L.
Ballou, D., & Podgursky, M. (2000). Reforming teacher (2002). Toward expert thinking: How curriculum case
preparation and licensing: What is the evidence? Teach- writing prompts the development of theory-based pro-
ers College Record, 102(1), 1-27. fessional knowledge in student teachers. Teaching Edu-
Berliner, D. (1986). In pursuit of the expert pedagogue. cation, 13(2), 221-245.
Educational Researcher, 15(7), 5-13. Howey, K. R., & Zimpher, N. L. (1989). Profiles of preservice
Berliner, D. (1991). Perceptions of student behavior as a teacher education. Albany: State University of New York
function of expertise. Journal of Classroom Interaction, Press.
26(1), 1-8. Kennedy, M. (1999). The role of preservice teacher educa-
Bikle, K., & Bunch, G. C. (2002). CLAD in STEP: One pro- tion. In L. Darling-Hammond & G. Sykes (Eds.), Teach-
gram’s efforts to prepare teachers for linguistic and cul- ing as the learning profession: Handbook of policy and prac-
tural diversity. Issues in Teacher Education, 11(1), 85-98. tice (pp. 54-85). San Francisco: Jossey Bass.
Cochran-Smith, M. (2001). Constructing outcomes in Kunzman, R. (2002). Preservice education for experienced
teacher education: Policy, practice and pitfalls. Educa- teachers: What STEP teaches those who have already
tion Policy Analysis Archives, 9(11). Retrieved from taught. Issues in Teacher Education, 11(1), 99-112.
http://epaa.asu.edu/epaa/v9n11.html Kunzman, R. (2003). From teacher to student: The value of
Cole, A. L., & Knowles, J. G. (1995). University supervisors teacher education for experienced teachers. Journal of
and preservice teachers: Clarifying roles and negotiat- Teacher Education, 54, 241-253.
ing relationships. Teacher Educator, 30(3), 44-56. Lotan, R., & Marcus, A. (2002). Standards-based assess-
Darling-Hammond, L. (2000). Reforming teacher prepara- ment of teacher candidates’ performance in clinical
tion and licensing: Debating the evidence. Teachers Col- practice. Issues in Teacher Education, 11(1), 31-48.
lege Record, 102(1), 28-56. McIntyre, J. D., Byrd, D. M., & Foxx, S. M. (1996). Field and
Darling-Hammond, L. (in press). Powerful teacher education: laboratory experiences. In J. Sikula, T. J. Buttery, & E.
Lessons from exemplary programs. San Francisco: Jossey- Guyton (Eds.), Handbook of Research on Teacher Education
Bass. (2nd ed., pp. 171-193). New York: Macmillan.
Darling-Hammond, L., Chung, R., & Frelow, F. (2002). Vari- National Commission on Teaching and America’s Future.
ation in teacher preparation: How well do different (1996). What matters most: Teaching for America’s future.
pathways prepare teachers to teach? Journal of Teacher New York: Author.
Education, 53(4), 286-302. Pecheone, R., & Chung, R. (2006). Evidence in teacher edu-
Darling-Hammond, L., Eiler, M., & Marcus, A. (2002). Per- cation: The Performance Assessment for California
ceptions of preparation: Using survey data to assess Teachers (PACT). Journal of Teacher Education, 57, 22-36.
teacher education outcomes. Issues in Teacher Education, Roeser, R. W. (2002). Bringing a “whole adolescent” per-
11(1), 65-84. spective to secondary teacher education: A case study
Darling-Hammond, L., Wise, A. E., & Klein, S. P. (1999). A of the use of an adolescent case study. Teaching Educa-
license to teach. San Francisco: Jossey-Bass. tion, 13(2), 155-179.
Darling-Hammond, L., & Youngs, P. (2002). Defining Shulman, L. S. (1996). Just in case: Reflections on learning
“highly qualified teachers:” What does “scientifically- from experience. In J. Colbert, P. Desberg, & K. Trimble
based research” actually tell us? Educational Researcher, (Eds.), The case for education: Contemporary approaches for
31(9), 13-25. using case methods (pp. 197-217). Boston: Allyn & Bacon.
Journal of Teacher Education, Vol. 57, No. 2, March/April 2006 137
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.
Shultz, S. (2002). Assessing growth in teacher knowledge. Linda Darling-Hammond is Charles E. Ducommun
Issues in Teacher Education, 11(1), 49-64. Professor of Education at Stanford University where she
U.S. Department of Education. (2002). The secretary’s report was faculty sponsor of the Stanford Teacher Education
on teacher quality. Washington, DC: Author. Program from 1998 to 2004. Her research, teaching, and
Williams, D. A., Ramanathan, H., Smith, D., Cruz, J., &
policy interests focus on teaching quality, school reform,
Lipsett, L. (1997). Problems related to participants’ roles
and programmatic goals in student teaching. Mid-Western
and educational equity. She is coeditor of the National
Educational Researcher, 10(4), 2-10. Academy of Education’s recent volume Preparing Teach-
Wise, A. E. (1996). Building a system of quality assurance ers for a Changing World: What Teachers Should
for the teaching profession: Moving into the 21st cen- Learn and Be Able To Do (Jossey-Bass, 2005) and its
tury. Phi Delta Kappan, 78(3), 191-192. companion volume, A Good Teacher in Every Class-
room: Preparing the Highly Qualified Teachers Our
Children Deserve (Jossey-Bass, 2005).
138 Journal of Teacher Education, Vol. 57, No. 2, March/April 2006
Downloaded from http://jte.sagepub.com at Stanford University on March 19, 2007
© 2006 American Association of Colleges for Teacher Education. All rights reserved. Not for commercial use or unauthorized distribution.