Insights on Formative Assessment in Mathematics

undefined
 
Formative assessment in mathematics:
opportunities and challenges
 
Dylan Wiliam (@dylanwiliam)
 
Seminar at Teachers College, Columbia University
October 2013
 
A research agenda for formative assessment
 
Definitional issues
Domain-specificity issues
Effectiveness issues
Communication issues
Implementation issues
Adoption issues
undefined
 
Definitional issues
 
 
3
 
The evidence base for formative assessment
 
Fuchs & Fuchs (1986)
Natriello (1987)
Crooks (1988)
Bangert-Drowns, et al. (1991)
Dempster (1991, 1992)
Elshout-Mohr (1994)
Kluger & DeNisi (1996)
Black & Wiliam (1998)
 
Nyquist (2003)
Brookhart (2004)
Allal & Lopez (2005)
Köller (2005)
Brookhart (2007)
Wiliam (2007)
Hattie & Timperley (2007)
Shute (2008)
 
4
 
Definitions of formative assessment
 
We use the general term assessment to refer to all those activities
undertaken by teachers—and by their students in assessing
themselves—that provide information to be used as feedback to
modify teaching and learning activities. Such assessment becomes
formative assessment when the evidence is actually used to adapt the
teaching to meet student needs
” (Black & Wiliam, 1998 p. 140)
 
“the process used by teachers and students to recognise and respond
to student learning in order to enhance that learning, during the
learning” (Cowie & Bell, 1999 p. 32)
 
“assessment carried out during the instructional process for the
purpose of improving teaching or learning” (Shepard et al., 2005 p.
275)
 
 
“Formative assessment refers to frequent, interactive
assessments of students’ progress and understanding to identify
learning needs and adjust teaching appropriately” (Looney,
2005, p. 21)
 
“A formative assessment is a tool that teachers use to measure
student grasp of specific topics and skills they are teaching. It’s a
‘midstream’ tool to identify specific student misconceptions and
mistakes while the material is being taught” (Kahl, 2005 p. 11)
 
 
“Assessment for Learning is the process of seeking and interpreting
evidence for use by learners and their teachers to decide where the
learners are in their learning, where they need to go and how best to
get there”
 
(Assessment Reform Group,  2002 pp. 2-3)
 
“Assessment for learning is any assessment for which the first priority
in its design and practice is to serve the purpose of promoting
students’ learning. It thus differs from assessment designed primarily
to serve the purposes of accountability, or of ranking, or of certifying
competence. An assessment activity can help learning if it provides
information that teachers and their students can use as feedback in
assessing themselves and one another and in modifying the teaching
and learning activities in which they are engaged. Such assessment
becomes “formative assessment” when the evidence is actually used
to adapt the teaching work to meet learning needs.” (Black, Harrison,
Lee, Marshall & Wiliam, 2004 p. 10)
Theoretical questions
8
 
Need for clear definitions
So that research outcomes are commensurable
Theorization and definition
Possible variables
Category (instruments, outcomes, functions)
Beneficiaries (teachers, learners)
Timescale (months, weeks, days, hours, minutes)
Consequences (outcomes, instruction, decisions)
Theory of action (what gets 
formed
?)
 
 
Formative assessment: a new definition
 
“An assessment functions formatively to the extent that
evidence about student achievement elicited by the
assessment is interpreted and used, by teachers,
learners, or their peers, to make decisions about the next
steps in instruction that are likely to be better, or better
founded, than the decisions that would have been taken
in the absence of that evidence.”
 
Unpacking formative assessment
 
Where the
learner is going
 
Where the learner is
 
How to get there
 
Teacher
 
Peer
 
Learner
 
Clarifying,
sharing and
understanding
learning
intentions
 
Engineering effective
discussions, tasks, and
activities that elicit
evidence of learning
 
Providing
feedback that
moves learners
forward
 
Activating students as learning
resources for one another
 
Activating students as owners
of their own learning
10
 
Definitional issues: potential research
 
How can formative assessment be
defined and what are the
consequences of different definitions,
for psychometrics, for communication,
and for adoption?
undefined
 
Domain specificity issues
 
 
Pedagogy and didactics
 
Some aspects of formative assessment are generic
Some aspects of formative assessment are
domain-specific
There is a continuing debate about what aspects of
formative assessment are generic (pedagogy) and
which are domain-specific (didactics)
undefined
 
Clarifying, sharing and
understanding learning intentions
 
 
14
 
A standard middle school math problem…
 
Two farmers have adjoining fields
with a common boundary that is not
straight.
This is inconvenient for plowing.
How can they divide the two
fields so that the boundary
is straight, but the two
fields have the
same area as
they had before?
 
 
How many rectangles?
undefined
 
Engineering effective discussions,
activities, and classroom tasks that elicit
evidence of learning
 
Questioning in math: Diagnosis
 
In which of these right-angled triangles is a
2
 + b
2
 = c
2 
?
 
20
 
Diagnostic item: medians
 
What is the median for the following data set?
 
38      74      22      44      96      22      19      53
 
a.
22
b.
38 and 44
c.
41
d.
46
e.
70
f.
77
g.
This data set has no median
 
Diagnostic item: means
 
What can you say about the means of the following
two data sets?
Set 1: 
 
10
 
12
 
13
 
15
Set 2: 
 
10
 
12
 
13
 
15
 
0
 
A.
The two sets have the same mean.
B.
The two sets have different means.
C.
It depends on whether you choose to count the zero.
undefined
 
Providing feedback that moves
learners forward
 
Getting feedback right is hard
undefined
 
Activating students:
as learning resources for one another
as owners of their own learning
 
+/–/interesting: responses for “+”
 
26
 
I got that ball-park estimates are supposed to be simple
I know that you have to look at it and say “OK”
I know that when I am adding the number I end up with must
be bigger than the one I started at
I get most of the problems
It was easy for me because on the first one it says 328 so I
took the 2 and made it a 12
I know that we would have to regroup
I know how to do plus and minus because we have been
doing it for a long time
I get it when you cross out a number and make it a new one
I know that when you can’t – from both colomes you go to
the third colome and take that from it
I know that when my answer is right the ball park
estimate is close to it
 
 
 
+/–/interesting: responses for “–”
 
27
 
I am still a tiny bit confused about subtraction regrouping
I am a little bit confused about ball park estimates
I get confused because sometimes I don’t get the problem
I am confused when you subtract really big numbers like
1,000 something
I’m still a little bit confused about regrouping
Minus is confusing when you have to regroup twice
Minus is a little bit hard when you have to regroup
I don’t understand when you borrow which colome you
borrow from when both are 0
I am still confused about showing what I did to solve the
problem
I am a little confused about when you need to subtract
 
 
 
 
 
 
 
 
 
 
+/–/interesting: responses for “interesting”
 
28
 
Carrying the number over to the next number
It’s interesting how some people go to the nearest hundred
while some go to the nearest ten
It’s interesting how some have to regroup twice
It’s pretty interesting about how you have to work really hard
I am interested in borrowing because I didn’t just get it yet. I
want to really get to know it
I find it weird that you could just keep going from colome to
colome when you need to borrow
On the ball park estimate it is easy but sometimes hard
I really think that regrouping is pretty amazing
It is cool how addition and subtraction regrouping is just
moving numbers and you could get it right easily
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Domain-specificity issues: potential research
 
How much domain-specific knowledge does a
teacher need in order to be able to implement
high-quality formative assessment routines
consistently?
Can domain-specific formative assessment tools be
independent of a particular curriculum?
undefined
 
The effectiveness issue
 
 
Effects of formative assessment
 
Standardized effect size: differences in means, measured
in population standard deviations
undefined
 
Understanding meta-analysis:
“I think you’ll find it’s a bit more
complicated than that” (Goldacre, 2008)
 
 
32
 
Understanding meta-analysis
 
33
 
A technique for aggregating results from different
studies by converting empirical results to a
common measure (usually effect size)
Standardized effect size is defined as:
 
 
Problems with meta-analysis
The “file drawer” problem
Variation in population variability
Selection of studies
Sensitivity of outcome measures
 
 
 
undefined
 
The “file drawer” problem
 
 
34
The importance of statistical power
 
The statistical power of an experiment is the
probability that the experiment will yield an effect that
is large enough to be statistically significant.
In single-level designs, power depends on
significance level set
magnitude of effect
size of experiment
The power of most social studies experiments is low
Psychology:
 
0.4 (Sedlmeier & Gigerenzer, 1989)
Neuroscience:
 
0.2 (Burton et al., 2013)
Education:
 
0.4
Only lucky experiments get published…
undefined
 
Variation in variability
 
Annual growth in achievement, by age
37
Bloom, Hill, Black, and Lipsey (2008)
A 50% increase in the
rate of learning for six-
year-olds is equivalent
to an effect size of 0.76
A 50% increase in the
rate of learning for 15-
year-olds is equivalent
to an effect size of 0.1
 
Variation in variability
 
38
 
Studies with younger children will produce larger
effect size estimates
Studies with restricted populations (e.g., children
with special needs, gifted students) will produce
larger effect size estimates
undefined
 
Selection of studies
 
 
Feedback in STEM subjects
 
40
 
Review of 9000 papers on feedback in
mathematics, science and technology
Only 238 papers retained
Background papers
 
24
Descriptive papers
 
79
Qualitative papers
 
24
Quantitative papers
 
111
Mathematics
 
60
Science
 
35
Technology
 
16
 
Ruiz-Primo and Li (2013)
 
Classification of feedback studies
 
41
 
1.
Who provided the feedback (teacher, peer, self, or technology-based)?
2.
How was the feedback delivered (individual, small group, or whole
class)?
3.
What was the role of the student in the feedback (provider or
receiver)?
4.
What was the focus of the feedback (e.g., product, process, self-
regulation for cognitive feedback; or goal orientation, self-efficacy for
affective feedback)
5.
On what was the feedback based (student product or process)?
6.
What type of feedback was provided (evaluative, descriptive, or
holistic)?
7.
How was feedback provided or presented (written, video, oral, or
video)?
8.
What was the referent of feedback (self, others, or mastery criteria)?
9.
How, and how often was feedback given in the study (one time or
multiple times; with or without pedagogical use)?
 
Main findings
 
42
undefined
 
Sensitivity to instruction
 
 
Sensitivity of outcome measures
 
44
 
Distance of assessment from the curriculum
Immediate
e.g., science journals, notebooks, and classroom tests
Close
 e.g., where an immediate assessment asked about number of
pendulum swings in 15 seconds, a close assessment asks about the
time taken for 10 swings
Proximal
e.g., if an immediate assessment asked students to construct boats
out of paper cups, the proximal assessment would ask for an
explanation of what makes bottles float
Distal
e.g., where the assessment task is sampled from a different domain
and where the problem, procedures, materials and measurement
methods differed from those used in the original activities
Remote
standardized national achievement tests.
 
Ruiz-Primo, Shavelson, Hamilton, and Klein (2002)
 
Impact of sensitivity to instruction
 
45
 
Effect size
 
Close
 
Proximal
 
Effectiveness issues: potential research
 
Under what kind of conditions does
the implementation of formative
assessment practices in classrooms
lead to student improvement?
What kinds of increases in the rate of
student learning are possible?
undefined
 
Communication issues
 
 
Dissemination models
 
Gas-pump attendant
FedEx
IKEA
Sherpa
Gardener
PhD supervisor
 
 
So much for the easy bit…
 
Theorization
 
Advocacy
 
Products
Evidence of
impact
Ideas
 
Communication issues: potential research
 
How can the vision of effective formative
assessment practice be communicated to
teachers?
undefined
 
Implementation issues
 
Hand hygiene in hospitals
Pittet (2001)
 
Implementation issues
 
What are the practical obstacles to the
introduction of formative assessment
practices, and how can they be
overcome?
What kinds of tools and supports can
be provided for teachers, and what
needs to be developed locally?
undefined
 
Adoption issues
 
The story so far…
 
1993-1998
Review of research on formative assessment
1998-2003
Face-to-face implementations with groups of teachers
2003-2008
Attempts to produce faithful implementations at scale
2008-2013
Creating the conditions for implementations at scale
 
Adoption issues: potential research
 
How can we support leaders in
prioritizing changes that make the
most difference to student outcomes?
undefined
 
Comments? Questions?
 
www.dylanwiliam.net
Slide Note
Embed
Share

Delve into the nuances of formative assessment in mathematics through a comprehensive exploration of definitional issues, the evidence base, and key definitions. Uncover the research agenda, challenges, and opportunities in implementing and adopting formative assessment practices to enhance student learning outcomes.

  • Mathematics
  • Formative Assessment
  • Student Learning
  • Education

Uploaded on Oct 04, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Formative assessment in mathematics: opportunities and challenges Dylan Wiliam (@dylanwiliam) Seminar at Teachers College, Columbia University October 2013

  2. A research agenda for formative assessment Definitional issues Domain-specificity issues Effectiveness issues Communication issues Implementation issues Adoption issues ?

  3. 3 Definitional issues

  4. The evidence base for formative assessment 4 Fuchs & Fuchs (1986) Natriello (1987) Crooks (1988) Bangert-Drowns, et al. (1991) Dempster (1991, 1992) Elshout-Mohr (1994) Kluger & DeNisi (1996) Black & Wiliam (1998) Nyquist (2003) Brookhart (2004) Allal & Lopez (2005) K ller (2005) Brookhart (2007) Wiliam (2007) Hattie & Timperley (2007) Shute (2008) ?

  5. Definitions of formative assessment We use the general term assessment to refer to all those activities undertaken by teachers and by their students in assessing themselves that provide information to be used as feedback to modify teaching and learning activities. Such assessment becomes formative assessment when the evidence is actually used to adapt the teaching to meet student needs (Black & Wiliam, 1998 p. 140) the process used by teachers and students to recognise and respond to student learning in order to enhance that learning, during the learning (Cowie & Bell, 1999 p. 32) assessment carried out during the instructional process for the purpose of improving teaching or learning (Shepard et al., 2005 p. 275) ?

  6. Formative assessment refers to frequent, interactive assessments of students progress and understanding to identify learning needs and adjust teaching appropriately (Looney, 2005, p. 21) A formative assessment is a tool that teachers use to measure student grasp of specific topics and skills they are teaching. It s a midstream tool to identify specific student misconceptions and mistakes while the material is being taught (Kahl, 2005 p. 11) ?

  7. Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there (Assessment Reform Group, 2002 pp. 2-3) Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting students learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information that teachers and their students can use as feedback in assessing themselves and one another and in modifying the teaching and learning activities in which they are engaged. Such assessment becomes formative assessment when the evidence is actually used to adapt the teaching work to meet learning needs. (Black, Harrison, Lee, Marshall & Wiliam, 2004 p. 10) ?

  8. Theoretical questions 8 Need for clear definitions So that research outcomes are commensurable Theorization and definition Possible variables Category (instruments, outcomes, functions) Beneficiaries (teachers, learners) Timescale (months, weeks, days, hours, minutes) Consequences (outcomes, instruction, decisions) Theory of action (what gets formed?) ?

  9. Formative assessment: a new definition An assessment functions formatively to the extent that evidence about student achievement elicited by the assessment is interpreted and used, by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence. ?

  10. Unpacking formative assessment 10 Where the learner is going Where the learner is How to get there Providing feedback that moves learners forward Engineering effective discussions, tasks, and activities that elicit evidence of learning Teacher Clarifying, sharing and understanding learning intentions Peer Activating students as learning resources for one another Activating students as owners of their own learning Learner

  11. Definitional issues: potential research How can formative assessment be defined and what are the consequences of different definitions, for psychometrics, for communication, and for adoption?

  12. Domain specificity issues

  13. Pedagogy and didactics Some aspects of formative assessment are generic Some aspects of formative assessment are domain-specific There is a continuing debate about what aspects of formative assessment are generic (pedagogy) and which are domain-specific (didactics) ?

  14. 14 Clarifying, sharing and understanding learning intentions

  15. A standard middle school math problem Two farmers have adjoining fields with a common boundary that is not straight. This is inconvenient for plowing. How can they divide the two fields so that the boundary is straight, but the two fields have the same area as they had before? ? ?

  16. ? ?

  17. ? ?

  18. How many rectangles? ? m m-1 ( 2 ) ( ) n n-1 2 ?

  19. Engineering effective discussions, activities, and classroom tasks that elicit evidence of learning

  20. Questioning in math: Diagnosis 20 In which of these right-angled triangles is a2 + b2 = c2 ? b c A B a a c b a c C D b b c a a b E F c c b a ?

  21. Diagnostic item: medians What is the median for the following data set? 38 74 22 44 96 22 19 53 a. 22 b. 38 and 44 c. 41 d. 46 e. 70 f. 77 g. This data set has no median ?

  22. Diagnostic item: means What can you say about the means of the following two data sets? Set 1: Set 2: 10 10 12 12 13 13 15 15 0 A. The two sets have the same mean. B. The two sets have different means. C. It depends on whether you choose to count the zero. ?

  23. Providing feedback that moves learners forward

  24. Getting feedback right is hard Response type Feedback indicates performance falls short of goal exceeds goal Change behavior Increase effort Exert less effort Change goal Reduce aspiration Increase aspiration Abandon goal Decide goal is too hard Decide goal is too easy Reject feedback Feedback is ignored Feedback is ignored ?

  25. Activating students: as learning resources for one another as owners of their own learning

  26. +//interesting: responses for + 26 I got that ball-park estimates are supposed to be simple I know that you have to look at it and say OK I know that when I am adding the number I end up with must be bigger than the one I started at I get most of the problems It was easy for me because on the first one it says 328 so I took the 2 and made it a 12 I know that we would have to regroup I know how to do plus and minus because we have been doing it for a long time I get it when you cross out a number and make it a new one I know that when you can t from both colomes you go to the third colome and take that from it I know that when my answer is right the ball park estimate is close to it ?

  27. +//interesting: responses for 27 I am still a tiny bit confused about subtraction regrouping I am a little bit confused about ball park estimates I get confused because sometimes I don t get the problem I am confused when you subtract really big numbers like 1,000 something I m still a little bit confused about regrouping Minus is confusing when you have to regroup twice Minus is a little bit hard when you have to regroup I don t understand when you borrow which colome you borrow from when both are 0 I am still confused about showing what I did to solve the problem I am a little confused about when you need to subtract ?

  28. +//interesting: responses for interesting 28 Carrying the number over to the next number It s interesting how some people go to the nearest hundred while some go to the nearest ten It s interesting how some have to regroup twice It s pretty interesting about how you have to work really hard I am interested in borrowing because I didn t just get it yet. I want to really get to know it I find it weird that you could just keep going from colome to colome when you need to borrow On the ball park estimate it is easy but sometimes hard I really think that regrouping is pretty amazing It is cool how addition and subtraction regrouping is just moving numbers and you could get it right easily ?

  29. Domain-specificity issues: potential research How much domain-specific knowledge does a teacher need in order to be able to implement high-quality formative assessment routines consistently? Can domain-specific formative assessment tools be independent of a particular curriculum? ?

  30. The effectiveness issue

  31. Effects of formative assessment Standardized effect size: differences in means, measured in population standard deviations Source Effect size Kluger & DeNisi (1996) 0.41 Black &Wiliam (1998) Wiliam et al., (2004) 0.4 to 0.7 0.32 Hattie & Timperley (2007) 0.96 Shute (2008) 0.4 to 0.8 ?

  32. 32 Understanding meta-analysis: I think you ll find it s a bit more complicated than that (Goldacre, 2008)

  33. Understanding meta-analysis 33 A technique for aggregating results from different studies by converting empirical results to a common measure (usually effect size) Standardized effect size is defined as: Problems with meta-analysis The file drawer problem Variation in population variability Selection of studies Sensitivity of outcome measures ?

  34. 34 The file drawer problem

  35. The importance of statistical power The statistical power of an experiment is the probability that the experiment will yield an effect that is large enough to be statistically significant. In single-level designs, power depends on significance level set magnitude of effect size of experiment The power of most social studies experiments is low Psychology: 0.4 (Sedlmeier & Gigerenzer, 1989) Neuroscience: 0.2 (Burton et al., 2013) Education: 0.4 Only lucky experiments get published ?

  36. Variation in variability

  37. Annual growth in achievement, by age 37 1.6 A 50% increase in the rate of learning for six- year-olds is equivalent to an effect size of 0.76 1.4 annual growth (SDs) 1.2 A 50% increase in the rate of learning for 15- year-olds is equivalent to an effect size of 0.1 1.0 0.8 0.6 0.4 0.2 0.0 5 6 7 8 9 10 11 12 13 14 15 16 Age Bloom, Hill, Black, and Lipsey (2008) ?

  38. Variation in variability 38 Studies with younger children will produce larger effect size estimates Studies with restricted populations (e.g., children with special needs, gifted students) will produce larger effect size estimates ?

  39. Selection of studies

  40. Feedback in STEM subjects 40 Review of 9000 papers on feedback in mathematics, science and technology Only 238 papers retained Background papers Descriptive papers Qualitative papers Quantitative papers Mathematics Science Technology 24 79 24 111 60 35 16 Ruiz-Primo and Li (2013) ?

  41. Classification of feedback studies 41 1. Who provided the feedback (teacher, peer, self, or technology-based)? 2. How was the feedback delivered (individual, small group, or whole class)? 3. What was the role of the student in the feedback (provider or receiver)? 4. What was the focus of the feedback (e.g., product, process, self- regulation for cognitive feedback; or goal orientation, self-efficacy for affective feedback) 5. On what was the feedback based (student product or process)? 6. What type of feedback was provided (evaluative, descriptive, or holistic)? 7. How was feedback provided or presented (written, video, oral, or video)? 8. What was the referent of feedback (self, others, or mastery criteria)? 9. How, and how often was feedback given in the study (one time or multiple times; with or without pedagogical use)? ?

  42. Main findings 42 Characteristic of studies included Maths Science Feedback treatment is a single event lasting minutes 85% 72% Reliability of outcome measures 39% 63% Validity of outcome measures 24% 3% Dealing only or mainly with declarative knowledge 12% 36% Schematic knowledge (e.g., knowing why) 9% 0% Multiple feedback events in a week 14% 17% ?

  43. Sensitivity to instruction

  44. Sensitivity of outcome measures 44 Distance of assessment from the curriculum Immediate e.g., science journals, notebooks, and classroom tests Close e.g., where an immediate assessment asked about number of pendulum swings in 15 seconds, a close assessment asks about the time taken for 10 swings Proximal e.g., if an immediate assessment asked students to construct boats out of paper cups, the proximal assessment would ask for an explanation of what makes bottles float Distal e.g., where the assessment task is sampled from a different domain and where the problem, procedures, materials and measurement methods differed from those used in the original activities Remote standardized national achievement tests. Ruiz-Primo, Shavelson, Hamilton, and Klein (2002) ?

  45. Impact of sensitivity to instruction 45 Effect size Close Proximal ?

  46. Effectiveness issues: potential research Under what kind of conditions does the implementation of formative assessment practices in classrooms lead to student improvement? What kinds of increases in the rate of student learning are possible?

  47. Communication issues

  48. Dissemination models Gas-pump attendant FedEx IKEA Sherpa Gardener PhD supervisor ?

  49. So much for the easy bit Ideas Theorization Products Evidence of impact Advocacy ?

  50. Communication issues: potential research How can the vision of effective formative assessment practice be communicated to teachers? ?

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#