Proportionality and Growth Measures in School and Teacher Evaluations

Slide Note
Embed
Share

Growth models in educational evaluations are a topic of intense debate, with discussions around the selection of appropriate models to achieve optimal effort from agents, provide useful performance signals, and avoid exacerbating inequities. The proportional VAM model is proposed as a suitable choice based on these key objectives, offering a balanced approach to measuring student test-score growth without perpetuating disparities.


Uploaded on Oct 10, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Selecting Growth Measures for School and Teacher Evaluations: Should Proportionality Matter? Mark Ehlert Cory Koedel Eric Parsons Michael Podgursky Department of Economics, University of Missouri -Columbia 1

  2. Motivation Growth models are increasingly being incorporated into district, school and teacher evaluations across the United States. The question of how to model student test- score growth has resulted in lively policy debates What are objectives of the evaluation system? 2

  3. Summary of Findings We argue that the three key objectives of an evaluation system in education are: Elicit optimal effort from agents Provide useful performance signals to educational actors Avoid exacerbating pre-existing inequities in the labor markets faced by advantaged and disadvantaged schools Given these objectives, the proper growth model for use in evaluation systems is neither the sparse model or a traditional VAM model. Instead, it is what we call the proportional VAM model. 3

  4. A Model Menu The growth-model choice set essentially comes down to these three choices: 1) The sparse model (e.g., SGPs) 2) A single-equation VAM model (e.g., a standard value-added model from the research literature). = + + + + + + Y Y Y X S 0 1 1 1 2 3 4 isjt isjt iskt it it s ijst 3) The proportional model (e.g., a two-step fixed effects model or random- effects model, less common in research) = + 1 1 + 1 2 + + + Y Y Y X S 0 3 4 isjt isjt iskt it it isjt isjt = + u s isjt 4

  5. http://www.leg.state.nv.us/session/76th2011/exhibits/assembly/ed/aed1013c.pdfhttp://www.leg.state.nv.us/session/76th2011/exhibits/assembly/ed/aed1013c.pdf 5

  6. Comparing the One-Step and Two-Step VAMs The key difference is that the two-step VAM partials out variation in test scores attributable to student and school characteristics before estimating the school effects. Specific example: Suppose that high-poverty schools really are of lower quality (causally). In the one-step VAM, the model identifies poverty effects (F/R lunch) using within-school variation in student poverty status so it can separately identify differences in school quality between high- and low-poverty students In the two-step VAM, the first step attributes any and all systematic performance differences between high- and low-poverty students to the first-step variables (e.g., it purges them from the residuals), including systematic differences in school quality. The implication is that high- and low-poverty schools are only compared to each other in the model output not to dissimilar schools. 6

  7. Missouri Schools, Median SGPs r = -.37 7

  8. Missouri Schools, one-step fixed effects VAM r = -.25 8

  9. Missouri Schools, two-step fixed effects VAM r = -.03 9

  10. Implications Table 1. Correlations in School-Level Estimates Across Models. SGP One-step fixed effects Two-step fixed effects Table 3. Average Share of Students Eligible for Free/Reduced-Price Lunch in Non-Overlapping Top- Quartile Schools Across Models. Outside of Top Quartile: SGP Top-Quartile: SGP -- Top-Quartile: One-step FE 52.4 Top-Quartile: Two-step FE 69.7 Note: See text for a description of non-overlapping top-quartile schools. Sample Average Free/Reduced-Price Lunch Share: 48.2 SGP 1.00 -- -- One-step fixed effects 0.82 1.00 -- Two-step fixed effects 0.85 0.84 1.00 Outside of Top Quartile: One-step FE 47.7 -- 60.5 Outside of Top Quartile: Two-step FE 32.8 29.2 -- 10

  11. Objective #1: Elicit optimal educator effort Barlevy and Neal (2012) cover this issue extensively. There is also a large literature in economics, outside of the education- evaluation context, that is very clear on how to design evaluation systems when some competitors are at an inherent disadvantage (e.g., see Schotter and Weigelt (1992), who study this issue in the context of affirmative action policy) . A central lesson from these studies is that the right signal must be sent to agents in different circumstances to elicit optimal effort. This signal need not be a direct measure of absolute productivity; instead, it should be an indicator of performance relative to equally-circumstancedpeers. This is precisely what the proportional model does (based on observable circumstances). 11

  12. Objective #1: Elicit optimal educator effort Limitation: There is some evidence that the effort response margin in education in the United States is weak (Springer et al., 2010; on other hand Fryer, et. al., 2012). 12

  13. Objective #2: Provide useful performance signals It is a common conventional wisdom that growth-model output doesn t help educational actors improve. Is this really true? Growth model output can: Encourage effective schools (districts/teachers) to continue to refine and augment existing instructional strategies Serve as a point of departure for interventions/overhauls in ineffective schools (districts/teachers) Facilitate productive educator-to-educator learning by pairing low- and high-performing schools (districts/teachers). The signaling value of an evaluation system is particularly important when it is difficult for individual schools (districts/teachers) to assess their performance, and the performance of others, accurately. 13

  14. Objective #2: Provide useful performance signals We argue that the most useful performance signals come from the two-step proportional model. This is true even under the maintained assumption that the one-step VAM produces causal estimates. A key reason is that the causal estimates from the one-step VAM do not account for the counterfactual. Example: Disadvantaged schools face weaker educator labor markets (Boyd et al., 2005; Jacob, 2007; Koedel et al., 2011; Reininger, 2012) Sparse models provide the least-useful performance signals (not controversial: acknowledged in SGP literature) 14

  15. Example What do we tell Rough Diamond and Gold Leaf? What do we tell other schools about Rough Diamond and Gold Leaf? 15

  16. Objective #3: Labor-market inequities The labor-market difficulties faced by disadvantaged schools have been well-documented (Boyd et al., 2005; Jacob, 2007; Koedel et al., 2011; Reininger, 2012). As stakes become attached to school rankings based on growth models, systems that disproportionately identify poor schools as losers will make positions at these schools even less desirable to prospective educators. 16

  17. Summary thus far We identify three key objectives of an evaluation system in education: 1. Elicit optimal effort from agents 2. Provide useful performance signals to educational actors 3. Avoid exacerbating pre-existing inequities in the labor markets faced by advantaged and disadvantaged schools When one considers these key objectives, the proportionality feature of the two-step model is preferred on all three. 17

  18. But what about The fact remains that schools serving disadvantaged students really do have lower test scores, and lower unconditional growth, than schools serving advantaged students. There seems to be general concern that this information will be hidden if we construct proportional growth models. Our view is that this concern is largely misguided. 18

  19. Test-Score Levels 19

  20. Concluding Remarks Growth models are quickly (very quickly) moving from the research space to the policy space. The policy uses for growth models are not the same as the research uses for growth models. Starting with the right question is important: What are the objectives of the evaluation system? Beginning with this question, in our view, leads us to conclude that a proportional growth model is best-suited for use in educational evaluation programs for districts, schools and teachers. 20

Related


More Related Content