Understanding Large-Scale Achievement Surveys and Error Analysis

Slide Note
Embed
Share

Large-scale achievement surveys play a crucial role in education research, utilizing complex survey designs and estimation methods. Error analysis, including measurement error and sampling errors, is essential in obtaining accurate point and interval estimates. Various tools and techniques like PISATOOLS and PIAACTOOLS are used to address these errors effectively.


Uploaded on Oct 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Analyzing large-scale achievement surveys in Stata using PISATOOLS and PIAACTOOLS Dr Maciej Jakubowski Evidence Institute and Warsaw University November 2017

  2. What are large-scale achievement surveys? Complex survey design(s) Estimation without plausible values Point estimates Interval estimates with replicate weights Estimation with plausible values Point estimates Estimating sampling and measurement errors PISATOOLS PIAACTOOLS Agenda for today

  3. 2000s: PIRLS 2001 PIRLS 2006 TIMSS 2003 TIMSS 2007 TALIS 2008 before 1990: TED S-M ICCS 2009 FIMS 1964 PISA 2000 PISA 2003 FISS 1970 SIMS 1980 PISA 2006 PISA 2009 SISS 1983 1990s: 2010+ Reading L TALIS 2013 TALIS 2018 TIMSS 1995 ESLC 2012 ICILS 2013 ICILS 2018 IALS TIMSS 2011 TIMSS 2015 CIVIC PIRLS 2011 PIRLS 2016 PISA 2012 PISA 2015 PISA 2018

  4. Surveytechnical reports Data guides (TIMSS, PIRLS) Data analysis manual (PISA last version published in 2009) SVY documentationin Stata Where to find information?

  5. Measurement error Model-related errors Sampling schools and classrooms different probability of sampling a single school/classroom Sources of error Sampling students different probability of sampling a student (relatedmainly to school size) Non-response adjustments For trends: linking error

  6. The most important errors are: measurement error sampling errors Plausible values reflect measurement error Survey weight (main weight) to obtain unbiased point estimates for population Replicate weights to derive confidence intervals (interval estimates) reflecting sampling and non-response errors How to account for these errors?

  7. Stratum Urban Rural PSU Survey weights School1 School2 SchoolA SchoolB SchoolC Student1 Student2 Student1 Student2 Student1 Student2 Students

  8. Jackknife, BRR, bootstrap: re-sampling PSU units In Jackknife and BRR units are dropped by design and not randomly like in bootstrap PISA or PIAAC datasets contain sets of replicate weights BRR for PISA two different jackknife methods for PIAAC These weights usually contain additional information (often confidential), e.g. strata, non-response Replicate weights in Stata Easy to use by specifying svyset but Sometimes unclear how to specify svyset Some commands do not work with all replicate methods, e.g. qreg does not allow BRR

  9. . * open dataset . use int_stu09_jan27 if cnt=="POL".dta . . * create dummy for females How to do it in Stata? . recode st04q01 (2=0) (1=1), gen(female) (2443 differences between st04q01 and female) . . * create average SES for each school Example: regression with without plausible values . egen mean_escs=mean(escs), by(schoolid) . . sum joyread female escs mean_escs Variable Obs Mean Std. Dev. Min Max joyread 4,806 .0441885 1.096009 -3.2265 3.495 female 4,917 .5031523 .5000409 0 1 escs 4,870 -.2206496 .9144548 -3.1991 2.9589 mean_escs 4,917 -.2204822 .5147261 -1.258452 1.575866

  10. . * regression without weights - just sample estimates . reg joyread female escs mean_escs Source SS df MS Number of obs = 4,773 F(3, 4769) = 320.66 Model 960.2848 3 320.094933 Prob > F = 0.0000 Residual 4760.55098 4,769 .998228345 R-squared = 0.1679 Adj R-squared = 0.1673 Total 5720.83578 4,772 1.19883399 Root MSE = .99911 joyread Coef. Std. Err. t P>|t| [95% Conf. Interval] female .7517176 .0289555 25.96 0.000 .6949515 .8084837 escs .2443683 .0191595 12.75 0.000 .2068068 .2819297 mean_escs .0891822 .0339173 2.63 0.009 .0226887 .1556757 _cons -.2623389 .0216028 -12.14 0.000 -.3046904 -.2199874 . est store sample_reg

  11. . * regression with survey weights - population estimates . reg joyread female escs mean_escs [aw=w_fstuwt] (sum of wgt is 4.3541e+05) Source SS df MS Number of obs = 4,773 F(3, 4769) = 316.23 Model 929.986734 3 309.995578 Prob > F = 0.0000 Residual 4675.02843 4,769 .98029533 R-squared = 0.1659 Adj R-squared = 0.1654 Total 5605.01516 4,772 1.17456311 Root MSE = .9901 joyread Coef. Std. Err. t P>|t| [95% Conf. Interval] female .7654561 .0286986 26.67 0.000 .7091936 .8217186 escs .2442361 .0187771 13.01 0.000 .2074243 .2810479 mean_escs .0466393 .0372987 1.25 0.211 -.0264833 .1197619 _cons -.2862346 .0223631 -12.80 0.000 -.3300766 -.2423927 . est store aw_reg

  12. . * accounting for complex survey design with cluster S.E. . * ignores additional information on survey design and non-response . reg joyread female escs mean_escs [pw=w_fstuwt], vce(cluster schoolid) (sum of wgt is 4.3541e+05) Linear regression Number of obs = 4,773 F(3, 184) = 233.94 Prob > F = 0.0000 R-squared = 0.1659 Root MSE = .9901 (Std. Err. adjusted for 185 clusters in schoolid) Robust joyread Coef. Std. Err. t P>|t| [95% Conf. Interval] female .7654561 .0324301 23.60 0.000 .7014735 .8294387 escs .2442361 .0199941 12.22 0.000 .2047889 .2836833 mean_escs .0466393 .0486591 0.96 0.339 -.0493621 .1426408 _cons -.2862346 .0240522 -11.90 0.000 -.3336881 -.2387811 . est store pwcluster_reg

  13. . * proper BRR regression . * we tell Stata what is our survey design . * PISA uses BRR replicate weights with Fay correction . * every estimate is calculated 81 times so it takes time... . . svyset schoolid [pw=w_fstuwt], brrweight(w_fstr1-w_fstr80) vce(brr) fay(0.5) mse pweight: w_fstuwt VCE: brr MSE: on brrweight: w_fstr1 w_fstr2 w_fstr3 w_fstr4 w_fstr5 w_fstr6 w_fstr7 w_fstr8 w_fstr9 w_fstr10 w_fstr11 w_fstr12 w_fstr13 w_fstr14 w_fstr15 w_fstr16 w_fstr17 w_fstr18 w_fstr19 w_fstr20 w_fstr21 w_fstr22 w_fstr23 w_fstr24 w_fstr25 w_fstr26 w_fstr27 w_fstr28 w_fstr29 w_fstr30 w_fstr31 w_fstr32 w_fstr33 w_fstr34 w_fstr35 w_fstr36 w_fstr37 w_fstr38 w_fstr39 w_fstr40 w_fstr41 w_fstr42 w_fstr43 w_fstr44 w_fstr45 w_fstr46 w_fstr47 w_fstr48 w_fstr49 w_fstr50 w_fstr51 w_fstr52 w_fstr53 w_fstr54 w_fstr55 w_fstr56 w_fstr57 w_fstr58 w_fstr59 w_fstr60 w_fstr61 w_fstr62 w_fstr63 w_fstr64 w_fstr65 w_fstr66 w_fstr67 w_fstr68 w_fstr69 w_fstr70 w_fstr71 w_fstr72 w_fstr73 w_fstr74 w_fstr75 w_fstr76 w_fstr77 w_fstr78 w_fstr79 w_fstr80 fay: .5 Single unit: missing Strata 1: <one> SU 1: schoolid FPC 1: <zero>

  14. . svy: reg joyread female escs mean_escs (running regress on estimation sample) BRR replications (80) 1 2 3 4 5 .................................................. 50 .............................. Survey: Linear regression Number of obs = 4,773 Population size = 435,410.59 Replications = 80 Design df = 79 F( 3, 77) = 221.93 Prob > F = 0.0000 R-squared = 0.1659 BRR * joyread Coef. Std. Err. t P>|t| [95% Conf. Interval] female .7654561 .0340876 22.46 0.000 .6976063 .8333058 escs .2442361 .0201219 12.14 0.000 .2041845 .2842878 mean_escs .0466393 .0444229 1.05 0.297 -.0417823 .135061 _cons -.2862346 .0231624 -12.36 0.000 -.3323383 -.240131 . est store svybrr_reg

  15. . * comparing results of all regressions . est tab *, b(%7.5f) se(%7.5f) stats(N r2) modelwidth(13) Variable sample_reg aw_reg pwcluster_reg svybrr_reg female 0.75172 0.76546 0.76546 0.76546 0.02896 0.02870 0.03243 0.03409 escs 0.24437 0.24424 0.24424 0.24424 0.01916 0.01878 0.01999 0.02012 mean_escs 0.08918 0.04664 0.04664 0.04664 0.03392 0.03730 0.04866 0.04442 _cons -0.26234 -0.28623 -0.28623 -0.28623 0.02160 0.02236 0.02405 0.02316 N 4773 4773 4773 4773 r2 0.16786 0.16592 0.16592 0.16592 legend: b/se

  16. Plausible values are draws from posterior distribution of student latent achievement Usually 5, 10 or more plausible values are estimated Estimation with plausible values With each plausible value we can obtain unbiased estimates of student achievement Using one plausible values works well in initial analysis or for graphs However, only with five plausible values one can estimate measurement error

  17. .005 .004 .003 .002 .001 0 200 400 600 800 x 1st PV 2nd PV 3rd PV 4th PV 5th PV

  18. Point estimates: average of plausible value estimates Interval estimates obtained using Rubin s formula for multiple imputation (Rubin, 1987; Allison, 2000) Plausible values NEVER use average of plausible values as your variable

  19. Regression with plausible values point estimates Regression with plausible values using PISAREG Estimation algorithm with five plausible values: 1. Estimate your regression model with each plausible value and BRR replicate weights 2. Calculate regression coefficients by taking average of five coefficients Example in Stata 3. Your sampling variance is the average sampling variance from these regressions 4. Your measurement error is the variation of single plausible value regression coefficients around their average (point estimate). 5. Calculate S.E. using Rubin s formula It means you have to estimate each regression model with 405 regressions (5*(80+1))

  20. use int_stu09_jan27.dta if oecd==1, clear svyset schoolid [pw=w_fstuwt], brrweight(w_fstr1- w_fstr80) vce(brr) fay(0.5) mse Using forvalues loop to get a single coefficient recode st04q01 (2=0) (1=1), gen(female) local b=0 forvalues i=1(1)5 { svy: reg pv`i'read joyread female if cnt=="POL" local b=`b'+_b[joyread] } display "joyread coefficient: " %9.5f `b'/5

  21. pisareg depvar [indepvars] [if] [in] [,options] As depvar you can use math , scie , read and pisareg will know to use plausible values You can also use proflevel You should specify: cnt(string) save(filename, ...) pisareg example You can specify pvindep*(string). over(var) round(int) cycle(int) fast cons r2() pisareg read joyread female, cnt(OECD) cycle(2009) save(example_regOECD)

  22. Variable Country Australia Austria Belgium Canada Chile Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland Israel Italy Japan Korea Luxembourg Mexico Netherlands New Zealand Norway Poland Portugal Slovak Republic Slovenia Spain Sweden Switzerland Turkey United Kingdom United States OECD Average joyread female r2 Coef. 43.75 35.42 40.27 34.94 27.5 42.09 42.06 39.68 39.04 45.05 35.35 42.22 42.68 40.65 42.83 27.05 36.64 33.81 37.93 38.16 19.05 38.58 45.65 38.74 31.21 32.55 34.08 33.29 37.29 43.95 36.39 17.02 44.66 38.15 36.99 S.E. 1.12 1.56 1.29 0.85 1.6 1.73 1.51 1.92 1.24 2.35 1.38 2.15 2.03 1.46 1.57 1.93 1.01 1.71 2.1 1.51 1.15 2.07 1.63 1.5 1.44 1.69 2.22 1.46 1.1 1.7 1.32 2.17 1.53 1.98 0.28 Coef. 8.65 12.24 5.02 4.67 9.29 19.46 7.16 15.57 20.36 17.99 7.88 21.51 13.35 18.6 20.45 20.41 19.63 25.18 24.37 11.57 17.87 -0.55 16.48 22.41 24.81 14.18 32.69 31.47 7.69 14.67 9.04 31.62 2.52 1.32 15.58 S.E. 2.88 4.95 3.99 1.87 4.1 3.92 2.79 2.8 2.5 3.51 3.6 3.69 3.25 2.87 4.27 4.98 2.58 5.89 5.01 2.88 1.62 2.65 4.03 2.66 2.65 2.51 3.4 2.37 2.23 2.87 2.53 3.86 4.08 3.6 0.6 0.26 0.2 0.17 0.2 0.09 0.22 0.22 0.21 0.28 0.21 0.21 0.18 0.21 0.23 0.25 0.09 0.17 0.17 0.2 0.18 0.05 0.17 0.23 0.24 0.2 0.15 0.17 0.2 0.18 0.22 0.23 0.1 0.22 0.17 0.19

  23. https://www.evidenceinstitute.pl/skorzystaj-z-danych/ https://www.evidenceinstitute.eu/pisa-data-and-tools/ pisastats for basic statistics pisareg for linear regression Other commands in the PISATOOLS package pisaqreg for quantiles regression pisacmd for different regression and estimation commands pisadeco and pisaoaxaca for decomposition analysis Output saved as HTML tables and in matrices Check also: pv repest

  24. ssc install piaactools piaacdes descriptive statistics including plausible values PIAACTOOLS piaacreg different regression models piaactab tabulation with proficiency levels

  25. Example: Gender distribution by proficiency levels recode pvlit1 (.=.) (0/175.9999=0) /// (176/225.9999=1) (226/275.9999=2) /// (276/325.9999=3) (326/375.9999=4) /// (376/999=5), gen(proflevel1) tabstat male, by(proflevel) piaacdes male, over(pvlit) save(test) Example: Regression with plausible values as an independent variable. piaacreg readytolearn gender_r, /// pvindep1(pvnum) round(5) cons save(example3) mat list r(b) mat list r(se) Examples PIAAC data Example 4. Logistic regression with plausible values as an independent variable. recode computerexperience (1=1) (2=0), /// gen(compexp) piaacreg compexp readytolearn gender_r, /// pvindep1(pvnum) cmd("logit") save(example4)

  26. mj@evidenceinstitute.pl www.facebook.com/EvidenceInstitutePL @JakubowskiEvid www.evidenceinstitute.pl Zapraszamy do kontaktu!

More Related Content