Gender Wage Gap Among Those Born in 1958: A Matching Estimator Approach

Slide Note
Embed
Share

Examining the gender wage gap among individuals born in 1958 using a matching estimator approach reveals significant patterns over the life course. The study explores drawbacks in parametric estimation, the impact of conditioning on various variables, and contrasts with existing literature findings, shedding light on the importance of pre-labor market traits in wage formation. Results show a raw gap trend, regression-adjusted comparisons, and the value of matching estimators in addressing endogeneity concerns in wage differentials.


Uploaded on Sep 22, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. The Gender Wage Gap Among Those Born in 1958: A Matching Estimator Approach Alex Bryson UCL IAB Symposium November 23rd2023 Nuremberg (ESRC Grant No. ES/S012583/1)

  2. Project Overview Part of an ESRC funded project examining the GWG over the life course using birth cohort data The UCL team: Alex Bryson (PI) Heather Joshi (co-investigator) David Wilkinson (co-investigator) Francesca Foliano (Research Fellow) Bozena Wielgoszewska (Research Fellow) All information on the project can be found here: https://www.ucl.ac.uk/ioe/departments-and- centres/departments/social-science/gender-wage-gap-evidence- cohort-studies

  3. Motivation 1. Drawbacks in parametric estimation of the gender wage gap (GWG) Failure to compare like men and women 2. Common to condition on potentially endogenous variables Biases true estimates of the GWG 3. Data from the National Child Development Survey (NCDS) provide good basis for tackling these issues: Match men and women on a rich set of variables liable to impact wage formation over the life cycle which might conceivably differ by gender Measured pre-labour market entry and thus less liable to be endogenous with respect to wage formation Birth, 7, 11 and 16 years collected prospectively

  4. Preview of Results 1. Large raw GWG rising until 40s then falls but remains sizeable to age 63 2. The regression-adjusted GWG is similar to, or larger, than the raw gap when conditioning on pre-labour market variables. 3. This contrasts to findings in the literature in which the regression- adjusted GWG is usually much smaller than the raw gap. 4. This is the case whether we use matching or linear estimation techniques. 5. The PSM estimated GWG is above the raw gap when cohort members are in their 40s, 50s and 60s. 6. The implication is that women have pre-labour market traits which increase their earnings later in life relative to men. Better maths and reading scores, fewer behavioural problems But also very different occupational expectations

  5. Previous Literature 1. Studies indicate inverted u-shape in the GWG over the life course Small in early years, widening in 30s/40s, narrowing thereafter 2. Falls across cohorts 3. Raw gap tends to close by (roughly) one half when condition on other variables Depends somewhat on data set and conditioning variables 4. Frequently treats education and fertility decisions as exogenous when, in fact, might be endogenous and partials out some of GWG Same could be said of job traits 5. Some exceptions using structural estimation in an effort to tackle endogenous decision-making Adda, Dustmann and Stevens 2017 The Career Costs of Children , Journal of Political Economy

  6. Value of Matching Estimators 1. Linear estimation (and decompositions based on them) rely on assumptions regarding functional form 2. If ignore common support might be comparing wages of women and men who are not reasonable comparators 3. Matching may make a substantive difference to the estimation of the GWG (Nopo, 2008) Strittmatter and Wunsch (2021) explain more of GWG when estimated with PSM Substantial common support issue in their data Combine exact matching on key wage determinants with PSM (radius) matching

  7. PSM v OLS 1. Both assume relevant differences between treated and non-treated are captured by their observed data (conditional independence assumption) violated if analysis does not incorporate all factors affecting participation and outcome of interest the assumption is not testable 2. Advantages of PSM relative to OLS semi-parametric so does not require assumption of linearity in outcome equation individual causal effect is completely unrestricted so heterogeneous treatment effects can be captured (no assumption of constant additive effects) highlights problem of common support since women must have like counterparts in male population. Thus, avoids extrapolating beyond CS but implications if many treated individuals remain unmatched 7

  8. Data and Methods 1. National Child Development Study (NCDS) 2. Log hourly wages at ages 23, 33, 42, 50, 55, 61 and 63 Deflated to January 2000 prices Rerun matching for each wage outcome 3. Propensity score matching (PSM) used to match women to men on single index (the propensity score) derived from probit (0,1) if woman 4. Using pre- labour market covariates from mother, cohort member, teacher Parental background; pregnancy/birth; ages 7, 11, 16 5. Theory driven as opposed to data driven (Machine Learning) Some advantages to ML (Bonaccolto-T pfer and Briel, 2022) 6. Plausibility of conditional independence assumption in this case 7. 5 nearest neighbours (Froelich) to recover ATT enforces common support with 0.005 caliper Bootstrapping (50 reps) Compare with OLS, OLS with match weights, OLS with entropy weights (Hainmueller, 2012)

  9. Covariates used in matching Wave Pre-birth/birth Variables White; country of birth; father s social class; mother smoked during pregnancy; birthweight (ounces); sibling birth order; mother smoking 4 months after birth Southgate reading test score; arithmetic problems; N Rutter symptoms; Score on Bristol Social Adjustment Guide; number of child illnesses Occupational expectations when aged 25; standardized reading score; standardized maths score In trouble with police; mother s assessment of over/underweight; disability; alcohol consumption; smoking behaviour Age 7 Age 11 Age 16 Girls are 4 ounces lighter at birth; have higher reading scores at age 7; have lower Rutter score age 7; have lower BSAG score at age 7; very different occupational expectations (slide 11); have higher reading and maths scores at age 11; have fewer problems with the police at age 16; more likely to be perceived as obese by parents at age 16; less likely to be disabled at 16; drinks less alcohol at age 16; less likely to smoke at age 16

  10. Scales 1. Bristol Social Adjustment Guide (BSAG see Engel 1959). Additive scale (0,64) from 12 syndromes at age 7: unforthcomingness; withdrawal; depression; anxiety for acceptance by adults; hostility towards adults; writing off of adults and adult standards; anxiety for acceptance by children; hostility towards children; restlessness; inconsequential behaviour; miscellaneous symptoms and miscellaneous nervous symptoms 2. Anti-social behaviours Rutter scale (0,21) reported by mum at age 7 captures conduct problems such as destroys own/others property; frequently fights with/is quarrelsome with other children; often disobedient; often tells lies; bullies other children (Rutter et al., 1970) Rutter at 16 impacts on lifetime employment Parsons et al. 2022 Teenage conduct problems: a lifetime of disadvantage in the labour market, Oxford Economic Papers 3. Southgate reading test score Score of 0, 30 at age 7 to test word recognition and comprehension See https://closer.ac.uk/cross-study-data-guides/cognitive-measures- guide/ncds-cognition/ncds-age-7-southgate-group-reading-test-sgrt/

  11. Occupational Expectations At Age 25 Asked at Age 11 Male 9 6 Female 4 4 Professional Other non-manual, scientific Typist, clerical Shop assistant Junior non-managerial Personal services Foreman, manual Skilled manual Semi-skilled manual Unskilled manual Self-employed Farm worker HM Forces Sports man/woman Student Teacher/nurse Unclassifiable 2 1 3 1 <1 18 3 <1 1 2 7 9 <1 2 34 11 7 1 9 <1 1 <2 <1 1 2 <1 <1 <1 20 38

  12. Match Bias 23 33 42 50 55 61 63 Pseudo r-sq: Unmatched 0.398 0.390 0.380 0.384 0.395 0.408 0.408 Matched Rubin s B Rubin s R 0.010 23.1 1.04 0.015 29.2* 1.07 0.014 27.5* 0.95 0.018 31.5* 1.07 0.030 40.2* 1.08 0.062 58.9* 1.34 0.062 58.9* 1.34 Rubin s B: absolute standardised differences of means of linear index of propensity score in treated and match non-treated groups (B<25 is ok) Rubin s R: ratio of treated to matched non-treated variances of propensity score index (R between 0.5 and 2 is deemed balanced) means falls outside tolerable balance limits

  13. Common Support 40 cases off common support. Zero at other ages

  14. GWG At Different Ages using PSM differences in log mean hourly earnings 61 63 Age: 23 33 42 50 55 2.611 2.635 Fem 1.536 1.843 1.908 2.080 2.022 2.890 2.913 Male Unmatched Male Matched Raw difference Matched difference N 1.704 2.209 2.354 2.435 2.359 2.950 2.950 1.693 2.206 2.378 2.460 2.369 -.337 (17.35) -.279 (10.89) -.278 (11.08) -.168 (21.41) -.367 (25.91) -.446 (25.38) -.355 (21.81) -.339 (4.75) -.315 (4.33) -.156 (7.95) -.363 (10.82) -.471 (10.98) -.381 (8.88) -.347 (7.19) 1668 1668 8011 6881 7175 6031 4992 GWG follows an inverted-U shape over the life course, peaking when women are in their 40s Raw gap is very substantial, ranging from around .17 log points when cohort members are in their early 20s to .45 log points in their 40s The PSM estimated GWG is similar to the raw gap when cohort members are in their 20s and 30s. However, the PSM estimated gap is above the raw gap when they are in their 40s, 50s and 60s The implication is that women have pre-labour market traits which increase their earnings later in life relative to men

  15. GWG from Log Hourly Wage Regressions 61 63 Age: 23 33 42 50 55 -.337 (17.35) -.279 (10.89) -.278 (11.08) Raw difference OLS -.168 (21.41) -.367 (25.91) -.446 (25.38) -.355 (21.81) -.375 (15.25) -.331 (10.27) -.348 (10.54) -.336 (7.98) -.332 (10.52) -.176 (17.37) -.350 (19.91) -.438 (19.79) -.352 (17.49) -.308 (7.38) OLS with PSM weights OLS with entropy weights N -.150 (10.58) -.343 (14.59) -.457 (12.86) -.356 (12.77) -.372 (10.46) -.325 (6.98) -.293 (6.56) -.162 (11.57) -.342 (15.03) -.460 (13.37) -.340 (12.91) 1668 1668 8011 6881 7175 6031 4992 OLS regression adjusted gaps are similar to raw gaps until age 55 and later when the regression-adjusted estimates are larger than the raw gap. Unweighted OLS regression adjusted gaps are larger than the OLS estimates with PSM weights entropy weights. No systematic difference in the size of the GWG as indicated by the matched difference in last slide and regression adjusted estimates here: the matched difference is larger at ages 33, 42, and 50 whereas the OLS estimate is larger at ages 23 and 63

  16. Log Hourly Gender Wage Gap by Age -0.1 -0.15 -0.2 -0.25 -0.3 -0.35 -0.4 -0.45 -0.5 23 33 42 50 55 61 63 Raw Gap Matched OLS OLS with PSM wts OLS with entropy wts

  17. Summary 1. Large raw GWG rising until 40s then falls but remains sizeable to age 63 2. In contrast to findings in the literature in which the regression-adjusted GWG is considerably smaller than the raw gap, differences in log hourly mean earnings between men and women are of roughly similar size and, in some cases, wider than raw gaps conditioning on pre-labour market variables. 3. This is the case whether we use matching or linear estimation techniques. 4. However, the PSM estimated GWG is above the raw gap when cohort members are in their 40s, 50s and 60s. 5. The implication is that women have pre-labour market traits which increase their earnings later in life relative to men. Better maths and reading scores; fewer behavioural problems. But also very different occupational expectations

  18. What Next? 1. Specification for probit Have we got the right covariates? Dangers of matching on irrelevant variables (King and Nielsen, 2019) More flexible specification? 2. Whether to hard match on occupational expectations? 3. Alternative matching estimators (as per Froelich, 2007) NN, kernel; combine exact matching with PSM; entropy balancing 4. Tackline participation decision Likely under-estimating GWG if negative selection into employment Bringing in the zeros results in a much larger GWG Using matching estimates to impute earnings to non-participants (as per Bryson et al. 2020) Is this the right thing to do? 5. Attrition

Related