Assessing the 2020 Census Quality Data Report
The 2020 Census quality data report was assessed by an ad hoc panel appointed by The National Academies of Sciences, Engineering, and Medicine. The panel reviewed data collected, process measures, demographic analysis, and more to evaluate the quality of the census data. Recommendations were made for further research by the Census Bureau for the 2030 Census.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Assessing the 2020 Census: Final Report Consensus Study Committee on National Statistics (CNSTAT) Released October 3, 2023 Assessing the 2020 Census: Final Report | The National Academies Press (www.nap.edu) ASSESSING THE 2020 CENSUS
Statement of Task The National Academies of Sciences, Engineering, and Medicine will appoint an ad hoc panel to review the quality of the data that were collected in the 2020 Census. As part of its work, the panel will: 1. Review information from the Census Bureau on the data collected as well as various process measures and indicators of data quality obtained as part of the 2020 Census operations; 2. Review other available information, such as results from demographic analysis, process measures and preliminary results from the post-enumeration survey; and analyses of administrative records; and 3. Consider the results from evaluations of similar indicators from the 2010 and 2000 Censuses. The panel will produce a final report that includes conclusions about the quality of the data collected in the 2020 Census and makes recommendations for further research by the Census Bureau to evaluate the quality of the 2020 data and to begin planning the 2030 Census. 2
Panel to Evaluate the Quality of the 2020 Census Teresa A. Sullivan (Chair), Department of Sociology, University of Virginia C. Matthew Snipp, School of Humanities and Sciences, Stanford University Margo Anderson, Department of History, University of Wisconsin Milwaukee (emerita) Edward Telles, Department of Sociology, University of California, Irvine Robert M. Bell,* Google and AT&T Labs (retired) Wendy Underhill, National Conference of State Legislatures, Denver Kathryn Edin (NAS), Department of Sociology, Princeton University David Van Riper,* Minnesota Population Center, University of Minnesota Marc Hamel, Statistics Canada (retired) * Denotes member of the panel s designated data analytic subgroup. George T. Ligler (NAE), Department of Multidisciplinary Engineering, Texas A&M University Thomas A. Louis,* Department of Biostatistics, Johns Hopkins University (emeritus) Daniel L. Cork, Study Director Lloyd B. Potter,* Texas Demographic Center, University of Texas at San Antonio Constance F. Citro, Senior Scholar Michael L. Cohen, Senior Program Officer Joseph J. Salvo,* University of Virginia Biocomplexity Institute Anthony Mann, Senior Program Associate Katrina Baum Stone, Senior Program Officer Regina Shih, Department of Epidemiology, Rollins School of Public Health, Emory University 3
How We Worked: Data Analysis Subgroup 10 plenary meetings, half involving Census Bureau presentations Dedicated data analysis subgroup was defining characteristic of this panel Sworn/cleared to work behind Census Bureau IT firewall Met twice a week (and, for several months, a third, biweekly session with Census Bureau staff) All panel members sworn/cleared at level to facilitate discussions/deliberations via Census VDI Only figures/tables cleared by Census Disclosure Review Board (and subject to rounding or noise infusion) were ported from behind firewall 4
Disclosure Review Board Approval Numbers CBDRB-FY23-0171 (Nonresponse Followup [NRFU] resolutions by phase, figures) CBDRB-FY23-0179 (race/ethnicity analysis) The Census Bureau s Disclosure Review Board and Disclosure Avoidance Officers have reviewed the information products used to produce several figures and tables in this report for unauthorized disclosure of confidential information and have approved the disclosure avoidance practices applied to these releases. CBDRB-FY23-0180 (age heaping analysis) CBDRB-FY23-0197 (Master Address File [MAF] development, figures) CBDRB-FY23-0206 (correlation coefficients) CBDRB-FY23-0214 (NRFU resolutions by phase, tables) By mutual agreement, Disclosure Review Board clearance was sought and obtained only for the generation of specific figures and tables published in this report, and not for the issuance of the underlying data files as standalone products or as addenda to this report. CBDRB-FY23-0215 (MAF development, tables) CBDRB-FY23-0221 (Group Quarters data quality) CBDRB-FY23-0224 (NRFU and Self-Response return rate analysis by ventiles of American Community Survey variables) 5
SEPTEMBER 2023 BRIEFING Main Messages: Key Conclusions About 2020 6
Signature achievement of the 2020 Census was its very completion under exceptionally difficult circumstances Census Bureau s focus on key innovation areas created the capability to react as well as could be expected to unprecedented difficulties Four innovation areas driving 2020 Census planning worked very well Implement Internet Self-Response (and position it as modal response channel) Shift precensus Address Canvassing work from fieldwork to in-office review of imagery, other data Reengineer field management and case handling systems Permit use of administrative records to enumerate some nonresponding households, for which reliable records were available and at least one field visit was unsuccessful Other helpful design features for meeting 2020 challenges included: Careful planning and necessary modification of mailing/contact strategy Reinstating Update Leave rather than pure reliance on Update Enumerate (reduced contacts during COVID) Attention to Master Address File coverage and updating in preceding decade 7
No census is, or can be, perfectand 2020 had its share of data quality issues Age Heaping Age heaping in 2020 much more severe than in 2010, and analyses suggest cause was degradation in quality of proxy response in Nonresponse Follow-up NRFU, Proxy Respondents 25.00 20.00 Estimated Heaping Metric Total Population 15.00 10.00 2.00 1.50 1.00 5.00 0.50 0.00 0.00 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 -0.50 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 -1.00 -5.00 Age in Census Edited File 2010 2020 -10.00 Age in Census Edited File 2010 2020 8
No census is, or can be, perfectand 2020 had its share of data quality issues Group Quarters Group Quarters Enumeration is a challenge in any census, but difficulty made nearly insurmountable by 2020 Census circumstances COVID-19-related effects, delays, and access restrictions hit GQs particularly hard 45% GQ Imputation Rates 50% Upended college student count, both on- campus (GQ) and off-campus (cancelled Early NRFU) 40% 35% 30% 25% Nursing, health care facilities more restricted 20% 15% Good idea to convert GQs to eResponse, but problematic for college/university student count, re Family Educational Rights and Privacy Act (FERPA) restrict- ions on only providing directory info 10% 5% 0% Sex Age/DOB Ethnicity Race 2020 Census 2010 Census 9
No census is, or can be, perfectand 2020 had its share of data quality issues Differential Net Undercount Net percent undercoverage with respect to race and ethnicity in 2020 consistent with 2010 patterns but exacerbated in degree, for reasons not readily known (Note: Demographic Analysis estimates for Black and all other people not yet available) 4 3 2 1 0 White Non- Hispanic Alone Black AOIC Asian AOIC AIAN AOIC on Reservations Hispanic -1 -2 -3 -4 -5 -6 -7 2020 PES 2010 PES 10
No census is, or can be, perfectand 2020 had its share of data quality issues Missing People Omissions decreased for White and Asian people in 2020, stayed at the same high level for American Indians/Alaska Natives on reservations, and increased for Black and Hispanic people. Percent P-Sample Omissions 16 14 12 10 8 6 4 2 0 White Non- Hispanic Alone Black AOIC Asian AOIC AIAN AOIC on Reservations Hispanic 2020 PES 2010 PES 11
Net Percent Undercoverage of Black people by age, sex, and owner/renter status, 2020 and 2010 PES Black Percent Net Undercount 6 4 2 0 -2 -4 -6 -8 -10 -12 -14 2020 PES 2010 PES 12
Net Percent Undercoverage of Hispanic people by age, sex, and owner/renter status, 2020 and 2010 PES Hispanic Percent Net Undercount 6 4 2 0 -2 -4 -6 -8 -10 -12 -14 2020 PES 2010 PES 13
Item Imputation Rates by Response Mode, 2020 & 2010 2020 2010 70% 60% 60% 50% 50% 40% 40% 30% 30% 20% 20% 10% 10% 0% 0% Self NRFU Household NRFU Proxy NRFU Ad Records Self NRFU Household NRFU Proxy Age/DOB Ethnicity Race Age/DOB Ethnicity Race Proxy responses: imputation rates for age/date of birth high in 2020 and 2010, but 2-3 times higher In 2020 for race and ethnicity. Note high-ish rates of imputation for race and ethnicity for administrative records enumerations in NRFU in 2020. If you can get a household member to respond (self or in NRFU), data quality will be good. 14
Some Other Findings RESPONSE MODES Households in the poorest census tracts least likely and households in census tracts with the largest size housing units (# rooms) most likely to self-respond Households in the poorest census tracts least likely and households in census tracts with the most prevalent broadband & highest median incomes most likely to self-respond by internet Households in census tracts with the most owner units least likely and households in census tracts with the most renter units most likely to be enumerated by proxy Internet self-responders most prevalent among Asian households, followed by two or more races, White alone, some other race alone, Hispanic, Black alone, NHOPI Alone, AIAN alone RACE WRITE-INS Internet responses produced 4 times as many write-in race responses (relative to check-box-only responses) as paper and NRFU household member responses implications for big changes in race reporting in 2020 among different groups (because use of internet varied across groups) 15
Conclusion 11.2: Disclosure Avoidance The decision of the U.S. Census Bureau to respond to the threats to confidentiality protection at a very late date in 2020 Census planning with a new, more complex Disclosure Avoidance System (DAS) using differential privacy-based algorithms went counter to long- standing principles of decennial census planning. The approach had not been tested in a census environment nor had the ability of the algorithms to handle critical user data needs been assessed; the Census Bureau had no backup plan should implementation of the new DAS prove challenging. The decision to continue to deploy the new DAS in the face of serious implementation problems resulted in marked delays in delivery of data products, with some variables and types of geographic units of questionable utility, and others not provided at all. [Tools to help users understand and analyze the data are also pretty much lacking.] In addition, it is not clear that the chosen privacy budgets for the various 2020 Census data products, with high values of the ? parameter that trades off accuracy with confidentiality protection, provide much actual protection. 16
Corollaries on Disclosure Avoidance System (DAS) We do not assess differential privacy in general, agreeing that it has desirable theoretical properties and is useful for some types of applications We strongly agree that confidentiality protection is a vitally important responsibility for the decennial census, and for statistical agencies in general We find fault in the particular implementation for the 2020 Census, done without prototyping and testing and without adequate engagement of the user/stakeholder community counter to the careful implementation of other major changes in census methodology The same criticisms attach to other major changes made in haste and without proper development which is why we say in Conclusion 8.2 that major motion toward an administrative records-based census in 2030 is not feasible 17
SEPTEMBER 2023 BRIEFING Main Messages: Recommendations for 2030 and Beyond 18
Priority Goals for 2030 Census Testing and Development (Recommendation 12.1) Like predecessor Panel to Review the 2010 Census, we recommend that the Census Bureau focus primary attention on a small, manageable number of major innovation areas: 1. Maximize self-response to the census, including better matching of contact and communication strategies to the desired response mode 2. Improve the quality of the data collection in NRFU, including reduction if not elimination of proxy reporting when a good alternative is available 3. Reduce gaps in coverage and data quality associated with race, ethnicity, and socioeconomic status 4. Improve the quality of address listings and contact strategies for conventional housing units and group quarters alike 5. Realign balance between utility, timeliness, and confidentiality protection in 2030 data products Support improvements to census law Penalties for deliberate misuse of data; resolution of FERPA directory information limitation 19
Implications/Suggestions for APDU APDU members will have their own interests in particular aspects of the census, such as the undercount for particular groups APDU members collectively (I hope) will lobby for genuine interactive dialogue with Census Bureau staff in charge of 2030 data products My suggested strategy: 1. Make user requirements clear timeliness, usability, join variables (e.g., persons per occupied unit) 2. Meet Census Bureau partway on confidentiality concerns say, agree to reduction of data for blocks (e.g., do we need race by every other race for people checking 4 and 5 races [~220K total]); work with Census Bureau to reduce blocks with small populations (median block had 28 people in 2020) 3. With these kinds of changes, whatever method Census uses for protection (DP, swapping, et al.) should be much easier to implement and have much less of an adverse impact on quality Bottom Line: 2030 Census data user working group needs to be established with Census blessing ASAP, operational plan developed by 2027, and implementation tested in 2028, to achieve timely, useful data in 2030! 20
Assessing the 2020 Census: Final Report Consensus Study Committee on National Statistics (CNSTAT) Released October 3, 2023 Assessing the 2020 Census: Final Report | The National Academies Press (www.nap.edu) Thank You ccitro@nas.edu ASSESSING THE 2020 CENSUS