Census data are foundational to democracy, but persistent undercounts of children, people of color, and immigrant households undermine the fairness and accuracy of the census.1 To better address demographic disparities in census counts, the Census Bureau has invested in administrative data projects. However, administrative data cannot replace known and reliable methods for enumerating historically undercounted populations—namely, community outreach and in-person enumeration efforts through Non-Response Follow-up.
Until the Census Bureau can address the inadequate representation of undercounted communities in both census data and their administrative data records, the Bureau’s administrative data projects may compound existing data disparities. Undercounts have enormous consequences. The resulting biases in population statistics impede equitable political representation and access to public benefits and government services. Because census data are core to social science and statistical research projects, undercounts also hinder our ability to understand inequality and create solutions to advance equity.
Administrative data from non-census sources often contain high-quality demographic information that can be used to fill in census omissions. Government entities and third parties, such as the Postal Service and the Internal Revenue Service, share administrative data with the Census Bureau. The Bureau then attempts to link individual administrative data records to response data from the Decennial Census and other census surveys. Files are linked when enough matching Personally Identifying Information (PII) is identified in both the administrative data and census response to verify that the data represents the same person. Using linked data, the Census Bureau can enumerate non-responding households expected at a certain address and attribute demographic information found in administrative data to individual census responses. For example, after hurricanes limited efforts to count non-responding households in parts of Louisiana, the Census Bureau used administrative data from the Medicare Enrollment Database, Indian Health Service Patient Registration, Selective Service System Registration, and past census survey data to enumerate households in the affected regions.
However, data linkage is an imperfect process. Though administrative data contain a wealth of information, integrating administrative and census data requires linking individual records to ensure that data are attributed to the correct person. One of data linkage’s major limitations is that it depends on the comparison of two already relatively high-quality datasets. Confidently identifying the correct person and linking their census and administrative data requires:
(a) access to two sets of records—at least one of the individual’s responses to the Decennial Census or another census survey and at least one set of administrative data records containing that person’s information; as well as
(b) a sufficient level of accurate and consistent personally identifying information across the individual’s available census and administrative data sources to ensure accuracy of the link.
When an individual’s identity in administrative data and census responses can be verified using multiple sets of matching values across records, the resulting link is more likely to be accurate. Conversely, a person’s census record that is missing identifying values cannot be linked to the administrative data records with sufficient certainty.
Data linkage improves coverage of white, higher-income households but is less effective at addressing undercounts of people of color, children, immigrants, people who are less familiar with English, people who have lower incomes, and residents of group quarters. Linkage requires sufficient PII in both the administrative data and census responses, a standard that the Bureau has not met for many members of undercounted communities. When census enumerators collect incomplete census responses or rely on tenuous proxy responders such as neighbors, the census responses are less likely to be linked to administrative data. Even households for whom comprehensive, high-quality census responses are collected still fail data linkage if they are wholly absent from available administrative datasets. Until the Census Bureau can improve coverage of undercounted communities in census data and its administrative data records, administrative data linkage will be unable to address gaps in demographic data collection and household enumeration.
Recent events and policy shifts outside of the Census Bureau have introduced additional challenges. COVID-19, extreme weather events, and the Trump Administration’s attempts to weaponize immigration status all compromised enumeration in 2020 and likely contributed to high linkage disparities in 2020 Census data. The share of unlinked 2020 Census records nearly doubled from 2010, from 8.99 percent of census responses in 2010 to 16.39 percent in 2020. Group quarters (GQ) residents, often underrepresented in administrative data, were particularly affected by this breakdown in linkage rates.2 The census records of non-GQ residents in 2020 were more than twice as likely as GQ residents’ to be linked to administrative data. Even when GQ residents are represented in administrative data, their records often contain discordant addresses, which complicate linkage.
While the Bureau’s administrative data projects show promise for improving survey quality, reducing costs, and lowering barriers to respondent participation, biases in data linkage may amplify the impact of undercounts. In-person efforts to reach historically undercounted communities must remain a priority to improve representation in administrative and census data.
1Groups may be underrepresented in the census for numerous, complex reasons, including inadequate Census Bureau outreach and a community’s lack of trust in government; these root causes must be addressed.
2 The Census Bureau has acknowledged widespread failures in the 2020 GQ enumeration and allowed local governments to submit limited requests for review.