Comparing Data Quality of Fertility and First Sexual Intercourse Histories

Wu, Lawrence, Steven P. Martin, and Daniel A. Long
Working paper no. 1999-08

Abstract

This paper evaluates the data quality of two demographic variables in light of hypotheses on respondent recall from the literature on survey methodology. An emerging consensus in this literature is that recall of the timing of an event declines with duration of recall unless the dating of an event is frequently “rehearsed.” We provide empirical evidence consistent with this hypothesis by assessing the quality of demographic data supplied by female respondents on selected event history outcomes in multiple nationally representative surveys. A first demographic variable concerns the interval between a first and second birth. We compare second birth intervals using birth registration data from the Vital Statistics on Natality (VSN) and individual-level survey data from the June Current Population Survey (CPS), the 1979-94 waves of the National Longitudinal Survey of Youth (NLSY), and the 1988 National Survey of Family Growth (NSFG). Despite marked differences in survey design, we find relatively few differences in the quality of birth interval data across these four surveys. A second demographic variable is age at first sexual intercourse. We engage in two sets of analyses of this variable. First, we use data from the NLSY to analyze discrepancies between successive reports on age (to the nearest year) at first intercourse. Second, we analyze a form of partially missing data (respondent inability to recall the calendar month of intercourse) that occurs in both the NLSY and NSFG. In both sets of analyses, we find that data quality varies significantly with duration of recall and measures of respondent ability related to arithmetic facility and memory. Observed differences by race and ethnicity narrow substantially when controlling for these and other background factors. We find evidence for a nonlinear association between duration of recall and data quality, with similar nonlinear patterns observed for both NLSY and NSFG respondents.