Large-Scale Data Harmonization Across Prospective Studies

Ke Pan, Lydia A. Bazzano, Kalpana Betha, Brittany M. Charlton, Jorge E. Chavarro, Christina Cordero, Erica P. Gunderson, Catherine L. Haggerty, Jaime E. Hart, Anne Marie Jukic, Sylvia H. Ley, Gita D. Mishra, Sunni L. Mumford, Enrique F. Schisterman, Karen Schliep, Jeffrey G. Shaffer, Daniela Sotres-Alvarez, Joseph B. Stanford, Allen J. Wilcox, Lauren A. WiseEdwina Yeung, Emily W. Harville

Research output: Contribution to journalArticlepeer-review


The Preconception Period Analysis of Risks and Exposures Influencing Health and Development (PrePARED) Consortium creates a novel resource for addressing preconception health by merging data from numerous cohort studies. In this paper, we describe our data harmonization methods and results. Individual-level data from 12 prospective studies were pooled. The crosswalk-cataloging-harmonization procedure was used. The index pregnancy was defined as the first postbaseline pregnancy lasting more than 20 weeks. We assessed heterogeneity across studies by comparing preconception characteristics in different types of studies. The pooled data set included 114,762 women, and 25,531 (22%) reported at least 1 pregnancy of more than 20 weeks' gestation during the study period. Babies from the index pregnancies were delivered between 1976 and 2021 (median, 2008), at a mean maternal age of 29.7 (standard deviation, 4.6) years. Before the index pregnancy, 60% of women were nulligravid, 58% had a college degree or more, and 37% were overweight or obese. Other harmonized variables included race/ethnicity, household income, substance use, chronic conditions, and perinatal outcomes. Participants from pregnancy-planning studies had more education and were healthier. The prevalence of preexisting medical conditions did not vary substantially based on whether studies relied on self-reported data. Use of harmonized data presents opportunities to study uncommon preconception risk factors and pregnancy-related events. This harmonization effort laid the groundwork for future analyses and additional data harmonization.

Original languageEnglish (US)
Pages (from-to)2033-2049
Number of pages17
JournalAmerican journal of epidemiology
Issue number12
StatePublished - Dec 1 2023
Externally publishedYes


  • consortia
  • data harmonization
  • preconception period
  • pregnancy
  • pregnancy complications

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Large-Scale Data Harmonization Across Prospective Studies'. Together they form a unique fingerprint.

Cite this