Probabilistic record linkage and an automated procedure to minimize the undecided-matched pair problem.

Carla Jorge Machado, Kenneth Hill

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Probabilistic record linkage allows the assembling of information from different data sources. We present a procedure when a one-to-one relationship between records in different files is expected but not found. Data were births and infant deaths, 1998-birth cohort, city of São Paulo, Brazil. Pairs for which a one-to-one relationship was obtained and a best-link was found with the highest weight were taken as unequivocally matched pairs and provided information to decide on the remaining pairs. For these, an expected relationship between differences in dates of death and birth registration was found; and places of birth and death registration for neonatal deaths were likely to be the same. Such evidence was used to solve for the remaining pairs. We reduced the number of non-uniquely matched records and of uncertain matches, and increased the number of uniquely matched pairs from 2,249 to 2,827. Future research using record linkage should use strategies from first record linkage runs before a full clerical review (the standard procedure under uncertainty) to efficiently retrieve matches.

Original languageEnglish (US)
Pages (from-to)915-925
Number of pages11
JournalCadernos de saúde pública / Ministério da Saúde, Fundação Oswaldo Cruz, Escola Nacional de Saúde Pública
Issue number4
StatePublished - 2004
Externally publishedYes

ASJC Scopus subject areas

  • Public Health, Environmental and Occupational Health


Dive into the research topics of 'Probabilistic record linkage and an automated procedure to minimize the undecided-matched pair problem.'. Together they form a unique fingerprint.

Cite this