Size estimation of key populations in the hiv epidemic in eswatini using incomplete and misaligned capture-recapture data

Abhirup Datta, Andrew Pita, Amrita Rao, Bhekie Sithole, Zandile Mnisi, Stefan Baral

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


In 2020, our understanding of the distributions of HIV risks in the most burdened settings, including eSwatini, remains limited. In part, this is driven by the limited availability of the size and burden of the populations at the greatest risk for HIV. Given pervasive social and healthcare stigmas, the size estimations of these populations often rely on the multiplier method—a vari-ant of the capture-recapture approach where the first survey is replaced by an enumeration of population members who used some service or attended an event. To characterize the distributions of marginalized communities in eSwatini, multiple data sources are available at each region for the multiplier method. Current practices in such circumstances produce multiple population size estimates at each region ignoring the correlation among these esti-mates. We recast the multiple multiplier method as a special case of capture-recapture problem with incomplete data and propose a fully model based approach for size estimation using multiple capture-recapture data with arbitrary pattern of incompleteness. We use a data augmentation scheme that allows us to model the correlations in the data and produce a unified estimate of population size per region. A hierarchical model ties together the models for multiple regions, allowing us to borrow strength across the regions and en-abling extrapolation to areas without data. In eSwatini we also encounter data misalignment where counts from some of the data sources are not available for each region but as an aggregate over few regions. We propose a solution to the general misalignment problem which considers data-source-specific patterns of misalignment. We use simulation studies to demonstrate the accurate inferential capabilities of our Bayesian multiplier method. This approach is then used to produce uncertainty-quantified population size estimates of key populations in eSwatini. Lastly, we propose a Bayesian nonparametric extension for incomplete capture-recapture that allows nonindependent data sources.

Original languageEnglish (US)
Pages (from-to)1207-1241
Number of pages35
JournalAnnals of Applied Statistics
Issue number3
StatePublished - 2020


  • Bayesian
  • Capture-recapture
  • Epidemiology
  • HIV
  • Misalignment
  • Multiplier method

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Size estimation of key populations in the hiv epidemic in eswatini using incomplete and misaligned capture-recapture data'. Together they form a unique fingerprint.

Cite this