Improved semiparametric time series models of air pollution and mortality

Francesca Dominici, Aidan McDermott, Trevor J. Hastie

Research output: Contribution to journalReview articlepeer-review

115 Scopus citations


In 2002, methodological issues around time series analyses of air pollution and health attracted the attention of the scientific community, policy makers, the press, and the diverse stakeholders concerned with air pollution. As the U.S. Environmental Protection Agency (EPA) was finalizing its most recent review of epidemiologic evidence on particulate matter air pollution (PM), statisticians and epidemiologists found that the S-PLUS implementation of generalized additive models (GAMs) can overestimate effects of air pollution and understate statistical uncertainty in time series studies of air pollution and health. This discovery delayed completion of the PM Criteria Document prepared as part of the review of the U.S. National Ambient Air Quality Standard, because the time series findings represented a critical component of the evidence. In addition, it raised concerns about the adequacy of current model formulations and their software implementations. In this article we provide improvements in semiparametric regression directly relevant to risk estimation in time series studies of air pollution. First, we introduce a closed-form estimate of the asymptotically exact covariance matrix of the linear component of a GAM. To ease the implementation of these calculations, we develop the S package gam.exact, an extended version of gam. Use of gam.exact allows a more robust assessment of the statistical uncertainty of the estimated pollution coefficients. Second, we develop a bandwidth selection method to reduce confounding bias in the pollution-mortality relationship due to unmeasured time-varying factors, such as season and influenza epidemics. Third, we introduce a conceptual framework to fully explore the sensitivity of the air pollution risk estimates to model choice. We apply our methods to data of the National Mortality Morbidity Air Pollution Study, which includes time series data from the 90 largest U.S. cities for the period 1987-1994.

Original languageEnglish (US)
Pages (from-to)938-948
Number of pages11
JournalJournal of the American Statistical Association
Issue number468
StatePublished - Dec 2004


  • Bandwidth selection
  • Generalized additive model
  • Generalized linear Model
  • Mean squared error
  • Particulate matter
  • Semiparametric regression
  • Time series

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Improved semiparametric time series models of air pollution and mortality'. Together they form a unique fingerprint.

Cite this