TY - JOUR
T1 - Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling
AU - Rosenblum, Michael A.
AU - Van Der Laan, Mark J.
N1 - Funding Information:
KEYWORDS: Bernstein's inequality, central limit theorem, confidence interval, influence curve, normal distribution, survey sampling Author Notes: Michael Rosenblum was supported by a Ruth L. Kirschstein National Research Service Award (NRSA) under NIH/NIMH grant 5 T32 MH-19105-19. Mark van der Laan was supported by NIH grant R01 A1074345-01.
PY - 2009
Y1 - 2009
N2 - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).
AB - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).
KW - Bernstein's inequality
KW - Central limit theorem
KW - Confidence interval
KW - Influence curve
KW - Normal distribution
KW - Survey sampling
UR - http://www.scopus.com/inward/record.url?scp=62749164866&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62749164866&partnerID=8YFLogxK
U2 - 10.2202/1557-4679.1118
DO - 10.2202/1557-4679.1118
M3 - Article
C2 - 20231867
AN - SCOPUS:62749164866
SN - 1557-4679
VL - 5
JO - International Journal of Biostatistics
JF - International Journal of Biostatistics
IS - 1
M1 - 4
ER -