TY - JOUR
T1 - Methods for analysis of complex survey data
T2 - An application using the tanzanian 2015 demographic and health survey and service provision assessment
AU - Sheffel, Ashley
AU - Wilson, Emily
AU - Munos, Melinda
AU - Zeger, Scott
N1 - Funding Information:
This work was completed as part of the National Evaluation Platform project with funding from the Department of Global Affairs Canada. The funder had no role in the study design, data collection, analysis and interpretation of data, or manuscript writing.
Publisher Copyright:
© 2019 ISoGH.
PY - 2019
Y1 - 2019
N2 - Background Low-income and middle-income countries (LMICs) seek to better utilize household and health facility survey data for monitoring and evaluation, as well as for health program planning. However, analysis of this complex survey data are complicated. In Tanzania, the National Evaluation Platform project sought to analyze Demographic and Health Survey (DHS) data and Service Provision Assessment (SPA) data as part of an evaluation of the national One Plan for Maternal and Child Health. To support this evaluation, we used this survey data to answer two key methodological questions: 1) what are the benefits and costs of using sampling weights in rate estimation; and 2) what is the best method for calculating standard errors in these two surveys? Methods We conducted a simulation study for each methodologic question. The first simulation study assessed the benefits and costs of using sampling weights in rate estimation. This simulation used weighted and unweighted estimates and examined bias, variance, and the mean squared error (MSE). The second simulation study assessed the best method for calculating standard errors comparing cluster bootstrapped variance estimation, design based asymptotic variance with one level (svy1), and design based asymptotic variance with three levels (svy3). We compared coverage probability and confidence interval length. Results Our results showed that although weighted estimates were less biased, unweighted estimates were less variable. The weighted estimates had a lower MSE, indicating that the effect of the bias trade-off was greater than the effect of the variance trade-off for most indicators assessed. The best performer for variance estimation was the cluster bootstrap method, followed by the svy3 method. The svy1 method was the worst performer for most indicators assessed. Conclusions As complex survey data become more widely used for policymaking in LMICs, there is a need for guidance on the best methods for analyzing this data. The standard of practice has been a design-based analysis using survey weights and the single-level svy method for calculating standard errors. This study puts forth an alternative approach to analysis. In addition, this study offers practical guidance on determining the best method for analysis of complex survey data.
AB - Background Low-income and middle-income countries (LMICs) seek to better utilize household and health facility survey data for monitoring and evaluation, as well as for health program planning. However, analysis of this complex survey data are complicated. In Tanzania, the National Evaluation Platform project sought to analyze Demographic and Health Survey (DHS) data and Service Provision Assessment (SPA) data as part of an evaluation of the national One Plan for Maternal and Child Health. To support this evaluation, we used this survey data to answer two key methodological questions: 1) what are the benefits and costs of using sampling weights in rate estimation; and 2) what is the best method for calculating standard errors in these two surveys? Methods We conducted a simulation study for each methodologic question. The first simulation study assessed the benefits and costs of using sampling weights in rate estimation. This simulation used weighted and unweighted estimates and examined bias, variance, and the mean squared error (MSE). The second simulation study assessed the best method for calculating standard errors comparing cluster bootstrapped variance estimation, design based asymptotic variance with one level (svy1), and design based asymptotic variance with three levels (svy3). We compared coverage probability and confidence interval length. Results Our results showed that although weighted estimates were less biased, unweighted estimates were less variable. The weighted estimates had a lower MSE, indicating that the effect of the bias trade-off was greater than the effect of the variance trade-off for most indicators assessed. The best performer for variance estimation was the cluster bootstrap method, followed by the svy3 method. The svy1 method was the worst performer for most indicators assessed. Conclusions As complex survey data become more widely used for policymaking in LMICs, there is a need for guidance on the best methods for analyzing this data. The standard of practice has been a design-based analysis using survey weights and the single-level svy method for calculating standard errors. This study puts forth an alternative approach to analysis. In addition, this study offers practical guidance on determining the best method for analysis of complex survey data.
UR - http://www.scopus.com/inward/record.url?scp=85077458402&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077458402&partnerID=8YFLogxK
U2 - 10.7189/jogh.09.020902
DO - 10.7189/jogh.09.020902
M3 - Article
C2 - 31893037
AN - SCOPUS:85077458402
SN - 2047-2978
VL - 9
JO - Journal of global health
JF - Journal of global health
IS - 2
M1 - 020902
ER -