TY - JOUR

T1 - Modelling multivariate binary data with alternating logistic regressions

AU - Carey, Vincent

AU - Zeger, Scott L.

AU - Diggle, Peter

N1 - Funding Information:
The authors thank Karen Bandeen-Roche and John Hart, Jr., for helpful discussions, and John Hart, Jr., and Barry Gordon for permission to use the Wada test data. We also thank the referees and associate editor for suggestions which have improved upon an earlier draft. Scott L. Zeger gratefully acknowledges support from the US National Institutes of Health grant AI 25529 and from the Merck, Sharp and Dohme Research Laboratory.

PY - 1993/9

Y1 - 1993/9

N2 - SUMMARY: Marginal models for multivariate binary data permit separate modelling of the relationship of the response with explanatory variables, and the association between pairs of responses. When the former is the scientific focus, a first-order generalized estimating equation method (Liang & Zeger, 1986) is easy to implement and gives efficient estimates of regression coefficients, although estimates of the association among the binary outcomes can be inefficient. When the association model is a focus, simultaneous modelling of the responses and all pairwise products (Prentice, 1988) using second-order estimating equations gives more efficient estimates of association parameters as well. However, this procedure can become computationally infeasible as the cluster size gets large. This paper proposes an alternative approach, alternating logistic regressions, for simultaneously regressing the response on explanatory variables as well as modelling the association among responses in terms of pairwise odds ratios. This algorithm iterates between a logistic regression using first-order generalized estimating equations to estimate regression coefficients and a logistic regression of each response on others from the same cluster using an appropriate offset to update the odds ratio parameters. For clusters of size n, alternating logistic regression involves evaluation and inversion of matrices of order n2 rather than n4 as required for second-order generalized estimating equations. The alternating logistic regression estimates are shown to be reasonably efficient relative to solutions of second-order equations in a few problems. The new method is illustrated with an analysis of neuropsychological tests on patients with epileptic seizures.

AB - SUMMARY: Marginal models for multivariate binary data permit separate modelling of the relationship of the response with explanatory variables, and the association between pairs of responses. When the former is the scientific focus, a first-order generalized estimating equation method (Liang & Zeger, 1986) is easy to implement and gives efficient estimates of regression coefficients, although estimates of the association among the binary outcomes can be inefficient. When the association model is a focus, simultaneous modelling of the responses and all pairwise products (Prentice, 1988) using second-order estimating equations gives more efficient estimates of association parameters as well. However, this procedure can become computationally infeasible as the cluster size gets large. This paper proposes an alternative approach, alternating logistic regressions, for simultaneously regressing the response on explanatory variables as well as modelling the association among responses in terms of pairwise odds ratios. This algorithm iterates between a logistic regression using first-order generalized estimating equations to estimate regression coefficients and a logistic regression of each response on others from the same cluster using an appropriate offset to update the odds ratio parameters. For clusters of size n, alternating logistic regression involves evaluation and inversion of matrices of order n2 rather than n4 as required for second-order generalized estimating equations. The alternating logistic regression estimates are shown to be reasonably efficient relative to solutions of second-order equations in a few problems. The new method is illustrated with an analysis of neuropsychological tests on patients with epileptic seizures.

KW - Clustered data

KW - Generalized estimating equation

KW - Logistic regression

UR - http://www.scopus.com/inward/record.url?scp=0000429149&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0000429149&partnerID=8YFLogxK

U2 - 10.1093/biomet/80.3.517

DO - 10.1093/biomet/80.3.517

M3 - Article

AN - SCOPUS:0000429149

SN - 0006-3444

VL - 80

SP - 517

EP - 526

JO - Biometrika

JF - Biometrika

IS - 3

ER -