A hierarchical multivariate two-part model for profiling providers' effects on health care charges

John W. Robinson, Scott L. Zeger, Christopher B. Forrest

Research output: Contribution to journalReview articlepeer-review

7 Scopus citations


Procedures for analyzing and comparing health care providers' effects on health services delivery and outcomes have been referred to as provider profiling. In a typical profiling procedure, patient-level responses are measured for clusters of patients treated by providers that in turn can be considered statistically exchangeable. Thus a hierarchical model naturally represents the structure of the data. When provider effects on multiple responses are profiled, a multivariate model rather than a series of univariate models can capture associations among responses at both the provider and patient levels. When responses are in the form of charges for health care services and sampled patients include nonusers of services, charge variables are a mix of 0's and highly skewed positive values that present a modeling challenge. For analysis of covariate effects on charges for a single service, a frequently used approach is a two-part model that combines logistic or probit regression on any use of the service and linear regression on log-positive charges given use of the service. Here we extend the two-part model to the case of charges for multiple services, using a log-linear model and a general multivariate lognormal model, and use the resultant multivariate two-part model as the within-provider component of a hierarchical model. The log-linear likelihood is reparameterized as proposed by Fitzmaurice and Laird, so that covariate effects on any use of each service are marginal with respect to any use of other services. The general multivariate lognormal likelihood is structured in such a way that the variance of log-positive charges for each service is provider-specific but correlations among logs of positive charges for different services are uniform across providers. A data augmentation step is included in the Gibbs sampler used to fit the hierarchical model to accommodate the fact that values of log-positive charges are undefined for unused services. We apply this hierarchical, multivariate, two-part model to analyze the effects of primary care physicians on their patients' annual charges for two services, primary care and specialty care. We also demonstrate an approach for incorporating prior information about the effects of patient morbidity on response variables, to improve the accuracy of provider profiles based on patient samples of limited size.

Original languageEnglish (US)
Pages (from-to)911-923
Number of pages13
JournalJournal of the American Statistical Association
Issue number475
StatePublished - Sep 2006
Externally publishedYes


  • Data augmentation
  • Gibbs sampler
  • Point-of-service health plan
  • Primary care
  • Referral to specialists
  • Rejection sampling

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'A hierarchical multivariate two-part model for profiling providers' effects on health care charges'. Together they form a unique fingerprint.

Cite this