We consider a regression to the mean problem with a very large sample for the first measurement and relatively small subsample for the second measurement, selected on the basis of the initial measurement. This is a situation that often occurs in screening trials. We propose to estimate the unselected population mean and variance from the first measurement in the larger sample. Using these estimates, the correlation between the two measurements, as well as an effect of treatment, can be estimated in simple and explicit form. Under the condition that the size of the subsample is of a smaller order, the new estimators for all the four parameters are as asymptotically efficient as the usual maximum likelihood estimators. Tests based on this new approach are also discussed. An illustration from a cholesterol screening study is included.
ASJC Scopus subject areas
- Statistics and Probability
- General Biochemistry, Genetics and Molecular Biology
- General Immunology and Microbiology
- General Agricultural and Biological Sciences
- Applied Mathematics