One of the main reasons for the slow progress in detecting susceptibility genes in complex diseases may be that the clinical diagnoses used as phenotypes are genetically heterogeneous. The general objective of this paper is to develop a latent class model to identify homogeneous disease sub-types based on multivariate disease measurements in pedigrees from genetic studies. Our hypothesis is that the resulting disease sub-types will be influenced by a small number of genes, that will thus be more easily detectable. Specifically, we extended latent class analysis to allow dependence between the latent disease class status of relatives within nuclear families as a function of their kinship. Such a dependence model is expected to capture the underlying Mendelian transmission of alleles within families. An EM algorithm maximizes the likelihood and a cross-validation approach selects the optimal model. Through a simulation study under a genetic disease class model, we show that taking into account familial dependence improves the classification of the individuals in their true classes, compared to a traditional model assuming independence. An application of our approach to a dataset from the Autism Genetics Research Exchange is also presented.
- EM algorithm
- Latent class models
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty