The sequence kernel association test for multicategorical outcomes

Zhiwen Jiang, Haoyu Zhang, Thomas U. Ahearn, Montserrat Garcia-Closas, Nilanjan Chatterjee, Hongtu Zhu, Xiang Zhan, Ni Zhao

Research output: Contribution to journalArticlepeer-review


Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER− breast cancer subtypes. We also investigated educational attainment using UK Biobank data ((Figure presented.)) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at

Original languageEnglish (US)
Pages (from-to)432-449
Number of pages18
JournalGenetic epidemiology
Issue number6
StatePublished - Sep 2023


  • SKAT
  • multicategorical data
  • the generalized logit model
  • the proportional odds model

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)


Dive into the research topics of 'The sequence kernel association test for multicategorical outcomes'. Together they form a unique fingerprint.

Cite this