A bayesian graphical model for integrative analysis of TCGA data: Bayesgraph for TCGA integration

Yanxun Xu, Yitan Zhu, Yuan Ji

Research output: Chapter in Book/Report/Conference proceedingChapter


The Cancer Genome Atlas (TCGA) is a research project supported by the National Cancer Institute and the National Human Genome Research Institute to chart the genomic changes involved in more than 20 types of cancer (Network, 2008, 2012). TCGA generates the most comprehensive cancer genomic data consisting of whole-genome measurements of multiple features (such as DNA sequence, copy number, methylation, and expressions) on thousands of matched cancer patient samples. TCGA cancer genomic data have already been widely used for cancer research during the past a few years (The Cancer Genome Atlas, https://tcga-data.nci.nih.gov/tcga/). At the end of 2012, the number of monthly unique visitors to the TCGA data portal reaches close to 1000. More than 200 grant applications cite TCGA data in 2012 and 157 papers using TCGA data were published. We expect the growth of TCGA data usage be dramatic in the next few years. A hallmark of TCGA and TCGA data is multimodality. That is, multiple genomic characterizations, such asDNAcopy number, gene expression, protein expression. are measured for the same set of biological samples across multiple cancers. Integrative analyses of the multimodal data provide opportunities for a systematic examination across genomic spectrum of cancer. In particular, we apply a class of Bayesian graphical models to study intragenic interactions between three genomic features, mRNA gene expression, DNA copy number variation (CNV) and DNA methylation. Transcription is a critical genetic process in which DNA is transcribed to RNA. Perturbation of transcription directly affectsmRNAexpression and hence the subsequent protein production, leading to pathological states. Genetic variations such as CNVs and DNA methylations of the same gene frequently contribute to disrupted gene expression. Such disruption can be detected by learning the intragenic functional interaction between the associated variations. CNVs result in an abnormal number of copies of DNA and thus change the gene expression level and associated phenotypes. For example, a deletion (loss of both DNA copies) of PAX5 has been found to be associated with acute lymphoblastic leukemia (Shlien and Malkin, 2009). DNA methylation is a biochemical modification that adds a methyl group to the 5 position of the cytosine pyrimidine ring or the number 6 nitrogen of the adenine purine ring. There is strong evidence that abnormal hypermethylation at the gene promoter region results in transcriptional silencing of tumor suppressor genes.

Original languageEnglish (US)
Title of host publicationIntegrating Omics Data
PublisherCambridge University Press
Number of pages16
ISBN (Electronic)9781107706484
ISBN (Print)9781107069114
StatePublished - Jan 1 2015

ASJC Scopus subject areas

  • Medicine(all)


Dive into the research topics of 'A bayesian graphical model for integrative analysis of TCGA data: Bayesgraph for TCGA integration'. Together they form a unique fingerprint.

Cite this