TY - JOUR
T1 - Identifying differential correlation in gene/pathway combinations
AU - Braun, Rosemary
AU - Cope, Leslie
AU - Parmigiani, Giovanni
N1 - Funding Information:
RB was supported by the Cancer Prevention Fellowship Program, Division of Cancer Prevention, National Cancer Institute, National Institutes of Health. LC and GP were supported by National Science Foundation grant DMS034211.
PY - 2008/11/18
Y1 - 2008/11/18
N2 - Background: An important emerging trend in the analysis of microarray data is to incorporate known pathway information a priori. Expression level "summaries" for pathways, obtained from the expression data for the genes constituting the pathway, permit the inclusion of pathway information, reduce the high dimensionality of microarray data, and have the power to elucidate gene-interaction dependencies which are not already accounted for through known pathway identification. Results: We present a novel method for the analysis of microarray data that identifies joint differential expression in gene-pathway pairs. This method takes advantage of known gene pathway memberships to compute a summary expression level for each pathway as a whole. Correlations between the pathway expression summary and the expression levels of genes not already known to be associated with the pathway provide clues to gene interaction dependencies that are not already accounted for through known pathway identification, and statistically significant differences between gene-pathway correlations in phenotypically different cells (e.g., where the expression level of a single gene and a given pathway summary correlate strongly in normal cells but weakly in tumor cells) may indicate biologically relevant gene-pathway interactions. Here, we detail the methodology and present the results of this method applied to two gene-expression datasets, identifying gene-pathway pairs which exhibit differential joint expression by phenotype. Conclusion: The method described herein provides a means by which interactions between large numbers of genes may be identified by incorporating known pathway information to reduce the dimensionality of gene interactions. The method is efficient and easily applied to data sets of ∼102 arrays. Application of this method to two publicly-available cancer data sets yields suggestive and promising results. This method has the potential to complement gene-at-a-time analysis techniques for microarray analysis by indicating relationships between pathways and genes that have not previously been identified and which may play a role in disease.
AB - Background: An important emerging trend in the analysis of microarray data is to incorporate known pathway information a priori. Expression level "summaries" for pathways, obtained from the expression data for the genes constituting the pathway, permit the inclusion of pathway information, reduce the high dimensionality of microarray data, and have the power to elucidate gene-interaction dependencies which are not already accounted for through known pathway identification. Results: We present a novel method for the analysis of microarray data that identifies joint differential expression in gene-pathway pairs. This method takes advantage of known gene pathway memberships to compute a summary expression level for each pathway as a whole. Correlations between the pathway expression summary and the expression levels of genes not already known to be associated with the pathway provide clues to gene interaction dependencies that are not already accounted for through known pathway identification, and statistically significant differences between gene-pathway correlations in phenotypically different cells (e.g., where the expression level of a single gene and a given pathway summary correlate strongly in normal cells but weakly in tumor cells) may indicate biologically relevant gene-pathway interactions. Here, we detail the methodology and present the results of this method applied to two gene-expression datasets, identifying gene-pathway pairs which exhibit differential joint expression by phenotype. Conclusion: The method described herein provides a means by which interactions between large numbers of genes may be identified by incorporating known pathway information to reduce the dimensionality of gene interactions. The method is efficient and easily applied to data sets of ∼102 arrays. Application of this method to two publicly-available cancer data sets yields suggestive and promising results. This method has the potential to complement gene-at-a-time analysis techniques for microarray analysis by indicating relationships between pathways and genes that have not previously been identified and which may play a role in disease.
UR - http://www.scopus.com/inward/record.url?scp=58149301302&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58149301302&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-9-488
DO - 10.1186/1471-2105-9-488
M3 - Article
C2 - 19017408
AN - SCOPUS:58149301302
SN - 1471-2105
VL - 9
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - 488
ER -