TY - GEN
T1 - Sampling-based subnetwork identification from microarray data and protein-protein interaction network
AU - Wang, Xiao
AU - Gu, Jinghua
AU - Xuan, Jianhua
AU - Shajahan, Ayesha N.
AU - Clarke, Robert
AU - Chen, Li
N1 - Funding Information:
This work was supported by Polish Ministry of Science and Higher Education , Grant N N401 267339 . Authors are very grateful to Dr. David C. Kilpatrick for critical reading of the manuscript.
PY - 2012
Y1 - 2012
N2 - Identification of condition-specific protein interaction subnetworks has emerged as an attractive research field to reveal molecular mechanisms of diseases and provide reliable network biomarkers for disease diagnosis. Several methods have been proposed, which integrate gene expression and protein-protein interaction (PPI) data to identify subnetworks. However, existing methods treat differential expression of genes and network topology independently, which is an oversimplified assumption to model real biological systems. In this paper, we propose a sampling-based subnetwork identification approach to take into account the dependency between gene expression and network topology. Specifically, we apply Markov random field (MRF) theory to model the dependency of genes in PPI network using a Bayesian framework, followed by a Markov Chain Monte Carlo (MCMC) approach to identify significant subnetworks. The MCMC approach estimates the posterior distribution of genes' significant scores and network structure iteratively. Experimental results on both synthetic data and real breast cancer data demonstrated the effectiveness of the proposed method in identifying subnetworks, especially several functionally important, aberrant subnetworks associated with pathways involved in the development and recurrence of breast cancer.
AB - Identification of condition-specific protein interaction subnetworks has emerged as an attractive research field to reveal molecular mechanisms of diseases and provide reliable network biomarkers for disease diagnosis. Several methods have been proposed, which integrate gene expression and protein-protein interaction (PPI) data to identify subnetworks. However, existing methods treat differential expression of genes and network topology independently, which is an oversimplified assumption to model real biological systems. In this paper, we propose a sampling-based subnetwork identification approach to take into account the dependency between gene expression and network topology. Specifically, we apply Markov random field (MRF) theory to model the dependency of genes in PPI network using a Bayesian framework, followed by a Markov Chain Monte Carlo (MCMC) approach to identify significant subnetworks. The MCMC approach estimates the posterior distribution of genes' significant scores and network structure iteratively. Experimental results on both synthetic data and real breast cancer data demonstrated the effectiveness of the proposed method in identifying subnetworks, especially several functionally important, aberrant subnetworks associated with pathways involved in the development and recurrence of breast cancer.
KW - Breast cancer
KW - Gene expression
KW - Markov Chain Monte Carlo (MCMC)
KW - Markov random field (MRF)
KW - Protein-protein interaction (PPI)
KW - Subnetwork identification
UR - http://www.scopus.com/inward/record.url?scp=84873607968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84873607968&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2012.221
DO - 10.1109/ICMLA.2012.221
M3 - Conference contribution
AN - SCOPUS:84873607968
SN - 9780769549132
T3 - Proceedings - 2012 11th International Conference on Machine Learning and Applications, ICMLA 2012
SP - 158
EP - 163
BT - Proceedings - 2012 11th International Conference on Machine Learning and Applications, ICMLA 2012
PB - IEEE Computer Society
T2 - 11th IEEE International Conference on Machine Learning and Applications, ICMLA 2012
Y2 - 12 December 2012 through 15 December 2012
ER -