Abstract
Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge – answering questions from high-dimensional data that we have not yet thought to ask.
Original language | English (US) |
---|---|
Pages (from-to) | 790-805 |
Number of pages | 16 |
Journal | Trends in Genetics |
Volume | 34 |
Issue number | 10 |
DOIs | |
State | Published - Oct 2018 |
Keywords
- deconvolution
- dimension reduction
- genomics
- matrix factorization
- single cell
- unsupervised learning
ASJC Scopus subject areas
- Genetics
Access to Document
Other files and links
Fingerprint
Dive into the research topics of 'Enter the Matrix: Factorization Uncovers Knowledge from Omics'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS
In: Trends in Genetics, Vol. 34, No. 10, 10.2018, p. 790-805.
Research output: Contribution to journal › Review article › peer-review
}
TY - JOUR
T1 - Enter the Matrix
T2 - Factorization Uncovers Knowledge from Omics
AU - Stein-O'Brien, Genevieve L.
AU - Arora, Raman
AU - Culhane, Aedin C.
AU - Favorov, Alexander V.
AU - Garmire, Lana X.
AU - Greene, Casey S.
AU - Goff, Loyal A.
AU - Li, Yifeng
AU - Ngom, Aloune
AU - Ochs, Michael F.
AU - Xu, Yanxun
AU - Fertig, Elana J.
N1 - Funding Information: We thank Orly Alter, David Berman, J. Brian Byrd, Michael Love, Irene Gallego Romero, Lillian Fritz-Laylin, Luciane Kagohara, Louise Klein, Craig Mak, Matthew Stephens, Daniela Witten, and other members of ‘New PI Slack’ for their insightful feedback. This work was supported by the National Institutes of Health (NIH) National Cancer institute (NCI) and the National Libary of Medicine (NLM) (grants NCI 2P30CA006516-52 and 2P50CA101942-11 to A.C.C., NCI R01CA177669 E.J.F., NCI U01CA212007 to E.J.F., NLM R01LM011000 to M.F.O., and NCI P30 CA006973), Johns Hopkins University Catalyst and Discovery Awards to E.J.F., a Johns Hopkins University Institute for Data Intensive Engineering and Science (IDIES) Award to E.J.F. and R.A., a Johns Hopkins School of Medicine Synergy award to E.J.F. and L.A.G., a grant from The Gordon and Betty Moore Foundation (GBMF 4552) to C.S.G., Alex’s Lemonade Stand Foundation’s Childhood Cancer Data Lab (C.S.G.), award K01ES025434 from the National Institute of Environmental Health Sciences (NIEHS) through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (L.X.G.), award P20 COBRE GM103457 from the NIH/National Institute of General Medical Sciences (NIGMS; to L.X.G.), R01 LM012373 from the NLM (L.X.G.), R01 HD084633 from the National Institute of Child Health and Human Development (NICHD; to L.X.G.), the Department of Defense Breast Cancer Research Program (BCRP; award BC140682P1, A.C.C.), the National Science and Engineering Council of Canada (NSERC; DG grant number RGPIN-2016-05017, A.N.), the Windsor-Essex County Cancer Centre Foundation (Seeds4Hope grant 814221, A.N.), Hopkins inHealth and Booz Allen Hamilton (90056858) to Y.X., the Russian Foundation for Basic Research (KOMFI 17-00-00208) and NIH (NCI P30 CA006973) to A.V.F., and the National Research Council of Canada to Y.L. This project has been made possible in part by grants 2018-183444 (E.J.F.), 2018-128827 (L.A.G.), and 2018-182718 (C.S.G.) from the Chan Zuckerberg Initiative Donor-Advised Fund (DAF), an advised fund of the Silicon Valley Community Foundation. The views and opinions of, and endorsements by, the author(s) do not reflect those of the US Army or the Department of Defense. Funding Information: We thank Orly Alter, David Berman, J. Brian Byrd, Michael Love, Irene Gallego Romero, Lillian Fritz-Laylin, Luciane Kagohara, Louise Klein, Craig Mak, Matthew Stephens, Daniela Witten, and other members of ‘New PI Slack’ for their insightful feedback. This work was supported by the National Institutes of Health (NIH) National Cancer institute (NCI) and the National Libary of Medicine (NLM) (grants NCI 2P30CA006516-52 and 2P50CA101942-11 to A.C.C., NCI R01CA177669 E.J.F., NCI U01CA212007 to E.J.F., NLM R01LM011000 to M.F.O., and NCI P30 CA006973), Johns Hopkins University Catalyst and Discovery Awards to E.J.F., a Johns Hopkins University Institute for Data Intensive Engineering and Science (IDIES) Award to E.J.F. and R.A., a Johns Hopkins School of Medicine Synergy award to E.J.F. and L.A.G., a grant from The Gordon and Betty Moore Foundation (GBMF 4552) to C.S.G., Alex's Lemonade Stand Foundation's Childhood Cancer Data Lab (C.S.G.), award K01ES025434 from the National Institute of Environmental Health Sciences (NIEHS) through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (L.X.G.), award P20 COBRE GM103457 from the NIH/National Institute of General Medical Sciences (NIGMS; to L.X.G.), R01 LM012373 from the NLM (L.X.G.), R01 HD084633 from the National Institute of Child Health and Human Development (NICHD; to L.X.G.), the Department of Defense Breast Cancer Research Program (BCRP; award BC140682P1, A.C.C.), the National Science and Engineering Council of Canada (NSERC; DG grant number RGPIN-2016-05017, A.N.), the Windsor-Essex County Cancer Centre Foundation (Seeds4Hope grant 814221, A.N.), Hopkins inHealth and Booz Allen Hamilton (90056858) to Y.X., the Russian Foundation for Basic Research (KOMFI 17-00-00208) and NIH (NCI P30 CA006973) to A.V.F., and the National Research Council of Canada to Y.L. This project has been made possible in part by grants 2018-183444 (E.J.F.), 2018-128827 (L.A.G.), and 2018-182718 (C.S.G.) from the Chan Zuckerberg Initiative Donor-Advised Fund (DAF), an advised fund of the Silicon Valley Community Foundation. The views and opinions of, and endorsements by, the author(s) do not reflect those of the US Army or the Department of Defense. Publisher Copyright: © 2018 Elsevier Ltd
PY - 2018/10
Y1 - 2018/10
N2 - Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge – answering questions from high-dimensional data that we have not yet thought to ask.
AB - Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge – answering questions from high-dimensional data that we have not yet thought to ask.
KW - deconvolution
KW - dimension reduction
KW - genomics
KW - matrix factorization
KW - single cell
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85051770269&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051770269&partnerID=8YFLogxK
U2 - 10.1016/j.tig.2018.07.003
DO - 10.1016/j.tig.2018.07.003
M3 - Review article
C2 - 30143323
AN - SCOPUS:85051770269
SN - 0168-9525
VL - 34
SP - 790
EP - 805
JO - Trends in Genetics
JF - Trends in Genetics
IS - 10
ER -