TY - JOUR
T1 - State-of-the-art methods for exposure-health studies
T2 - Results from the exposome data challenge event
AU - The Exposome Data Challenge Participant Consortium
AU - Maitre, Léa
AU - Guimbaud, Jean Baptiste
AU - Warembourg, Charline
AU - Güil-Oumrait, Nuria
AU - Petrone, Paula Marcela
AU - Chadeau-Hyam, Marc
AU - Vrijheid, Martine
AU - Basagaña, Xavier
AU - Gonzalez, Juan R.
AU - Alfano, Rossella
AU - Basu, Sanjib
AU - Benavides, Jaime
AU - Broséus, Lucile
AU - Brunius, Carl
AU - Caceres, Alejandro
AU - Carli, Matthew
AU - Cazabet, Rémy
AU - Chattopadhyay, Shounak
AU - Chen, Yun Hua
AU - Chillrud, Lawrence
AU - Conti, David
AU - Gennings, Chris
AU - Gouripeddi, Ramkiran
AU - Iyer, S. Hari
AU - Jedynak, Paulina
AU - Li, Huichu
AU - McGee, Glen
AU - Midya, Vishal
AU - Mistry, Sejal
AU - Moccia, Chiara
AU - Mork, S. Daniel
AU - Pearce, L. John
AU - Peruzzi, Michele
AU - Pescador, Jimenez Marcia
AU - Reimann, Brigitte
AU - Roscoe, J. Charlotte
AU - Shen, Xiaotao
AU - Stratakis, Nikos
AU - Wang, Ziyue
AU - Wang, Congrong
AU - Wheeler, David
AU - Wilson, Ander
AU - Wu, Qiong
AU - Yu, Miao
AU - Zhao, Yinqi
AU - Zou, Fei
AU - Zugna, Daniela
AU - Chen, Ruizhe
AU - Chung, Yu Che
AU - Jang, Jiyeong
N1 - Publisher Copyright:
© 2022
PY - 2022/10
Y1 - 2022/10
N2 - The exposome recognizes that individuals are exposed simultaneously to a multitude of different environmental factors and takes a holistic approach to the discovery of etiological factors for disease. However, challenges arise when trying to quantify the health effects of complex exposure mixtures. Analytical challenges include dealing with high dimensionality, studying the combined effects of these exposures and their interactions, integrating causal pathways, and integrating high-throughput omics layers. To tackle these challenges, the Barcelona Institute for Global Health (ISGlobal) held a data challenge event open to researchers from all over the world and from all expertises. Analysts had a chance to compete and apply state-of-the-art methods on a common partially simulated exposome dataset (based on real case data from the HELIX project) with multiple correlated exposure variables (P > 100 exposure variables) arising from general and personal environments at different time points, biological molecular data (multi-omics: DNA methylation, gene expression, proteins, metabolomics) and multiple clinical phenotypes in 1301 mother–child pairs. Most of the methods presented included feature selection or feature reduction to deal with the high dimensionality of the exposome dataset. Several approaches explicitly searched for combined effects of exposures and/or their interactions using linear index models or response surface methods, including Bayesian methods. Other methods dealt with the multi-omics dataset in mediation analyses using multiple-step approaches. Here we discuss features of the statistical models used and provide the data and codes used, so that analysts have examples of implementation and can learn how to use these methods. Overall, the exposome data challenge presented a unique opportunity for researchers from different disciplines to create and share state-of-the-art analytical methods, setting a new standard for open science in the exposome and environmental health field.
AB - The exposome recognizes that individuals are exposed simultaneously to a multitude of different environmental factors and takes a holistic approach to the discovery of etiological factors for disease. However, challenges arise when trying to quantify the health effects of complex exposure mixtures. Analytical challenges include dealing with high dimensionality, studying the combined effects of these exposures and their interactions, integrating causal pathways, and integrating high-throughput omics layers. To tackle these challenges, the Barcelona Institute for Global Health (ISGlobal) held a data challenge event open to researchers from all over the world and from all expertises. Analysts had a chance to compete and apply state-of-the-art methods on a common partially simulated exposome dataset (based on real case data from the HELIX project) with multiple correlated exposure variables (P > 100 exposure variables) arising from general and personal environments at different time points, biological molecular data (multi-omics: DNA methylation, gene expression, proteins, metabolomics) and multiple clinical phenotypes in 1301 mother–child pairs. Most of the methods presented included feature selection or feature reduction to deal with the high dimensionality of the exposome dataset. Several approaches explicitly searched for combined effects of exposures and/or their interactions using linear index models or response surface methods, including Bayesian methods. Other methods dealt with the multi-omics dataset in mediation analyses using multiple-step approaches. Here we discuss features of the statistical models used and provide the data and codes used, so that analysts have examples of implementation and can learn how to use these methods. Overall, the exposome data challenge presented a unique opportunity for researchers from different disciplines to create and share state-of-the-art analytical methods, setting a new standard for open science in the exposome and environmental health field.
KW - Environmental exposures
KW - Exposome
KW - Multi-omics
KW - Multiple exposures
KW - Statistical models
UR - http://www.scopus.com/inward/record.url?scp=85137034873&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137034873&partnerID=8YFLogxK
U2 - 10.1016/j.envint.2022.107422
DO - 10.1016/j.envint.2022.107422
M3 - Article
C2 - 36058017
AN - SCOPUS:85137034873
SN - 0160-4120
VL - 168
JO - Environment international
JF - Environment international
M1 - 107422
ER -