A machine learning based approach towards high-dimensional mediation analysis

Tanmay Nath; Brian Caffo; Tor Wager; Martin A. Lindquist

doi:10.1016/j.neuroimage.2022.119843

A machine learning based approach towards high-dimensional mediation analysis

Tanmay Nath, Brian Caffo, Tor Wager, Martin A. Lindquist

Bloomberg School of Public Health

Research output: Contribution to journal › Article › peer-review

Abstract

Mediation analysis is used to investigate the role of intermediate variables (mediators) that lie in the path between an exposure and an outcome variable. While significant research has focused on developing methods for assessing the influence of mediators on the exposure-outcome relationship, current approaches do not easily extend to settings where the mediator is high-dimensional. These situations are becoming increasingly common with the rapid increase of new applications measuring massive numbers of variables, including brain imaging, genomics, and metabolomics. In this work, we introduce a novel machine learning based method for identifying high dimensional mediators. The proposed algorithm iterates between using a machine learning model to map the high-dimensional mediators onto a lower-dimensional space, and using the predicted values as input in a standard three-variable mediation model. Hence, the machine learning model is trained to maximize the likelihood of the mediation model. Importantly, the proposed algorithm is agnostic to the machine learning model that is used, providing significant flexibility in the types of situations where it can be used. We illustrate the proposed methodology using data from two functional Magnetic Resonance Imaging (fMRI) studies. First, using data from a task-based fMRI study of thermal pain, we combine the proposed algorithm with a deep learning model to detect distributed, network-level brain patterns mediating the relationship between stimulus intensity (temperature) and reported pain at the single trial level. Second, using resting-state fMRI data from the Human Connectome Project, we combine the proposed algorithm with a connectome-based predictive modeling approach to determine brain functional connectivity measures that mediate the relationship between fluid intelligence and working memory accuracy. In both cases, our multivariate mediation model links exposure variables (thermal pain or fluid intelligence), high dimensional brain measures (single-trial brain activation maps or resting-state brain connectivity) and behavioral outcomes (pain report or working memory accuracy) into a single unified model. Using the proposed approach, we are able to identify brain-based measures that simultaneously encode the exposure variable and correlate with the behavioral outcome.

Original language	English (US)
Article number	119843
Journal	NeuroImage
Volume	268
DOIs	https://doi.org/10.1016/j.neuroimage.2022.119843
State	Published - Mar 2023

Keywords

Deep learning
Machine learning
Mediation analysis
Pain
Resting-state functional connectivity
fMRI

ASJC Scopus subject areas

Neurology
Cognitive Neuroscience

Access to Document

10.1016/j.neuroimage.2022.119843

Cite this

@article{e45c9d70366e45efa5d265b6edc707e6,

title = "A machine learning based approach towards high-dimensional mediation analysis",

abstract = "Mediation analysis is used to investigate the role of intermediate variables (mediators) that lie in the path between an exposure and an outcome variable. While significant research has focused on developing methods for assessing the influence of mediators on the exposure-outcome relationship, current approaches do not easily extend to settings where the mediator is high-dimensional. These situations are becoming increasingly common with the rapid increase of new applications measuring massive numbers of variables, including brain imaging, genomics, and metabolomics. In this work, we introduce a novel machine learning based method for identifying high dimensional mediators. The proposed algorithm iterates between using a machine learning model to map the high-dimensional mediators onto a lower-dimensional space, and using the predicted values as input in a standard three-variable mediation model. Hence, the machine learning model is trained to maximize the likelihood of the mediation model. Importantly, the proposed algorithm is agnostic to the machine learning model that is used, providing significant flexibility in the types of situations where it can be used. We illustrate the proposed methodology using data from two functional Magnetic Resonance Imaging (fMRI) studies. First, using data from a task-based fMRI study of thermal pain, we combine the proposed algorithm with a deep learning model to detect distributed, network-level brain patterns mediating the relationship between stimulus intensity (temperature) and reported pain at the single trial level. Second, using resting-state fMRI data from the Human Connectome Project, we combine the proposed algorithm with a connectome-based predictive modeling approach to determine brain functional connectivity measures that mediate the relationship between fluid intelligence and working memory accuracy. In both cases, our multivariate mediation model links exposure variables (thermal pain or fluid intelligence), high dimensional brain measures (single-trial brain activation maps or resting-state brain connectivity) and behavioral outcomes (pain report or working memory accuracy) into a single unified model. Using the proposed approach, we are able to identify brain-based measures that simultaneously encode the exposure variable and correlate with the behavioral outcome.",

keywords = "Deep learning, Machine learning, Mediation analysis, Pain, Resting-state functional connectivity, fMRI",

author = "Tanmay Nath and Brian Caffo and Tor Wager and Lindquist, {Martin A.}",

note = "Funding Information: Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657), which was funded by the McDonnell Center for Systems Neuroscience at Washington University and the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research. This research was supported in part by NIH grants R01EB016061 and R01EB026549 from the National Institute of Biomedical Imaging and Bioengineering, R01MH076136 from National Institute of Mental Health, and Oracle Cloud credits and related resources provided by the Oracle for Research program. We are particularly thankful to Bryan Barker and Rajib Ghosh for providing support with the Oracle cluster. Funding Information: Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657), which was funded by the McDonnell Center for Systems Neuroscience at Washington University and the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research. This research was supported in part by NIH grants R01EB016061 and R01EB026549 from the National Institute of Biomedical Imaging and Bioengineering, R01MH076136 from National Institute of Mental Health, and Oracle Cloud credits and related resources provided by the Oracle for Research program. We are particularly thankful to Bryan Barker and Rajib Ghosh for providing support with the Oracle cluster. Publisher Copyright: {\textcopyright} 2022 The Authors",

year = "2023",

month = mar,

doi = "10.1016/j.neuroimage.2022.119843",

language = "English (US)",

volume = "268",

journal = "NeuroImage",

issn = "1053-8119",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - A machine learning based approach towards high-dimensional mediation analysis

AU - Nath, Tanmay

AU - Caffo, Brian

AU - Wager, Tor

AU - Lindquist, Martin A.

N1 - Funding Information: Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657), which was funded by the McDonnell Center for Systems Neuroscience at Washington University and the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research. This research was supported in part by NIH grants R01EB016061 and R01EB026549 from the National Institute of Biomedical Imaging and Bioengineering, R01MH076136 from National Institute of Mental Health, and Oracle Cloud credits and related resources provided by the Oracle for Research program. We are particularly thankful to Bryan Barker and Rajib Ghosh for providing support with the Oracle cluster. Funding Information: Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657), which was funded by the McDonnell Center for Systems Neuroscience at Washington University and the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research. This research was supported in part by NIH grants R01EB016061 and R01EB026549 from the National Institute of Biomedical Imaging and Bioengineering, R01MH076136 from National Institute of Mental Health, and Oracle Cloud credits and related resources provided by the Oracle for Research program. We are particularly thankful to Bryan Barker and Rajib Ghosh for providing support with the Oracle cluster. Publisher Copyright: © 2022 The Authors

PY - 2023/3

Y1 - 2023/3

N2 - Mediation analysis is used to investigate the role of intermediate variables (mediators) that lie in the path between an exposure and an outcome variable. While significant research has focused on developing methods for assessing the influence of mediators on the exposure-outcome relationship, current approaches do not easily extend to settings where the mediator is high-dimensional. These situations are becoming increasingly common with the rapid increase of new applications measuring massive numbers of variables, including brain imaging, genomics, and metabolomics. In this work, we introduce a novel machine learning based method for identifying high dimensional mediators. The proposed algorithm iterates between using a machine learning model to map the high-dimensional mediators onto a lower-dimensional space, and using the predicted values as input in a standard three-variable mediation model. Hence, the machine learning model is trained to maximize the likelihood of the mediation model. Importantly, the proposed algorithm is agnostic to the machine learning model that is used, providing significant flexibility in the types of situations where it can be used. We illustrate the proposed methodology using data from two functional Magnetic Resonance Imaging (fMRI) studies. First, using data from a task-based fMRI study of thermal pain, we combine the proposed algorithm with a deep learning model to detect distributed, network-level brain patterns mediating the relationship between stimulus intensity (temperature) and reported pain at the single trial level. Second, using resting-state fMRI data from the Human Connectome Project, we combine the proposed algorithm with a connectome-based predictive modeling approach to determine brain functional connectivity measures that mediate the relationship between fluid intelligence and working memory accuracy. In both cases, our multivariate mediation model links exposure variables (thermal pain or fluid intelligence), high dimensional brain measures (single-trial brain activation maps or resting-state brain connectivity) and behavioral outcomes (pain report or working memory accuracy) into a single unified model. Using the proposed approach, we are able to identify brain-based measures that simultaneously encode the exposure variable and correlate with the behavioral outcome.

AB - Mediation analysis is used to investigate the role of intermediate variables (mediators) that lie in the path between an exposure and an outcome variable. While significant research has focused on developing methods for assessing the influence of mediators on the exposure-outcome relationship, current approaches do not easily extend to settings where the mediator is high-dimensional. These situations are becoming increasingly common with the rapid increase of new applications measuring massive numbers of variables, including brain imaging, genomics, and metabolomics. In this work, we introduce a novel machine learning based method for identifying high dimensional mediators. The proposed algorithm iterates between using a machine learning model to map the high-dimensional mediators onto a lower-dimensional space, and using the predicted values as input in a standard three-variable mediation model. Hence, the machine learning model is trained to maximize the likelihood of the mediation model. Importantly, the proposed algorithm is agnostic to the machine learning model that is used, providing significant flexibility in the types of situations where it can be used. We illustrate the proposed methodology using data from two functional Magnetic Resonance Imaging (fMRI) studies. First, using data from a task-based fMRI study of thermal pain, we combine the proposed algorithm with a deep learning model to detect distributed, network-level brain patterns mediating the relationship between stimulus intensity (temperature) and reported pain at the single trial level. Second, using resting-state fMRI data from the Human Connectome Project, we combine the proposed algorithm with a connectome-based predictive modeling approach to determine brain functional connectivity measures that mediate the relationship between fluid intelligence and working memory accuracy. In both cases, our multivariate mediation model links exposure variables (thermal pain or fluid intelligence), high dimensional brain measures (single-trial brain activation maps or resting-state brain connectivity) and behavioral outcomes (pain report or working memory accuracy) into a single unified model. Using the proposed approach, we are able to identify brain-based measures that simultaneously encode the exposure variable and correlate with the behavioral outcome.

KW - Deep learning

KW - Machine learning

KW - Mediation analysis

KW - Pain

KW - Resting-state functional connectivity

KW - fMRI

UR - http://www.scopus.com/inward/record.url?scp=85147782916&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85147782916&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2022.119843

DO - 10.1016/j.neuroimage.2022.119843

M3 - Article

C2 - 36586543

AN - SCOPUS:85147782916

SN - 1053-8119

VL - 268

JO - NeuroImage

JF - NeuroImage

M1 - 119843

ER -

A machine learning based approach towards high-dimensional mediation analysis

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this