TY - JOUR
T1 - Addressing the mean-correlation relationship in co-expression analysis
AU - Wang, Yi
AU - Hicks, Stephanie C.
AU - Hansen, Kasper D.
N1 - Publisher Copyright:
Copyright: © 2022 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2022/3
Y1 - 2022/3
N2 - Estimates of correlation between pairs of genes in co-expression analysis are commonly used to construct networks among genes using gene expression data. As previously noted, the distribution of such correlations depends on the observed expression level of the involved genes, which we refer to this as a mean-correlation relationship in RNA-seq data, both bulk and single-cell. This dependence introduces an unwanted technical bias in coexpression analysis whereby highly expressed genes are more likely to be highly correlated. Such a relationship is not observed in protein-protein interaction data, suggesting that it is not reflecting biology. Ignoring this bias can lead to missing potentially biologically relevant pairs of genes that are lowly expressed, such as transcription factors. To address this problem, we introduce spatial quantile normalization (SpQN), a method for normalizing local distributions in a correlation matrix. We show that spatial quantile normalization removes the mean-correlation relationship and corrects the expression bias in network reconstruction.
AB - Estimates of correlation between pairs of genes in co-expression analysis are commonly used to construct networks among genes using gene expression data. As previously noted, the distribution of such correlations depends on the observed expression level of the involved genes, which we refer to this as a mean-correlation relationship in RNA-seq data, both bulk and single-cell. This dependence introduces an unwanted technical bias in coexpression analysis whereby highly expressed genes are more likely to be highly correlated. Such a relationship is not observed in protein-protein interaction data, suggesting that it is not reflecting biology. Ignoring this bias can lead to missing potentially biologically relevant pairs of genes that are lowly expressed, such as transcription factors. To address this problem, we introduce spatial quantile normalization (SpQN), a method for normalizing local distributions in a correlation matrix. We show that spatial quantile normalization removes the mean-correlation relationship and corrects the expression bias in network reconstruction.
UR - http://www.scopus.com/inward/record.url?scp=85128126968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128126968&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1009954
DO - 10.1371/journal.pcbi.1009954
M3 - Article
C2 - 35353807
AN - SCOPUS:85128126968
SN - 1553-734X
VL - 18
JO - PLoS computational biology
JF - PLoS computational biology
IS - 3
M1 - e1009954
ER -