TY - JOUR
T1 - Machine learning to detect the SINEs of cancer
AU - Douville, Christopher
AU - Lahouel, Kamel
AU - Kuo, Albert
AU - Grant, Haley
AU - Avigdor, Bracha Erlanger
AU - Curtis, Samuel D.
AU - Summers, Mahmoud
AU - Cohen, Joshua D.
AU - Wang, Yuxuan
AU - Mattox, Austin
AU - Dudley, Jonathan
AU - Dobbyn, Lisa
AU - Popoli, Maria
AU - Ptak, Janine
AU - Nehme, Nadine
AU - Silliman, Natalie
AU - Blair, Cherie
AU - Romans, Katharine
AU - Thoburn, Christopher
AU - Gizzi, Jennifer
AU - Schoen, Robert E.
AU - Tie, Jeanne
AU - Gibbs, Peter
AU - Ho-Pham, Lan T.
AU - Tran, Bich N.H.
AU - Tran, Thach S.
AU - Nguyen, Tuan V.
AU - Goggins, Michael
AU - Wolfgang, Christopher L.
AU - Wang, Tian Li
AU - Shih, Ie Ming
AU - Lennon, Anne Marie
AU - Hruban, Ralph H.
AU - Bettegowda, Chetan
AU - Kinzler, Kenneth W.
AU - Papadopoulos, Nickolas
AU - Vogelstein, Bert
AU - Tomasetti, Cristian
N1 - Publisher Copyright:
© 2024 American Association for the Advancement of Science. All rights reserved.
PY - 2024/1/24
Y1 - 2024/1/24
N2 - We previously described an approach called RealSeqS to evaluate aneuploidy in plasma cell-free DNA through the amplification of ~350,000 repeated elements with a single primer. We hypothesized that an unbiased evaluation of the large amount of sequencing data obtained with RealSeqS might reveal other differences between plasma samples from patients with and without cancer. This hypothesis was tested through the development of a machine learning approach called Alu Profile Learning Using Sequencing (A-PLUS) and its application to 7615 samples from 5178 individuals, 2073 with solid cancer and the remainder without cancer. Samples from patients with cancer and controls were prespecified into four cohorts used for model training, analyte integration, and threshold determination, validation, and reproducibility. A-PLUS alone provided a sensitivity of 40.5% across 11 different cancer types in the validation cohort, at a specificity of 98.5%. Combining A-PLUS with aneuploidy and eight common protein biomarkers detected 51% of the cancers at 98.9% specificity. We found that part of the power of A-PLUS could be ascribed to a single feature—the global reduction of AluS subfamily elements in the circulating DNA of patients with solid cancer. We confirmed this reduction through the analysis of another independent dataset obtained with a different approach (whole-genome sequencing). The evaluation of Alu elements may therefore have the potential to enhance the performance of several methods designed for the earlier detection of cancer.
AB - We previously described an approach called RealSeqS to evaluate aneuploidy in plasma cell-free DNA through the amplification of ~350,000 repeated elements with a single primer. We hypothesized that an unbiased evaluation of the large amount of sequencing data obtained with RealSeqS might reveal other differences between plasma samples from patients with and without cancer. This hypothesis was tested through the development of a machine learning approach called Alu Profile Learning Using Sequencing (A-PLUS) and its application to 7615 samples from 5178 individuals, 2073 with solid cancer and the remainder without cancer. Samples from patients with cancer and controls were prespecified into four cohorts used for model training, analyte integration, and threshold determination, validation, and reproducibility. A-PLUS alone provided a sensitivity of 40.5% across 11 different cancer types in the validation cohort, at a specificity of 98.5%. Combining A-PLUS with aneuploidy and eight common protein biomarkers detected 51% of the cancers at 98.9% specificity. We found that part of the power of A-PLUS could be ascribed to a single feature—the global reduction of AluS subfamily elements in the circulating DNA of patients with solid cancer. We confirmed this reduction through the analysis of another independent dataset obtained with a different approach (whole-genome sequencing). The evaluation of Alu elements may therefore have the potential to enhance the performance of several methods designed for the earlier detection of cancer.
UR - http://www.scopus.com/inward/record.url?scp=85183334590&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183334590&partnerID=8YFLogxK
U2 - 10.1126/scitranslmed.adi3883
DO - 10.1126/scitranslmed.adi3883
M3 - Article
C2 - 38266106
AN - SCOPUS:85183334590
SN - 1946-6234
VL - 16
JO - Science translational medicine
JF - Science translational medicine
IS - 731
M1 - eadi3883
ER -