TY - JOUR
T1 - Genome-wide repeat landscapes in cancer and cell-free DNA
AU - Annapragada, Akshaya V.
AU - Niknafs, Noushin
AU - White, James R.
AU - Bruhm, Daniel C.
AU - Cherry, Christopher
AU - Medina, Jamie E.
AU - Adleff, Vilmos
AU - Hruban, Carolyn
AU - Mathios, Dimitrios
AU - Foda, Zachariah H.
AU - Phallen, Jillian
AU - Scharpf, Robert B.
AU - Velculescu, Victor E.
N1 - Publisher Copyright:
© 2024 American Association for the Advancement of Science. All rights reserved.
PY - 2024/3/13
Y1 - 2024/3/13
N2 - Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.
AB - Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.
UR - http://www.scopus.com/inward/record.url?scp=85187741865&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85187741865&partnerID=8YFLogxK
U2 - 10.1126/scitranslmed.adj9283
DO - 10.1126/scitranslmed.adj9283
M3 - Article
C2 - 38478628
AN - SCOPUS:85187741865
SN - 1946-6234
VL - 16
JO - Science translational medicine
JF - Science translational medicine
IS - 738
M1 - eadj9283
ER -