TY - JOUR
T1 - CloneRetriever
T2 - An Automated Algorithm to Identify Clonal B and T Cell Gene Rearrangements by Next-Generation Sequencing for the Diagnosis of Lymphoid Malignancies
AU - Halper-Stromberg, Eitan
AU - McCall, Chad M.
AU - Haley, Lisa M.
AU - Lin, Ming Tseh
AU - Vogt, Samantha
AU - Gocke, Christopher D.
AU - Eshleman, James R.
AU - Stevens, Wendy
AU - Martinson, Neil A.
AU - Epeldegui, Marta
AU - Holdhoff, Matthias
AU - Bettegowda, Chetan
AU - Glantz, Michael J.
AU - Ambinder, Richard F.
AU - Xian, Rena R.
N1 - Funding Information:
U.S. Department of Health & Human Services j NIH j National Cancer Institute (NCI) - R01CA250069, R21CA232891, P30CA006973, P30AI094189, UM1CA121947, R21CA220475, U01AI035040, and R01CA228157. N.A. Martinson has received institutional research funding from Pfizer Inc.
Publisher Copyright:
© 2021 American Association for Clinical Chemistry Inc.. All rights reserved.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - BACKGROUND: Clonal immunoglobulin and T-cell receptor rearrangements serve as tumor-specific markers that have become mainstays of the diagnosis and monitoring of lymphoid malignancy. Next-generation sequencing (NGS) techniques targeting these loci have been successfully applied to lymphoblastic leukemia and multiple myeloma for minimal residual disease detection. However, adoption of NGS for primary diagnosis remains limited. METHODS: We addressed the bioinformatics challenges associated with immune cell sequencing and clone detection by designing a novel web tool, CloneRetriever (CR), which uses machine-learning principles to generate clone classification schemes that are customizable, and can be applied to large datasets. CR has 2 applications—a “validation” mode to derive a clonality classifier, and a “live” mode to screen for clones by applying a validated and/or customized classifier. In this study, CR-generated multiple classifiers using 2 datasets comprising 106 annotated patient samples. A custom classifier was then applied to 36 unannotated samples. RESULTS: The optimal classifier for clonality required clonal dominance >4.5 above background, read representation >8% of all reads, and technical replicate agreement. Depending on the dataset and analysis step, the optimal algorithm yielded sensitivities of 81%–90%, specificities of 97%–100%, areas under the curve of 91%–94%, positive predictive values of 92–100%, and negative predictive values of 88%–98%. Customization of the algorithms yielded 95%–100% concordance with gold-standard clonality determination, including rescue of indeterminate samples. Application to a set of unknowns showed concordance rates of 83%–96%. CONCLUSIONS: CR is an out-of-the-box ready and user-friendly software designed to identify clonal rearrangements in large NGS datasets for the diagnosis of lymphoid malignancies.
AB - BACKGROUND: Clonal immunoglobulin and T-cell receptor rearrangements serve as tumor-specific markers that have become mainstays of the diagnosis and monitoring of lymphoid malignancy. Next-generation sequencing (NGS) techniques targeting these loci have been successfully applied to lymphoblastic leukemia and multiple myeloma for minimal residual disease detection. However, adoption of NGS for primary diagnosis remains limited. METHODS: We addressed the bioinformatics challenges associated with immune cell sequencing and clone detection by designing a novel web tool, CloneRetriever (CR), which uses machine-learning principles to generate clone classification schemes that are customizable, and can be applied to large datasets. CR has 2 applications—a “validation” mode to derive a clonality classifier, and a “live” mode to screen for clones by applying a validated and/or customized classifier. In this study, CR-generated multiple classifiers using 2 datasets comprising 106 annotated patient samples. A custom classifier was then applied to 36 unannotated samples. RESULTS: The optimal classifier for clonality required clonal dominance >4.5 above background, read representation >8% of all reads, and technical replicate agreement. Depending on the dataset and analysis step, the optimal algorithm yielded sensitivities of 81%–90%, specificities of 97%–100%, areas under the curve of 91%–94%, positive predictive values of 92–100%, and negative predictive values of 88%–98%. Customization of the algorithms yielded 95%–100% concordance with gold-standard clonality determination, including rescue of indeterminate samples. Application to a set of unknowns showed concordance rates of 83%–96%. CONCLUSIONS: CR is an out-of-the-box ready and user-friendly software designed to identify clonal rearrangements in large NGS datasets for the diagnosis of lymphoid malignancies.
UR - http://www.scopus.com/inward/record.url?scp=85121477117&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121477117&partnerID=8YFLogxK
U2 - 10.1093/clinchem/hvab141
DO - 10.1093/clinchem/hvab141
M3 - Article
C2 - 34491318
AN - SCOPUS:85121477117
SN - 0009-9147
VL - 67
SP - 1524
EP - 1533
JO - Clinical chemistry
JF - Clinical chemistry
IS - 11
ER -