BACKGROUND: Clonal immunoglobulin and T-cell receptor rearrangements serve as tumor-specific markers that have become mainstays of the diagnosis and monitoring of lymphoid malignancy. Next-generation sequencing (NGS) techniques targeting these loci have been successfully applied to lymphoblastic leukemia and multiple myeloma for minimal residual disease detection. However, adoption of NGS for primary diagnosis remains limited. METHODS: We addressed the bioinformatics challenges associated with immune cell sequencing and clone detection by designing a novel web tool, CloneRetriever (CR), which uses machine-learning principles to generate clone classification schemes that are customizable, and can be applied to large datasets. CR has 2 applications—a “validation” mode to derive a clonality classifier, and a “live” mode to screen for clones by applying a validated and/or customized classifier. In this study, CR-generated multiple classifiers using 2 datasets comprising 106 annotated patient samples. A custom classifier was then applied to 36 unannotated samples. RESULTS: The optimal classifier for clonality required clonal dominance >4.5 above background, read representation >8% of all reads, and technical replicate agreement. Depending on the dataset and analysis step, the optimal algorithm yielded sensitivities of 81%–90%, specificities of 97%–100%, areas under the curve of 91%–94%, positive predictive values of 92–100%, and negative predictive values of 88%–98%. Customization of the algorithms yielded 95%–100% concordance with gold-standard clonality determination, including rescue of indeterminate samples. Application to a set of unknowns showed concordance rates of 83%–96%. CONCLUSIONS: CR is an out-of-the-box ready and user-friendly software designed to identify clonal rearrangements in large NGS datasets for the diagnosis of lymphoid malignancies.
ASJC Scopus subject areas