Predicting the functional consequences of somatic missense mutations found in tumors

Hannah Carter, Rachel Karchin

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Cancer-specific High-throughput Annotation of Somatic Mutations (CHASM) is a computational method that uses supervised machine learning to prioritize somatic missense mutations detected in tumor sequencing studies. Missense mutations are a key mechanism by which important cellular behaviors, such as cell growth, proliferation, and survival, are disrupted in cancer. However, only a fraction of the missense mutations observed in tumor genomes are expected to be cancer causing. Distinguishing tumorigenic "driver" mutations from their neutral "passenger" counterparts is currently a pressing problem in cancer research.CHASM trains a Random Forest classifier on driver mutations from the COSMIC databases and uses background nucleotide substitution rates observed in tumor sequencing data to model tumor type-specific passenger mutations. Each missense mutation is represented by quantitative features that fall into five major categories: physiochemical properties of amino acid residues; scores derived from multiple sequence alignments of protein or DNA; region-based amino acid sequence composition; predicted properties of local protein structure; and annotations from the UniProt feature tables. Both a software package and a Web server implementation of CHASM are available to facilitate high-throughput prioritization of somatic missense mutations from large, multi-tumor exome sequencing studies. After ranking candidate driver mutations with CHASM, the vector of features describing each mutation can be used to suggest possible mechanism by which mutations alter protein activity in tumorigenesis. This chapter details the application of both implementations of CHASM to tumor sequencing data.

Original languageEnglish (US)
Title of host publicationGene Function Analysis
PublisherHumana Press Inc.
Pages135-159
Number of pages25
ISBN (Print)9781627037204
DOIs
StatePublished - 2014

Publication series

NameMethods in Molecular Biology
Volume1101
ISSN (Print)1064-3745

Keywords

  • CHASM
  • Drivers and passengers
  • Machine learning
  • Random Forest
  • Somatic mutation analysis
  • Tumor sequencing

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'Predicting the functional consequences of somatic missense mutations found in tumors'. Together they form a unique fingerprint.

Cite this