Finding Genes in DNA using Decision Trees and Dynamic Programming

Steven Salzberg, Xin Chen, John Henderson, Kenneth Fasman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This study demonstrates the use of decision tree classifiers as the basis for a general gene-finding system. The system uses a dynamic programming algorithm that finds the optimal segmentation of a DNA sequence into coding and non-coding regions (exons and introns). The optimality property is dependent on a separate scoring function that takes a subsequence and assigns to it a score reflecting the probability that the sequence is an exon. In this study, the scoring functions were sets of decision trees and rules that were combined to give the probability estimate. Experimental results on a newly collected database of human DNA sequences are encouraging, and some new observations about the structure of classifiers for the gene-finding problem have emerged from this study. We also provide descriptions of a new probability chain model that produces very accurate filters to find donor and acceptor sites.

Original languageEnglish (US)
Title of host publicationProceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, ISMB 1996
PublisherAAAI Press
Pages201-210
Number of pages10
ISBN (Electronic)1577350022, 9781577350026
StatePublished - 1996
Externally publishedYes
Event4th International Conference on Intelligent Systems for Molecular Biology, ISMB 1996 - St. Louis, United States
Duration: Jun 12 1996Jun 15 1996

Publication series

NameProceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, ISMB 1996

Conference

Conference4th International Conference on Intelligent Systems for Molecular Biology, ISMB 1996
Country/TerritoryUnited States
CitySt. Louis
Period6/12/966/15/96

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology
  • Artificial Intelligence
  • Information Systems

Fingerprint

Dive into the research topics of 'Finding Genes in DNA using Decision Trees and Dynamic Programming'. Together they form a unique fingerprint.

Cite this