Abstract
Chromosomal DNA is characterized by variation between individuals at the level of entire chromosomes (e.g., aneuploidy in which the chromosome copy number is altered), segmental changes (including insertions, deletions, inversions, and translocations), and changes to small genomic regions (including single nucleotide polymorphisms). A variety of alterations that occur in chromosomal DNA, many of which can be detected using high density single nucleotide polymorphism (SNP) microarrays, are linked to normal variation as well as disease and are therefore of particular interest. These include changes in copy number (deletions and duplications) and genotype (e.g., the occurrence of regions of homozygosity). Hidden Markov models (HMM) are particularly useful for detecting such alterations, modeling the spatial dependence between neighboring SNPs. Here, we improve previous approaches that utilize HMM frameworks for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding measures of uncertainty when available. Using simulated and experimental data, we, in particular, demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package VanillaICE.
Original language | English (US) |
---|---|
Pages (from-to) | 687-713 |
Number of pages | 27 |
Journal | Annals of Applied Statistics |
Volume | 2 |
Issue number | 2 |
DOIs | |
State | Published - Jun 2008 |
Keywords
- Chromosomal alterations
- DNA copy number
- Hidden Markov models
- Loss of heterozygosity
- SNP arrays
- Single nucleotide polymorphisms
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty