TY - JOUR
T1 - A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis
T2 - Application to Mycobacterium tuberculosis
AU - Lamichhane, Gyanu
AU - Zignol, Matteo
AU - Blades, Natalie J.
AU - Geiman, Deborah E.
AU - Dougherty, Annette
AU - Grosset, Jacques
AU - Broman, Karl W.
AU - Bishai, William R.
PY - 2003/6/10
Y1 - 2003/6/10
N2 - We describe a postgenomic in silico approach for identifying genes that are likely to be essential and estimate their proportion in haploid genomes. With the knowledge of all sites eligible for mutagenesis and an experimentally determined partial list of nonessential genes from genome mutagenesis, a Bayesian statistical method provides reasonable predictions of essential genes with a subsaturation level of random mutagenesis. For mutagenesis, a transposon such as Himar1 is suitable as it inserts randomly into TA sites. All of the possible insertion sites may be determined a priorifrom the genome sequence and with this information, data on experimentally hit TA sites may be used to predict the proportion of genes that cannot be mutated. As a model, we used the Mycobacterium tuberculosis genome. Using the Himar1 transposon, we created a genetically defined collection of 1,425 insertion mutants. Based on our Bayesian statistical analysis using Markov chain Monte Carlo and the observed frequencies of transposon insertions in all of the genes, we estimated that the M. tuberculosis genome contains 35% (95% confidence interval, 28%-41%) essential genes. This analysis further revealed seven functional groups with high probabilities of being enriched in essential genes. The PE-PGRS (Pro-Glu polymorphic GC-rich repetitive sequence) family of genes, which are unique to mycobacteria, the polyketide/nonribosomal peptide synthase family, and mycolic and fatty acid biosynthesis gene families were disproportionately enriched in essential genes. At subsaturation levels of mutagenesis with a random transposon such as Himar1, this approach permits a statistical prediction of both the proportion and identities of essential genes of sequenced genomes.
AB - We describe a postgenomic in silico approach for identifying genes that are likely to be essential and estimate their proportion in haploid genomes. With the knowledge of all sites eligible for mutagenesis and an experimentally determined partial list of nonessential genes from genome mutagenesis, a Bayesian statistical method provides reasonable predictions of essential genes with a subsaturation level of random mutagenesis. For mutagenesis, a transposon such as Himar1 is suitable as it inserts randomly into TA sites. All of the possible insertion sites may be determined a priorifrom the genome sequence and with this information, data on experimentally hit TA sites may be used to predict the proportion of genes that cannot be mutated. As a model, we used the Mycobacterium tuberculosis genome. Using the Himar1 transposon, we created a genetically defined collection of 1,425 insertion mutants. Based on our Bayesian statistical analysis using Markov chain Monte Carlo and the observed frequencies of transposon insertions in all of the genes, we estimated that the M. tuberculosis genome contains 35% (95% confidence interval, 28%-41%) essential genes. This analysis further revealed seven functional groups with high probabilities of being enriched in essential genes. The PE-PGRS (Pro-Glu polymorphic GC-rich repetitive sequence) family of genes, which are unique to mycobacteria, the polyketide/nonribosomal peptide synthase family, and mycolic and fatty acid biosynthesis gene families were disproportionately enriched in essential genes. At subsaturation levels of mutagenesis with a random transposon such as Himar1, this approach permits a statistical prediction of both the proportion and identities of essential genes of sequenced genomes.
UR - http://www.scopus.com/inward/record.url?scp=0038472048&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0038472048&partnerID=8YFLogxK
U2 - 10.1073/pnas.1231432100
DO - 10.1073/pnas.1231432100
M3 - Article
C2 - 12775759
AN - SCOPUS:0038472048
SN - 0027-8424
VL - 100
SP - 7213
EP - 7218
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 12
ER -