TY - JOUR
T1 - Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)
AU - Benachenhou, Farid
AU - Sperber, Göran O.
AU - Bongcam-Rudloff, Erik
AU - Andersson, Göran
AU - Boeke, Jef D.
AU - Blomberg, Jonas
N1 - Funding Information:
We thank Oscar Eriksson for his invaluable help with computers and software, Hans-Henrik Fuxelius for his useful advice and Aris Katzourakis for initial valuable comments. This work was financially supported by funds given to JB and GA for ERV work from the Swedish Medical Research council, and to JB for bioinformatic development from the Uppsala Academic Hospital, and to EBR and GA from the Swedish University of Agricultural Sciences.
PY - 2013
Y1 - 2013
N2 - Background: Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability.The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results: Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups.Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5TGTTRNRYNYAACA 3); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region.The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening.The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a Superviterbi alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion: The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events.
AB - Background: Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability.The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results: Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups.Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5TGTTRNRYNYAACA 3); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region.The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening.The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a Superviterbi alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion: The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events.
KW - Genome evolution
KW - LTR
KW - Long terminal repeat
KW - Phylogeny
KW - Retrotransposon
KW - Retrovirus
UR - http://www.scopus.com/inward/record.url?scp=84875348477&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84875348477&partnerID=8YFLogxK
U2 - 10.1186/1759-8753-4-5
DO - 10.1186/1759-8753-4-5
M3 - Review article
C2 - 23369192
AN - SCOPUS:84875348477
SN - 1759-8753
VL - 4
JO - Mobile DNA
JF - Mobile DNA
IS - 1
M1 - 5
ER -