banner02
soloLTRseeker: a new tool to identify soloLTR sequences
Estela Perez-Roman  1@  , Alexandros Bousios  1  
1 : School of Life Sciences, University of Sussex
Brighton BN1 9RH -  United Kingdom

Transposable elements (TEs) are mobile DNA sequences present virtually in all eukaryotic organisms. They are repetitive sequences that can form complex structures, which are problematic to annotate and resolve. Recent advances in sequencing technologies have accelerated the publication of high-quality genome assemblies, often sequenced from telomere to telomere and including regions of high TE density. These developments have brought about a flurry of algorithms and annotation tools that collectively aim to help researchers study TEs instead of masking out their sequences. One type of TE structure, however, remains hard to identify, with scientists resorting to ad hoc approaches for their annotation. These structures, known as soloLTRs, are the product of unequal homologous recombination between the LTRs of the same or closely located LTR retrotransposons. During soloLTR formation, the internal region between both LTRs and one of the two LTRs are deleted, leaving a single LTR. This is a dynamic process that has been observed across species. Quantifying these events, and comparing its intensity and impact between hosts requires accurate and efficient soloLTR annotation. We have therefore developed a computational pipeline, soloLTRseeker, tailored for this task. soloLTRseeker requires as input the fasta and gff3 files of the full-length sequence and LTRs of intact LTR retrotransposons. Through a series of steps and filters, soloLTRseeker annotates high-quality soloLTRs that contain target-site duplications, but also generates multiple intermediate files and analysis. The pipeline incorporates the TEsorter tool to allocate elements of the Ty1/Copia and Ty3 superfamilies into lineages (e.g. ATHILA, SIRE), and is therefore suited for plant genomes only. Although still under development, we ran soloLTRseeker in several angiosperm and gymnosperm species, and our tests show that it generates high-quality annotations. Overall, our aim with soloLTRseeker is to develop a tool specifically designed to identify soloLTRs, and fill this gap in TE annotation pipelines.


Online user: 10 Privacy
Loading...