banner02
TETrimmer: a tool to replace and assistant transposable element manual curation
Jiangzhao Qian  1, 2, 3, 4, 5, 6@  
1 : Hang Xue
Department of Plant and Microbial Biology, University of California, Berkeley, 221A Koshland Hall, Berkeley -  United States
2 : shujun ou
Department of Molecular, Ohio State University, Genetics 592 Aronoff Laboratory 318 W 12th Avenue Columbus -  United States
3 : Mary Wildermuth
Department of Plant and Microbial Biology, University of California, Berkeley, 221A Koshland Hall, Berkeley -  United States
4 : Lisa Fuertauer
Unit of Plant Molecular Cell Biology, Institute for Biology III, RWTH Aachen University, Worringerweg 1, 52056 Aachen -  Germany
5 : Stefan Kusch
Forschungszentrum Jülich GmbH, Wilhelm-Johnen-Straße, 52428 Jülich -  Germany
6 : Ralph Panstruga
Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056 Aachen -  Germany

Transposable elements (TEs) are repetitive DNA elements that can change their position within a genome. They can occupy large proportions of eukaryotic genomes. Many tools have been developed for de novo TE identification, like RepeatModeler, EDTA, REPET, HiTE, and EarlGrey. But manual curation is still required for a high-quality TE annotation by experts, which is very time-consuming. We developed a software called TETrimmer that can replace the main tasks of TE manual curation.

 Because the sequence divergence among TE subfamilies can be small, more than one type of subfamilies was usually included into one file after BLASTn and multiple sequence alignment (MSA). TETrimmer combined maximum likelihood phylogenetic tree and DBSCNA methods to efficiently cluster and separate MSA based on sequence relatedness. Annotated TEs from de novo TE annotation software can be fragmented. TETrimmer can automatically identify the proper extension size, clean the MSA, and define TE boundaries. The cleaning module of TETrimmer is very powerful, it uses new algorithm to efficiently remove MSA gaps and low conserved regions. Finally, TETrimmer supplies a graphical user interface to allow the user easily reviewing and modifying TETrimmer outputs. So far, we have tested TETrimmer on Drosophila melanogaster, Danio rerio, Oryza sativa, Zea mays, Blumeria hordei, and Homo sapiens. Comparing with the directly RepeatModeler2 outputs, TETrimmer can dramatically increase the TE annotation quality.



  • Poster
Online user: 7 Privacy
Loading...