banner02
Identification of Transposable Element families from pangenome polymorphisms with pantera
Pío Sierra  1@  , Richard Durbin@
1 : Department of Genetics, University of Cambridge
Downing Pl, Cambridge CB2 3EH -  United Kingdom

Most methods used to identify TEs in a newly sequenced genome are based either on matches to known families from other species, or on them being repeated at high copy number. As species with more than one high quality assembly become more common, including those from diploid individuals with both haplotypes assembled, an alternative strategy becomes possible, in which we focus on the polymorphic character of TEs caused by their mobility. We present a method, Pantera, that uses structural polymorphisms found in pangenomes to create a library of TE families recently active in a species, or in a closely related group of species. This approach is particularly strong for finding full length TE sequences, and low copy number families. We will show the results of applying pantera both to well-studied species with curated TE libraries and also to a wide range of species from the Darwin Tree of Life project. As an illustration, we will discuss more than 700 Mavericks identified in 411 Lepidoptera species, which fall into at least four groups that are differentiated by sequence homology, internal structure and length distribution.


Online user: 5 Privacy
Loading...