Retrotransposons are the only type of transposable elements that remain active in human genomes. Although most elements are truncated, a few elements remain able to retrotranspose. In fact, somatic retrotransposition is a hallmark of various cancer types, with up to hundred insertions detected in colorectal cancers (CRCs).
Characterization of retrotransposon insertions has posed challenges with traditional short-read sequencing technologies. However, the emergence of long-read sequencing technologies has enabled the exploration of structural features of the insertions. Additionally, long-read sequencing offers the capability to identify retrotransposon insertions nested within other repetitive sequences. To systematically detect and annotate both somatic and germline retrotransposon insertions in 104 uterine leiomyomas and 62 CRCs, we developed a pipeline- Transposon Detection in Oxford Nanopore Sequencing data (TraDetIONS). Using TraDetIONS, we identified 1495 somatic insertions in colorectal samples, however, uterine leiomyomas—benign neoplasms originating from mesenchymal tissue—exhibited no somatic insertions.
A comparative analysis of somatic and germline insertions unveiled disparities in terms of transposon classes, insertion length, and target site preferences. Furthermore, insertions featuring 5' inversion, processed pseudogenes, and nested retrotransposons—transposons nestled within other transposons—were detected. This analysis enabled the characterization of somatic and germline retrotransposition events, leveraging the long sequencing reads provided by Oxford Nanopore technologies.