Transposons, often called "jumping genes," are DNA sequences that have the capacity to move from one chromosomal site to another. More than three million copies of transposons have accumulated in humans throughout the course of evolution and now comprise an estimated 45% of the total DNA content in the human genome.
These mobile genetic elements are scattered throughout the human genome – separated, on average, by only 500 base pairs. But Dr. John Mattick's laboratory at the University of Queensland, Australia, identified long tracks of genomic segments (greater than 10 kilobases in length) that lack transposable elements. His team identified 860 such sequences in humans, 993 in mice, and 559 in opossums. They named these segments TFRs, or transposon-free regions.
"Strikingly," says Mattick, "many TFRs in the human genome occur in the same position in the mouse and opossum genomes, despite the fact that transposons entered each lineage independently, after each species diverged from a common ancestor. It appears that many TFRs are evolutionarily conserved features that existed prior to – and have been largely maintained since – the divergence of eutherian mammals and marsupials approximately 170 million years ago."
The opossum was chosen for inclusion in the analysis because it is a marsupial that has a similar load of transposable elements compared to mice and humans but is evolutionarily distant from the two species. In contrast, the genomes of chicken and fish, which diverged from humans more than 300 million years ago, do not have a significant density of transposons.
Given the strong evolutionary conservation of the TFRs, Mattick's group hypothesized that they are regions of significant biological importance. Upon further characterizing the TFRs, they discovered that many (85%) overlapped at least one annotated gene and that almost all (94%) overlapped at least one known RNA transcript. In addition, the TFRs were enriched in microRNAs, in genes that encode proteins with putative DNA-binding activity, and in genes that are involved in developmental processes. Another striking feature of TFRs was that they are associated with ultra-conserved regions, or genomic segments longer than 200 base pairs with 100% identity between human, mouse, and rat. All of these observations strongly support an important role for TFRs in critical biological processes.
"The majority of the TFRs lie outside of protein-coding sequences, so they presumably represent regions of regulatory information or RNA transcripts that cannot be disrupted. However, it's difficult to explain mechanistically the requirement of 10 or more kilobases of uninterrupted sequence in terms of the current paradigms of transcriptional regulation," explains Mattick. "It appears that TFRs might be the passive signatures of one or more poorly understood mechanisms of gene regulation that operate in higher organisms, suggesting a wider role for noncoding sequences than has hitherto been appreciated."
The work was conducted under Mattick's guidance by graduate students Cas Simons and Michael Pheasant, as well as by Dr. Igor Makunin, a postdoctoral researcher.