Transposable elements as genomic diseases

Andreas Wagner abcd
aUniversity of Zurich, Dept. of Biochemistry, Bldg. Y27, Winterthurerstrasse 190, CH-8057, Zurich, Switzerland. E-mail: aw@bioc.unizh.ch; Tel: +41-44-635-6141
bUniversity of New Mexico, Department of Biology, 167A Castetter Hall, Albuquerque, NM 87131, USA
cThe Santa Fe Institute, 1399 Hyde Park Road, Sante Fe, NM 87501, USA
dThe Swiss Institute of Bioinformatics

Received 21st August 2008 , Accepted 22nd September 2008

First published on 27th October 2008


Abstract

Human disease agents can get transmitted both horizontally—through infection—and vertically—from parent to offspring. Depending on details of their evolutionary dynamics, they may increase or decrease in virulence over time. The evolutionary dynamics of bacterial transposable elements resembles that of human pathogens in these and other respects. I here briefly highlight similarities and differences in the two evolutionary processes. I also suggest that an epidemiological perspective, combined with future estimates of parameters of transposable element evolution from hundreds of genomes, may yield insights into the forces that maintain transposable elements in bacterial populations.


Transposable element “life” histories are complex

We have come a long way since the first assertion that transposable elements constitute selfish DNA, and that they are thus bad for their hosts.1,2 For instance, we now know that the simple question why hosts do not rid themselves of transposable elements does not have a simple answer. First, transposable elements may on occasion cause beneficial mutations,3–8 which may drive their fixation in a population, although their effects are, on average, deleterious.9–15 Second, even in largely clonally reproducing asexual organisms, transposable elements can be shuttled from genome to genome through horizontal gene transfer.16 Third, in populations of sexually reproducing organisms, transposable elements may spread rapidly, even if they cause deleterious mutations.17 Sex has a role analogous to horizontal gene transfer in facilitating transposable element spreading. Part of the reason is that if a host organism harbors at least two (unlinked) copies of a transposable element, then more than 50% of its sexual offspring will carry the transposable element. The same would hold for any two copies of non-mobile DNA, but transposable elements can subsequently increase their own copy number . Thus, the combined action of sex and transposition, allows transposable elements to gain copies in a population more rapidly than other genes. Fourth, in higher organism heterochromatin can form a reservoir for transposable elements,10,13,18,19 where they do little damage, and from where they can continually to invade the remainder of the genome , making it difficult to purge them. Fifth, transposable elements can become “domesticated”, a process by which their deleterious effects on the host become smaller.20 Sixth, some transposable elements have particularly insidious reproductive features that make them difficult to eliminate from a genome . Consider, for instance, the two major classes of transposable elements, DNA transposons and retrotransposons.21 Many DNA transposons transpose through a cut-and-paste mechanism, whereas retrotransposons are transcribed into an RNA molecule, from which copies of the transposon are then made. A few RNAtranscripts of the retrotransposons may suffice to make many copies of the transposable element. Also, as opposed to DNA transposons, whose excision (and thus occasional accidental loss) is an integral part of their reproductive cycle, retrotransposons excise very rarely.21 This combination—high amplification and low excision rate—allows rapid copy number increases, and may be part of the reason why our genomes are packed with retrotransposons.

To disentangle all these interacting factors may be impossible. However, we can examine systems where only few of them play a role, and where their interaction can thus be better studied. These will preferably be systems with (i) simple genomes, thus little heterochromatin and strong mutational effects of individual transpositions, (ii) asexual reproduction, to avoid the complications elicited by regular sex, and (iii) abundant available genetic information. The systems that meet these criteria are prokaryotes with their simple genomes, their only sporadic sex through horizontal gene transfer, and the availability of hundreds of completely sequenced genomes.

The kiss of death

In these systems, we and others have recently carried out large-scale surveys of multiple families of insertion sequences, arguable the simplest transposable elements.11,22–25 One emerging observation is that, as a rule with some exceptions, transposable elements within any one genome are typically extremely similar to one another.22,23 This mirrors earlier observations derived from more limited data.26 After excluding gene conversion as an unlikely cause of this extraordinary homogeneity,22 one is left with the conclusion that transposable elements have entered individual bacterial genomes only recently. This observation stands in stark contrast to the often high divergence of transposable elements among genomes, which suggests that any one transposable element family is ancient.22,27,28 How can one reconcile the old age of transposable element families with their recent presence in any one genome ? Easily, if transposable elements travel extensively between host genomes, That is, while inhabiting one host, they can be transferred to another host through horizontal gene transfer. If the old host perishes, they persist in the new host. This process may occur concurrently among multiple genomes and populations, thus securing the persistence of transposable elements, even though their individual hosts perish. These evolutionary dynamics can also explain the appearance that individual genomes have acquired transposable elements only recently. It suggests that bacterial genomes infected by insertion sequences typically do not persist on long evolutionary time scales (otherwise the transposable elements would show greater within-genome divergence). In other words, insertion sequences may be the kiss of death for a bacterial genome .

Epidemiology of diseases and transposable elements

These observations complement previous evidence that transposable elements are on average deleterious to their hosts.9–15,29,30 They also cast a spotlight on the evolutionary dynamics of transposable elements among genomes and populations. These dynamics exhibit more than a passing resemblance to the dynamics of human diseases. Many human diseases (Chagas’ disease, HIV, etc.) are transmitted not only horizontally but vertically from parent to offspring, the analogue of a transposable element in a genome inherited from mother to daughter cell. Also, human diseases can persist in human populations partly through infection (the analogue of horizontal transfer), despite the sometimes lethal damage they can cause. They thus enter new hosts before their old host either dies or expels them. For many human diseases, the restriction of infection, for example through vaccination, can lead to the extinction of the disease. Much the same may hold for some transposable elements and organisms where horizontal transfer or sexual reproduction probably ceased a long time ago.12,31–35 Ancient intracellular prokaryotic symbionts provide a case in point. These organisms live inside the cells of their host. They may thus have little or no opportunity to exchange genetic material with their relatives in other hosts of the same species. As a result, some ancient endosymbionts show fewer transposable elements in their genomes.32,34 Related examples include the bdelloid rotifers, eukaryotes that probably have reproduced asexually for many million years. In these species, the incidence of deleterious retrotransposons, which rely on spreading through sexual recombination, is reduced.12,35

Other phenomena observed for infectious diseases have similar analogies in the dynamics of transposable elements. For example, some disease agents can become less virulent—less damaging to the host—over time. Prominent examples include the myxoma viruses, which were introduced to control rabbit populations in Australia in the 1950s, but rapidly evolved reduced virulence (ref. 36, pp. 649–650). Analogously, eventually successful endosymbioses may have begun as parasitic interactions. The counterpart of this phenomenon is the “domestication” of transposable elements mentioned above. Thus, both in the evolution of disease and of mobile DNA might an initially damaging agent eventually come to coexist with or even serve the host. The opposite of domestication takes place when various disease agents (microbial pathogens or transposable elements) compete for reproduction in the same host (multicellular organism or genome ). In that case, the most aggressively replicating agent will outcompete the rest, which may lead to increased virulence. It is tempting to speculate that this phenomenon may be at work in some bacterial genomes that are overrun with insertion sequences.23

The challenge of time scale

Despite such similarities, important differences between human pathogens and the evolutionary dynamics of transposable elements also exist. These differences render the elucidation of the time scale of transposable element evolution challenging. First, as opposed to many human diseases, transposable elements are transmitted vertically as a rule, and horizontally as the exception. Horizontal transfer rates, for example, are typically much smaller than 10−2 per cell generation.37–40 The infectious mode of transmission occurs only rarely. Second, and relatedly, the disease dynamics of transposable elements takes place at time scales very different from that of human pathogens. Human microparasites, for example, typically replicate at rates vastly larger than that of their hosts. Thus, many pathogen generations elapse within one host generation. Not so for transposable elements, which are replicated passively with the host genome , that is, at the same time scale. Their second mode of replication, through transposition, takes place at even slower time scales than that of the host, typically between 10−3–10−5 per host cell generation.14,41–45 Thus, transposable elements may be very slow-acting diseases. In fact, the close relatedness of transposable elements within a genome suggests that their evolutionary dynamics plays out on a time scale that is an uncomfortable intermediate between laboratory time scales (<104 generations; Fig. 1) and time scales at which molecular clocks measure time through nucleotide substitutions (>107 generations). Whereas the dynamics on laboratory time scales can be measured directly, and whereas the dynamics on evolutionary time scales can be measured by sequence comparison, the dynamics on intermediate time scales can be difficult to measure. If laboratory evolution experiments fail to succeed in eliminating transposable elements from a genome ,7 the reasons may be found in the time-scale at which such elimination would occur.
Time scales for laboratory evolution experiments, molecular evolution, and transposable element dynamics. Red (blue) regions reflect time scales in generations at which a particular approach can detect evolutionary change well (or poorly). The information in this figure is based on the following considerations. The longest laboratory evolution experiments in a microbial system to date extend through some 104 generations. They have been ongoing for more than a decade.49 This is the time scale on which an evolutionary process would need to unfold in order to be detectable in the laboratory. The time resolution of molecular clocks is limited by the time needed to accumulate a single (synonymous) nucleotide substitution, the elementary time unit of molecular evolution. For E. coli in the wild, for example, synonymous nucleotide substitutions accumulate at a rate of circa Ks = 0.009 per gene pair and per 108 – 3 × 108 generations.50 For a transposable element of approximately 1 kilo basepair in length, one would expect of the order of 9 substitutions in this amount of time. This translates into a rate of substitution that is of the order of one substitution per 107 generations. The median within-genome synonymous divergence of insertion sequences in 20 families and 438 genomes is Ks = 0.0087.51 With the substitution rates given above for E. coli, it would take of the order of 108 generations to accumulate this amount of change. This corresponds to fewer than ten substitutions per 1 kilo base pair of sequence, a small amount of change for which the statistical power to test evolutionary hypotheses is still very limited.
Fig. 1 Time scales for laboratory evolution experiments, molecular evolution, and transposable element dynamics. Red (blue) regions reflect time scales in generations at which a particular approach can detect evolutionary change well (or poorly). The information in this figure is based on the following considerations. The longest laboratory evolution experiments in a microbial system to date extend through some 104 generations. They have been ongoing for more than a decade.49 This is the time scale on which an evolutionary process would need to unfold in order to be detectable in the laboratory. The time resolution of molecular clocks is limited by the time needed to accumulate a single (synonymous) nucleotide substitution, the elementary time unit of molecular evolution. For E. coli in the wild, for example, synonymous nucleotide substitutions accumulate at a rate of circa Ks = 0.009 per gene pair and per 108 – 3 × 108 generations.50 For a transposable element of approximately 1 kilo basepair in length, one would expect of the order of 9 substitutions in this amount of time. This translates into a rate of substitution that is of the order of one substitution per 107 generations. The median within-genome synonymous divergence of insertion sequences in 20 families and 438 genomes is Ks = 0.0087.51 With the substitution rates given above for E. coli, it would take of the order of 108 generations to accumulate this amount of change. This corresponds to fewer than ten substitutions per 1 kilo base pair of sequence, a small amount of change for which the statistical power to test evolutionary hypotheses is still very limited.

Many decades of epidemiological and evolutionary studies have yielded fundamental insights into the dynamics of human diseases. By comparison, with few exceptions,46,47 the evolutionary dynamics of transposable elements among multiple genomes is poorly explored. Until recently, we lacked both data and quantitative models for such an exploration. The availability of hundreds, and soon thousands of bacterial genome sequences, is about to remedy this lack of data. This is another reason why prokaryotes are attractive for this type of work. In addition, a long history of quantitative modeling in epidemiology may help adapt existing models to transposable element dynamics.36 Abundant sequence data may be helpful in specifying the structure and parameters of these models. For example, an important epidemiological question is whether horizontal transfer of insertion sequences often occurs between distantly related genomes, or exclusively between closely related genomes. Our data suggests that the horizontal transfer of transposable elements among distantly related genomes occurs, but that it is the exception rather than the rule (Fig. 2). With the accumulation of more and more sequence data, it may be possible to estimate important parameters of epidemiological processes, such as horizontal transfer rates, or transposition/deletion rates at least semiquantitatively. This information will improve our understanding of the evolutionary dynamics of transposable elements in prokaryotes. It remains to be seen whether this epidemiological perspective will also be productive for eukaryotes, where life cycles and genome architectures are more complex, and where populations are smaller, rendering selection less effective.48


Distant horizontal gene transfer is rare. The figure shows a prokaryotic phylogenetic tree based on 16S rDNA sequences from 438 bacterial genomes. Colored bars indicate species that contain insertion sequences from three families: IS1, IS5, and IS110. The length of the bars indicate the number of insertion sequences per genome. Note the patchy and disjoint distribution of insertion sequences across clades. The inset shows, for all species pairs where both members contain IS elements, the evolutionary distance between the species based on their 16S rDNA divergence (horizontal axis) versus the evolutionary distance of IS elements, based on their synonymous divergence KS (vertical axis).52 The size of the green circles corresponds to the number of species pairs with a given divergence. If horizontal transfer among distant species was frequent, one would expect data points with high 16S divergence and low IS divergence, but such data points are absent. A larger scale survey51 from which this data was taken, finds only a small number of such distant transfer events in 20 IS families and 438 species.
Fig. 2 Distant horizontal gene transfer is rare. The figure shows a prokaryotic phylogenetic tree based on 16S rDNA sequences from 438 bacterial genomes. Colored bars indicate species that contain insertion sequences from three families: IS1, IS5, and IS110. The length of the bars indicate the number of insertion sequences per genome . Note the patchy and disjoint distribution of insertion sequences across clades. The inset shows, for all species pairs where both members contain IS elements, the evolutionary distance between the species based on their 16S rDNA divergence (horizontal axis) versus the evolutionary distance of IS elements, based on their synonymous divergence KS (vertical axis).52 The size of the green circles corresponds to the number of species pairs with a given divergence. If horizontal transfer among distant species was frequent, one would expect data points with high 16S divergence and low IS divergence, but such data points are absent. A larger scale survey51 from which this data was taken, finds only a small number of such distant transfer events in 20 IS families and 438 species.

References

  1. L. E. Orgel and F. H. C. Crick, Nature, 1980, 284, 604–607 CrossRef CAS.
  2. W. F. Doolittle and C. Sapienza, Nature, 1980, 284, 601–607 CrossRef CAS.
  3. R. Koszul, B. Dujon and G. Fischer, Yeast, 2003, 20, S97–S97.
  4. J. F. Y. Brookfield and P. M. Sharp, Trends in Genetics, 1994, 10, 109–111 Search PubMed.
  5. M. Blot, Genetica, 1994, 93, 5–12 CrossRef CAS.
  6. M. J. Dunham, H. Badrane, T. Ferea, J. Adams, P. O. Brown, F. Rosenzweig and D. Botstein, Proceedings of the National Academy of Sciences of the United States of America, 2002, 99, 16144–16149 Search PubMed.
  7. D. Schneider and R. E. Lenski, Research in Microbiology, 2004, 155, 319–327 CrossRef CAS.
  8. P. Capy, G. Gasperi, C. Biemont and C. Bazin, Heredity, 2000, 85, 101–106 CrossRef CAS.
  9. E. G. Pasyukova, S. V. Nuzhdin, T. V. Morozova and T. F. C. Mackay, Journal of Heredity, 2004, 95, 284–290 Search PubMed.
  10. C. Bartolome, X. Maside and B. Charlesworth, Molecular Biology and Evolution, 2002, 19, 926–937 CAS.
  11. M. Touchon and E. P. C. Rocha, Molecular Biology and Evolution, 2007, 24, 969–981 CrossRef CAS.
  12. I. R. Arkhipova, Cytogenetic and Genome Research, 2005, 110, 372–382 CrossRef CAS.
  13. B. Charlesworth, P. Sniegowski and W. Stephan, Nature, 1994, 371, 215–220 CrossRef CAS.
  14. B. Charlesworth and C. H. Langley, Annual Review of Genetics, 1989, 23, 251–287 CrossRef CAS.
  15. S. V. Nuzhdin, Genetica, 1999, 107, 129–137 CrossRef CAS.
  16. H. Ochman, J. Lawrence and E. Groisman, Nature, 2000, 405, 299–304 CrossRef CAS.
  17. D. Hickey, Genetics, 1982, 101, 519–531 CAS.
  18. P. Dimitri, N. Corradini, F. Rossi, E. Mei, I. F. Zhimulev and F. Verni, Cytogenetic and Genome Research, 2005, 110, 165–172 CrossRef CAS.
  19. L. Bouneau, C. Fischer, C. Ozouf-Costaz, A. Froschauer, O. Jaillon, J. P. Coutanceau, C. Korting, J. Weissenbach, A. Bernot and J. N. Volff, Genome Research, 2003, 13, 1686–1695 CrossRef CAS.
  20. W. J. Miller, J. F. McDonald, D. Nouaud and D. Anxolabehere, Genetica, 1999, 107, 197–207 CrossRef CAS.
  21. Mobile DNA II, eds. N. Craig, R. Craigie, M. Gellert and A. L. Lambowitz, ASM Press, Washington, DC, 2002 Search PubMed.
  22. A. Wagner, Molecular Biology and Evolution, 2006, 23, 723–733 CrossRef CAS.
  23. A. Wagner, C. Lewis and M. Bichsel, Nucleic Acids Research, 2007, 35, 5284–5293 CrossRef CAS.
  24. J. Filee, P. Siguier and M. Chandler, Microbiology and Molecular Biology Reviews, 2007, 71, 121–157 CrossRef CAS.
  25. P. Siguier, J. Filee and M. Chandler, Current Opinion in Microbiology, 2006, 9, 526–531 CrossRef CAS.
  26. S. A. Sawyer, D. E. Dykhuizen, R. F. DuBose, L. Green, T. Mutangadura-Mhlanga, D. F. Wolczyk and D. L. Hartl, Genetics, 1987, 115, 51–63 CAS.
  27. J. Mahillon and M. Chandler, Microbiology and Molecular Biology Reviews, 1998, 62, 725–774 CAS.
  28. P. Siguier, J. Perochon, L. Lestrade, J. Mahillon and M. Chandler, Nucleic Acids Research (Database Issue), 2006, 34, D34–D36 Search PubMed.
  29. J. F. Y. Brookfield, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 1985, 312, 217–226 Search PubMed.
  30. B. Charlesworth, The population genetics of transposable elements, in Population genetics and molecular evolution, eds. T. Ohta and K. Aoki, Springer-Verlag, New York, NY, 1985, pp. 213–232 Search PubMed.
  31. E. S. Dolgin and B. Charlesworth, Genetics, 2006, 174, 817–827 CrossRef CAS.
  32. N. Moran and G. Plague, Current Opinion in Genetics and Development, 2004, 14, 627–633 CrossRef CAS.
  33. S. V. Nuzhdin and D. A. Petrov, Biological Journal of the Linnean Society, 2003, 79, 33–41 CrossRef.
  34. S. R. Bordenstein and W. S. Reznikoff, Nature Reviews Microbiology, 2005, 3, 688–699 Search PubMed.
  35. I. R. Arkhipova and M. Meselson, Proceedings of the National Academy of Sciences of the United States of America, 2005, 102, 11781–11786 Search PubMed.
  36. R. Anderson and R. May, Infectious diseases of humans Dynamics and Control, Oxford University Press, Oxford, UK, 1991 Search PubMed.
  37. M. Droge, A. Puhler and W. Selbitschka, Biology and Fertility of Soils, 1999, 29, 221–245 CrossRef CAS.
  38. J. D. van Elsas, J. T. Trevors and M. E. Starodub, Fems Microbiology Ecology, 1988, 53, 299–306.
  39. S. J. Billington, J. G. Songer and B. H. Jost, Antimicrobial Agents and Chemotherapy, 2002, 46, 1281–1287 CrossRef CAS.
  40. S. C. Jiang and J. H. Paul, Applied and Environmental Microbiology, 1998, 64, 2780–2787 CAS.
  41. eds. D. E. Berg and M. M. Howe, Mobile DNA, ASM Press, Washington, DC, 1989 Search PubMed.
  42. D. L. Hartl, D. E. Dykhuizen, R. D. Miller, J. Green and J. de Framond, Cell, 1983, 35, 503–510 CrossRef CAS.
  43. C. Egner and D. E. Berg, Proceedings of the National Academy of Sciences of the USA, 1981, 78, 459–463 Search PubMed.
  44. M. M. Shen, E. A. Raleigh and N. Kleckner, Genetics, 1987, 116, 359–369 CAS.
  45. N. Kleckner, Genetics, 1990, 124, 449–454 CAS.
  46. R. Condit, F. Stewart and B. Levin, American Naturalist, 1988, 132, 129–147 Search PubMed.
  47. C. Basten and M. Moody, Journal of Mathematical Biology, 1991, 29, 743–761 CrossRef CAS.
  48. M. Lynch, The origins of genome architecture, Sinauer, Sunderland, MA, 2007 Search PubMed.
  49. D. E. Rozen, D. Schneider and R. E. Lenski, Journal of Molecular Evolution, 2005, 61, 171–180 CrossRef CAS.
  50. H. Ochman, S. Elwyn and N. A. Moran, Proceedings of the National Academy of Sciences of the USA, 1999, 96, 12638–12643 Search PubMed.
  51. A. Wagner and N. de la Chaux, Molecular Genmetics and Genomics, 2008 (in press) Search PubMed.
  52. W.-H. Li, Molecular Evolution, Sinauer, Massachusetts, 1997 Search PubMed.

This journal is © The Royal Society of Chemistry 2009
Click here to see how this site uses Cookies. View our privacy policy here.