Phylogeographic variation in recombination rates within a global clone of methicillin-resistant Staphylococcus aureus
© Castillo-Ramírez et al.; licensee BioMed Central Ltd. 2013
Received: 26 July 2012
Accepted: 27 December 2012
Published: 27 December 2012
Next-generation sequencing (NGS) is a powerful tool for understanding both patterns of descent over time and space (phylogeography) and the molecular processes underpinning genome divergence in pathogenic bacteria. Here, we describe a synthesis between these perspectives by employing a recently developed Bayesian approach, BRATNextGen, for detecting recombination on an expanded NGS dataset of the globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clone ST239.
The data confirm strong geographical clustering at continental, national and city scales and demonstrate that the rate of recombination varies significantly between phylogeographic sub-groups representing independent introductions from Europe. These differences are most striking when mobile non-core genes are included, but remain apparent even when only considering the stable core genome. The monophyletic ST239 sub-group corresponding to isolates from South America shows heightened recombination, the sub-group predominantly from Asia shows an intermediate level, and a very low level of recombination is noted in a third sub-group representing a large collection from Turkey.
We show that the rapid global dissemination of a single pathogenic bacterial clone results in local variation in measured recombination rates. Possible explanatory variables include the size and time since emergence of each defined sub-population (as determined by the sampling frame), variation in transmission dynamics due to host movement, and changes in the bacterial genome affecting the propensity for recombination.
The wave of molecular data generated through the advent of full genome sequencing and molecular typing methodologies such as multilocus sequence typing (MLST) has revolutionized our understanding of micro-evolutionary dynamics in bacterial pathogens. Although MLST has proved very successful both as a typing scheme for epidemiological surveillance and for elucidating population structure, these data are based on <1% of the genome and thus have limited resolving power for elucidating patterns of dissemination and microevolution within very recently emerged clones. However, the recent deployment of next-generation sequencing (NGS) platforms for phylogeographic studies has provided the means to generate ultra-high resolution SNP data (covering >95% of the genome) for large population samples. This technology has shed light on the divergence of highly virulent or resistant clones over years or months [1–3] within a defined locale [4, 5], transmission within a single health-care setting [6, 7] and even the molecular changes associated with prolonged carriage and invasive disease in a single host . Such studies have laid the foundation for a new era for evolutionary biology and molecular epidemiology in bacterial pathogens, particularly when considered alongside extensive metadata and social contact networks .
In addition to understanding long- and short-term transmission dynamics, NGS data can also inform on questions relating to molecular processes such as mutation and recombination, and the evolutionary fate of this variation . Current studies tend to emphasize one of these perspectives over the other. For example, Croucher et al.  comprehensively described the role and clinical significance of recombination within a resistant clone of Streptococcus pneumoniae, but did not explicitly consider the epidemiological context of the isolates. In contrast, the study of Harris et al.  in the hospital-acquired methicillin-resistant Staphylococcus aureus (MRSA) clone ST239 provided detailed evidence concerning inter-continental transmission, but had less focus on recombination. However, just as horizontal gene transfer may impact on the epidemiological properties of a clone (for example, via the acquisition of a gene conferring heightened virulence ), so the dynamics of clonal dissemination, due to factors such as host movement, will in turn impact on genome evolution. For example, the conditions necessary for sub-populations to first diversify, and then subsequently recombine, will clearly be determined by the rates of transmission over a range of geographical scales.
Here we describe a novel synthesis between global patterns of dissemination and diversification (phylogeography) and the relative rates of recombination (compared to mutation) within the hospital-acquired MRSA clone ST239. We supplement the data of Harris et al. with a further 102 isolates, and employ a recently developed Bayesian method for identifying recombination events on a genome-wide scale, both within the stable core and in the non-core (accessory) genome. Thirty-one of the extra isolates are from a global collection and 71 were recovered from 4 Turkish hospitals, thus providing detailed evidence as to the transmission dynamics within this country. We considered Turkey to be an excellent focal country for this study as it has been described as the 'epicenter' of ST239, owing to the extreme predominance of this clone (>95%) amongst hospital-acquired MRSA infections . We also considered that the location of Turkey straddling the border of Europe and Asia would provide evidence as to the spread of this clone out of Eastern Europe. Our analysis indicates significant variation in the relative impact of recombination (compared to mutation) between the ST239 sub-groups associated with different geographical regions.
Patterns of diversity and recombination in the core and non-core genomes
We supplemented the original global data of Harris et al. (62 isolates)  with full-genome SNP data from a further 31 isolates from a global collection and 71 isolates from Turkey (Additional file 1). The Turkish isolates were recovered from four hospitals representing the three largest cities in Turkey; Istanbul, Ankara and Izmir. The combined dataset of 165 isolates (62 previously sequenced + 102 newly sequenced + the reference genome, S. aureus TW20, ) resulted in a total of 16,172 high-quality SNPs. These SNPs were derived both from the previously defined core regions , which are present in all strains, but also from the non-core regions , which are variably present or absent, and correspond largely to mobile genetic elements (MGEs). Although recombination is known to occasionally affect the core genome of S. aureus [14, 15], this occurs at a low rate compared to other bacterial pathogens . In contrast, non-core genes are highly mosaic and contain a much higher SNP density than core genes, indicative of high rates of recombination [2, 10].
The global phylogeography of ST239
The identification of recombination events also facilitates robust phylogenetic analysis through their subsequent exclusion. Although the impact of recombination on the core genome was confirmed to be low using BRATNextGen, we supplemented this analysis with a second method for detecting recombination recently applied to NGS data of S. pneumoniae . We took a conservative approach and excluded SNPs corresponding to recombined regions in either one, or both, of these approaches for use in downstream phylogeographic analyses. The number of genes that were found to be affected by recombination in both analyses was 312 (compared with 441 on the basis of BRATNextGen alone; Additional file 6), but we note that the distribution of these genes according to functional category was similar as described previously, with the vast majority of recombined genes corresponding to MGEs, and higher proportion of cell envelope genes than other core categories (Additional file 7).
Harris et al. noted that 20 isolates from a single hospital in northern Thailand resolved into two distinct clades. The current data reveal that isolates from Malaysia (MAL 1, 3, 119, 9, 215, 11) cluster basally, with 100% bootstrap support, to one of these clades but not the other. This indicates that this clade may have been introduced into Thailand via Malaysia (or another neighboring country). Additional Thai isolates (M170, M705, H202) cluster with the second Thai clade, suggesting this clade represents an older and more established population of ST239 in Thailand. Whereas the study of Harris et al. focused on two extremes of geographical scale (global and a single hospital), the current data also reveal evidence for strong structuring on an intermediate, countrywide, scale. For example, three isolates from Lithuania (LIT 89,76,68) form a tight cluster, having a bootstrap value of 100, which is distinct from other European isolates, and three newly sequenced isolates from Argentina (RA3, 6, 7) cluster with four original isolates from this country (AGT 120, 9, 1). Although in this case the bootstrap value (44) is low, this is consistent with a distinct Argentinean sub-cluster within the South American group.
Most striking, however, is the observation that the Turkish isolates form a single monophyletic cluster. There are no Turkish isolates that fall outside this cluster. The two isolates of Turkish origin from the original study of Harris et al. (TUR 1, 27) are positioned basally within this cluster, which is expected as they were isolated a decade earlier (1995/1996) than the newly characterized isolates (2006/2007). Four isolates of Eastern European origin are also clustered basally within this group (yellow branches within the Turkish clade; HU109, 7, HUR18 (Hungary), HK3 (Czech Republic), H482 (Romania)). Taken together, these observations provide strong evidence of a single introduction of ST239 into Turkey from Eastern Europe, with very limited transmission of ST239 into or out of Turkey subsequent to this event.
The phylogeography of ST239 within Turkey
Dating the population sub-groups
In order to further define the global structure of the ST239 population, we used the program Bayesian Analysis of Population Structure (BAPS) . This approach resolved the three major groups revealed by the phylogenetic analysis, corresponding to the Turkish clade (including the eastern European isolates), South American, Asian and a fourth group that mainly contains isolates from Europe and USA (the strains composing each group are listed in Additional file 9). Of these, only the fourth group is polyphyletic; the three BAPS groups that are monophyletic are highlighted by the colored rectangles in Figure 3. We used the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) program  to construct a calibrated phylogeny (Additional file 10), this phylogeny suggests that ST239 first emerged in Europe in 1965 (late 1960, early 1973), which is consistent with the previous analysis by Harris et al. . The observation that the three monophyletic clades appear unrelated to each other points to multiple independent exports from Europe, rather than a progressive wave of global dissemination. We estimate that the Turkish and Asian clades both emerged around 1982 (Turkish clade 1982 (1975, 1988) and Asian clade 1983 (late 1976, 1989)), whereas we estimate the introduction into South America to have occurred approximately a decade later in 1992 (late 1989, 1993) (Additional file 10).
Recombination between and within groups
Admixture analysis between groups
Turkish (BAPS group 1)
Asian (BAPS group 2)
South American (BAPS group 3)
Fourth (BAPS group 4)
The r/m ratios
r/m with MGE
r/m without MGE
1.1 (m 1839, r 2032)
0.29 (m 895, r 264)
0.55 (m 3080, r 1707)
0.05 (m 2071, r 104)
0.26 (m 3207, r 839)
0.05 (m 2017, r 92)
1 (m 4276, r 4277)
0.06 (m 2730, r 173)
The principal goal of this study was to demonstrate the utility of NGS data for simultaneously exploring both molecular processes and phylogeographic patterns in pathogenic bacteria over very short time scales. Our data confirm that the overwhelming majority of recombination is restricted to the highly diverse non-core genes corresponding to MGEs and points to similar rates of recombination in the core genome suggested previously using MLST data [14, 16, 22]. Somewhat higher estimates have recently been published from full genomic comparisons [23, 24], but this discrepancy might be due to the inclusion of non-core genes in these studies, or other methodological differences. For example, we note that in the study by Chan et al. , hypothetical proteins and genes of unknown function categories were overrepresented in whole-gene transfers. Our data also point to higher rates of recombination in core genes encoding proteins associated with the cell envelope than in other core categories, which is consistent with the view that such genes tend to be under higher levels of diversifying selection . We note that the analysis of the non-core is limited by read-mapping to only a single ST239 reference genome (TW20). This means that any MGEs present in other ST239 genomes, but absent in TW20, are not taken into account, and the true diversity of the non-core is under-estimated. Nevertheless, the comparisons we draw between the core and non-core remain indicative of general trends between sub-groups, as discussed below. We also do not consider the lack of alternative reference genomes to be a problem for capturing core genome divergence. We note that the very recent emergence of ST239 means that the ST239-specific core will represent a higher proportion of the whole genome than the core of the species as a whole. Thus, broadly speaking, we consider a single reference genome to be sufficient for a detailed analysis of core divergence on this very fine micro-evolutionary scale, and for generating useful, if incomplete, evidence concerning the non-core.
In order to objectively define population sub-groups prior to comparative analysis, we used a combination of robust phylogenetic reconstruction using maximum likelihood, and a Bayesian approach for defining sub-groups implemented in BAPS. As discussed previously , this integrated approach is critical. Without the use of BAPS it would be unclear as to which clusters on the tree could be objectively ring-fenced and promoted to a sub-group, and without the phylogenetic analysis it would not be clear which of the BAPS sub-groups are monophyletic. The four sub-groups defined by BAPS in our analysis show strong geographical correspondence, being composed primarily of isolates from Asia, South America, Turkey, and a fourth sub-group consisting mainly of strains from Europe and USA. Three of these sub-groups are robustly supported by the phylogeny; the fourth BAPS group (Europe + USA) is an exception as it is polyphyletic and is defined on the basis that the isolates do not belong in any of the other groups. Without the accompanying phylogeny, the nature of this fourth sub-group would not be obvious.
The power of any population level analysis is ultimately dependent on the sampling frame, and we note that the promotion of a phylogenetic cluster to a population sub-group, through the use of BAPS, is largely dependent on how well the cluster is represented in the sample. For example, the three isolates from Lithuania form a clear monophyletic cluster but, unlike the Turkish cluster, they are not defined as a population sub-group by BAPS, presumably because there are so few isolates. To examine this more quantitatively, we re-ran the BAPS analysis excluding random isolates from the Turkish cluster, in order to gauge the approximate number of Turkish isolates required for this cluster to be defined as a BAPS sub-group. When the analysis was run with only 20 or 15 Turkish isolates, the same four BAPS groups were recovered, but when only 5 or 10 Turkish isolates were included, just three groups were defined and the Turkish isolates were assimilated into the fourth group. This suggests that, at least for our dataset, between 10 and 15 isolates are required before a monophyletic cluster is assigned as a population sub-group by BAPS, even if it is a cluster defined by a long branch.
We then combined the two strands of analysis by considering the relative impact of recombination compared to mutation (r/m) in the core and non-core within different population sub-groups. This analysis demonstrated statistically significant variation in r/m between the population sub-groups, thus highlighting an association between molecular processes and phylogeography. Although these differences were far more striking when MGEs were considered, our data suggest broadly similar trends when these elements are excluded. Gene flow has previously been shown to be structured in bacterial populations [26, 27], and links have been drawn with shifts in epidemiological properties. For example, epidemic serogroup A Neisseria meningitidis, which has caused pandemic waves in sub-Saharan Africa , appears to be recombining at a lower rate than other serogroups, and it has been proposed that this may be due to the rapid spread of this clone . However, to our knowledge the current study represents the first attempt to consider how recombination varies within a single clone (which has emerged and globally disseminated in under 50 years), and within the context of a robust phylogeographic framework. Below, we discuss three broad (and non-exclusive) classes of explanation for this phylogeographical variation in recombination: time-dependence, variation in transmission associated with host movement, and changes in the bacterial genome.
Regarding the possibility of time-dependence, we consider if there are any a priori expectations that a very recently emerged local population should show lower rates of recombination (relative to mutation) than a more established one. A nascent population expanding from a single founder event will initially contain very low levels of diversity. If recombination mostly occurs within the clone, the statistical detection of recombination events within a sample drawn from a very recently emerged population will prove problematic, and detectable levels of recombination should increase over time as diversity accumulates in the population. This then might explain the decrease in r/m associated with the founder events from Eastern Europe into Turkey and within Turkey to Izmir (Figure 5c,d). An important caveat is that this argument assumes very limited recombination between ST239 and other S. aureus lineages carried asymptomatically on the skin or in the anterior nares. ST239 is multiply antibiotic resistant, and generally very rarely observed outside of the hospital environment (with the notable exception of a recently described variant in Russia ). Further, at least 90% of the hospital-acquired MRSA within Turkey correspond to this clone ; thus, this ecological specialism of ST239 may limit the opportunities for recombination with other lineages. Genetic factors such as sequence divergence and the presence of lineage-specific restriction modification systems have also been proposed to impact on the relative rates of recombination within and between lineages [31, 32]. On the other hand, it is also clear that recombination between different S. aureus lineages does occur occasionally, the hybrid genome of ST239 being a striking example .
It is also interesting to speculate as to whether differences in the selective consequences of recombination and de novo mutation might be playing a role, as this might also result in a time-dependent effect. For example, if de novo mutations were, on average, more selectively deleterious than recombination events, then these would be preferentially purged over time and the r/m would increase. In support of this, we have recently shown that the variation imported by recombination is largely synonymous , hence possibly less likely to be deleterious than a de novo non-synonymous mutation that has not passed through a selective filter.
Arguments based on time dependence cannot explain the very high estimates for r/m for the South American sub-group, as this sub-group emerged more recently than the Turkish and Asian sub-groups. An alternative explanation to time-dependent effects is variation in transmission patterns within and between each geographic region, owing to host movement rather than intrinsic differences in transmissibility of the bacteria. Harris et al.  noted evidence for an inter-continental transmission event from South America to Portugal, which is likely to have initiated the wave of ST239 cases in this country (the 'Brazilian clone') in the late 1990s/early 2000s (the green cluster within the South American group in Figure 3). It is possible that this transmission event increased the opportunity for recombination with European ST239 lineages, notably the 'Portuguese clone' of ST239, which caused the prior wave of ST239 infection in Portugal in the early 1990s . However, the admixture analysis does not help to explain the high rates of recombination in the South American sub-group, as this analysis suggests a higher rate of gene flow between Europe and Asia. In contrast, our data suggest both very low rates of transmission of ST239 into or out of Turkey, and low admixture between the Turkish sub-group and other groups. This relative isolation helps to explain why the r/m for the Turkish group is lower than for the Asian sub-group, despite these groups having emerged at approximately the same time. An important caveat is that the Asian sub-group (which includes a small number of isolates from China) will be substantially larger than the Turkish sub-group in terms of potential hosts, and that the definition of the sub-groups depends on the sampling frame, as discussed above.
Finally, it is possible that genomic changes such as the acquisition of a virulence factor may result in an increase in transmissibility and an epidemiological shift that, in turn, impacts on local rates of recombination. The acquisition of sasX by ST239 in Eastern China  is a notable example. However, there is no clear evidence for broad epidemiological distinctions between the sub-groups compared in this analysis. It is also possible that genetic differences, such as the presence of a particular phage variant or restriction modification system, might impact on the intrinsic rate of recombination.
We have shown variation in rates of recombination within a robust phylogeographic framework, and speculated as to the likely forces underpinning these differences. A great deal more theoretical and empirical work is required to fully understand this variation, in particular the respective roles of transmission and the chances that two different lineages will meet, time-dependence and the role of purifying selection, and genomic changes impacting on recombination rate. Although the forces responsible for intra-clonal and intra-species variation in recombination rates remain unclear, here we demonstrate the utility of a closer synthesis of phylogeography and molecular dynamics.
Materials and methods
The data set and genome sequencing
We supplemented a previous MRSA ST239 data set  consisting of isolates taken from global sources (n = 42) and a single hospital in Thailand (n = 20) with 71 clinical isolates from four Turkish hospitals. These isolates were all sampled between 2006 and 2009 and represent a range of sources (for example, blood, catheter, swab; details are provided in Additional file 13); 31 clinical isolates from a global collection were also sequenced, giving a total of 165 isolates for this study. ST239 is a heath-care associated (HA) MRSA lineage and all the isolates used in this study were collected from hospitals. A list of all the strains is provided in Additional file 1. Unique index-tagged libraries for each sample were created, and up to 12 separate libraries were sequenced in each of eight channels in Illumina Genome Analyser GAII cells with 75-base paired-end reads. Data have been deposited in the European Nucleotide Archive under study number ERP000228. The paired-end reads were mapped against the chromosome of S. aureus TW20 (accession number FN433596)  using SMALT  and SNPs were identified as described in . Mobile genetic elements and accessory regions in the TW20 reference chromosome had previously been identified by manual curation . Coding sequences in TW20 were assigned to functional role categories adapted from Riley  by manual curation .
Tests for recombination
We applied the BRATNextGen software on the whole genome alignment with the same default settings as in . Significance of each putative recombinant segment (P-value <0.05) was determined through a bootstrap test with 100 replicate analyses executed in parallel on separate processors. As shown by Marttinen et al. , this approach yielded almost identical results with the analysis of the PMEN1 data in Croucher et al. , but was considerably faster and also directly indicates which recombination events are shared by subsets of isolates. Neighbor net and the PhiTtest were implemented using SplitTree4 .
Maximum likelihood phylogeny
A maximum likelihood phylogenetic analysis was run using PhyML . We used a multiple alignment composed of all the SNPs that have not been affected by recombination. As we were only considering SNPs, the proportion of invariable sites was set to 0. The General Time-Reversible (GTR) model was used, where the base frequencies and the relative substitution rates between them were estimated by maximizing the likelihood of the phylogeny. For estimating the tree topology both nearest neighbor interchange and subtree pruning and regrafting methods were used. To measure the group support across the phylogeny, 100 non-parametric bootstrap replicates were run.
Bayesian Analysis of Population Structure
We conducted the mixture and admixture analyses using BAPS version 5.3 , converting the SNP alignment file into the appropriate format using a PERL script. The 'Clustering of individuals' module was used for the population mixture analysis. In order to evaluate the consistency of the results a vector from 2 to 20 K values (where K is the maximum number of groups) was run. For the population admixture analysis, the 'Admixture based on mixture clustering' was run with a minimum population size of 5 and the following specifications: the iterations to estimate the admixture coefficient for the individuals were 100; the reference individuals from each population were 200; and the number of iterations for estimating the admixture coefficient for reference individuals was 20. The P-values in Table 1 were computed by BAPS during the admixture analysis as described in the BAPS manual  (p. 25): 'The final column gives the p-value for the individual. This value tells the proportion of reference individuals simulated from the population in which the individual was originally clustered having the admixture coefficient to the cluster smaller than or equal to the individual.'
Bayesian Markov Chain Monte Carlo analyses
The dated phylogeny analysis was implemented through BEAST . For this analysis a multiple alignment of all the SNPs not affected by recombination was used. Since we had the date of sampling for the isolates, we could use this information for dating. We used the uncorrelated log-normal relaxed clock with the GTR model and did not use the correction for site rate heterogeneity because the statistical model selection did not show this as a significant parameter. The analysis was run for 90,000,000 steps and at every 9,000 steps samples were taken. We discarded 9,000,000 steps as burn-in. The program TRACER was used to evaluate the convergence of the analysis. This was done by looking at the trace plots of the likelihood scores and by confirming that the effective sample size of the tree likelihood was greater than 200.
Statistical analyses and r/m ratios
All the statistical analyses were carried out via R . The r/m ratios were calculated from the output of BRATNextGen using a PERL script.
This work was funded by the TROCAR consortium (EU FP7-HEALTH #223031) and the UK Clinical Research Collaboration Translational Infection Research Initiative (grant number G1000803). We thank the core sequencing and informatics teams at the Sanger Institute for their assistance and The Wellcome Trust for its support of the Sanger Institute Pathogen Genomics and Biology groups. MTGH, SDB and JP were supported by Wellcome Trust grant 098051. We are truly thankful to the reviewers whose comments have significantly improved our manuscript.
Bayesian Analysis of Population Structure
Bayesian Evolutionary Analysis by Sampling Trees
mobile genetic element
multilocus sequence typing
methicillin-resistant Staphylococcus aureus
- Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM, Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A, Klugman KP, Parkhill J, Hanage WP, Bentley SD: Rapid pneumococcal evolution in response to clinical interventions. Science. 2011, 331: 430-434. 10.1126/science.1198545.PubMedPubMed CentralView ArticleGoogle Scholar
- Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ, Bentley SD: Evolution of MRSA during hospital transmission and intercontinental spread. Science. 2010, 327: 469-474. 10.1126/science.1182395.PubMedPubMed CentralView ArticleGoogle Scholar
- Mutreja A, Kim DW, Thomson NR, Connor TR, Lee JH, Kariuki S, Croucher NJ, Choi SY, Harris SR, Lebens M, Niyogi SK, Kim EJ, Ramamurthy T, Chun J, Wood JL, Clemens JD, Czerkinsky C, Nair GB, Holmgren J, Parkhill J, Dougan G: Evidence for several waves of global transmission in the seventh cholera pandemic. Nature. 2011, 477: 462-465. 10.1038/nature10392.PubMedPubMed CentralView ArticleGoogle Scholar
- Nubel U, Dordel J, Kurt K, Strommenger B, Westh H, Shukla SK, Zemlickova H, Leblois R, Wirth T, Jombart T, Balloux F, Witte W: A timescale for evolution, population expansion, and spatial spread of an emerging clone of methicillin-resistant Staphylococcus aureus. PLoS Pathog. 2010, 6: e1000855-10.1371/journal.ppat.1000855.PubMedPubMed CentralView ArticleGoogle Scholar
- McAdam PR, Templeton KE, Edwards GF, Holden MT, Feil EJ, Aanensen DM, Bargawi HJ, Spratt BG, Bentley SD, Parkhill J, Enright MC, Holmes A, Girvan EK, Godfrey PA, Feldgarden M, Kearns AM, Rambaut A, Robinson DA, Fitzgerald JR: Molecular tracing of the emergence, adaptation, and transmission of hospital-associated methicillin-resistant Staphylococcus aureus. Proc Natl Acad Sci USA. 2012, 109: 9107-9112. 10.1073/pnas.1202869109.PubMedPubMed CentralView ArticleGoogle Scholar
- Snitkin ES, Zelazny AM, Thomas PJ, Stock F, Henderson DK, Palmore TN, Segre JA: Tracking a hospital outbreak of carbapenem-resistant Klebsiella pneumoniae with whole-genome sequencing. Sci Transl Med. 2012, 4: 148ra116-10.1126/scitranslmed.3004129.PubMedPubMed CentralGoogle Scholar
- Koser CU, Holden MT, Ellington MJ, Cartwright EJ, Brown NM, Ogilvy-Stuart AL, Hsu LY, Chewapreecha C, Croucher NJ, Harris SR, Sanders M, Enright MC, Dougan G, Bentley SD, Parkhill J, Fraser LJ, Betley JR, Schulz-Trieglaff OB, Smith GP, Peacock SJ: Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. N Engl J Med. 2012, 366: 2267-2275. 10.1056/NEJMoa1109910.PubMedPubMed CentralView ArticleGoogle Scholar
- Young BC, Golubchik T, Batty EM, Fung R, Larner-Svensson H, Votintseva AA, Miller RR, Godwin H, Knox K, Everitt RG, Iqbal Z, Rimmer AJ, Cule M, Ip CL, Didelot X, Harding RM, Donnelly P, Peto TE, Crook DW, Bowden R, Wilson DJ: Evolutionary dynamics of Staphylococcus aureus during progression from carriage to disease. Proc Natl Acad Sci USA. 2012, 109: 4550-4555. 10.1073/pnas.1113219109.PubMedPubMed CentralView ArticleGoogle Scholar
- Gardy JL, Johnston JC, Ho Sui SJ, Cook VJ, Shah L, Brodkin E, Rempel S, Moore R, Zhao Y, Holt R, Varhol R, Birol I, Lem M, Sharma MK, Elwood K, Jones SJ, Brinkman FS, Brunham RC, Tang P: Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N Engl J Med. 2011, 364: 730-739. 10.1056/NEJMoa1003176.PubMedView ArticleGoogle Scholar
- Castillo-Ramirez S, Harris SR, Holden MT, He M, Parkhill J, Bentley SD, Feil EJ: The impact of recombination on dN/dS within recently emerged bacterial clones. PLoS Pathog. 2011, 7: e1002129-10.1371/journal.ppat.1002129.PubMedPubMed CentralView ArticleGoogle Scholar
- Li M, Du X, Villaruz AE, Diep BA, Wang D, Song Y, Tian Y, Hu J, Yu F, Lu Y, Otto M: MRSA epidemic linked to a quickly spreading colonization and virulence determinant. Nat Med. 2012, 18: 816-819. 10.1038/nm.2692.PubMedPubMed CentralView ArticleGoogle Scholar
- Alp E, Klaassen CH, Doganay M, Altoparlak U, Aydin K, Engin A, Kuzucu C, Ozakin C, Ozinel MA, Turhan O, Voss A: MRSA genotypes in Turkey: persistence over 10 years of a single clone of ST239. J Infect. 2009, 58: 433-438. 10.1016/j.jinf.2009.04.006.PubMedView ArticleGoogle Scholar
- Holden MT, Lindsay JA, Corton C, Quail MA, Cockfield JD, Pathak S, Batra R, Parkhill J, Bentley SD, Edgeworth JD: Genome sequence of a recently emerged, highly transmissible, multi-antibiotic- and antiseptic-resistant variant of methicillin-resistant Staphylococcus aureus, sequence type 239 (TW). J Bacteriol. 2010, 192: 888-892. 10.1128/JB.01255-09.PubMedPubMed CentralView ArticleGoogle Scholar
- Feil EJ, Cooper JE, Grundmann H, Robinson DA, Enright MC, Berendt T, Peacock SJ, Smith JM, Murphy M, Spratt BG, Moore CE, Day NP: How clonal is Staphylococcus aureus?. J Bacteriol. 2003, 185: 3307-3316. 10.1128/JB.185.11.3307-3316.2003.PubMedPubMed CentralView ArticleGoogle Scholar
- Robinson DA, Enright MC: Evolution of Staphylococcus aureus by large chromosomal replacements. J Bacteriol. 2004, 186: 1060-1064. 10.1128/JB.186.4.1060-1064.2004.PubMedPubMed CentralView ArticleGoogle Scholar
- Vos M, Didelot X: A comparison of homologous recombination rates in bacteria and archaea. ISME J. 2009, 3: 199-208. 10.1038/ismej.2008.93.PubMedView ArticleGoogle Scholar
- Marttinen P, Hanage WP, Croucher NJ, Connor TR, Harris SR, Bentley SD, Corander J: Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res. 2012, 40: e6-10.1093/nar/gkr928.PubMedPubMed CentralView ArticleGoogle Scholar
- Huson DH, Bryant D: Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006, 23: 254-267.PubMedView ArticleGoogle Scholar
- Riley M: Functions of the gene products of Escherichia coli. Microbiol Rev. 1993, 57: 862-952.PubMedPubMed CentralGoogle Scholar
- Tang J, Hanage WP, Fraser C, Corander J: Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Comput Biol. 2009, 5: e1000455-10.1371/journal.pcbi.1000455.PubMedPubMed CentralView ArticleGoogle Scholar
- Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007, 7: 214-10.1186/1471-2148-7-214.PubMedPubMed CentralView ArticleGoogle Scholar
- Perez-Losada M, Browne EB, Madsen A, Wirth T, Viscidi RP, Crandall KA: Population genetics of microbial pathogens estimated from multilocus sequence typing (MLST) data. Infect Genet Evol. 2006, 6: 97-112. 10.1016/j.meegid.2005.02.003.PubMedPubMed CentralView ArticleGoogle Scholar
- Chan CX, Beiko RG, Ragan MA: Lateral transfer of genes and gene fragments in Staphylococcus extends beyond mobile elements. J Bacteriol. 2011, 193: 3964-3977. 10.1128/JB.01524-10.PubMedPubMed CentralView ArticleGoogle Scholar
- Takuno S, Kado T, Sugino RP, Nakhleh L, Innan H: Population genomics in bacteria: a case study of Staphylococcus aureus. Mol Biol Evol. 2012, 29: 797-809. 10.1093/molbev/msr249.PubMedPubMed CentralView ArticleGoogle Scholar
- Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE, Urwin R, Zhang Q, Zhou J, Zurth K, Caugant DA, Feavers IM, Achtman M, Spratt BG: Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci USA. 1998, 95: 3140-3145. 10.1073/pnas.95.6.3140.PubMedPubMed CentralView ArticleGoogle Scholar
- Didelot X, Maiden MC: Impact of recombination on bacterial evolution. Trends Microbiol. 2010, 18: 315-322. 10.1016/j.tim.2010.04.002.PubMedPubMed CentralView ArticleGoogle Scholar
- Hanage WP, Fraser C, Tang J, Connor TR, Corander J: Hyper-recombination, diversity, and antibiotic resistance in pneumococcus. Science. 2009, 324: 1454-1457. 10.1126/science.1171908.PubMedView ArticleGoogle Scholar
- Morelli G, Malorny B, Muller K, Seiler A, Wang JF, del Valle J, Achtman M: Clonal descent and microevolution of Neisseria meningitidis during 30 years of epidemic spread. Mol Microbiol. 1997, 25: 1047-1064. 10.1046/j.1365-2958.1997.5211882.x.PubMedView ArticleGoogle Scholar
- Achtman M: Population structure of pathogenic bacteria revisited. Int J Med Microbiol. 2004, 294: 67-73. 10.1016/j.ijmm.2004.06.028.PubMedView ArticleGoogle Scholar
- Yamamoto T, Takano T, Higuchi W, Iwao Y, Singur O, Reva I, Otsuka Y, Nakayashiki T, Mori H, Reva G, Kuznetsov V, Potapov V: Comparative genomics and drug resistance of a geographic variant of ST239 methicillin-resistant Staphylococcus aureus emerged in Russia. PLoS One. 2012, 7: e29187-10.1371/journal.pone.0029187.PubMedPubMed CentralView ArticleGoogle Scholar
- Waldron DE, Lindsay JA: Sau1: a novel lineage-specific type I restriction-modification system that blocks horizontal gene transfer into Staphylococcus aureus and between S. aureus isolates of different lineages. J Bacteriol. 2006, 188: 5578-5585. 10.1128/JB.00418-06.PubMedPubMed CentralView ArticleGoogle Scholar
- Turner KM, Feil EJ: The secret life of the multilocus sequence type. Int J Antimicrob Agents. 2007, 29: 129-135. 10.1016/j.ijantimicag.2006.11.002.PubMedView ArticleGoogle Scholar
- Aires-de-Sousa M, Correia B, de Lencastre H: Changing patterns in frequency of recovery of five methicillin-resistant Staphylococcus aureus clones in Portuguese hospitals: surveillance over a 16-year period. J Clin Microbiol. 2008, 46: 2912-2917. 10.1128/JCM.00692-08.PubMedPubMed CentralView ArticleGoogle Scholar
- SMALT. [http://www.sanger.ac.uk/resources/software/smalt/]
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52: 696-704. 10.1080/10635150390235520.PubMedView ArticleGoogle Scholar
- Corander J: BAPS: Bayesian Analysis of Population Structure Manual v. 5.3. 2009, [http://web.abo.fi/fak/mnf/mate/jc/software/BAPS5manual.pdf]Google Scholar
- The R Project for Statistical Computing. [http://www.r-project.org/]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.