Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe

Background DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Results Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. Conclusions The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture. Electronic supplementary material The online version of this article (doi:10.1186/s13059-015-0788-9) contains supplementary material, which is available to authorized users.


Modelling Origin Usage
In order to simulate the distribution of reads if non-mapped origins were present at potential locations of origins (strand-switch regions; SSRs) we used L. major chromosome 36, since this has the most potential origins, being the largest chromosome in this species. We used a bespoke were then merged using samtools to create a bam file with a single origin used in 100%, or with 6 additional, non-natural origins each with 17% usage. This merged bam file was then processed using the MFAseq pipeline and it was clear that these origins would not have been easily visible in our analysis (data not shown).
To derive a threshold of visible detection of origins we took a file with a single simulated origin (position LmjF.36: 155190-156279) and merged it multiple times with the original L. major early-S data to dilute the percentage of usage. Each of this series of 'dilutions' (no dilution, 5:1, 2:1, 1:1, 1:2 and 1:5, plus the original early-S track) was plotted using the MFAseq pipeline and these results are show in Figure S10. Using the natural, mapped origin of smallest amplitude (L. major chromosome 4) we then measured the height of each peak using a bespoke script to measure the median ratio for the SSR and the median ratio for the whole chromosome (median so as not to exaggerate heights). The resulting ratio was 1.037 (origin/chromosome), so we set this is a stringent limit of reasonable detection. We then plotted the median for the simulated and diluted origin, showing there was a good trend and fit to the data (Fig.S10), and so we used this to calculate the percentage 'dilution' at which we would have a ratio of 1.037. The region background is 1.038 and the threshold of detection is consequently 1.076, which corresponds to a dilution of 25.09%. We therefore declared the stringent threshold of visible detection as 25%.
The scripts used in this analysis are freely and publicly available from: http://bitbucket.org/WTCMPCPG/originssimulation/.

Plasmid generation and transformation
The origin-active strand switch region (SSR) on chromosome 30 from L. major Friedlin was  Table 1). qPCR was performed as described in the main text. To determine the plasmid copy number we normalized to the chromosome reference (ΔCT = CTG418 -genomic CTgDNA; plasmid copy number = 2 × 2(-ΔCT)).     Highlighted in red are L. mexicana chromosomes 8 and 20, each of which is syntenic with two L. major chromosomes (29 and 8, and 36 and 20, respectively). Early S/G2 DNA sequence depth ratios (L. major blue, L. mexicana green) and CDS organisation are as detailed in Fig.1 and Fig.S2.

Supplementary figure legends
Acetylated histone H3 (H3ac) localisation is shown for L. major (as in Fig.1 and Fig.S2), but has not been mapped in L. mexicana. sizes denoted in intervals of 0.25 Mb). Early S/G2 DNA sequence depth ratios (green) are shown relative to late S/G2 ratios (yellow) and to CDS organisation, as detailed in Fig.1 and Fig.S2. fragmented; in addition, for simplicity, we have not detailed the relative orientation of the synteny blocks: readers are referred to [5] for greater detail. Conservation of replication origins between the two genomes is detailed as follows: blue denotes origins in loci that are conserved in the two genomes; red denotes loci that are conserved between the genomes but only show origin activity in T. brucei; dotted pink denotes loci that are conserved between the genomes but only show origin activity in L. major; burgundy denotes a T. brucei origin at the site of chromosome rearrangement between the genomes, with no detectable origin activity in L. major; grey denotes a T. brucei origin in a region of unclear synteny due to local rearrangements relative to the L. major chromosome; black denotes an origin within a subtelomeric region of T. brucei chromosome 6 that is not found in the L. major genome; boxes denote origins that are coincident with mapped centromeres (chromosomes 1-8). Mb, and early S/G2 DNA sequence depth ratios (L. major blue, T. brucei orange; y-axes) and CDS organisation are as detailed in Fig.1 and Fig.S2. The approximate location of the mapped centromere (green circle) in T. brucei chromosome 8 is indicated.     Table S1. Minimal and maximal median S/G2 ratios, derived from 2.5 kbp bins after plotting sequence reads from DNA derived from early S and G2 phase cells, for the 36 L. major chromosomes; data are shown either for the whole chromosome, for the reads spanning the origin-active SSR, or for a 30 kbp region around the origin-defining MFAseq peak. * in chromosome 1 (LmjF.01) there is no clearly defined origin-active SSR, and so the SSR region is arbitrarily defined as the region between the two genes proximal to the right-hand telomere (see         Table S1