Chromatin architecture and gene expression in Escherichia coli
© BioMed Central Ltd 2004
Published: 1 December 2004
Skip to main content
© BioMed Central Ltd 2004
Published: 1 December 2004
Two recent genome-scale analyses underscore the importance of DNA topology and chromatin structure in regulating transcription in Escherichia coli.
Expression of a gene is in a sense a bit like purchasing a new home - the value is strongly dependent on location. This value is context-dependent: it depends on who your neighbors are and also on the larger geographical picture. Two recent studies have analyzed DNA topology and chromatin structure on a genome-wide scale in Escherichia coli [1, 2]. Both show that an important factor in determining transcription profiles - when and to what extent a gene is expressed - is the location of the gene within the context of the E. coli K-12 chromosome. Whereas this is old news for those who are interested mainly in eukaryotic chromosomes, it is an important concept that has often been overlooked (in our opinion) in bacterial transcriptomics. In eukaryotes, it is well known that there are two types of chromatin: heterochromatin, which remains condensed for the most part throughout the cell cycle and contains few genes, and euchromatin, which, on the other hand, contains gene-rich regions and in some cases clusters of highly expressed genes.
The results reported by Jeong et al.  are slightly different from previous findings by Sousa et al. , who looked at the expression of a reporter gene when it was inserted at different positions around the chromosome. Sousa et al.  found that gene expression varies along the chromosome in a somewhat linear manner, forming a gradient in which the more highly expressed genes are localized near the replication origins and the region around the replication terminus contains few highly expressed genes. This was thought to be a result of gene dosage associated with the distance to the origin of replication: during the replication of the chromosome, there are more likely to be multiple copies of genes that are close to the replication origin. As can be seen in Figure 1, regions with highly expressed genes are not limited to the area close to the origin but are distributed in clumps throughout the chromosome, although there are few highly expressed regions around the replication terminus. Thus, in contrast to the predictions of Sousa et al. , the experimental results of Jeong et al.  show that a gene does not necessarily have to be located close to the origin of replication to be highly expressed but its expression level is rather dependent on its location within a smaller confined sub-domain.
The long-range correlations (several hundred thousand base-pairs) found by Jeong et al.  are more interesting than the short-range correlations and also have precedents in eukaryotic systems, where such clustering of highly expressed genes was postulated a very long time ago for the Drosophila polytene chromosomes . More recently, there have been two studies on gene expression in human chromosomes that showed clustering of highly expressed genes [6, 7]. The topic of chromatin structure and gene expression in eukaryotes has generated considerably more interest (and publications) than in bacteria. In fact, at the time of writing this article, a paper was recently published showing that the 'upstream binding factor' for RNA polymerase I causes the chromatin in mammalian cells to form a more decondensed, open structure, allowing access to the polymerase enzyme for transcription . Although most animals have on the order of a thousand times as much DNA as bacteria, the level of compaction by chromatin is similar in both (about 7000-fold). But it is likely that the DNA compaction is more dynamic in bacteria, because of the higher coding density of the chromosome. Furthermore, transcription and translation are coupled in bacteria, most likely for topological reasons . The long-range correlations found by Jeong et al.  are consistent with a role for chromatin structure in regulating gene expression in bacteria, showing once again that what is true for elephants can also apply to E. coli.
More than 20 years ago, it was postulated that supercoiling could be used to regulate gene expression in E. coli , and about a decade later (before microarray technology was readily available) the influence of supercoiling on the concentration of 88 proteins in E. coli was demonstrated . In the recent article by Peter et al. , the influence of DNA supercoiling on transcription was studied using DNA microarrays to systematically probe the expression profiles of all E. coli genes. The authors  demonstrated that supercoiling may act as a 'transcription factor' and that it can have either a negative or a positive effect on the transcription of a specific gene. They identified 306 'supercoiling-sensitive genes', and the expression of most of these genes correlates very well with the amount of chromosomal relaxation in each experiment. The fact that most of these supercoiling-sensitive genes were localized in regions of high density 'clumps' that were affected by DNA relaxation agrees well with the findings by Jeong et al.  that short-range correlations are dependent on negative supercoiling.
The outermost two circles in Figure 1 are based on the data of Peter et al.  and show the locations of supercoiling-sensitive genes (log p values; circle 1) and the correlation with chromosomal relaxation (circle 2). Anti-correlations corresponding to regions where expression decreases upon DNA relaxation were also found. As reported by Peter et al. , chromosomal regions with significant numbers of supercoiling-sensitive genes generally overlap with regions that are more correlated or anti-correlated with the level of chromosomal relaxation than regions with no supercoiling-sensitive genes.
Some of the chromosomal regions that are mostly correlated with supercoiling overlap with regions showing differential expression patterns among the experimental conditions used by Jeong et al. . For example, gyrA and gyrB at 2.33 megabases and 3.88 megabases on the chromosome, respectively, are highly expressed in DNA-relaxed cells (wild-type cells grown with norfloxacin; circle 6 in Figure 1) but hardly expressed in wild-type cells grown in rich (LB; circle 3) or minimal (M9; circle 4) media. Because of the experimental conditions used in the two studies, however, this picture is expected for the gyrase genes. These genes are known to be sensitive to supercoiling and are involved in maintaining a precise level of supercoiling in the cell. Thus, the inhibition of these proteins is very likely to increase their mRNA expression. Surprisingly, a substantial number of additional genes were also affected by gyrase inhibition, indicating that this change in expression has to be due to the effect that gyrase inhibition has on DNA supercoiling - that is, chromosomal relaxation.
Peter et al.  also found that supercoiling-sensitive genes whose expression increased upon DNA relaxation were significantly more AT-rich in their upstream and coding regions compared with the corresponding regions of genes not sensitive to supercoiling; the opposite was true for supercoiling-sensitive genes whose expression decreased upon DNA relaxation. This may, however, be due to the fact that AT-rich regions tend to be more curved than AT-poor regions. Supercoiling-sensitive genes may, therefore, be expected to be more AT-rich in upstream regions than genes that are regulated by means other than supercoiling. Nonetheless, these small local variations in upstream regions are not visible on the genome-scale atlas plot (Figure 1, circle 9). Because these supercoiling-sensitive genes are localized to specific regions, one would expect that in some cases a region would appear AT-rich if all of its supercoiling-sensitive genes were significantly AT-rich in their upstream regions.
From a DNA topology perspective, the plectonemic supercoils contain more potential energy, in terms of driving superhelical-dependent transitions (such as melting of the DNA helix). Thus, if there were regions along the chromosome that contained lots of binding sites for proteins involved in chromatin structure, most of the supercoiling would be transiently restrained, and hence less free energy would be available for transcription. In addition, the chromatin proteins can physically block the RNA polymerase from binding to the DNA. Because the E. coli chromatin proteins Ihf and Fis show some sequence specificity, it is possible to predict binding sites throughout the chromosome. On a global scale, there tends to be an anti-correlation between these chromatin-binding sites and regions of highly expressed genes . Finally, on the more local level of a few kilobases (for example, an operon), it is possible to predict regions that tend to exclude chromatin proteins and hence might potentially be highly expressed . In Figure 1, this 'nucleosomal position preference' measure is plotted in circle 8. As expected, regions of low position preference tend to correspond to the regions with highly expressed genes found by Jeong et al. . However, the majority of cellular DNA is compacted transiently by chromatin proteins, and there are many regions that are not highly expressed but are nonetheless regulated, with their relative expression levels dependent on supercoiling.
Originally, it was postulated that the chromosome was divided into 12-80 topologically isolated loops, so-called domains, in which chromatin could be relaxed independent of supercoiling in nearby domains . Later this number was estimated more exactly at around 50 domains corresponding to a domain size of approximately 100 kb . Recently, Postow et al.  presented evidence of an even smaller domain size of approximately 10 kb on average, corresponding to as many as 400 distinct topological domains in E. coli. This result corresponds very well with the finding of Jeong et al.  that up to 16 genes exhibit apparent coherent transcriptional activity and the idea that genes may be organized into confined supercoiled domains with a size of up to 16 kb.
The fact that the genes identified as sensitive to supercoiling have a variety of functions supports the hypothesis that supercoiling may act as a global transcriptional regulatory mechanism and that the cell may use this mechanism as an environmental sensor because the topology of the chromosome may be affected by the surrounding environment. The chromatin protein H-NS regulates many environmental genes, probably through topological changes to DNA .
One final aspect of this global view of regulation of transcription at the level of chromatin structure is that some of these environmentally regulated and supercoiling sensitive genes are involved in bacterial pathogenesis. For example, in Salmonella it has been shown that expression of genes involved in invasion is regulated by DNA supercoiling . Thus, the global regulation of gene expression by DNA topology could prove to be an important aspect of understanding the mechanisms of bacterial virulence .
This work was supported by a grant from the Danish Center for Scientific Computing.