Basal core promoters control the equilibrium between negative cofactor 2 and preinitiation complexes in human cells
© Albert et al.; licensee BioMed Central Ltd. 2010
Received: 17 October 2009
Accepted: 15 March 2010
Published: 15 March 2010
The general transcription factor TFIIB and its antagonist negative cofactor 2 (NC2) are hallmarks of RNA polymerase II (RNAPII) transcription. Both factors bind TATA box-binding protein (TBP) at promoters in a mutually exclusive manner. Dissociation of NC2 is thought to be followed by TFIIB association and subsequent preinitiation complex formation. TFIIB dissociates upon RNAPII promoter clearance, thereby providing a specific measure for steady-state preinitiation complex levels. As yet, genome-scale promoter mapping of human TFIIB has not been reported. It thus remains elusive how human core promoters contribute to preinitiation complex formation in vivo.
We compare target genes of TFIIB and NC2 in human B cells and analyze associated core promoter architectures. TFIIB occupancy is positively correlated with gene expression, with the vast majority of promoters being GC-rich and lacking defined core promoter elements. TATA elements, but not the previously in vitro defined TFIIB recognition elements, are enriched in some 4 to 5% of the genes. NC2 binds to a highly related target gene set. Nonetheless, subpopulations show strong variations in factor ratios: whereas high TFIIB/NC2 ratios select for promoters with focused start sites and conserved core elements, high NC2/TFIIB ratios correlate to multiple start-site promoters lacking defined core elements.
TFIIB and NC2 are global players that occupy active genes. Preinitiation complex formation is independent of core elements at the majority of genes. TATA and TATA-like elements dictate TFIIB occupancy at a subset of genes. Biochemical data support a model in which preinitiation complex but not TBP-NC2 complex formation is regulated.
The core region of metazoan promoters shows various architectures and can harbor several distinct motifs, termed TATA box (TATA) , initiator (INR) , downstream promoter element (DPE) , downstream core element , upstream and downstream TFIIB recognition elements (BREu and BREd, respectively) [5, 6] and motif ten element  (reviewed in ). These elements facilitate assembly of the transcription machinery in a cooperative manner and are thought to contribute to accurate initiation at a defined transcription start site (TSS) . In a majority of vertebrate genes core promoter elements are less represented . Instead, they reside in CpG islands and are GC-rich. These promoters assemble general transcription factors (GTFs) in a manner that remains poorly understood.
The general initiation factor TFIIB is absolutely required for transcription initiation by RNA polymerase II (RNAPII) . TFIIB associates with TATA box-binding protein (TBP) and establishes sequence-specific contacts in the major groove upstream and in the minor groove downstream of TATA . The upstream binding site, termed BREu, has been defined via an in vitro selection procedure employing the TATA-containing Adenovirus major late (AdML) promoter . The corresponding high-affinity downstream element, BREd, was characterized via site selection in the context of the TATA-containing Adenovirus E4 (AdE4) promoter . Both elements stabilize the TFIIB-TBP-promoter complex in vitro. BREu and BREd suppressed basal transcription of the AdML core promoter ; however, BREd enhanced activity of the AdE4 promoter . Broadly, these data are in conflict with a general positive role of TFIIB in transcription.
The function of TFIIB has not been investigated in vivo, nor has TFIIB occupancy so far been correlated with gene activity. Prevalence of BREs in active genes remains subject to controversy. A computational study based on statistical analysis of curated promoter sets concluded that up to 25% of human core promoters contain a potential BREu. The motif was found to be enriched in CpG promoters (>30% frequency) but depleted in CpG-less promoters (<10% frequency) . In contrast, a recent large-scale study of CAGE (cap analysis of gene expression) data sets in mammals did not reveal clear evidence of BREu over-representation in these regions . The prevalence of BREd in mammalian promoters has not been investigated by bioinformatic means.
Genome-wide binding studies on general initiation factors have been extensively performed in yeast and include maps of TBP, TFIID and SAGA [16, 17], GTFs , Mediator [19, 20], and Mot1 and negative cofactor 2 (NC2) . However, with few exceptions [22–25] comparable studies in mammalian cells are lacking. Here we conducted a comparative genome-wide analysis on promoter association of human TFIIB and NC2 and correlate it with gene expression and core promoter architecture. Whereas most genes direct preinitiation complexes (PICs) to their promoters in the apparent absence of core promoter elements, a small subset of highly expressed genes with high TFIIB/NC2 ratios direct binding of PICs via core promoters. Biochemical data suggest that TATA and regulatory factors positively control TFIIB but not (or to a lesser extent) NC2 binding, thereby providing a model for binding of GTFs in the absence of core elements and alterations in TFIIB/NC2 ratios inside cells. In addition to defining a library of promoters ranked by steady-state levels of PICs in human B cells, the comparative analyses of TFIIB and NC2 also establish a resource for human basal core promoters.
Genome-wide promoter binding of TFIIB
TFIIB Peak Identification
TFIIB replicate 1
TFIIB replicate 2
Peaks (mean + 1.0 s.d.)
Peaks (mean + 2.0 s.d.)
Peaks (mean + 2.5 s.d.)
TFIIB occupancy correlates positively with steady-state mRNA levels
At single genes TFIIB occupancy matched well with steady-state mRNA levels in LCL721 B cells  (Figure 2b, lower panel). To corroborate this at a genome-wide scale, TFIIB occupancy levels were correlated with mRNA levels for all genes. To this end, the median of the TFIIB ChIP-chip signal on each NimbleGen promoter array probeset was plotted against the normalized mRNA hybridization signal on the corresponding probeset of an Affymetrix gene expression array (Figure 2d). Then, a sliding window was moved over the ChIP-chip data from genes with low TFIIB levels to genes with high TFIIB levels and the average expression for these subgroups was determined. The resulting curve revealed a significant positive correlation between TFIIB occupancy and gene expression (Pearson's correlation r = 0.97). Moreover, a disproportionately high number of the most strongly expressed genes bear high TFIIB levels, as revealed by the skewed distribution of expression quantiles (Figure 2e). Here, 94% of the genes in the upper 10th percentile of TFIIB occupancy are expressed above average (median of all expression array signals), and 37% fall into the top 10% of expressed genes. In contrast, 26% of the genes in the lower 10th percentile of TFIIB occupancy are expressed above average, and only 2% of those are amongst the top 10% of all expressed genes. These outliers may reflect gene expression control at posttranscriptional stages, for example, through stabilization of mRNAs. To statistically evaluate the observed difference, a Kolmogorov-Smirnov test was applied. It confirmed with a significance level of P < 2e-16 that the distribution of expression signals in the upper 10th percentile of TFIIB occupancy is highly dissimilar to the distribution in all genes. Taken together, these analyses indicate that TFIIB-dependent PIC formation provides an excellent measure for gene activity, both at the single gene and the genome-wide level.
Human core promoter structure associated with preinitiation complexes
Comparison of genome-wide TFIIB and NC2 promoter occupancy
Intact core promoters select for TFIIB and against NC2
We then asked if we could identify core promoter structures that relate to different TFIIB/NC2 ratios. Here, we focused on active genes, that is, genes that are expressed above average and are bound by both factors, using the 60th percentile for TFIIB and NC2 gene occupancy as well as for steady-state mRNA levels as cut-off. From these, the top 100 genes showing the highest or lowest TFIIB/NC2 ratios were selected for further analysis. Alignment of the promoter regions of the top 100 TFIIB-dominated genes yielded structured core regions with the most frequent bases resembling the INR consensus at positions -2 to +5 (Figure 5b, upper panel). Preferred bases at positions -35 to -25 (CGGCTAAAAAA) matched conserved BREu and TATA residues. Also, a G-rich sequence around +30 (GGGCGT) resembled the DPE motif (RGWYVT)  identified in Drosophila. In contrast, alignment of NC2-dominated genes did not reveal recognizable core elements. Instead, the core regions of these genes were enriched for G and C, which were the most frequent bases at every single position from -50 to +50 (Figure 5b, lower panel).
The enrichment of core promoter elements in TFIIB- versus NC2-dominated genes was analyzed further. Enumeration of motif frequencies revealed that in 81% of TFIIB-dominated genes but in only 38% of NC2-dominated genes, at least one core promoter motif was present (Figure 5c). Strikingly, 27% and 11% genes of the former group harbored combinations of two or three motifs, whereas only 4% and zero genes of the latter group contained such binary and ternary motif combinations. Individual motif frequencies are summarized in Figure 5d. Comparing TFIIB- versus NC2-dominated genes, TATA was revealed as the most strongly enriched motif. It was present in 39% of TFIIB-dominated genes but in only 1% of NC2-dominated genes (Figure 5d). Other significantly enriched motifs included DPE (11% versus 1%), BREu (6% versus 1%) and, to a lesser extent, TATA-like (63% versus 16%), BREu-like (35% versus 13%) and INR (20% versus 13%). Again, BREd was not identified above stochastic levels in the TATA downstream region. In aligned TATA consensus promoters of TFIIB-dominated genes, the preferred bases upstream of TATA were consistent with described TFIIB contacts [6, 12] at the BREu (G at position -34 and C at position -32 were found in 42% and 52% of all TATA promoters), whereas the base composition downstream of TATA did not show homology to the BREd consensus. For example, thymine was the least frequent base at position -24, while it is the most frequent base in the in vitro selected BREd consensus sequence RTDKKKK . Base composition rather resembled the upstream region by showing preferential usage of G and C. From these data and the insignificant abundance of BREd, we conclude that BREd does not correlate with PIC formation and TFIIB binding in vivo.
The high prevalence for the occurrence of motif combinations in TFIIB-dominated genes in illustrated in Figure 5e. In line with the known synergy between INR and TATA , 94% and 56% of promoters harboring the INR motif also contained a TATA-like or TATA consensus sequence, respectively. A strong linkage was also observed for DPE and TATA: 89% of DPE promoters harbored a TATA-like sequence, and 67% of DPE promoters a TATA consensus motif in the upstream region around -30. This is unexpected, since the DPE was functionally identified in Drosophila promoters as a surrogate core element in TATA-less promoters . Finally, 50% of promoters with a BREu-like motif around position -32 contained an adjacent TATA-like sequence and 32% a downstream TATA consensus, reflecting the above observation of conserved BREu residues in TATA-containing promoters with high-TFIIB levels. Taken together, TFIIB strongly selects for TATA as well as for synergistic combinations of TATA with INR or DPE and, to a lesser extent, with BREu-like sequences in human core promoters.
NC2 is more frequent on genes with multiple start sites lacking defined core promoter elements
Average occupancy profiles of TFIIB and NC2 at promoters of genes with high or low TFIIB/NC2 ratios (1,000 for each group) showed similar factor profiles at the former group, with peak maxima coinciding at position -50 (Figure 6b, left). In contrast, at genes with low NC2/TFIIB ratios a broader distribution of both factors (ranging from -90 to -290) was observed (Figure 6b, right). Here, NC2 is markedly enriched in upstream regions relative to TFIIB, perhaps indicating a specific role of NC2 on genes with multiple start sites. The relevance of the difference in TFIIB versus NC2 distributions on these genes was confirmed with high confidence (P < 2.2e-16) by running a Wilcoxon-Mann-Whitney test on the positions of TFIIB and NC2.
TFIIB/NC2 ratios are influenced by both activators and core promoter elements
Our analysis establishes the first genome-wide reference data set for steady-state occupancy levels of vertebrate PICs. The comparative analysis of TFIIB and NC2 occupancy with gene expression further provides a framework for future detailed analyses of basal versus gene regulatory mechanisms on individual or groups of human genes. Our data presently suggest that PIC (or TBP-TFIIB) association correlates with TATA or is independent of core elements altogether, whereas NC2 association is largely independent of the underlying core promoter structure.
TFIIB and NC2 act globally and are present at active genes. We report a strong positive correlation of TFIIB with gene expression levels at a genome-wide scale (Figure 2d), which is in line with the factor's original definition as a crucial PIC component . Conflicting reports indicating a negative TFIIB impact through BRE interactions (see Introduction) are not represented in our genome-wide data, although we can not exclude such mechanisms at specific genes. At least for highly expressed genes, our data reason for an inhibitory function of NC2. It remains to be proven that NC2 can also act positively on certain genes. Candidates for the latter are multiple start sites genes that produce high mRNA levels and display high NC2/TFIIB ratios. A possible mechanism is that efficient promoter association of TBP depends on NC2 at such genes.
Our data reason against a positive influence of core elements on NC2 promoter association. For example, NC2-dominated genes with high NC2/TFIIB ratios were enriched for GC but depleted for core promoter elements, in particular TATA, BREu and DPE (Figure 5b-d). Attempts to show direct specificity of TBP-NC2 complexes for GC-rich regions failed (Christine Göbel and MM, unpublished). Enrichment of NC2 on such genes probably reflects low initiation rates from start sites located further upstream of a major TSS. At the majority of genes, however, TFIIB and NC2 occupancy distribution is very similar. This indirectly suggests that TBP, the partner of both TFIIB and NC2, dictates the recognition site. However, alternative scenarios in which NC2 binding and PIC formation become coupled could be projected. For example, when RNAPII clears the promoter it leaves TBP behind . The latter may subsequently be recognized and stabilized by the abundant NC2 complex.
NC2 occupancy and activity appear in a distinct light if compared with TFIIB. A generally positive correlation of binding with the presence of TATA turns into a negative correlation relative to the competing GTF TFIIB. Related to this, NC2 occupancy positively correlates with gene expression, yet TFIIB correlation with it is more pronounced. Indeed, TFIIB/NC2 ratios increase especially in the most strongly expressed 5% of the B cell genes (Figure 5a). Our data thus reason for a negative role of NC2 at strongly expressed genes carrying intact core promoters. This is consistent with the original reports by Reinberg and our laboratory [33–36].
TATA, although a rather infrequent motif, is positively correlated with the binding of TFIIB (Figure 3a). Somewhat surprisingly, we found little evidence for a critical role of the previously defined BREs in PIC formation. The BREu consensus is found in approximately 3% of the preferred TFIIB target genes (Figure 3b). In pre-selected TFIIB-dominated genes the BREu frequency increases only moderately to 6% (Figure 5c). BREd is not found above stochastic levels and, hence, is apparently not linked to TFIIB-driven PIC formation. One may object that BREs are more degenerated in sequence and difficult to track, especially in the absence of TATA boxes, where the position of TFIIB-DNA interaction is less predictable. Along this line we note that genes with a high TFIIB/NC2 ratio often carry GC-rich regions that resemble the upstream BREu. In summary, the data imply that conserved BRE motifs with position and sequence fidelity comparable to the TATA consensus do not play a significant role in TFIIB promoter association.
Most genes that bind TFIIB with high efficiency (top 5%) seem not to employ core elements to facilitate or stabilize GTF-core promoter interactions. TATA consensus is found with a frequency below 5%, TATA-like elements reach 29% (Figure 3a). The DPE, downstream core element and motif ten element were not detected above stochastic levels in the top 5% of target genes of either TFIIB or NC2. So far our attempts have failed to select associated structure in core promoters for the few genes where these elements may play a role. We could also not reconstruct an alternative (that is, mammalian) DPE from the information obtained with high-TFIIB or high-NC2 target genes. Generally, core elements were most well represented in a small subset of genes that have high expression levels and at the same time display high TFIIB/NC2 ratios. In this small subset we did identify with a frequency of 11% a positioned DPE-like motif conforming to the Drosophila consensus RGWYVT . In contrast to the situation in Drosophila, DPE presence is strongly linked to TATA in this subset of human promoters (Figure 5d).
We hypothesize that at the majority of genes lacking intact core elements, promoters are accessible in chromatin and/or may ultimately direct GTFs to promoters via interactions with regulatory surfaces, for example, through gene-specific activators. To prove this assumption, individual genes will have to be studied in detail both in vivo and in vitro. While this will undoubtedly uncover different scenarios in directing PIC formation, we have initially taken a reductionist biochemical approach using one model activator together with prototypic (TATA+/-, INR+) promoters (Figure 7). Most importantly, the activator, and to a lesser extent TATA, influence binding of TFIIB, while NC2 is unresponsive to the activator. NC2 also has less affinity for TATA, yet TBP-NC2 complexes retain moderate specificity for TATA . This result suggests that PICs might be directed to promoters by activators, whereas the core promoters contribute to their binding and less to the association of NC2 with promoters. The high prevalence of intact core elements and their combinations in the small subset of TFIIB-dominated genes as well as the positive correlation of high TFIIB/NC2 ratios to gene expression levels (Figure 5a) suggests that core promoter elements contribute to gene activity in this subgroup of genes. The model predicts that binding of GTFs may be largely directed by activators on GC-rich promoters, whereas direct binding of GTFs and, to a lesser extent, regulatory factors contribute to the activity of the small subset of genes carrying multiple intact core elements within promoters.
TFIIB and NC2 are global factors acting at a large fraction of all human genes. TATA was revealed as the most influential element for TFIIB recruitment and PIC formation. Most genes, however, recruit general factors in the absence of known GTF binding sites. We hypothesize that at these genes, TFIIB/NC2 ratios are determined by interactions between regulatory factors and the RNAPII machinery. There is overwhelming evidence for the influence of regulatory factors on PIC formation, but little precedence for direct action of activators on NC2. This is also the result of our in vitro binding studies using VP16 as a model for transactivators. On the other hand, core promoter elements are the major determinant for PIC binding in a subgroup of highly expressed genes that are characterized by high TFIIB/NC2 ratios. This subgroup establishes a small pool of human core promoters that may prove useful for future analyses of interactions between GTFs, cofactors and core promoters.
Materials and methods
Anti-TFIIB antibody (sc-225) and non-specific IgG serum (sc-2027) were purchased from Santa Cruz Biotechnology (Santa Cruz, CA, USA). Anti-NC2 alpha (DRAP1) antibody 4G7 has been previously described .
LCL721 cells were grown in RPMI 1640 medium supplemented with 10% (v/v) heat-inactivated fetal bovine serum, 5 mM L-glutamine and 100 units/ml penicillin-streptomycin (all from Invitrogen, Karlsruhe, Germany) in a humidified incubator at 37°C and 5% CO2.
We pelleted 1 × 108 cells (0.4 × 106 cells/ml) by centrifugation (1,200 rpm, 5 minutes) and washed them with PBS. The cell pellet was resuspended in 36 ml of PBS. Cells were fixed by adding 4 ml of a freshly prepared 10% formaldehyde solution (10% (v/v) formaldehyde (Sigma-Aldrich, Taufkirchen, Germany), 140 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 50 mM Hepes-KOH pH 8.0). Cross-linking was done for 9 minutes at room temperature, followed by quenching with 125 mM glycine, with immediate transfer of cells to ice followed by 5 minutes incubation on ice. Cells were washed twice with ice-cold PBS and sequentially lysed by resuspending the cell pellet in 5 ml of ice-cold ChIP lysis buffer 1 (50 mM Hepes-KOH pH 7.4, 140 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 10% (v/v) glycerol, 0.5% (v/v) Igepal CA-630 (Sigma-Aldrich, Taufkirchen, Germany), 0.25% Triton X-100 (Sigma-Aldrich), and freshly added 1× protease inhibitor cocktail (Roche, Mannheim, Germany)) and 10 minutes rotation at 4°C. Cells were collected by centrifugation (4,000 rpm, 10 minutes, 4°C), followed by resuspension in 5 ml of ice-cold ChIP lysis buffer 2 (10 mM Tris-HCl pH 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, freshly added 1× protease inhibitor cocktail) and 10 minutes rotation at 4°C. After centrifugation (4,000 rpm, 10 minutes, 4°C), the pellet was resuspended in 3 ml of ice-cold ChIP lysis buffer 3 (10 mM Tris-HCl pH 8.0, 140 mM NaCl, 1 mM EDTA, 1 mM EGTA, 0.5% N-lauryl sarcosine (Sigma-Aldrich), 0.1% sodium deoxycholate (Sigma-Aldrich), and 1× protease inhibitor cocktail). Acid-washed glass beads (212 to 300 microns; Sigma-Aldrich) were added, and the cross-linked chromatin was sheared to an average size of 300 bp by 6 minutes sonication (40% power output, with pulses set to 30 s ON/10 s OFF) in an ice-water bath using a Branson 250-D sonicator and a microtip. After sonication, Triton X-100 was added to 0.5% as final concentration, and the lysate was centrifuged (5,500 rpm, 5 minutes, 4°C) to remove cell debris. The chromatin extract was pre-cleared with 100 μl blocked (pre-absorbed with PBS/0.5% (w/v) BSA (Sigma-Aldrich)) protein A/G sepharose FF beads (GE Healthcare, Munich, Germany) for 2 h at 4°C, quantified in a UV spectrophotometer and diluted to 1 mg/ml and 0.25% N-lauryl sarcosine. We used 500 μl of the chromatin extract per single ChIP reaction in lubricated tubes in a total volume of 1 ml. The extract was incubated overnight at 4°C with 50 μl blocked protein A/G sepharose beads that had been pre-adsorbed with 10 μg of antibody. Immune complexes were collected by centrifugation (3,000 rpm, 1 minute, 4°C) and washed six times with 1 ml of ice-cold ChIP wash buffer (50 mM Hepes-KOH pH 7.4, 500 mM LiCl, 1 mM EDTA, 1% Igepal Ca-630, 0.7% sodium deoxycholate, and freshly added 0.5× protease inhibitor cocktail) and one time with 1× TE (10 mM Tris-HCl pH 8.0, 1 mM EDTA) containing 50 mM NaCl. The protein-DNA complexes were eluted from the beads by adding 200 μl ChIP elution buffer (50 mM Tris-HCl pH 8.0, 10 mM EDTA, 1% sodium dodecyl sulfate (SDS; Invitrogen)) and incubation at 65°C under constant agitation for 10 minutes. After removal of beads by centrifugation (6,000 rpm, 5 minutes, room temperature) the supernatant was incubated at 65°C overnight to revert the cross-links. Then, the sample was diluted to 400 μl with 1× TE. DNAse-free RNAse A (8 μg; RPA grade; Applied Biosystems, Foster City, CA, USA) was added and the sample was incubated for 1 h at 37°C, followed by addition of proteinase K (PCR grade; Roche) to 250 μg/ml and digestion for 2 h at 55°C. Genomic DNA was isolated from the precipitated material as well as from the sheared chromatin input (1% of the material used for ChIP) by phenol extraction and ethanol precipitation.
ChIP and input DNA was end-polished using T4 DNA polymerase (New England Biolabs, Ipswich, MA, USA) and 200 μM dNTPs for 20 minutes at 12°C. After phenol extraction and ethanol precipitation, blunted DNA was ligated to 100 pmol of annealed linker (of oligo-25, 5'-GCGGTGACCCGGGAGATCTGAATTC, and oligo-11, 5'-GAATTCAGATC) using T4 DNA ligase (New England Biolabs) overnight at 16°C. DNA was ethanol precipitated and amplified by ligation-mediated PCR in a total volume of 55 μl containing 250 μM dNTPs, 50 pmol of oligo-25, 5 units of Taq polymerase (New England Biolabs) and 0.025 units of Pfu Turbo polymerase (Stratagene, La Jolla, CA, USA) for one initial cycle consisting of 2 minutes at 55° (during which polymerase was added), 5 minutes at 72°C and 2 minutes at 94°C, followed by 22 cycles of 0.5 minutes at 94°C, 0.5 minutes at 60°C and 2 minutes at 72°C, and a final 4-minute extension at 72°C. Amplicons were purified with a PCR purification kit (Qiagen, Hilden, Germany). At least 5 μg of amplified ChIP and input DNA were labeled and hybridized to human promoter arrays (NimbleGen HGS17 human 1.5 K promoter chip) containing 24,134 human promoter regions, each represented by a probe set of 15 tiling 50-mer oligonucleotide probes covering 1.5 kb DNA around TSSs. Slides were scanned by NimbleScan and raw data were processed by NimbleGen according to standard procedures . Briefly, for each feature a log2 ratio of the hybridization intensities of the co-hybridized ChIP and input DNA was determined. These ratios were scaled to center the data around zero by robust statistics. Specifically, scaling was performed by subtracting Tukey's bi-weight mean for the log2 ratios of all array features from each individual log2 ratio. The median of the scaled log2 ChIP/input ratio of each probe set provides a measure for promoter occupancy.
ChIP-qPCR analysis was done with Power SYBR green PCR master mix in an ABI StepOne Plus thermocycler (Applied Biosystems, Foster City, CA, USA). Triplicate reactions were carried out in a total volume of 10 μl containing 4 pmol of forward and reverse primers. Reactions containing serially diluted input DNA were used as standard curve to quantify ChIP DNA reactions. Melting curve analysis was used to determine the specificity of all reactions. Primer sequences are available upon request.
Computational and statistical analyses
Peak finding was done using the Mpeak program . Peaks were called under different stringency settings, with cut-offs of mean log2 signal ratios (ChIP versus total DNA) plus either 1, 2, or 2.5 standard deviations (Table 1). High-resolution binding profiles were generated by extracting and remapping of relevant array probe sequences to their exact position in the NCBI build 36 of the human genome. A slightly modified version of the RegionMiner software (Genomatix, Munich, Germany) was used to correlate the position of single high-score probes (upper 5th percentile) with annotated TSSs (Figures 1d and 6b). To avoid positional bias, the relative fraction of high-score probes mapping to distinct 10-bp bins around aligned TSSs was calculated with respect to the number of all available probes at this position. Correlation analyses of factor occupancy and gene expression levels were conducted as follows: steady-state mRNA expression levels were derived from polyadenylated RNA of LCL721 cells hybridized to an Affymetrix U133 Plus 2.0 microarray that covers probesets for the analysis of over 47,000 human transcripts (data available as SI Data Set 2 in reference ). Data analysis with GCOS software and default statistical algorithm parameters was performed by Affymetrix service provider KFB (Regensburg, Germany). Log2-scaled ChIP/input enrichment of TFIIB or NC2 on NimbleGen promoter probesets (see above) were matched with corresponding Affymetrix probeset IDs (only using probesets with 'present' calls) to generate the gene expression correlation analysis of TFIIB (Figure 2d) or TFIIB/NC2 ratios (Figure 5a). Degree of correlation between data sets in Figures 1a, 2d and 4a was determined by applying the Pearson correlation function in Microsoft Excel. For other statistical analyses the Bioconductor package was used. These included the evaluation of the statistical significance of the difference of Affymetrix probeset signal values for genes that do have a top 10% enrichment score against the distribution of all probeset signal values using a Kolmogorov-Smirnov test (Figure 2e). The hypothesis that both distributions are similar can be denied on a significance level of P < 0.01, thus making it clear that the data for these genes differ from the complete set. To accomplish this we used the ks.boot function from the 'Matching' package (provided by JS Sekhon) , for the R software for statistical computing . Similarly, the relevance of the difference in TFIIB versus NC2 distributions on genes (Figure 6b) was revealed by running a Mann-Whitney-Wilcoxon test on the factor positions using the R package .
Immobilized template assay and in vitrotranscription
Immobilized template assays and in vitro transcription were performed as described . The pGL2-MRG5 promoter template was amplified from vector pGL2-MRG5. It contains five Gal4 binding sites immediately upstream of a synthetic HIV/AdML core promoter driving expression of a downstream luciferase cassette. Amplification primers were biotinylated 5'-GCATTCTAGTTGTGGTTTGTCCAA and 5'-ATACGACGATTCTGTGATTTG. Templates were purified on 1% (w/v) agarose gels, recovered using a gel extraction kit (Qiagen) and coupled to paramagnetic streptavidin beads (Promega, Madison, WI, USA) as follows: beads were washed twice in B&W buffer (5 mM Tris-HCl pH 7.5, 1 mM EDTA, 1 M NaCl, 0.003% Igepal CA-630). Subsequently, beads were resuspended in B&W buffer, and 15 ng biotinylated template (in 1× TE pH 8.0 containing 1 M NaCl) was added for each microgram of magnetic beads. After shaking for 45 minutes at room temperature, beads were washed once in B&W buffer containing 0.5 mg/ml BSA (fraction V; Sigma-Aldrich). For blocking, beads were resuspended at a concentration of 1 μg/μl in buffer A (60 mM KCl, 20 mM Hepes-KOH pH 8.2, 5 mM MgCl2, 10 mM dithiothreitol (DTT; Sigma-Aldrich), 0.025% Igepal CA-630, 0.2 mM phenylmethanesulfonyl fluoride (PMSF; Sigma-Aldrich)) containing 5 mg/ml BSA and 5 mg/ml polyvinylpyrrolidone (Sigma-Aldrich) and incubated for 15 minutes at room temperature. Afterwards, beads were washed three times with buffer A. PIC assembly was conducted in a total volume of 200 μl with 1,050 ng pGL2-MRG5 promoter template coupled to 70 μg beads, 2 μg poly(dG:dC) competitor DNA, 100 to 200 μg Jurkat nuclear extract and, if indicated, 200 ng Gal4-VP16 (the carboxy-terminal 147 amino acids of the Saccharomyces cerevisiae Gal4p DNA-binding domain linked to the Herpes simplex virus VP16 activation domain comprising residues 411 to 490). PIC assembly buffer was composed of 20 mM Hepes-KOH pH 8.2, 5 mM MgCl2, 10 mM DTT, 0.025% Igepal CA-630, 0.5 mg/ml BSA (Roche), 10% (w/v) glycerol, 0.1 mg/ml PEG 8000 and 0.2 mM PMSF. After 45 minutes incubation at 30°C the template-bound complexes were concentrated with a magnet and washed three times with 200 μl buffer A. PICs were either eluted with Laemmli buffer and analyzed by immunoblot or probed in an in vitro transcripton reaction (see below). Immunoblots were scanned and signal intensities quantified using the ImageJ program . To test for the activity of template-associated PICs, in vitro transcription was performed. PICs were formed as above, washed and resuspended in transcription buffer (20 mM Hepes-KOH pH 8.2, 60 mM KCl, 5 mM MgCl2, 10 mM DTT, 0.025% Igepal CA-630 0.5 mg/ml BSA, 10% (w/v) glycerol, 0.1 mg/ml PEG 8000, 4 units RNAsin (Promega), 0.2 mM PMSF). Transcription was initiated by addition of the NTP mix supplemented with 1 μl alpha-32P UTP (3,000 Ci/mmol). Final NTP concentrations were 100 μM ATP, CTP and GTP each, and 5 μM UTP. Transcription reactions were incubated at 30°C for 30 minutes and stopped by addition of 400 μl transcription stop buffer (7 M urea, 10 mM Tris-HCl pH 7.8, 10 mM EDTA pH 8.0, 300 mM sodium acetate, 0.5% SDS, 100 mM lithium chloride, 0.4 mg/ml yeast tRNA). Reactions were extracted with phenol/chloroform, RNA precipitated with isopropanol and analyzed by autoradiography.
The raw and processed ChIP-chip data have been deposited at the Gene Expression Omnibus (GEO) public repository  and are accessible as [GEO:GSE19562]. Scaled log2 ChIP/input ratios of NimbleGen probeset signals from the TFIIB and NC2 ChIP-chip experiments are also available as Additional file 2.
Adenovirus major late
downstream TFIIB recognition element
upstream TFIIB recognition element
bovine serum albumin
cap analysis gene expression
ChIP with detection by microarrays
downstream promoter element
glyceraldehyde 3-phosphate dehydrogenase
general transcription factor
negative cofactor 2 (DR1/DRAP1)
RNA polymerase II
TATA box-binding protein
general transcription factor IIB (GTF2B)
transcription start site.
We thank J-C Andrau for providing PCR primer sequences for human GAPDH, and B Lenhard for comments on the manuscript. This work was supported by grants from the German Ministry for Education and Research (grant 0313030A) and the European Union (EUTRACC, grant LSHG-CT-2007-037445) to MM.
- Lifton RP, Goldberg ML, Karp RW, Hogness DS: The organization of the histone genes in Drosophila melanogaster : functional and evolutionary implications. Cold Spring Harb Symp Quant Biol. 1978, 42: 1047-1051.PubMedView ArticleGoogle Scholar
- Smale ST, Baltimore D: The "initiator" as a transcription control element. Cell. 1989, 57: 103-113. 10.1016/0092-8674(89)90176-1.PubMedView ArticleGoogle Scholar
- Burke TW, Kadonaga JT: Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters. Genes Dev. 1996, 10: 711-724. 10.1101/gad.10.6.711.PubMedView ArticleGoogle Scholar
- Lewis BA, Kim TK, Orkin SH: A downstream element in the human beta-globin promoter: evidence of extended sequence-specific transcription factor IID contacts. Proc Natl Acad Sci USA. 2000, 97: 7172-7177. 10.1073/pnas.120181197.PubMedPubMed CentralView ArticleGoogle Scholar
- Deng W, Roberts SG: A core promoter element downstream of the TATA box that is recognized by TFIIB. Genes Dev. 2005, 19: 2418-2423. 10.1101/gad.342405.PubMedPubMed CentralView ArticleGoogle Scholar
- Lagrange T, Kapanidis AN, Tang H, Reinberg D, Ebright RH: New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB. Genes Dev. 1998, 12: 34-44. 10.1101/gad.12.1.34.PubMedPubMed CentralView ArticleGoogle Scholar
- Lim CY, Santoso B, Boulay T, Dong E, Ohler U, Kadonaga JT: The MTE, a new core promoter element for transcription by RNA polymerase II. Genes Dev. 2004, 18: 1606-1617. 10.1101/gad.1193404.PubMedPubMed CentralView ArticleGoogle Scholar
- Juven-Gershon T, Hsu JY, Theisen JW, Kadonaga JT: The RNA polymerase II core promoter - the gateway to transcription. Curr Opin Cell Biol. 2008, 20: 253-259. 10.1016/j.ceb.2008.03.003.PubMedPubMed CentralView ArticleGoogle Scholar
- Thomas MC, Chiang CM: The general transcription machinery and general cofactors. Crit Rev Biochem Mol Biol. 2006, 41: 105-178. 10.1080/10409230600648736.PubMedView ArticleGoogle Scholar
- Saxonov S, Berg P, Brutlag DL: A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci USA. 2006, 103: 1412-1417. 10.1073/pnas.0510310103.PubMedPubMed CentralView ArticleGoogle Scholar
- Sawadogo M, Roeder RG: Factors involved in specific transcription by human RNA polymerase II: analysis by a rapid and quantitative in vitro assay. Proc Natl Acad Sci USA. 1985, 82: 4394-4398. 10.1073/pnas.82.13.4394.PubMedPubMed CentralView ArticleGoogle Scholar
- Tsai FT, Sigler PB: Structural basis of preinitiation complex assembly on human pol II promoters. EMBO J. 2000, 19: 25-36. 10.1093/emboj/19.1.25.PubMedPubMed CentralView ArticleGoogle Scholar
- Evans R, Fairley JA, Roberts SG: Activator-mediated disruption of sequence-specific DNA contacts by the general transcription factor TFIIB. Genes Dev. 2001, 15: 2945-2949. 10.1101/gad.206901.PubMedPubMed CentralView ArticleGoogle Scholar
- Gershenzon NI, Ioshikhes IP: Synergy of human Pol II core promoter elements revealed by statistical sequence analysis. Bioinformatics. 2005, 21: 1295-1300. 10.1093/bioinformatics/bti172.PubMedView ArticleGoogle Scholar
- Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A: A code for transcription initiation in mammalian genomes. Genome Res. 2008, 18: 1-12. 10.1101/gr.6831208.PubMedPubMed CentralView ArticleGoogle Scholar
- Huisinga KL, Pugh BF: A TATA binding protein regulatory network that governs transcription complex assembly. Genome Biol. 2007, 8: R46-10.1186/gb-2007-8-4-r46.PubMedPubMed CentralView ArticleGoogle Scholar
- Zanton SJ, Pugh BF: Changes in genomewide occupancy of core transcriptional regulators during heat stress. Proc Natl Acad Sci USA. 2004, 101: 16843-16848. 10.1073/pnas.0404988101.PubMedPubMed CentralView ArticleGoogle Scholar
- Venters BJ, Pugh BF: A canonical promoter organization of the transcription machinery and its regulators in the Saccharomyces genome. Genome Res. 2009, 19: 360-371. 10.1101/gr.084970.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Andrau JC, Pasch van de L, Lijnzaad P, Bijma T, Koerkamp MG, Peppel van de J, Werner M, Holstege FC: Genome-wide location of the coactivator mediator: Binding without activation and transient Cdk8 interaction on DNA. Mol Cell. 2006, 22: 179-192. 10.1016/j.molcel.2006.03.023.PubMedView ArticleGoogle Scholar
- Zhu X, Wiren M, Sinha I, Rasmussen NN, Linder T, Holmberg S, Ekwall K, Gustafsson CM: Genome-wide occupancy profile of mediator and the Srb8-11 module reveals interactions with coding regions. Mol Cell. 2006, 22: 169-178. 10.1016/j.molcel.2006.03.032.PubMedView ArticleGoogle Scholar
- van Werven FJ, van Bakel H, van Teeffelen HA, Altelaar AF, Koerkamp MG, Heck AJ, Holstege FC, Timmers HT: Cooperative action of NC2 and Mot1p to regulate TATA-binding protein function across the genome. Genes Dev. 2008, 22: 2359-2369. 10.1101/gad.1682308.PubMedPubMed CentralView ArticleGoogle Scholar
- Albert TK, Grote K, Boeing S, Stelzer G, Schepers A, Meisterernst M: Global distribution of negative cofactor 2 subunit-alpha on human promoters. Proc Natl Acad Sci USA. 2007, 104: 10000-10005. 10.1073/pnas.0703490104.PubMedPubMed CentralView ArticleGoogle Scholar
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, et al: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447: 799-816. 10.1038/nature05874.PubMedView ArticleGoogle Scholar
- Denissov S, van Driel M, Voit R, Hekkelman M, Hulsen T, Hernandez N, Grummt I, Wehrens R, Stunnenberg H: Identification of novel functional TBP-binding sites and general factor repertoires. EMBO J. 2007, 26: 944-954. 10.1038/sj.emboj.7601550.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim TH, Barrera LO, Zheng M, Qu C, Singer MA, Richmond TA, Wu Y, Green RD, Ren B: A high-resolution map of active promoters in the human genome. Nature. 2005, 436: 876-880. 10.1038/nature03877.PubMedPubMed CentralView ArticleGoogle Scholar
- Zheng M, Barrera LO, Ren B, Wu YN: ChIP-chip: data, model, and analysis. Biometrics. 2007, 63: 787-796. 10.1111/j.1541-0420.2007.00768.x.PubMedView ArticleGoogle Scholar
- Singh BN, Hampsey M: A transcription-independent role for TFIIB in gene looping. Mol Cell. 2007, 27: 806-816. 10.1016/j.molcel.2007.07.013.PubMedView ArticleGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14: 1188-1190. 10.1101/gr.849004.PubMedPubMed CentralView ArticleGoogle Scholar
- Bailey TL, Elkan C: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994, 2: 28-36.PubMedGoogle Scholar
- Gilfillan S, Stelzer G, Piaia E, Hofmann MG, Meisterernst M: Efficient binding of NC2. TATA-binding protein to DNA in the absence of TATA. J Biol Chem. 2005, 280: 6222-6230. 10.1074/jbc.M406343200.PubMedView ArticleGoogle Scholar
- Kawaji H, Kasukawa T, Fukuda S, Katayama S, Kai C, Kawai J, Carninci P, Hayashizaki Y: CAGE Basic/Analysis Databases: the CAGE resource for comprehensive promoter analysis. Nucleic Acids Res. 2006, 34: D632-636. 10.1093/nar/gkj034.PubMedPubMed CentralView ArticleGoogle Scholar
- Yudkovsky N, Ranish JA, Hahn S: A transcription reinitiation intermediate that is stabilized by activator. Nature. 2000, 408: 225-229. 10.1038/35041603.PubMedView ArticleGoogle Scholar
- Goppelt A, Stelzer G, Lottspeich F, Meisterernst M: A mechanism for repression of class II gene transcription through specific binding of NC2 to TBP-promoter complexes via heterodimeric histone fold domains. EMBO J. 1996, 15: 3105-3116.PubMedPubMed CentralGoogle Scholar
- Inostroza JA, Mermelstein FH, Ha I, Lane WS, Reinberg D: Dr1, a TATA-binding protein-associated phosphoprotein and inhibitor of class II gene transcription. Cell. 1992, 70: 477-489. 10.1016/0092-8674(92)90172-9.PubMedView ArticleGoogle Scholar
- Schluesche P, Stelzer G, Piaia E, Lamb DC, Meisterernst M: NC2 mobilizes TBP on core promoter TATA boxes. Nat Struct Mol Biol. 2007, 14: 1196-1201. 10.1038/nsmb1328.PubMedView ArticleGoogle Scholar
- Xie J, Collart M, Lemaire M, Stelzer G, Meisterernst M: A single point mutation in TFIIA suppresses NC2 requirement in vivo . EMBO J. 2000, 19: 672-682. 10.1093/emboj/19.4.672.PubMedPubMed CentralView ArticleGoogle Scholar
- Roche NimbleGen. [http://www.nimblegen.com]
- Multivariate and Propensity Score Matching Software for Causal Inference. [http://sekhon.berkeley.edu/matching/]
- The R Project for Statistical Computing. [http://www.r-project.org]
- Boeing S, Rigault C, Heidemann M, Eick D, Meisterernst M: RNAPII CTD SER-7 phosphorylation is established in a mediator-dependent fashion. J Biol Chem. 2009Google Scholar
- ImageJ. [http://rsbweb.nih.gov/ij/]
- Gene Expression Omnibus. [http://www.ncbi.nlm.nih.gov/geo/]
- Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, et al: Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet. 2006, 38: 626-635. 10.1038/ng1789.PubMedView ArticleGoogle Scholar
- Cage Basic Viewer. [http://fantom31p.gsc.riken.jp/cage/hg17/]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.