Glucocorticoid receptor binds half sites as a monomer and regulates specific target genes
© Schiller et al.; licensee BioMed Central 2014
Received: 19 May 2014
Accepted: 17 July 2014
Published: 31 July 2014
Skip to main content
© Schiller et al.; licensee BioMed Central 2014
Received: 19 May 2014
Accepted: 17 July 2014
Published: 31 July 2014
Glucocorticoid receptor (GR) is a hormone-activated, DNA-binding transcriptional regulatory factor that controls inflammation, metabolism, stress responses, and other physiological processes. In vitro, GR binds as an inverted dimer to a motif consisting of two imperfectly palindromic 6 bp half sites separated by 3 bp spacers. In vivo, GR employs different patterns of functional surfaces of GR to regulate different target genes. The relationships between GR genomic binding and functional surface utilization have not been defined.
We find that A477T, a GR mutant that disrupts the dimerization interface, differs from wild-type GRα in binding and regulation of target genes. Genomic regions strongly occupied by A477T are enriched for a novel half site motif. In vitro, GRα binds half sites as a monomer. Through the overlap between GRα- and A477T-bound regions, we identify GRα-bound regions containing only half sites. We further identify GR target genes linked with half sites and not with the full motif.
Genomic regions bound by GR differ in underlying DNA sequence motifs and in the GR functional surfaces employed for regulation. Identification of GR binding regions that selectively utilize particular GR surfaces may discriminate sub-motifs, including the half site motif, that favor those surfaces. This approach may contribute to predictive models for GR activity and therapy.
In vitro, purified GR recognizes specifically a GR binding sequence (GBS) motif composed of imperfect palindromic 6 base pair (bp) ‘half sites’ separated by 3 bp ‘spacers’, binding as an inverted dimer . In vivo, GR occupies specific genomic GR binding regions (GBRs), identified using chromatin immunoprecipitation coupled with quantitative PCR (ChIP) or deep sequencing (ChIP-seq). The genomic locations of GBRs differ strikingly in different cell or physiologic contexts, and not all GBRs contain an identifiable GBS motif. Hence, the relationship between sequences recognized in vitro and genomic regions occupied in vivo is complex. It is reassuring that GBS motifs drive GR-regulated transcription in a simple reporter context , and therefore reasonable to assume that GR occupancy at GBRs that contain GBSs reflects in vivo sequence recognition. Interestingly, different GBSs, tested in isolation in vivo, trigger utilization of different patterns of GR functional surfaces , supporting the notion that GR assumes different conformations at these sequences in vivo. However, rules dictating the relationship between DNA sequence, GR conformation, and utilization of distinct GR functional surfaces remain unknown.
We therefore examined whether sequence motifs, including the canonical GBS, found within GBRs might in fact be an aggregate of specific motifs, each biased toward utilization of particular functional surfaces of GR. In this study, we measured changes in gene expression and genomic occupancy upon glucocorticoid treatment of isogenic cultured human cell lines expressing alleles of GR bearing differences in particular functional surfaces. We further used statistical, computational, and biophysical methods to characterize changes in GR conformation and function, as well as the sequence motifs that are present at GREs, in relation to these functional surfaces.
We then assigned a simple phenotypic classification to each GR regulated gene based on its valence of regulation by GRα (activation or repression) and the difference in regulation by that allele and by GRα (gain, loss, or change of valence). Some phenotypic classifications were less common than others: in particular, genes that change the valence of regulation were relatively rare. Moreover, certain phenotypes appeared to be relatively allele-restricted; for example, many genes gained activation with A477T and GRγ (103 and 36 genes, respectively), whereas few gained activation with 30iiB or E773R (6 and 1 genes, respectively) (Figure 3). Thus, A477T and GRγ may drive activation through one or more mechanisms unavailable to GRα, 30iiB, or E773R.
Next, we performed ChIP-seq to define GBRs genome-wide in U2OS cells expressing GRα or one of the three mutant alleles (30iiB, A477T, and E773R), after treatment with dexamethasone. Defining GBRs enables identification of putative primary regulated genes (those regulated by a proximal GR-occupied presumptive GRE) and allows us to distinguish genes differentially regulated due to changes in occupancy from those with altered GR activity. GRγ has been similarly examined by others . Importantly, we did paired-end sequencing of the ChIP-seq library, which improves the dynamic range of occupancy by allowing use of both ends of the double stranded molecule to determine whether the originating fragment was unique . We identified GBRs (Additional file 2) using MACS2a, a software package for the analysis of ChIP-seq data, and assigned the GBRs to genes (Additional file 3) based on proximity (see Materials and Methods, ‘GBR to gene assignment’) . We also compared occupancy between GRα and each mutant (Additional file 4 Additional file 5 and Additional file 6) using the ‘diffstats’ module of MACS2 (see Materials and Methods, ‘Computational analysis of ChIP-seq data’). To avoid considering the same GBR twice, we considered all GBRs from GRα and those GBRs from a given mutant that did not overlap with GBRs from GRα (see Additional file 7, Additional file 8, Additional file 9, and Additional file 10); additionally, we analyzed only GBRs with signals of at least two reads per million total reads.
Comparing these results with our gene expression findings, we observed substantial enrichment of GRα-selectively occupied GBRs near genes that lost activation in cells expressing A477T (Figure 4E, Additional file 11, and Additional file 12), and of both A477T- and GRα-selectively occupied GBRs near genes that gained activation (Figure 4F, Additional file 12, and Additional file 13). Thus, it appeared that the differentially regulated genes represent a mixture, some with altered strength of occupancy (that is, a direct relationship between binding and regulation) and others with altered GR activity (that is, an indirect relationship). The occupancy of A477T near genes that are activated by A477T is surprising in light of previous reports indicating that mutations in the dimerization interface abrogate ‘transactivation’ [10,11].
We further investigated whether the differentially occupied GBRs near differentially regulated genes were responsible for the observed regulation. As a simple measure of regulatory function, we cloned GBRs - selected on the basis of proximity to differentially regulated genes, strength of occupancy, and isolation from many other GBRs - into luciferase-based reporters and transfected them into U2OS cells expressing either GRα or A477T. We found GRα-selective activation with four GRα-selectively occupied GBRs, A477T-selective activation with seven A477T-selectively occupied GBRs, and non-selective activation with five non-selectively occupied GBRs (Additional file 14). Six of 24 GBRs did not activate transcription in either cell line, but only two drove aberrant activation (opposing selectivity of occupancy and activation). These results imply that certain of the differentially occupied GBRs may indeed function to differentially regulate nearby genes. Additional experiments, using recent methods such as endogenous genome editing [12-15], will be required to assess unequivocally whether a given GBR is actually a GRE, and, if so, whether it confers differential regulation.
De novo motif searches with MEME  revealed that GBRs occupied by only GRα were enriched for the canonical GR ‘full site’ motif, which specifies cooperative binding as a GR dimer (Figure 5). Allowing MEME to determine the optimal length for motifs in GBRs occupied by only GRα revealed a 17 bp GR motif (Additional file 15), which is longer than the previously reported 14 bp  and 15 bp motifs , having additional, asymmetric specificity flanking the more strongly specified half site. In contrast, GBRs occupied by only A477T were enriched for a single half site motif (Figure 5), which may provoke binding of a GR monomer to DNA. These results are consistent with the observations that A477T has an impaired dimerization interface and decreased cooperativity .
GBRs that were occupied by both GRα and A477T were enriched for an intermediate motif that consisted of a closer-to-consensus half site and a very weak second half site, separated by the normal 3 bp spacer (Figure 5). Due to the semi-quantitative nature of ChIP-seq, some GBRs that we characterized as similarly occupied by both GRα and A477T may actually be differentially occupied (and conversely, some GBRs that we characterized as differentially occupied may not be). Therefore, the weak second half site seen at GBRs occupied by both GRα and A477T may be an experimental artifact; moreover, the lack of a second half site motif at A477T-selectively occupied GBRs implies that such GBRs indeed contain a motif distinct from that found at GRα-occupied GBRs. Supporting this view, length-unbiased searches with MEME for motifs at GBRs occupied by only A477T or both A477T and GRα revealed truncated motifs that included only the first two positions corresponding to the second half site (Additional file 15). In Figure 5, we display GR motifs found when forcing MEME to search for motifs that are exactly 17 bp in length, for ease of comparison. CentriMo, which searches for central/local enrichment of motifs in ChIP-seq data , showed that GR motifs were strongly enriched at GBR signal maxima (Additional file 15), implying direct binding by GR at these GBRs.
Discriminatory motif analysis of the three different classes (GRα-selective, A477T-selective, and non-selective) of occupied GBRs with DREME, which exhaustively searches for all 3 to 8 bp regular expression motifs that are enriched relative to di-nucleotide shuffled sequences , revealed motifs similar to that of activation protein 1 (AP1) and Signal Transducers and Activators of Transcription (STAT) families of transcription factors (Figure 5 and Additional file 15). We did not detect any centrally enriched motifs that might indicate a cooperative binding partner (that is, a heterodimer) for DNA-bound GRα or A477T. Nonetheless, previous studies  indicate that there is widespread co-association among transcription factor binding sites, indicating that nearby sites may be functional.
Occurrences of STAT and AP1 motifs found using DREME.
Found at GBRs in cells expressing
GRα and A477T
Of 1,000 with motif (n)
Of 1,000 with motif (n)
Occurrences of STAT and AP1 motifs found among all GBRs.
Motif present at GBR
With GR motif (n)
Without GR motif (n)
GR interacts physically and functionally with both the STAT and AP1 families of transcription factors . Although GR:AP1 interaction is known to be stimulated by lipopolysaccharide (LPS) treatment, GR also interacts with AP1 under normal conditions . The presence of AP1 and STAT motifs at GBRs is consistent with both ‘tethering’ of GR - protein:protein association of GR with DNA-bound AP1 or STAT - and with the existence of composite elements, elements that contain binding sites for GR, and for one or more additional transcriptional regulatory factors. The greater abundances of GBRs that contained STAT3 or AP1 motifs and also GR full sites as compared to GBRs that contained these motifs and lacked GR full sites (1.4- and 3.0-fold, respectively; Table 2) implies that composite elements may be more common than tethering elements.
Motivated by the discovery of a half site motif near genes regulated by A477T, we investigated whether any genes were regulated by GRα recognition of half site motifs. We identified a set of genes whose regulation at 4 h is similar (less than two-fold different) between GRα and A477T at the 95% confidence levelc. GBR occupancy near these ‘non-selective’ genes (Additional file 15) did not reveal significant enrichment of GRα- or A477T-selectively occupied GBRs relative to genes that are selectively regulated, suggesting that GBRs near these genes are bound by both GRα and A477T. Consistent with this explanation, GRα and A477T bind specifically and with equal affinity to half sites in vitro .
Next, we investigated whether there were GRα-regulated genes bearing half site GBRs and apparently lacking full site GBRs (Additional file 7 and Additional file 12). This restrictive criterion should define genes most likely to be regulated by GRα via half sites. We did not restrict our selection with respect to regulation by A477T, as some genes regulated by half sites might not be regulated by A477T due to impaired conformational dynamics and/or co-regulator recruitment. We defined as a full site motif any sequence with an individual information of at least 8.94 bits (the 95th percentile of randomly permuted GBR sequences), and as a half site motif any sequence with an individual information of at least 8.00 bits, with respect to the given motif. By comparison, the full site and half site motifs had 11.9 bits and 9.3 bits of information, respectively, relative to GBR background sequence (Additional file 15). We additionally required that the full site motif’s individual information at candidate GBRs be less than the half site motif’s individual information. In total, there were 1,441 GRα-regulated genes, 607 (42%) of which had associated GBRs within 20 kb of the transcriptional start site. We found that 35 of these 607 genes met our definition, and that this number declined slowly with an increasing window of association (27 within 30 kb, 21 within 40 kb, 17 within 50 kb).
We set out to find whether sequence motifs bound by GR might be decomposed into multiple specific motifs that each relate to the utilization of specific functional surfaces of GR. We found that A477T, a mutation in the dimerization interface of GR leading to reduced DNA-binding cooperativity and increased dissociation rates in vitro , resulted in gains and losses in glucocorticoid-induced gene regulation and in GBR occupancy in vivo. Analysis of A477T-occupied GBRs revealed a hitherto unrecognized half site motif. The half site motif, coupled with a failure to detect non-GR partner motifs, suggested that A477T might bind GBRs as a monomer in vivo. Furthermore, A477T regulated many genes, suggesting that GR monomers are sufficient for gene regulation. The question remains, then, whether GRα monomers bind to DNA in a sequence-specific manner and regulate a set of target genes.
Curiously, we found that some genes were regulated by A477T and not by GRα, and some GBRs were occupied by A477T and not by GRα. We speculate that GRα fails to bind such GBRs because it is titrated away by cooperatively bound, high affinity full sites, composed of ‘degenerate’ or weak half sites; such cooperatively bound sites would be unavailable to A477T due to its reduced cooperativity. Indeed, many GRα-selectively occupied GBRs near GRα-selectively regulated genes contained full sites lacking ‘consensus’ or perfect half sites; rather, these full sites were composed of two degenerate half sites (Figure 6). These results suggest two orthogonal binding modes for GRα: (1) consensus half sites, most likely bound by monomers; and (2) degenerate full sites, most likely bound by dimers.
Alternatively, GRα occupancy may not be observed at A477T-specific GBRs because GRα dissociates from these regions more rapidly than A477T, reducing the signal below detection limits. However, A477T dissociates from DNA more rapidly than GRα in vitro  and in vivo , arguing against this model. Importantly, at any given site, some fraction of binding events may involve monomers and some fraction may involve dimers. We found that a minimum of 35 genes are regulated entirely by direct binding of GRα monomers. While many genes may be regulated entirely by direct, cooperative binding of GRα as a dimer, our results suggest that others may be regulated in part by monomers and in part by dimers.
Additionally, we found that GBRs were enriched within 20 kb from the promoters of regulated genes (relative to the whole genome), suggesting that promoter-proximal GBRs were responsible for many observed changes in gene expression. Similarly, we found that GRα- and A477T-selectively occupied GBRs were enriched near the promoters of genes that lost and gained activation, respectively, with A477T, consistent with these GBRs being responsible for gene regulation. Nonetheless, more than half of GBRs were located distal to genes (more than 50 kb from the transcriptional start site), and only 42% of GRα-regulated genes were associated with promoter-proximal GBRs (±20 kb). Further exploration is needed to understand the function (if any) of distal GBRs.
The ability of a transcription factor to function in gene regulation alternatively as a monomer or a dimer has been previously observed for SRY (sex determining region Y)-box 10 (SOX10) . There are several lines of evidence suggesting that GRα, too, binds DNA alternatively as a monomer or a dimer. GRα and A477T bind specifically and with equal affinity to half sites in vitro , with a Hill coefficient of approximately 1, consistent with binding as a monomer to half sites. Additionally, single-molecule imaging of GR binding in vivo indicates that A477T dissociates from DNA 10-fold faster than GRα , consistent with the observation that A477T has reduced cooperativity when binding full sites in vitro . These results collectively suggest that A477T binds as a monomer in vivo. Furthermore, deep sequencing of DNase I-treated chromatin, in combination with ChIP-seq, reveals that hormone-induced binding of androgen receptor (AR, HUGO symbol NR3C4), a related nuclear receptor, was associated with DNase footprints the size of AR monomers at two-thirds of binding regions . Given the significant homology between AR and GR, these results suggest that GRα is also capable of binding as a monomer in vivo.
DNase footprints measured in aggregate, and even direct imaging of individual binding events, are insufficient to demonstrate that certain GBR binding events promote regulation of target genes. Joint measurements of gene expression and GBR occupancy are the minimum necessary data to infer function of GBRs. Our finding that at least 35 regulated genes contain only GBRs with half sites within 20 kb of their transcriptional start site provides the best evidence to date that GRα binds as a functional monomer in vivo, resulting in regulation of target genes.
If GR and other nuclear receptors bind DNA alternatively as dimers or as monomers, then these two receptor species may represent distinct drug targets . Glucocorticoids, like other steroid hormones, regulate diverse aspects of physiology, making them potent but blunt tools for medicine. Significant interest in ‘selective’ or ‘dissociated’ glucocorticoid receptor agonists [30,31] that separate the beneficial and adverse clinical effects of glucocorticoid treatment has been driven by the success of similar drugs targeting the estrogen receptor, especially in treating breast cancer and osteoporosis . Drugs targeting nuclear receptors commonly bind to a pocket in the ligand binding domain, normally bound by an endogenous hormone (cortisol in the case of human GR). Alternatively targeting the dimerization interface or DNA-binding interface to separate dimer and monomer activity may help produce improved selective modulators for GR and other receptors.
Encouragingly, drugs modulating AR conformation and activity, subsequent to hormone binding, have been previously found . Our data make it difficult to infer the potential effects of such selective modulators on GR activity. However, observations of mice harboring the equivalent mutation to A477T indicate that inhibiting GR dimerization may prevent glucocorticoid-induced decreases in skin and bone collagen synthesis, as well as decreases in insulin sensitivity, while maintaining or even exaggerating glucocorticoid-induced increases in triglyceride synthesis . Moreover, such an inhibitor may cause an increase in lipogenesis, possibly due to the regulation (by monomers) of target genes not normally regulated by GR . A selective modulator that inhibits the regulatory function of GR dimers may have different effects from one that inhibits their formation; both would inhibit regulation of dimerization-dependent genes, but the latter may also drive regulation of new target genes, similar to those observed in the dimerization interface mutant.
Evidence for existence of functional AR  and GR monomers bound to consensus half sites suggests that similar monomeric forms may exist for all nuclear receptor subfamily 3 group C transcription factors (GR, mineralocorticoid receptor, progesterone receptor, and AR), and possibly more distantly related nuclear receptors. These results may have important implications for how hormone-regulated gene networks evolved. Binding of receptor monomers to consensus half sites may present an evolutionary path for establishing robust response elements wherein one perfect half site appears, at random or by insertion or rearrangement, and allows for regulation by monomers, while also promoting selection of a second site with the proper spacing for cooperative binding by dimers; the full site would then be resistant to mutations that might have perturbed binding by monomers. Alternatively, the half site may have preferred function, and may therefore remain stable. Analysis of mutations equivalent to A477T in other nuclear receptors and cell lines may help clarify the functional and evolutionary role of half sites.
Additionally, other GR functional surfaces, including but not limited to AF1, AF2, and the lever arm, may be associated with more narrowly defined motifs that may either be subsets of the full site GBS motif, or may differ from it. Such motifs may be detected by analyzing differentially occupied GBRs near regulated genes selectively affected by perturbations to a functional surface. In cases where binding is not perturbed, discriminatory motif analysis between GBRs proximal to differentially regulated genes and GBRs proximal to non-selectively regulated genes may yield similar answers.
Moreover, structural methods such as X-ray crystallography, electron microscopy, and NMR may define directly the conformational changes induced in GR upon binding to these motifs. These approaches may confirm that GR binding to a suspected ‘sub-motif’ induces changes in the functional surface being studied and drives allosteric changes in remote functional surfaces.
GREs differ in their precise sequence motifs, in their modes of GR binding (monomer or dimer; direct or indirect), and in the functional GR surfaces required for binding or regulation. Structural changes induced in GR upon binding different ‘sub-motifs’ (refinements of the canonical motif) may be important determinants of GR activity at particular response elements. We have shown that GR occupies in vivo many genomic regions containing half sites, and that these regions are likely responsible for regulation of a subset of target genes. Moreover, GR specifically binds to half sites in vitro, without apparent cooperativity, and thus as a monomer.
Half sites are easily overlooked among those sequences that conform to the known GR full site motif, but they have distinct functional properties compared to degenerate full sites, which may be predicted to bind with similar affinity. Identification of GREs with other sub-motifs, the conformations those GREs induce, and the functional surfaces they utilize will be important steps in defining predictive models of GR activity and in developing therapeutic strategies to modulate GR activity.
U2OS cells stably transfected with rat GRα, GRα 30iiB (30iiB; E219K/F220L/W234R), GRα A477T (A477T), GRγ, or GRα E773R (E773R) were grown as previously described . Parent U2OS cells do not manifest any endogenous response to glucocorticoid treatment. DNA sequencing analysis was used to confirm the presence of point mutations in all cell lines. A549 cells were obtained from American Type Culture Collection and maintained in DMEM supplemented with 10% volume/volume (v/v) fetal bovine serum.
Whole cell extracts were collected using 20 mM Tris-Cl pH 8.0, 150 mM NaCl, 1% NonidetP-40 (NP-40), 1% sodium deoxycholate, 0.1% sodium dodecyl sulphate (SDS), 2 mM ethylenediaminetetraacetic acid (EDTA). Immunoblot analysis was performed as previously described . Antibodies to GR (N499) and β-actin (Sigma-Aldrich) were used.
Cells were plated in 6-well plates using DMEM supplemented with 5% v/v fetal bovine serum. Dexamethasone (Sigma) at a final concentration of 100 nM was added to the medium for 2, 4, or 24 h; for the ‘0 h’ time point, cells were treated with vehicle (ethanol) only for 4 h. Cells were lysed and total RNA was isolated using QIAshredder and RNeasy mini columns (Qiagen). The quality of RNA samples was evaluated by A260/A280 ratio which was at least 1.9 and the integrity was analyzed using the Bioanalyzer 2100 (Agilent) with the Experion RNA Stdsens analysis kit (Biorad). For each experimental condition 2 μg of high quality total RNA was submitted to the University of Southern California Epigenome Center. Four biological replicates of each experimental sample from independent experiments that were collected on different days were randomly placed on Illumina Human Ref8 beadchips and processed following standard Illumina procedures.
All coordinates reported are for the UCSC human genome build hg19 (GRCh37).
Bead-level data outputted from Illumina’s BeadScan software were read using the beadarray package  from Bioconductor  with the R programming language . BASH  was used to correct for compact and diffuse spatial artifacts on the arrays. Bead-level data were summarized using beadarray, filtering out beads that were more than three mean absolute deviations from the median. Probe annotations were retrieved using the illuminaHumanv3.db package from Bioconductor; poor quality probes and probes not corresponding to known HUGO gene symbols were removed from the data before further processing. Bead summary data were log2 transformed and quantile normalized. We then used the non-parametric empirical Bayes method described by  to correct for chip-to-chip batch effects. The raw data and batch-corrected, normalized data have been deposited in NCBI’s Gene Expression Omnibus  and are accessible through GEO Series accession number GSE45407 .
The limma package  from Bioconductor was used to determine differential gene expression/regulation: the lmFit function was used to determine relative expression at each condition (time point, expressed allele); the contrasts.fit function was used twice, first to determine fold-change at 2, 4, and 24 h relative to the ‘0 h’ time point (4 h ethanol treatment), and second to determine the difference in fold-change between each expressed allele and GRα; the eBayes function was used to compute statistics in each case. An adjusted  P value cutoff of P <0.05 was used to determine which genes are differentially expressed. We found there was insufficient statistical power to identify many differentially regulated genes at 2 h. This is likely due in part to the limitations of measuring steady state total RNA levels rather than newly generated transcripts, which was not feasible at the time of this study. The ggplot2 package  was used to visualize the results. For a more detailed procedure see (Additional file 17).
U2OS cells stably expressing one of GRα, A477T, 30iiB, or E773R were grown in 3× 75 cm2 dishes and treated with 100 nM dexamethasone for 90 min. The following procedure is adapted from . We added formaldehyde to a final concentration of 1% for 3 min at room temperature and then quenched by adding 2.5 M glycine to a final concentration of 125 mM glycine. Cells were incubated at 4°C for 10 min and then washed with PBS for 5 min. We added 5 mL ice-cold IP lysis buffer (50 mM HEPES-KOH pH 7.4, 1 mM EDTA, 150 mM NaCl, 10% v/v glycerol, 0.5% v/v Triton X-100) with 1× protease inhibitor cocktail added (PIC; Roche Complete) to each flask to lyse cells and scraped cells into 50 mL conical tubes. Lysed cells were nutated for 30 min and pelleted by centrifugation at 4°C for 5 min at 600 times gravity. After removing the supernatant, nuclei were resuspended in 900 μL RIPA buffer (10 mM Tris pH 8.0, 1 mM EDTA, 150 mM NaCl, 5% v/v glycerol, 0.1% sodium deoxycholate, 0.1% SDS, 1% v/v Triton X-100) with 1× PIC added and frozen at −80°C.
The remaining ChIP procedure was performed at 4°C. Protein G Dynabeads (Invitrogen) were prepared for IP as follows: beads were pelleted (using a magnet); liquid was removed; beads were washed twice with RIPA, then resuspended in RIPA with 1× PIC and 0.5 mg/mL BSA at the original bead concentration; 12 μg N499 α-GR antibody were added per 100 μL beads, and the beads were nutated for 2 h.
Nuclei were thawed in ice water. Chromatin from nuclei were fragmented by sonication at 4°C in 300 μL aliquots in 1.5 mL microcentrifuge tubes using the Diagenode Biorupter for 8 cycles of 7 min (0.5 min on / 0.5 min off). Samples were spun at 14,000 revolutions per minute (rpm) and for 15 min to pellet insoluble molecules, and the supernatant was transferred into new microcentrifuge tubes. We saved 10% of each sample to isolate later as input DNA (GRα and A477T only). Protein G Dynabeads with antibody bound were pelleted, washed twice with RIPA, and resuspended at the original bead concentration in RIPA with 4× PIC and 2 mg/mL BSA. Sonicated chromatin (900 μL) from each sample was incubated with 100 μL of antibody-bound beads for 16 h.
After pulldown, beads were washed four times with RIPA with 500 mM NaCl, then four times with LiCl buffer (20 mM Tris, 1 mM EDTA, 250 mM LiCl, 0.5% NP-40, 0.5% sodium deoxycholate), to decrease non-specific binding. To reverse crosslinks, beads were then resuspended in 11 μL RIPA with 500 mM NaCl plus 89 μL Rev-Xlink buffer (0.7% SDS in TE pH 8.0 with 0.2 mg/mL Proteinase K). A total of 10 μL of each input DNA sample was similarly treated by adding 1 μL 5 M NaCl and 89 μL Rev-Xlink buffer. Samples were incubated in a PCR block at 55°C for 3 h, then 16°C for 16 h, then column purified using the Clean & Concentrator-5 kit (Zymogen) at a DNA binding buffer:sample ratio of 5:1.
Fragments were end-repaired for 30 min at 20°C in 25 μL reactions with 1× T4 DNA Ligase Buffer w/ 10 mM ATP (NEB), 100 uM dNTPs (Invitrogen), 3.75 U T4 DNA Polymerase (NEB), 1.25 U DNA Polymerase I, Large (Klenow) Fragment (NEB), and 12.5 U T4 Polynucleotidse Kinase. Samples were column purified using the Clean & Concentrator-5 kit (Zymogen) at a DNA binding buffer:sample ratio of 2:1 following this and subsequent enzymatic reactions. dATP was added to the 3’ end of molecules by incubating for 30 min at 37°C in 25 μL reactions with Buffer 2 (NEB), 200 uM dATP (Invitrogen), and 7.5 U Klenow Fragment (3’- > 5’ exo-). Sample concentrations were measured using PicoGreen (Invitrogen), and then ligated to sequencing adapters at a 2:1 (adapter:sample) molar ratio in 20 μL reactions with 1X T4 DNA ligase buffer w/ 10 mM ATP (NEB) with 6% PEG-8000 and 200 U T4 DNA Ligase (NEB). Libraries were amplified by PCR for 17 cycles using the standard Illumina paired-end primers, and purified by PAGE as described in . Sample concentrations and size distribution were measured using the High Sensitivity DNA kit for the Bioanalyzer 2100 (Agilent) prior to sequencing.
In-house barcodes were used for ChIP samples, which were added to the 3’ ends of the sequencing adapters and determined using the first four bases of the read. Barcodes are TCAT (BC1), GACG (BC2), AGTC (BC3), and CTGA (BC4). The ChIPs for GRα and A477T were each ligated to all barcodes, pooled, and sequenced in their own lanes. Samples for 30iiB and E773R were ligated to BC1-2 and BC3-4, respectively, and pooled. Input samples were ligated to indexed TruSeq adapters using indices 5 and 7 for GRα and A477T inputs, respectively.
ChIP samples were sequenced at the University of California San Francisco Center for Advanced Technology using the Genome Analyzer II (Illumina) with 2× 75 bp paired end reads as per the manufacturer’s instructions. Input DNA samples were sequenced using the HiSeq 2000 (Illumina) with 2× 100 bp paired end reads. Raw sequence reads were deposited in the Sequence Read Archive (SRA) at the National Center for Biotechnology Information (NCBI)  and are accessible through SRA study accession number SRP020242.
We automated the analysis procedure using the Python programming language in order to make our results easily reproducible and to facilitate similar analysis by others. The scripts used - preprocess_reads.py, align2.py, call_peaks.py, and analyze.py - are described here and are available in the software package seriesoftubes on the Python Package Index (PyPI). seriesoftubes depends in part on Biopython . We used preprocess_reads.py (seriesoftubes) to separate the sequencing results by barcode/index, saving only reads with valid barcodes/indices (on both ends in a pair for internal barcodes), and merged barcodes/indices as applicable to consolidate sequence data for each sample. We then used align2.py (seriesoftubes) to align sequences to the genome using bowtie2 , allowing a maximum insert size of 600 bp (based on our observation of an average library size of <150 bp) and saved in BAM format using samtools . Note: for internal barcodes, the first five bases contain the barcode plus an additional ‘T’; these positions were ignored when running bowtie2.
We used call_peaks.py (seriesoftubes), which relies on the MACS2a callpeak module, to find significantly occupied peak regions (‘peaks’) and summit positions using an adjusted P value cutoff of P <0.01 and to generate pileup tracks (signal per million reads). Additionally, the MACS2 diffstats module was used to compute differential occupancy statistics and find (mutant versus GRα) differentially occupied peaks at using an adjusted P value cutoff of P <0.05. For convenience, we have created a track hub formatted for the UCSC genome browser  containing paired-end read alignments, signal pileups, peak region locations, and peak summit locations .
Peaks were assigned to genes using GREAT , with a basal domain of ±20 kb of a gene’s primary transcriptional start site (where assignment is guaranteed) and an extended domain of ±100 kb (where assignment is made only if it does not overlap with another gene’s basal domain); if a peak was assigned to multiple genes, it was ignored in subsequent analyses.
MEME-ChIP  was used for motif discovery as described in the text. For all motif analyses, we used only peak regions (±75 bp from the peak summit) that do not overlap with repeats, found by using bedtools  to intersect peaks regions with the ‘rmsk’ table from the RepeatMasker  track for hg19, extracted using the Table Browser . Peak sequences (±75 bp from the peak summit) were then extracted from the UCSC hg19 build of the human genome using the twobitreader package (available on PyPI). Occurrences of motifs were found using analyze.py (seriesoftubes), which relies on MOODS  for efficiency. JASPAR  motifs MA0144.1 and MA0099.2 were used to find STAT3 and AP1 sites, respectively. Peak sequences were randomly permuted and the 95th percentile of individual information for the best motif in each permuted peak sequence was used as a cutoff for determining presence of GR full site, STAT3, or AP1 motifs (8.94, 8.06, and 9.49 bits, respectively). Data were integrated for further analysis using the R programming language (Additional file 10) and visualized with ggplot2 .
Selected GR binding regions were cloned into the firefly luciferase reporter vector pGL4.10-E4TATAd using conventional methods (that is, PCR and restriction enzymes). A table of reporter names, primers, and genomic coordinates is provided in Additional file 18. We used the Renilla luciferase reporter vector pGL4.70-MLE2e as a normalization control, and the mammalian expression vector p6R as ‘filler’ DNA to reach the recommended concentration of DNA for transfections.
U2OS cells stably expressing either GRα or A477T were grown in three 10 cm dishes each, split, and seeded into two 96-well plates at a final density of about 5 × 104 cells/well. The next day, cells were transfected with 3.33 ng firefly reporter, 150 pg Renilla reporter, and 20 ng p6R per well using PLUS Reagent and Lipofectamine (Invitrogen) according to manufacturer instructions. After 16 h, cells were treated with 100 nM dexamethasone or vehicle only (ethanol) for 3 h, and then luciferase expression was assayed using the Dual-Luciferase Reporter Assay System (Promega) according to manufacturer’s instructions.
aMACS2 will be described in more detail in a forthcoming publication. Briefly, peak calling uses a Poisson test as in the first version of MACS  and differential occupancies are determined using a log-likelihood ratio test, and transformed to P values using Wilks’ theorem . Pre-release versions of MACS2 are available online at .
bGBRs with different locations among two samples were determined as those GBRs that were detectable in one sample and not the other (by overlap) and that did not significantly differ in occupancy among the two samples. This quantification of GBRs with differing locations is only an estimate, as some GBRs that differ in occupancy may appear to have similar occupancy. This effect results from variation in the observed numbers at a given GBR, primarily due to Poisson counting error.
cTo determine whether fold-changes in gene expression between two conditions were less than two-fold different at a significance level of 95%, we constructed 97.5% confidence intervals for the fold-change in gene expression under each condition and required that they do not overlap at all. The 97.5% confidence intervals were computed as the fold-change plus or minus 2.447 - the t-statistic for six degrees of freedom (eight total sample measurements minus two degrees for determining their respective means) - divided by the square root of two (because subtracting two groups of four yields an equivalent n of 2), and multiplied by the posterior standard deviation for a given gene as determined by limma.
dpGL4.10-E4TATA was constructed by ligating the E4TATA minimal promoter 5’-tttttagtcc tatatatact cgctctgtac ttggcccttt ttacactgtg to the promoter-less vector pGL4.10 at BglII and HindIII restriction sites. The E4TATA promoter was selected because it does not respond to dexamethasone.
epGL4.70-MLE2 was constructed by ligating the MLE2 promoter (engineered ML promoter with stronger induction ability) 5’-gaaggggggc tataaaaact cgctctggcg cgttcgtcct cactctcttc cgcatcgct. The MLE2 promoter was chosen because it does not respond to dexamethasone and drives strong expression.
Activation function 1
Activation function 2
Activation protein 1
Chromatin immunoprecipitation sequencing
Glucocorticoid receptor binding region
Glucocorticoid receptor binding sequence
Glucocorticoid receptor response element
Nuclear magnetic resonance spectroscopy
Signal Transducers and Activators of Transcription
Samantha Cooper was involved in the design of the study and optimizing chromatin immunoprecipitation sequencing protocols. Sheng-Hong Chen provided the luciferase reporter vectors pGL4.10-E4TATA and pGL4.70-MLE2. Clement Chu and Jessica Lund at the UCSF Center for Advanced Technology aided in the acquisition of sequencing data. Processing of RNA samples and gene expression microarrays were performed by the University of Southern California Epigenome Center. Joel Mefford suggested the statistical test used to identify non-selective, activated genes. Funding for this study was provided by grants to KRY (R01 CA020535) and to MRS (R01 DK043093) from the National Institute of Health and also by Cancer Center Support Grant P30CA014089 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.