Key resource table
Reagent or resource | Source | Identifier |
---|
Antibodies |
Anti-POLR1A | Santa Cruz | Cat # Sc-48385 |
Anti-RNA Pol II-CTD | Abcam | Cat # ab817 |
Anti-RNA Pol II-NTD | CST | Cat # D8L4Y |
Anti-POLR3A | Abcam | Cat # ab96328 |
Anti-GFP | Abcam | Cat # ab290 |
Anti-HA tag | Abcam | Cat # ab9110 |
Anti-β-Actin | Sigma-Aldrich | Cat # A2228 |
Anti-RPAC1 | Santa Cruz | Cat # Sc-374443 |
Anti-SSRP1 | Biolegend | Cat # 609710 |
Anti-SUPT16H | Santa Cruz | Cat # Sc-165987 |
Anti-INTS3 | Proteintech | Cat # 16620-1-AP |
Anti-INTS11 | ABclonal | Cat # A6566 |
Anti-TBP | Proteintech | Cat # 22006-1-AP |
Anti-RNA Pol II S2p | CST | Cat # 13499 |
Anti-H3K36me3 | Abcam | Cat # Ab9050 |
Anti-C-MYC | Proteintech | Cat # 10828-1-AP |
Anti-RPB3 | Proteintech | Cat # 13428-1-AP |
Anti-RPB5 | Proteintech | Cat # 15217-1-AP |
Anti-RPB6 | Proteintech | Cat # 15334-1-AP |
Anti-RPB8 | Proteintech | Cat # 15086-1-AP |
Anti-RPB11 | Proteintech | Cat # 16403-1-AP |
Anti-RPAC1 | Proteintech | Cat # 15923-1-AP |
Anti-Rabbit IgG Secondary Antibody | GE Healthcare | Cat # NA931V |
Anti-Mouse IgG Secondary Antibody | GE Healthcare | Cat # NXA931V |
Chemicals |
Normal Mouse IgG | Merck Millipore | Cat # 12-370 |
Normal Rabbit IgG | Merck Millipore | Cat # 12-370 |
Complete Tablets EDTA-free, EASYpack | Roche | Cat # 04693132001 |
Doxycycline | Sigma-Aldrich | Cat # D9891 |
Indole-3-acetic acid sodium salt | Sigma-Aldrich | Cat # Cat # 1I5148 |
Puromycin dihydrochloride | Sigma-Aldrich | Cat # P8833 |
GENETICIN, G418 | Thermo Fisher | Cat # 10131035 |
5-ethynyluridine (EU) | J&K | Cat # 1388360 |
Biotin-PEG3-azide | Aladdin | Cat # B122225 |
THPTA | Sigma | Cat # 762342 |
Sodium ascorbate | Sigma | Cat # A7631 |
Critical commercial assays |
Dynabeads™ Protein G | Thermo Fisher | Cat # 10004D |
Dynabeads™ M-280 Streptavidin | Thermo Fisher | Cat # 11205D |
PierceTM BCA protein assay kit | Thermo Fisher | Cat # 23227 |
FuGENE® HD Transfection Reagent | Promega | Cat # E2311 |
NEBNext Ultra II DNA Library Prep Kit for Illumina | NEB | Cat # E7645S |
Illumina Nextera DNA Sample Preparation Kit | Illumina | Cat # FC-121-1030 |
KOD FX polymerase | TOYOBO | Cat # KFX-101 |
2×Taq Master Mix (Dye Plus) | Vazyme | Cat # P112-02 |
pEASY-Basic Seamless Cloning and Assembly Kit | TransGen | CU201-03 |
KAPA HIFI hotstart PCR Kit | Kapa Biosystems | Cat # KK2502 |
SuperScript™ III Reverse Transcriptase | Thermo Fisher | Cat # 18080085 |
P-30 RNase-free spin column | BioRad | Cat # 732-6250 |
Micrococcal Nuclease (MNase) | NEB | Cat # M0247S |
ChamQ Universal SYBR qPCR Master Mix | Vazyme | Cat # Q711-02 |
Megen Gel extraction kit | Megen | Cat # D2111-03 |
HiPure Plasmid EF Mini Kit | Magen | Cat # P1112-02 |
Qubit dsDNA HS kit | Thermo Fisher | Cat # Q32851 |
Mouse ES cell culture
The V6.5 mouse ES (mES) cell line used here was a gift from R. Young of the Whitehead Institute, which was derived from the inner cell mass (ICM) of C57BL/6 × 129/sv crossed mice. These mES cells were cultured as previously described [38] and were tested and found to be free of mycoplasma contamination every 3 months. For experiments, all degron mES cells were pretreated with 1 μg/ml doxycycline for 12 h and then were treated with or without 500 μM indole-3-acetic acid (IAA) for different time points.
Plasmid construction and gene targeting
Plasmids for degron cell line construction are constructed as described [38]. Briefly, gene targeting donors contain mAID-GFP tag flanked with mouse genomic sequence for targeted loci were constructed with seamless ligation kit (TransGen Biotech, Cat # CU201-03). The donor of RPA1, RPB1, RPC1, and RPAC1 are fused to their C-terminal domain of endogenous loci, respectively. For comparison, we insert a mAID-GFP tag into RPB1 N-terminal domain to construct Pol II_NTD_degron cell line. For cell transfection, plasmids of donor and CRISPR sgRNA were prepared using a HiPure Plasmid EF Mini Kit (Magen, Cat # P1112-02) and were transfected into Tir1 stable-expressing clonal mouse embryonic stem cell line using FuGENE HD (Promega, Cat # E2311) following the manufacturer’s protocol. After 2 days, the cells were passaged and grown for 1 week in the presence of 100 μg/ml neomycin in the medium. The homozygous clonal lines were selected after genotyping. These clonal lines were assessed for their ability to undergo IAA-induced degradation and to show expression levels similar to that in wild-type mES cells. The clones degraded with the maximum efficiency were chosen for the following assays.
Western blotting
mESCs were dissociated, centrifugated at 3500 rpm for 2 min to be pelleted, and resuspended in 2.5 mM MgCl2, 0.1% NP40, 0.25 M sucrose, 1 mM DTT, 700 mM NaCl, 25 mM HEPES pH 7.9, 1× protease inhibitor cocktail and lysis for 10 min on ice, centrifugated at 14,000 rpm for 10 min at 4 °C. The protein concentration of supernatants was measured using the PierceTM BCA protein assay kit (Thermo, Cat # 23227). Samples were mixed with 2×loading buffer (4% SDS, 20% Glycerol, 0.2 M DTT, 0.2% Bromophenol Blue) for 10 min at 100 °C. Samples run on 10–12% polyacrylamide SDS-PAGE gel and transfer onto PVDF membranes were performed with 300 mA for 2 h. Membranes were blocked with 5% skim milk in PBST for 1 h at room temperature and then incubated with primary antibody diluted in 5% bovine serum albumin of PBST following the manufacturer’s recommendation overnight at 4 °C. The next day, the membrane was washed three times 5 min in PBS-0.1% Tween-20 at room temperature, incubated with secondary antibodies (1: 10,000) in 5% bovine serum albumin in PBS buffer supplementing with 0.1% Tween-20 1 h at room temperature, washed 3 times, and analyzed on G.E AI 600 RGB imaging system. Panels were mounted using ImageJ preserving linearity.
Native chromatin fraction isolation for immunoprecipitation
mESCs were dissociated, centrifugated at 3500 rpm for 2 min to be pelleted, and washed once with PBS/1 mM EDTA. mESCs were resuspended in CE buffer (10 mM HEPES pH 7.6, 60 mM KCl, 1 mM EDTA, 0.1% NP40, 1 mM DTT, 0.34 M Sucrose) on ice for 5 min and centrifugated at 3500g for 15 min at 4 °C, supernatant (cytoplasm) was discarded, and pellets were washed once with PBS/1 mM EDTA. The pellets were resuspended in glycerol buffer (20 mM Tris·HCl pH 8.0, 75 mM NaCl, 0.5 mM EDTA, 0.85 mM DTT, 50% (vol/vol) glycerol) following an equal volume of nuclei lysis buffer (10 mM Hepes pH 7.6, 1 mM DTT, 7.5 mM MgCl2, 0.2 mM EDTA, 0.3 M NaCl, 1 M urea, 1% NP-40), mixed thoroughly and placed on ice for 2 min, and centrifugated at 12,000 rpm for 2 min at 4 °C, and the supernatant (nucleoplasm) was discarded. The pellets (chromatin fraction) were washed twice with 1 ml PBS/1 mM EDTA. Chromatin in CSKII buffer was resuspended (10 mM PIPES pH 6.8, 50 mM NaCl, 0.3 M Sucrose, 6 mM MgCl, 1 mM DTT) with 5 μl DNase I (NEB, Cat # M0303) and 5 μl RNase A (Takara, Cat # 2158), 700 rpm rotation at 37 °C for 30 min. Equal volume of CSKII+ 0.5 M (NH4)2SO4 buffer was added and incubated at room temperature for 10 min. The mixture was centrifugated at 1200g for 6 min, and the supernatant fractions were collected. After incubating with antibody and protein G coupled beads for 12 h at 4 °C, the beads were washed four times with wash buffer (30 mM Hepes pH 7.6, 0.1 M NaCl, 0.5% NP-40 and proteinase inhibitor) and boiled at 100 °C for 10 min. Supernatants were collected for western blot detection.
ATAC-Seq
ATAC-Seq was performed according to the protocol from [81]. In total, 50,000 viable cells were used for library preparation using Nextera™ DNA Sample Prep Kit (Illumina, Cat # FC-121-1030). PCR-amplified libraries were extracted with Megen gel purification kit (Megen, Cat # D2111-03) without size selection. Library quality and quantity were analyzed with Bioanalyzer and Qubit assays and then sequenced on Illumina HiSeqXten using 150 bp paired-end mode.
ChIP-Seq/qPCR, ChIP-mass spectrometry
The ChIP procedure was modified based on a previously published protocol [82]. After cell fixation with 1% formaldehyde for 10 min at room temperature, chromatin fractions were isolated and 40 U of micrococcal nuclease (NEB, Cat # M0247S) was added to the chromatin fraction, incubated at 37 °C for 15 min, and then added 20 μl 0.5 M EDTA and 40 μl 0.5 M EGTA to inactivate MNase. The spun-down pellets were resuspended in 1 ml of sonication buffer and were sonicated by a Biorupter with the following settings: high energy, 30 s working time, 60-s intervals, 20 cycles. After centrifugated twice at 12,000 rpm for 10 min at 4 °C, the subsequent antibody enrichment and ChIP DNA/protein collection procedures were conducted as previously described. For ChIP-Seq library construction, all the ChIP material or 20 ng of the input ChIP DNA was used to construct Illumina sequencing libraries using the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB, Cat # E7645S). PCR-amplified libraries were gel extracted at 200–500 bp and eluted in 30 μl of water. The library quality and quantity were analyzed with Bioanalyzer and Qubit assays, and then, the library was sequenced using HiseqXten 150 × 150 pair-end sequencing. For ChIP-qPCR, we used primers with ChamQ Universal SYBR qPCR Master Mix (Vazyme, Cat # Q711-02). For ChIP-WB detection, the samples were detected as the “Western blotting” section described. For ChIP-MS, the samples were loaded for high-resolution MS detection (Thermo Fisher, Orbitrap Fusion Lumos) under the manufacturer’s instruction.
DRB treatment ChIP-qPCR
Pol III_degron cells were added with 1 μg/ml doxycycline for 12 h. For DRB treatment groups, 100 μM DRB was added for 3.5 h. For DRB release assay, DRB were washed out twice with PBS and replaced with fresh medium for 0, 10, and 20 min, and formaldehyde fixation was conducted immediately as aforementioned. For Pol III degradation groups, 500 μM indole-3-acetic acid (IAA) was added when cell was treated with DRB for 2.5 h so that cell was treated with 3.5 h DRB and 1 h IAA when harvested, and doxycycline and IAA were maintained in the medium in DRB release process. ChIP-qPCR were conducted as aforementioned.
Chromatin-associated RNA-Seq
The cells were dissociated and counts were 2×107 for one experiment. After adding 5% Drosophila S2 cells as spike-in, the mixed cell population was pelleted by centrifugation at 1000g for 5 min at 4 °C. Cell pellets were lysed gently with 0.5 ml of ice-cold NP-40 lysis buffer (10 mM Tris·HCl pH 7.5, 150 mM NaCl, 0.05% NP-40) on ice for 5 min. The cell lysate was added on top of 1.25 ml sucrose cushion (24% sucrose (wt/vol) in NP-40 lysis buffer). Centrifugation was done at 12,000 rpm for 10 min at 4 °C to isolate the nuclei pellet (the supernatant represented the cytoplasmic fraction). The nuclei pellet was washed once with 1 ml PBS/1 mM EDTA. Centrifugation was done at 12,000 rpm for 1 min at 4 °C, and the supernatant was discarded. Then, 0.5 ml nuclei lysis buffer (10 mM Hepes pH 7.6, 1 mM DTT, 7.5 mM MgCl2, 0.2 mM EDTA, 0.3 M NaCl, 1 M urea, 1% NP-40) and 0.5 ml glycerol buffer (20 mM Tris·HCl pH 8.0, 75 mM NaCl, 0.5 mM EDTA, 0.85 mM DTT, 50% (vol/vol) glycerol) were mixed. The nuclei pellet was resuspended gently and incubated on ice for 2 min. Centrifugation was done at 12,000 rpm for 2 min at 4 °C, and the supernatant was discarded (The supernatant represented the nuclear soluble fraction). The chromatin pellet was washed twice with 1 ml PBS/1 mM EDTA. Centrifugation was done at 12,000 rpm for 1 min at 4 °C, and the supernatant was discarded. Total RNA was isolated using TRIzol following the manufacturer’s instructions. Sequencing libraries were generated by Novogene corporation. The libraries were sequenced on an Illumina HiseqXten platform, and 150 bp paired-end reads were generated.
PRO-Seq
PRO-Seq was modified based on a previously published protocol [83]. mES cells were dissociated and counts were ~107 for one experiment. After adding 5% Drosophila S2 cells as spike-in, the mixed cell population was pelleted by centrifugation at 1000g for 5 min at 4 °C. The cell pellet was washed once in 10 ml of ice-cold PBS and resuspended in ice-cold douncing buffer (1×106 cells per ml, 10 mM Tris-HCl pH 7.4, 300 mM sucrose, 3 mM CaCl2, 2 mM MgCl2, 0.1% (vol/vol) Triton X-100, 0.5 mM DTT) for 5 min on ice and dounced 25 times using a Dounce homogenizer, followed by washing twice with douncing buffer. After centrifugation, the pellet was resuspended in storage buffer (5–10 × 106 nuclei per 100 μl of storage buffer, 10 mM Tris-HCl pH 8.0, 25% (vol/vol) glycerol, 5 mM MgCl2, 0.1 mM EDTA, and 5 mM DTT), and the solution was moved forward or flash-frozen in liquid nitrogen and stored at −80 °C. A 100 μl 2× NRO master mix was prepared (10 mM Tris-HCl pH 8.0, 5 mM MgCl2, 1 mM DTT, 300 mM KCl, 0.02 mM biotin-11-CTP, 0.0005 mM CTP, 0.25 mM ATP/GTP/UTP, 1% Sarkosyl and RNase inhibitor), pipetted thoroughly, and preheated to 37 °C. Using a cutoff P200 pipette tip, 100 μl of nuclei was added gently but the mixture was thoroughly pipetted 15 times and the cells were incubated for 3 min, with gently tapping at the incubation midpoint. Then total RNA was extracted with Trizol LS and dissolved with 20 μl DEPC-H2O. RNA was heat-denatured at 65 °C on a heat block for 40 s and fragmented by base hydrolysis by adding 5 μl of ice-cold 1 M NaOH, and the mixture was incubated on ice for 10 min. After adding 25 μl of 1 M Tris-HCl, pH 6.8, buffer was exchanged once by running the 50 μl base-hydrolyzed RNA sample through a P-30 RNase-free spin column according to the manufacturer’s instructions (BioRad, #732-6250). Then, ~50 μl of the RNA sample from the prior step and 50 μl of prewashed M280-streptavidin beads were mixed and incubated at room temperature on a rotator for 20 min. After washing beads with ice-cold high-salt wash buffer (50 mM Tris-HCl pH 7.4, 2 M NaCl and 0.5% (vol/vol) Triton X-100), binding buffer (10 mM Tris-HCl pH 7.4, 300 mM NaCl and 0.1% (vol/vol) Triton X-100), and low-salt wash buffer (5 mM Tris-HCl pH 7.4 and 0.1% (vol/vol) Triton X-100) each two times, RNA was extracted with Trizol twice and precipitated with glycogen and dissolved with 16 μl DEPC-H2O. RNA reverse transcription was conducted with SuperScript™ III Reverse Transcriptase according to instruction (Thermo Fisher, Cat # 18080085). After eliminating RNA by adding 2 μl of 1 M NaOH and incubating 20 min at 98 °C, the single-strand DNA was used to construct library with TELP protocol [3]. PCR-amplified libraries were gel extracted at 200–500 bp and eluted in 30 μl of water. The library quality and quantity were analyzed with Bioanalyzer and Qubit assays. The library was sequenced using HiseqXten 150 × 150 pair-end sequencing.
EU-Seq
EU-Seq was modified based on a previously published protocol [84]. Briefly, the cell-permeable uridine analog, 5-ethynyluridine (EU), is added to the culture medium with 1 mM concentration for 10 min to allow in vivo labeling of nascent transcripts. Ten percent of the total cell number of Drosophila S2 cells were treated similarly and used as spike-in control. After EU labeling, the cells are lysed, and total RNA is extracted. Biotin is conjugated to 10 μg EU-labeled RNAs with a click chemistry reaction in 30 μl working solution (50 mM HEPPS-pH 7.5, 2.5 mM THPTA, 2.5 mM CuSO4, 4 mM Biotin-PEG3-azide, 10 mM sodium ascorbate) for 1 h at room temperature. The reaction is stopped with 450 μl 5 mM EDTA and then the biotinylated RNAs are extracted with 500 μl phenol-chloroform (pH 5.2). The supernatant is collected by centrifugation at 13,000 rpm 4 °C for 10 min. 1/10 volume of 3 M NaAC, 1 μl glycogen, and an equal volume of Isopropanol are added to the supernatant. The RNAs are precipitated by centrifugation at 13,000 rpm 4 °C for 20 min, washed once with 75% EtOH, and dissolved in 100 μl H2O. Biotin-labeled RNAs were hydrolyzed with NaOH and subsequently enriched by streptavidin beads. cDNA synthesis and library construction were conducted as PRO-Seq described. PCR-amplified libraries were gel extracted at 200–500 bp and eluted in 30 μl H2O. The library quality and quantity were analyzed with Bioanalyzer and Qubit assays. The library was sequenced using HiseqXten 150 × 150 pair-end sequencing.
ChIP-Seq mapping and analysis
ChIP-Seq raw data were processed as described [38]. Adapters were trimmed with cutadapt (v2.10) in paired-end mode with the following parameters: -q 15,15 –minimum-length 18 -a CCCCCCCCCAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT. Trimmed reads were first aligned to the mouse reference genome (mm10) using Bowtie2 (v2.3.5.1) with default options [85]. SAMtools (v 0.1.19) and sambamba (v0.7.0) were used to filter unmapped reads, multiple mapped reads, and potential PCR duplicates [4, 5]. Only uniquely aligned reads were retained before calling peaks with MACS2 (v2.2.5) [6]. Publically available ChIP-Seq data from mESCs (Additional file 2: Table S1) were processed using the same strategy. For visualization, bam files of individual replicates for each condition were combined and then converted to bigwig files, binned (10 bp), and normalized to 1× depth of reads per genome coverage (RPGC) using the bamCoverage from the deepTools suite (v3.4.3) with parameters “–normalizeUsingRPGC –bs 10” [86]. Downstream analyses such as TSS plots, metagene plots, and heatmaps were also performed using deepTools with a bin size of 50 bp. Peaks were annotated within mRNA or lncRNA regions according to the GENCODE definitions, while the repeat elements (such as SINE, LINE, or LTR) were provided by Homer software (v4.10) [87], and tRNA loci are available on the UCSC Table Browser. Peaks were ensured to fall only within one single category. Annotation of shared and specific Pol I, Pol II, and Pol III binding peaks was carried out using bedtools combined with the Homer annotatePeaks.pl script, then visualized as a pie or bar graph using ggplot2 R package. The wide-type RNAP peaks were downloaded from supplementary data of our previous study. DiffBind (v.2.14.0) was used for the differential analysis of indicated ChIP-Seq signals by perturbating another two RNA polymerase at wide-type peaks of itself or active promoters [88] (defined in the “Gene list and promoter definition” section).
ChIP-Seq mapping to rDNA units
Trimmed reads were independently aligned against a single copy of the mouse rDNA repeat sequence (GenBank: BK000964.3) using bowtie2 to study whether Pol II and Pol III disruption exert an active or repressive effect on rDNA transcription level. The bam files from sample replicates were then merged to create a representative genome track over the rRNA gene unit, as shown in Fig. 2A.
Poly(A) RNA-Seq mapping and analysis
Raw data were processed as previously described [38]. After quality control and ribosomal read removal, adapter sequences were removed and aligned against the concatenated genome of mm10 and dm6 using the STAR aligner (v 2.7.5a) with default parameters and then separated into the mouse and fly bins [89]. The uniquely mapped reads were counted to estimate the transcript abundance over the Gencode annotated genes (mm10, GRCm38/M23) using featureCounts (v2.0.1) [90]. Differentially expressed genes were detected using DESeq2 with significance cutoffs of FDR < 0.05 and fold change > 2, with a minimum read count of 1 in at least one control sample of two biological replicates [91].
ATAC-Seq mapping and analysis
ATAC-Seq data were processed as published. Briefly, sequenced reads were first adapter trimmed and quality verified with cutadapt and FastQC (v0.11.7). Clean reads were then aligned to the mm10 mouse genome using Bowtie2. Uniquely aligned reads were subsequently processed into sorted, indexed BAM files using SAMtools. PCR duplicates and mitochondrial reads were discarded before further analysis. The remaining reads were corrected to account for the 9-bp insert introduced by the Tn5 transposase by offsetting the 5′ ends by either +4 (for plus strand) or −5 (for minus strand) as described previously. RPGC-normalized bigwig tracks representing open chromatin accessibility were generated using deepTools bamCoverage with parameters “–normalizeUsingRPGC –bs 10.” Nucleosome positioning and the corresponding signal tracks were calculated from replicate-merged ATAC-Seq data using the nucleoATAC algorithm (v0.3.4) with default parameters [92].
Gene list and promoter definition
All gene-centric analyses in this study were performed using mouse GENCODE annotation downloaded from gencodegenes.org in GTF format and filtered such that only “gene” entries. Annotations from chrM and random chromosomes were also omitted.
Active genes were defined as having promoter-proximal density is greater than 0, and the gene body density is significantly higher than 0.04 reads/kb based on the background estimation in our untreated PRO-Seq data, as previously described [93]. A union list of 8845 active mRNA genes (supported by RefSeq annotation) was created by only retaining those detected from all three untreated samples in mESCs and with a minimum length of 5 kb. The coordinates and annotations of tRNAs were downloaded in BED format from the UCSC Table Browser. Intronless genes were also extracted from RefSeq annotated genes comprising one single isoform with one exon.
Promoter regions were identified by ±1 kb regions surrounding the annotated transcription start sites. Active promoters were those promoters of active genes and overlapped with the H3K4me3 peak.
PRO-Seq and chromatin-associated RNA-Seq (ChAR-Seq) mapping and analysis
PRO-Seq and ChAR-Seq data were processed using a custom pipeline that builds on published workflows with minor modifications. Ribosomal reads were first removed by mapping to one copy of the mouse rDNA sequence. After adapter trimming and quality control with cutadapt and RseQC (v4.0.0) separately, clean reads were then aligned to the concatenated mm10+dm6 genome using bowtie2 with default options [94].
Nonuniquely mapping or properly paired reads were discarded, and PCR duplicates were removed with Sambamba. Read mapping to mouse and Drosophila chromosomes were separated and counted with SAMtools. Then, to quantitatively compare gene expression and genome enrichment profiles between different conditions or perturbations, the PRO-Seq and ChAR-Seq data were internally calibrated with Drosophila spike-in cells as previously introduced [38]. featureCounts was used to get gene-level read counts from uniquely mapped bam files in a strand-specific manner. This quantification procedure includes signals only in the gene body (+300 bp from TSS to annotated gene end), while very lowly expressed genes with less than five reads in all samples were also excluded from subsequent analysis. The resultant gene read count table was then subjected to DESeq2 for differential expression analysis, and a cutoff of 0.05 for FDR was chosen to identify significantly differential genes.
For visualization, BAM files of biological replicates were highly correlated and were pooled together before converting to bigwig signal tracks. The final bigwig files were separated by strand and normalized to spike-in controls using bamCoverage from deepTools with a bin size of 10. Pausing index was calculated according to the previous report [93, 95], defined as the read coverage in the gene body (from TSS+300 bp to the gene end) over the promoter-proximal region (from −30 to +300 bp relative to the TSS) for each gene. Only genes with a minimum length of 5 kb were considered in this analysis.
BETA analysis to combine ChIP-Seq and PRO-Seq results
We associated Pol III binding regions with nearby Pol II genes using Binding and Expression Target Analysis (BETA) (v1.0.7) to predict whether Pol III has an activating or repressive function by combining ChIP-Seq and PRO-Seq results [96]. The analysis was performed as previously published with the following adaptations. To study the direct functions of Pol III on Pol II-transcribed mRNA genes, we propose two types of interaction models: spatial proximity based on H3K27ac HiChIP loops (obtained from the previous study) and local regulation according to nearest mRNA genes. Briefly, each Pol III peak was independently classified according to its overlap profile, following a hierarchical tree. That is, we first assigned the Pol III peak to H3K27ac HiChIP contact anchors (requiring a minimum 1-bp overlap) to find its mRNA partner on the other side, while the associated target genes within ±100 kb of unassigned Pol III peaks were identified by linear proximity using the nearest gene approach. Next, Pol III regulatory score for each mRNA gene was estimated based on the strengths of Pol III binding and their distance from the TSS of the corresponding mRNA. A nonparametric statistical test (Kolmogorov–Smirnov test) was used to compare regulatory scores for up-, downregulated, or non-changed genes on the basis of PRO-Seq or Pol II ChIP-Seq results before and after Pol III depletion.
Gene Ontology and gene set enrichment analysis (GSEA)
Gene set enrichment analysis was performed using a pre-ranked gene list based on the difference (log2 fold change) detected in PRO-Seq between the untreated and Pol III degron samples by searching against the intronless genes (Fig. 3F) [97]. Normalized enrichment score (NES) and nominal p value were calculated from the result of 1000 permutations.
Gene Ontology analysis was completed using DAVID (v6.8) online tool with default settings to identify enriched terms in Pol III_degron elongation-affected mRNA genes (adjusted p value <0.05 and fold change ≤ −2) in mESCs (Fig. 3E) [98]. GO term categories were restricted to GOTERM_BP (Biological Process), GOTERM_MF (Molecular Function), GOTERM_CC (Cellular Component), and KEGG_PATHWAY.
Metagene profiles, heatmaps, and volcano plots
For Fig. S1B, peak-centered heatmaps were calculated using deepTools computeMatrix with options “reference-point -a 5000 -b 5000”. The output matrix was plotted using the plotHeatmap for the region spanning −5 kb upstream to +5 kb downstream of TSS, ranked by descending ChIP-Seq read density of the corresponding factor.
For Fig. 4F and S2D, TSS-centered metagene profiles were generated using computeMatrix with options 'reference-point -b 5000 -a 5000' for the region spanning −5 kb upstream to +5 kb downstream of TSS followed by plotProfile, or centered at the TSS in a ±1-kb window for Figs. 4E, 5F, and S3A.
For Fig. 7A, the ratio of transcription rate was defined as divided Pol II ChIP-Seq signals by PRO-Seq signals, while PRO-Seq signals represent the nascent RNA synthesized during the nuclear run-on period, and Pol II ChIP-Seq measured the Pol II binding at the chromatin. Scaled metagene profiles for the ratio of Pol II ChIP-Seq over PRO-Seq or EU-seq signal were created using bamCompare with options “—binSize 10 --outFileFormat bigwig.” The computeMatrix with options “scale-regions -b 3000 -a 3000” and plotProfile commands of deepTools were used to produce aggregated metagene plots of the above ratio in the given genomic regions.
For Figs. 1C, E, 3B–D, and 4F, scaled metagene profiles of ChIP-Seq, ChAR-Seq, or PRO-Seq signals were produced using ngs.plot suite by dividing each gene into 100 equally sized bins, with a 3-kb flanking region on each side in bins of 50 bp. Read pairs sharing the same or opposite orientation as the gene strand were assigned as “sense” and “antisense,” respectively. Only the first mate of read pairs were extracted to make strand-specific metaplots, and the extreme 5% values were removed. The correct bam files were used to calculate read density across those bins and subsequently summarized for all protein-coding genes. For metagene plots besides the strand-specific ones mentioned above, default parameters of ngs.plot were used.
For Figs. 2C and 6C, differential analysis of ChIP-Seq binding intensities at indicated peak regions was defined as having an FDR-adjusted p value < 0.05 along with absolute foldchange > 2, and MA plots were produced by the DiffBind plotMA function.
For Fig. 5A, differential expression analysis was performed using DEseq2 with raw count data as input. Volcano plots of PRO-Seq were made based on the fold changes and p values derived from the Wald test on spike-in normalized and log2-transformed reads.
For Figs. 4D and 5E, the ATAC-Seq read counts for each sample at each tRNA region or mRNA promoter were obtained by featureCount. The resulting count matrix was analyzed with DEseq2 to produce differential tRNA or mRNA gene sets before and after individual RNA polymerase depletion. The analysis protocol matched that for PRO-Seq data (see “PRO-Seq and chromatin-associated RNA-Seq (ChAR-Seq) mapping and analysis” section), with equivalent thresholds for differential chromatin accessibility.
Visualization of genome browser tracks
ChIP-Seq and ATAC-Seq signal tracks (bigwig format) were obtained for visualization on merged bam files with the command “bamCoverage --binSize 10 --normalizeUsing RPGC.”
For PRO-Seq and ChAR-Seq, Signal tracks were generated using the bamCoverage function in the deepTools package with options “--normalizeUsing RPKM.”
Figures illustrating these continuous signal tracks over selected genomic intervals were created in the Integrative Genomics Viewer (IGV) browser [99].
ChIP-MS data analysis
The ChIP-MS analysis was done as described previously. Briefly, the gel was rehydrated three times in distilled water at room temperature for 10 min with gentle agitation. The protein bands were cut out and further cut off into ca 1 × 1 mm2 pieces, followed by reduction with 10 mM TCEP in 25 mM NH4HCO3 at 25 °C for 30 min, alkylation with 55 mM IAA in 25 mM NH4HCO3 solution at 25 °C in the dark for 30 min, and sequential digestion with trypsin at a concentration of 12.5 ng/mL at 37 °C overnight (1st digestion for 4 h and 2nd digestion for 12 h). Tryptic peptides were then extracted out from gel pieces by using 50% ACN/2.5% FA for three times, and the peptide solution was dried under vacuum. Dry peptides were purified by Pierce C18 Spin Tips (Thermo Fisher, USA).
For DDA-MS, Biognosys-11 iRT peptides (Biognosys, Schlieren, CH) were spiked into peptide samples at the final concentration of 10% prior to MS injection for RT calibration. Peptides were separated by Ultimate 3000 nanoLC-MS/MS system (Dionex LC-Packings, Thermo Fisher Scientific™, San Jose, USA) equipped with a 15 cm × 75 μm ID fused silica column packed with 1.9 μm 120 Å C18. After injection, 500 ng peptides were trapped at 6μL/min on a 20 mm × 75 μm ID trap column packed with 3 μm 100 Å C18 aqua in 0.1% formic acid, 2% ACN. Peptides were separated along a 60-min 3–28% linear LC gradient (buffer A: 2% ACN, 0.1% formic acid (Fisher Scientific), buffer B: 98% ACN, 0.1% formic acid) at the flowrate of 300 nL/min (108 min inject-to-inject in total). Eluting peptides were ionized at a potential of +1.8 kV into a Q-Exactive HF mass spectrometer (Thermo Fisher Scientific™, San Jose, USA). Intact masses were measured at resolution 60,000 (at m/z 200) in the orbitrap using an AGC target value of 3E6 charges and a maximum ion injection time of 80 ms. The top 20 peptide signals (charge-states higher than 2+ and lower than +6) were submitted to MS/MS in the HCD cell (1.6 amu isolation width, 27% normalized collision energy). MS/MS spectra were acquired at resolution 30,000 (at m/z 200) in the orbitrap using an AGC target value of 1E5 charges, and a maximum ion injection time of 100 ms. Dynamic exclusion was applied with a repeat count of 1 and an exclusion time of 30 s.
For DIA-MS, Biognosys-11 iRT peptides (Biognosys, Schlieren, CH) were spiked into peptide samples at the final concentration of 10% prior to MS injection for RT calibration. Peptides were separated at 300 nL/min in a 3–28% linear gradient (buffer A: 2% ACN, 0.1% FA, buffer B: 98% ACN, 0.1% FA) in 60 min (75 min inject-to-inject in total) for all samples. Eluting peptides were ionized at a potential of +1.8 kV into a Q-Exactive HF mass spectrometer (Thermo Fisher Scientific, San Jose, USA). A full MS scan was acquired analyzing 390–1010 m/z at resolution 60,000 (at m/z 200) in the orbitrap using an AGC target value of 3E6 charges and maximum IT 80 ms. After the MS scan, 24 MS/MS scans were acquired, each with a 30,000 resolution at m/z 200, AGC target 1E6 charges, and normalized collision energy was 27%, with the default charge state set to 2 and maximum IT set to auto. The cycle of 24 MS/MS scans (center of isolation window) with three kinds of wide isolation window are as follows (m/z): 410, 430, 450, 470, 490, 510, 530, 550, 570, 590, 610, 630, 650, 670, 690, 710, 730, 750, 770, 790, 820, 860, 910, 970.
To analyze DIA data, a DDA library was built by Spectronaut (version: 13.5.190902.43655). The library building was performed according to the standard workflow in Spectronaut (Manual for Spectronaut, available on the Biognosis website). Data was searched against the Swissprot Mouse database September 2018. The differential proteins of Cano MS were identified with the peptide-spectrum match (PSM) number, where all captured protein’s PSM were normalized to that of POLR2A in the same condition respectively as it was an immunoprecipitation target. After that, PSM of Pol III +IAA 1 h divided by untreated condition were calculated and ratio less than 0.8 was defined as decreased while greater than 1.2 was defined as increased protein. The differential proteins of DIA-MS were calculated by DEseq2 with protein abundance number with two biological replicates and cutoff with 1.2 fold change; protein abundance number is calculated by log2 of sum of top3 unique peptide numbers. Proteins localized outside of the nucleus were removed according to Uniprot annotation, and both of MS data were filtered where IgG IP or Input signal greater than samples were removed.
Statistical analysis
Chromatin-associated RNA-Seq, ChIP-Seq, ChIP-MS, ChIP-qPCR, and RT-PCR and two biological replicates were conducted. P values and choice of statistical tests are reported in the figure legends, with the resulting numbers of observations indicated in the figure panels. Almost all the described data processing and analyzing steps (statistical tests, clustering, plotting, and so on) were performed in Python (v3.7.4) (www.python.org), the statistical computing environment R (v4.0.2) (www.r-project.org), and Microsoft Excel. Custom code used in this study is available upon request.