Optimizing sgRNA structure to improve CRISPR-Cas9 knockout efficiency
© Dang et al. 2015
Received: 9 October 2015
Accepted: 25 November 2015
Published: 15 December 2015
Single-guide RNA (sgRNA) is one of the two key components of the clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome-editing system. The current commonly used sgRNA structure has a shortened duplex compared with the native bacterial CRISPR RNA (crRNA)–transactivating crRNA (tracrRNA) duplex and contains a continuous sequence of thymines, which is the pause signal for RNA polymerase III and thus could potentially reduce transcription efficiency.
Here, we systematically investigate the effect of these two elements on knockout efficiency and showed that modifying the sgRNA structure by extending the duplex length and mutating the fourth thymine of the continuous sequence of thymines to cytosine or guanine significantly, and sometimes dramatically, improves knockout efficiency in cells. In addition, the optimized sgRNA structure also significantly increases the efficiency of more challenging genome-editing procedures, such as gene deletion, which is important for inducing a loss of function in non-coding genes.
By a systematic investigation of sgRNA structure we find that extending the duplex by approximately 5 bp combined with mutating the continuous sequence of thymines at position 4 to cytosine or guanine significantly increases gene knockout efficiency in CRISPR-Cas9-based genome editing experiments.
The clustered regularly interspaced short palindromic repeats (CRISPR) system has recently been developed into a powerful genome-editing technology [1–6]. This system is composed of two components: the nuclease Cas9 and the guide RNA. After maturation, the native type-II CRISPR guide RNA is composed of a 42-nucleotide CRISPR RNA (crRNA) and an 89-nucleotide transactivating crRNA (tracrRNA)  (Figure S1a in Additional file 1). Jinek et al.  systematically studied the minimal sequence requirement of the guide RNA in vitro and linked two minimal sequences together to create the short-version single-guide RNA (sgRNA; +48 nucleotides; Figure S1b in Additional file 1). However, a longer version of the sgRNA (+85 nucleotides), which is 37 nucleotides longer at the 5’ end (Figure S1c in Additional file 1), was shown to be much more efficient [7–9] and is now commonly used. This commonly used sgRNA has a shortened duplex compared with the native guide RNA (Figure S1a, c in Additional file 1). In addition, there is a continuous sequence of Ts, which is the pause signal for RNA polymerase III; this signal could potentially reduce transcription efficiency and knockout efficiency. Hsu et al.  showed that changing these two elements did not have a significant effect on knockout efficiency and concluded that the sgRNA (+85 nucleotides) without mutations and duplex extension is the most active sgRNA architecture. However, Chen et al.  reported that sgRNAs with a mutated continuous sequence of Ts and extended duplex significantly enhance the imaging efficiency of a dCas9 (a mutated version of Cas9 lacking nickase activity)–green fluorescent protein (GFP) fusion protein in cells, suggesting that changing these two elements enhances dCas9 binding to target sites and might also increase the knockout efficiency of Cas9. In this study, we systematically investigated the effect of changing these two elements on knockout efficiency and found that, overall, extending the duplex and mutating the continuous sequence of Ts significantly improved knockout efficiency.
Because the continuous sequence of Ts after the guide sequence is the pause signal for RNA polymerase III , the effect of its disruption in sgRNAs has been previously studied [9, 10]. We suspected that mutating the continuous sequence of Ts might also improve knockout efficiency in cells. Accordingly, we mutated this sequence at different positions and determined the knockout efficiency of the mutants (Fig. 1d; Figure S3 in Additional file 1). The knockout efficiency was increased in all mutants, and the mutation at position 4 had the greatest effect.
We previously tested the effect of mutating T→A on knockout efficiency without extending the duplex (Fig. 1c). Next, we also wanted to test the effect of mutating T→A, C, or G while also extending the duplex. Consistent with previous observations, mutations at position 4 generally had the highest knockout efficiency, although mutating T→C at position 1 had a similar effectiveness. In addition, mutating T→C or G generally had higher knockout efficiency than mutating T→A at various positions (Fig. 2b; Figure S5 in Additional file 1). Thus, mutating T→C or G at position 4 yielded the highest knockout efficiency.
To exclude the possibility that the increase in knockout efficiency using the optimized sgRNA structure is limited to TZM-bl cells or the CCR5 gene, we also tested eight sgRNAs targeting the CD4 gene in Jurkat cells. Consistent with the results observed in TZM-bl cells for the CCR5 gene, the optimized sgRNA design also significantly increased the efficiency of knocking out the CD4 gene in the Jurkat cell line (Fig. 3b; Figure S7 in Additional file 1). Thus, the optimized sgRNA structure appears to generally increase knockout efficiency.
The beneficial effect of extending the duplex generally reached a peak at around 5 bp of added length (Fig. 2a). To test whether extending the duplex by 5 bp is superior to extending it by 4 bp or 6 bp, we extended the duplex by 4 bp or 6 bp and compared the resulting knockout efficiencies for the 16 sgRNAs in Fig. 3a. As shown in Figure S8 in Additional file 1, extending the duplex by 4 bp or 6 bp appeared to yield similar knockout efficiency as 5 bp in most cases.
Previously, Chen et al.  showed that mutating T→A at position 4 in combination with extending the duplex by 5 bp significantly enhanced the imaging efficiency of the dCas9–GFP fusion protein in cells. Our results showed that extending the duplex by 4–6 bp and mutating T→C or G at position 4 significantly increased knockout efficiency. To compare the effect of two sgRNA designs on increasing the knockout efficiency, we randomly selected ten sgRNAs targeting CCR5 and compared their knockout efficiencies with different mutations. As shown in Fig. 3c, all of the T→C and most (nine out of ten) of the T→G mutations had significantly higher knockout efficiency than the T→A mutation. It is noteworthy that, although in most cases the T→C mutation had a similar level of knockout efficiency as the T→G mutation, it had a significantly higher knockout efficiency in sp11 (+11 %, P = 0.006) and sp19 sgRNAs (+6 %, P = 0.026) (Fig. 3c; Figure S9 in Additional file 1), suggesting that the T→C mutation might be the best choice.
With the optimized structure, most sgRNAs showed high knockout efficiency. Out of a total of 24 sgRNAs with an optimized sgRNA structure tested, 18 showed >50 % knockout efficiency. By contrast, only four sgRNAs showed >50 % knockout efficiency using the original sgRNA structure (Fig. 3a, b). This optimized sgRNA template not only reduces concerns that knockout experiments might not work due to low sgRNA functionality, but also significantly increases the efficiency of more challenging genome-editing procedures, such as gene deletion.
Previously, Hsu et al.  showed that extending the duplex by 10 bp in combination with mutating the continuous sequence of Ts did not increase knockout efficiency. Our results show that extending the duplex can significantly increase knockout efficiency, but after reaching a peak at around 5 bp, the effect declines, which might explain this discrepancy. Our conclusion is supported by Chen et al.’s study , in which they showed that extending the duplex and mutating the continuous sequence of Ts significantly enhances the imaging efficiency of the dCas9–GFP fusion protein in cells. The effects of these two modifications appear to be different. Mutating the continuous sequence of Ts significantly increased sgRNA production (Fig. 5b), which is likely to be the result of increased transcription efficiency due to the disrupted pause signal . The results with in vitro transcribed sgRNAs suggest that extending the duplex by itself also increases Cas9 functionality because of the structural change (Fig. 5d, e), since any effect of the RNA level was excluded in this experiment. When sgRNA is expressed inside cells, both effects contribute to increase the functionality. It is possible that the modified sgRNA structure might enhance binding to Cas9 or increase its stability. Further work is needed to determine how exactly sgRNA structure increases functionality.
Extending the duplex by ~5 bp combined with mutating the continuous sequence of Ts at position 4 to C or G significantly increased CRISPR-Cas9 gene knockout efficiency.
The TZM-bl cell line (catalog #8129) was obtained from the NIH AIDS Reagent Program and cultured in Dulbecco’s modified Eagle’s medium (DMEM; Life Technologies) with high glucose. The Jurkat (E6-1) cell line (catalog #177) was also obtained from the NIH AIDS Reagent Program and cultured in RPMI medium (Life Technologies). Both media were supplemented with 10 % fetal bovine serum (Life Technologies) and penicillin/streptomycin/L-glutamine (Life Technologies). All cells were maintained at 37 °C and 5 % CO2 in a humidified incubator.
Anti-CCR5 antibody (APC-conjugated, catalog #550856, clone 3A9) was purchased from BD Biosciences. Anti-CD4 antibody (APC-conjugated, catalog #317416, clone OKT4) was purchased from Biolegend. Anti-CD4 antibody (FITC-conjugated, catalog #35-0049-T100, clone RPA-T4) was purchased from TONBO Bioscience.
spCas9 protein was custom made (Novoprotein Scientific) and stored at 1 mg/ml concentration in −80 °C.
sgRNA fragments were inserted into pLB vectors (Addgene plasmid #11619)  at the Hpa I and Xho I sites. Cloned pLB-sgRNA constructs were sequenced to confirm that the sequence inserted was correct. The oligo sequences are listed in Additional file 3. The sgRNAs were started with either A or G, which is the preferred initiation nucleic acid for the U6 promoter . Plasmids were purified with the EZNA Endo-free Mini-prep kit (Omega Biotech). pSpCas9(BB) (pX330) (catalog #42230)  and lentiCas9-Blast (catalog #52962)  was purchased from Addgene. pX261-dU6 was constructed from pX261-U6-DR-hEmx1-DR-Cbh-NLS-hSpCas9-NLS-H1-shorttracr-PGK-puro (Addgene plasmid #42337)  by deleting a 398-bp fragment by NdeI digestion, followed by Klenow reaction and blunt end ligation to delete part of the U6 expression cassette.
Determining knockout efficiency
TZM-bl cells (9 × 104 per well) were seeded into 24-well plates overnight before transfection and washed twice with DPBS, and 300 μl of pre-warmed Opti-Mem I medium was added to each well. pLB-sgRNA plasmids (0.5 μg at a concentration of 0.1 μg/ul) were mixed with 0.5 μg of the Cas9 plasmid pX330 pre-mixed in 100 μl of Opti-Mem I medium. Two microliters of Lipofectamine 2000 transfection agent in 100 μl of Opti-Mem I medium per well were added to the diluted plasmids, followed by a 20-minute incubation. The complex was added to the cells, and the medium was changed to complete medium after a 6-hour incubation at 37 °C in 5 % CO2. Cells were collected for flow cytometry analysis 48 hours after transfection.
Jurkat cells were transfected with 0.5 μg of the pX330 plasmid and 0.5 μg of pLB-sgRNA constructs using the Neon 10-μl transfection kit (Life Technologies), according to the manufacturer’s instructions, and 2 × 105 cells were used per 10-μl tip. Parameters were set to 1325 V, 10 ms, and three pulses. Cells were collected for flow cytometry analysis 72 hours after transfection.
Cells were stained with either anti-CCR5 antibody for TZM-bl cells or anti-CD4 antibody for Jurkat cells, followed by analysis with a FACScanto II cell analyzer (BD Bioscience). Only GFP-positive cells (GFP is a marker expressed by the pLB vector, serving as positive control for transfection) were analyzed for knockout efficiency.
Determining the sgRNA expression level
TZM-bl cells (2.5 × 105 per well) were seeded into six-well plates overnight before transfection. Cells were transfected with 1.5 μg of pLB-sgRNA plasmids and 1.5 μg of the Cas9 plasmid pX330 with Lipofectamine 2000 (Life Technologies, catalog #11668019), according to the manufacturer’s instructions. Cells were collected 48 hours after transfection. GFP-positive cells were sorted with a FACSAria II cell sorter (BD Bioscience), followed by small RNA extraction with the miRNeasy Mini kit (Qiagen, catalog #217004). One microgram of extracted RNA was reverse transcribed with SuperScript® III Reverse Transcriptase reaction (Life Technology, catalog #18080-051), according to the manufacturer’s instructions. The cDNAs were quantified with Syber Green qPCR MasterMix (ABI, catalog #4309155) with primers (forward 5’-GTGTTCATCTTTGGTTTTGTGTTT-3’ and reverse 5’-CGGTGCCACTTTTTCAAGTT-3’). U6B was used as the internal control.
Evaluating target site modification at the DNA level by next-generation sequencing
TZM-bl cells were transfected with Lipofectamine 2000 in six-well plates, according to the manufacturer’s instructions. Cells were collected 48 hours after transfection. GFP-positive cells were sorted using a FACSAria II cell sorter (BD Bioscience), followed by genomic DNA extraction with the QIAamp DNA Blood Mini kit. CCR5 gene fragments were amplified with the primers CCR5-DS-F (5’-ACACTCTTTCCCTACACGACGCTCTTCCGATCTTCTACCTGCTCAACCTGGCC-3’) and CCR5-DS-R (5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCAAGTCCCACTGGGCGGC-3’). The resulting PCR products were amplified for a second round of PCR with individual index primers. The amplicons were run on a 2.5 % agarose gel and purified with the QIAquick Gel Extraction kit (QIAGEN, catalog #28704). Equal amounts of amplicons were mixed and sequenced with a MiSeq sequencer (Illumina).
Evaluating CCR5 disruption efficiency with lentiviral delivery of sgRNA
Lenti-Cas9-Blast and the Viral Power packaging mix (Life Technology, catalog #K4975-00) were co-transfected into 293 T cells with the calcium phosphate transfection protocol. Supernatant was collected and filtered through a 0.45-μm filter before being used for infection of TZM-bl cells and JLTRG-R5 cells (NIH AIDS Reagent Program #11586). Cells (2 × 106) were seeded into a 10-cm dish. After overnight culture, cells were infected with 1 ml viral supernatant with 5 ng/ml polybrene for 3 hours. Forty-eight hours after infection, the cells were treated with 10 μg/ml blasticidin (Life Technology, catalog #R210-01) for 3 days. The surviving cells were labeled as TZM-Cas9 or JLTRG-R5-Cas9 cells.
pLB-sgRNAs were packaged into lentivirus in a similar manner as Lenti-Cas9-Blast. TZM-Cas9 or JLTRG-R5-Cas9 cells (1 × 105) were seeded into 24-well plates and infected at MOI = 0.5. A portion of the cells were collected at different time points and analyzed by FACS to determine the CCR5 disruption rate. The rate of occurrence of GFP-positive cells was ~30 % for TZM-bl-Cas9 cells or ~10 % for JLTRG-R5-Cas9 cells.
Knockout of CD4 in primary CD4+ T cells with Cas9 preloaded with in vitro transcribed sgRNA
CD4+ T cells were isolated from peripheral blood mononuclear cells with StemSep™ Human CD4+ T Cell Enrichment Kit (StemCell Technologies, catalog #14052), and activated with Dynabeads® Human T-Activator CD3/CD28 (Life Technology, catalog #11131D) for 5 days in the presence of 20 U/ml IL-2 (NIH AIDS Reagents Program, catalog #136), 10 % fetal calf serum, and 1× penicillin-streptomycin-glutamine solution (Life Technology, catalog #10378-016).
sgRNAs were transcribed with HiScribe T7High Yield RNA Synthesis kit (NEB) according to the manufacturer’s instructions, followed by purification with the RNeasy Mini kit (Qiagen, catalog #217004). Before each use, sgRNAs were heated to 95 °C for 3 minutes in a PCR tube and immediately transferred to a water/ice bath for 2 minutes to obtain pure monomers.
Activated primary CD4+ T cells were electroporated using the Neon transfection system (100 μl tip, Life Technologies, catalog #MPK10096) with 10 μg of spCas9 protein that was preloaded with 300 pmol sgRNA (mixed and incubated at room temperature for 10 minutes). Cells (1 × 106) resuspended in 100 μl R buffer were mixed with a protein:RNA mix, followed by Neon electroporation (1500 V, 10 ms, three pulses), according to the manufacturer’s instructions. After 48 hours, the cells were stained with CD4 antibody and subjected to FACS analysis.
TZM-Cas9 cells were electroporated by Neon transfection system (10 μl tip; Life Technology catalog #MPK1096) with 30 pmol sgRNA. Cells (5 × 104) were re-suspended in 10 μl R buffer and mixed with RNA, followed by Neon electroporation (1005 V, 35 ms, two pulses) according to the manufacturer’s instructions. After 48 hours, the cells were stained with CD4 antibody and subject to FACS analysis.
Gene deletion assay
TZM-bl cells were co-transfected with sgRNA pairs (0.25 μg each) along with 0.5 μg of the Cas9-expressing plasmid pX261-dU6.sgRNA: pair 1 was CCR5 sp7 plus sp14; pair 2 was CCR5 sp7 plus sp18; pair 3 was CCR5 sp10 plus sp14; and pair 4 was CR5 sp10 plus sp18. The sgRNA sequences are provided in Additional file 3. Twenty-four hours after transfection, the cells were treated with 0.8 μg/ml puromycin for 48 hours, followed by recovery in medium without puromycin for 5 days. Genomic DNA was extracted from cells with the GenElute™ Mammalian Genomic DNA Miniprep kit (Sigma-Aldrich, catalog #G1N70). CCR5 gene fragments were amplified from 70 μg of genomic DNA using Premix Ex Taq (Takara, catalog #RR003A) with forward primer 5’-ATGGATTATCAAGTGTCAAGTCCAA-3’ and reverse primer 5’-AGGGAGCCCAGAAGAGAAAATAAAC-3’ for the CCR5 gene. The PCR was stopped at different cycle numbers to check the amount of amplicon and ensure that the amplification was in the exponential phase. PCR amplicons were analyzed on a 1 % agarose gel.
Student’s t-test (two-tailed, assuming equal variances for all experimental data sets) was used to compare two groups of independent samples.
The data set supporting the results of Fig. 1b in this article is available in the Gene Expression Omnibus with accession code GSE74766 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74766).
Clustered regularly interspaced short palindromic repeat
Fluorescence-activated cell sorting
Green fluorescent protein
Multiplicity of infection
Polymerase chain reaction
We thank Dr. Feng Zhang and Dr. Stephan Kissler for sharing their plasmids. We thank Dr. Manjunath Swamy for reading the manuscript and suggestions and 1R03AI114344 to H.W. and 1R21HL116268 to P.S. and H.W.
This work was supported partially by NIH/NIAID grant 1R56AI114357 and 1R03AI114344 to H.W.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Mali P, Yang L, Esvelt KM, Aach J, Guell M, DiCarlo JE, et al. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Jinek M, East A, Cheng A, Lin S, Ma E, Doudna J. RNA-programmed genome editing in human cells. Elife. 2013;2:e00471.PubMed CentralView ArticlePubMedGoogle Scholar
- Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. 2013;31:227–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–23.PubMed CentralView ArticlePubMedGoogle Scholar
- Cho SW, Kim S, Kim JM, Kim JS. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat Biotechnol. 2013;31:230–2.View ArticlePubMedGoogle Scholar
- Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–21.View ArticlePubMedGoogle Scholar
- Nishimasu H, Ran FA, Hsu PD, Konermann S, Shehata SI, Dohmae N, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014;156:935–49.PubMed CentralView ArticlePubMedGoogle Scholar
- Anders C, Niewoehner O, Duerst A, Jinek M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature. 2014;513:569–73.PubMed CentralView ArticlePubMedGoogle Scholar
- Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013;31(9):827–32.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen B, Gilbert LA, Cimini BA, Schnitzbauer J, Zhang W, Li GW, et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell. 2013;155:1479–91.PubMed CentralView ArticlePubMedGoogle Scholar
- Nielsen S, Yuzenkova Y, Zenkin N. Mechanism of eukaryotic RNA polymerase III transcription termination. Science. 2013;340:1577–80.PubMed CentralView ArticlePubMedGoogle Scholar
- Ma H, Zhang J, Wu H. Designing Ago2-specific siRNA/shRNA to avoid competition with endogenous miRNAs. Mol Ther Nucleic Acids. 2014;3, e176.PubMed CentralView ArticlePubMedGoogle Scholar
- Ma H, Dang Y, Wu Y, Jia G, Anaya E, Zhang J, et al. A CRISPR-based screen identifies genes essential for West-Nile-virus-induced cell death. Cell Rep. 2015;12:673–83.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhou Y, Zhu S, Cai C, Yuan P, Li C, Huang Y, et al. High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells. Nature. 2014;509:487–91.View ArticlePubMedGoogle Scholar
- Wang T, Wei JJ, Sabatini DM, Lander ES. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014;343:80–4.PubMed CentralView ArticlePubMedGoogle Scholar
- Shalem O, Sanjana NE, Hartenian E, Shi X, Scott DA, Mikkelsen TS, et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343:84–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Sanjana NE, Shalem O, Zhang F. Improved vectors and genome-wide libraries for CRISPR screening. Nat Methods. 2014;11:783–4.PubMed CentralView ArticlePubMedGoogle Scholar
- Konermann S, Brigham MD, Trevino AE, Joung J, Abudayyeh OO, Barcena C, et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature. 2014;517(7536):583–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Koike-Yusa H, Li Y, Tan EP, Velasco-Herrera Mdel C, Yusa K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol. 2014;32:267–73.View ArticlePubMedGoogle Scholar
- Gilbert LA, Horlbeck MA, Adamson B, Villalta JE, Chen Y, Whitehead EH, et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell. 2014;159:647–61.PubMed CentralView ArticlePubMedGoogle Scholar
- Kissler S, Stern P, Takahashi K, Hunter K, Peterson LB, Wicker LS. In vivo RNA interference demonstrates a role for Nramp1 in modifying susceptibility to type 1 diabetes. Nat Genet. 2006;38:479–83.View ArticlePubMedGoogle Scholar
- Ma H, Wu Y, Dang Y, Choi JG, Zhang J, Wu H. Pol III promoters to express small RNAs: delineation of transcription initiation. Mol Ther Nucleic Acids. 2014;3, e161.PubMed CentralView ArticlePubMedGoogle Scholar