The GoldenGate/MoClo system  was used to construct binary vectors for plant transformation. Level 1 plasmids bearing 35S promoter-driven hptII (selectable marker) and plant codon-optimized cas9-GFP genes were described in a previous study . T7 promoter-driven sgRNA cassettes for in vitro expression were chemically synthesized (Thermo Fisher) and blunt end cloned into a pJET1.2 vector (Thermo Fisher). The U6 promoter-driven sgRNA1 gene was chemically synthesized and directly cloned into a GoldenGate level 1 vector. Level 1 vectors carrying the hptII, cas9-GFP, and sgRNA1 genes were cloned into a level 2 vector via a one-pot reaction to produce binary vectors containing the hptII and cas9 expression cassettes as well as vectors containing the hptII, cas9, and sgRNA1 expression cassettes.
sgRNAs targeting the ACMV DNA A were designed using two parameters, cleavage efficiency and potential off-targets. Only sgRNAs with at least two seed sequence mismatches against the entire 750-Mb cassava genome were selected. First, high-efficiency sgRNAs were obtained using published software from the Broad Institute, MIT . The mismatch search included screening potential sgRNA targets in the cassava reference genome v6.1 (https://phytozome.jgi.doe.gov) with either the canonical NGG and non-canonical NAG protospacer-associated motifs (PAMs) for SpCas9. This yielded only 10 sgRNA designs (out of a total of 305 possible sgRNAs) meeting the off-target screening criteria, of which we selected the best six based on efficiency scores and target locations. These six sgRNAs (Additional file 1: Figure S1a) were further tested for effectiveness using an in vitro cleavage assay with purified Cas9/sgRNA complexes to cleave full-length AMCV amplicons (Additional file 1: Figure S1b). We selected sgRNA1 for stable expression in transgenic cassava because it had the best predicted efficiency and performed well in the in vitro cleavage assay.
In vitro transcription
T7::sgRNA expression cassettes were amplified from their respective pJET1.2 host vectors using primers listed in Additional file 1: Table S3. One microgram of gel-purified linear DNA was used as a template in an in vitro transcription reaction using 100 U of T7 RNA Polymerase (EP0111, Thermo Fisher), 0.1 U inorganic pyrophosphatase (EF0221, Thermo Fisher), 40 U RiboLock RNase Inhibitor (EO0381, Thermo Fisher), and 10 mM NTP mix (R0481, Thermo Fisher) in 1× T7 RNA Polymerase buffer for 16 h at 37 °C. Transcribed sgRNAs were purified by phenol - chloroform extraction.
In vitro Cas9 cleavage assay
The in vitro Cas9 cleavage assay was performed according to a previously published protocol . ACMV templates for cleavage were purified via PCR amplification from total DNA extracts of infected WT plant tissue using primers listed in Additional file 1: Table S3. The purified GFP-tagged Cas9 endonuclease was kindly provided by Prof Martin Jinek (University of Zurich). Two hundred and fifty nanograms of ACMV template DNA was digested with 1 μM each of purified sgRNA and Cas9 protein. Digestion reactions were stopped at 15, 60, and 105 min. The in vitro cleavage assay to test resistance to the Cas9-sgRNA1 complex was similarly performed using a 409 synthetic dsDNA template (Additional file 1: Data S2) designed from the ACMV-AC2 H54Q sequence.
Plant transformation and growth conditions
We generated cassava transgenic lines expressing the Cas9 protein together with sgRNA1 (referred to as Cas9+sgRNA1 lines) as well as control lines expressing only the Cas9 protein (referred to as Cas9 lines) using an established Agrobacterium tumefaciens-mediated transformation protocol . Twenty transgenic cassava lines were characterized for T-DNA copy number (Additional file 1: Figure S7) and expression of the full-length, 180-kDa GFP-tagged, plant codon-optimized Cas9 protein was verified via Western blotting (Additional file 1: Figure S8). The selected lines expressed both the Cas9 and sgRNA transgenes at varying levels (Additional file 1: Figure S1c,d).
In vitro transformed cassava plantlets were grown at 28 °C in a 16-h photoperiod and sub-cultured at 4-week intervals in CBM media (1× Murashige-Skoog medium, 2% sucrose, 2 μM copper sulfate, 0.3% gelrite, pH 5.8). Thirty-day-old plantlets were transferred to soil and grown in glasshouse conditions (14 h photoperiod, 60% relative humidity, day/night temperatures, 26 °C/17 °C).
Southern blotting for T-DNA integration analysis
Total DNA was extracted from leaves harvested from 4-week-old in vitro grown plants using a modified CTAB (cetyl trimethylammonium bromide) protocol . The same leaf samples were used for Western blots and reverse transcription-quantitative PCR (RT-qPCR) analysis. Ten micrograms of total DNA was restriction digested with 20 U of HindIII (Thermo Fisher) in an overnight reaction. The digested DNA was separated on a 0.8% agarose-TAE gel and transferred overnight onto a positively charged nylon membrane (Roti-Nylon Plus, Carl Roth). A 700-bp probe against the hptII gene was PCR amplified from the binary vector and labeled with [α-32P] dCTP using the Prime-A-Gene kit (Promega). The nylon membrane was treated with PerfectHyb Plus Hybridization Buffer (Sigma-Aldrich) for 30 min followed by hybridization together with the radio-labeled probe. Blots were developed using a Typhoon FLA 7500 imaging system (GE Healthcare Life Sciences).
Crude protein extracts were prepared by incubating ground leaf tissue in 5% SDS, 125 mM Tris-HCL (pH 6.8), 15% glycerol buffer with 1× EDTA-free Complete Protease Inhibitor (Roche) for 20 min at room temperature. Samples were centrifuged at 4 °C for 10 min to clear debris. Fifty micrograms of total protein extract was electrophoresed (after a 1:1 dilution with a bromophenol blue solution) on a pre-cast Novex 4–20% Tris-Glycine Midi gel (Thermo Fisher) and transferred to a PVDF membrane (Carl Roth) using a TransBlot Cell (Bio-Rad) system according to the manufacturer’s instructions. Membranes were blocked using 5% milk and 1× Tris buffered saline, 0.1% Tween20 (1XTBS-T) solution for 1 h at room temperature. The blocked membrane was incubated in a primary blotting solution (1XTBS-T) with SpCas9 monoclonal (mouse) antibody (Diagenode) at a 1:2500 dilution and a polyclonal Actin (Rabbit) antibody (Agrisera) at a 1:1000 dilution for 1 h at room temperature. After three 5-min washes with 1XTBS-T, the membrane was incubated with a secondary blotting solution containing IRDye 800CW Goat anti-Mouse IgG and IRDye 680RD Goat anti-Rabbit IgG antibodies at a 1:5000 dilution each. Blots were imaged using the LICOR Odyssey CLx fluorescence imaging system. A PageRuler Prestained protein ladder (Thermo Fisher) was used for size estimation.
Reverse transcription-quantitative PCR
1.5 μg of total RNA extract from leaf samples was DNase 1 treated and reverse transcribed with random hexamer primers using the Revert-Aid First strand cDNA Synthesis Kit (Thermo Fisher) according to the manufacturer’s instructions. Quantitative PCR was carried out with the fast SYBR Green dye for 40 cycles on a Lightcycler 480 instrument (Roche). Relative quantitation was performed using the MePP2A reference gene using the primers listed in Additional file 1: Table S3. Results of transgene-expression quantitation using RT-qPCR are presented in Additional file 1: Figure S1 c, d)
Virus inoculation and symptom monitoring in cassava
Agrobacterium tumefaciens (strain LBA4404) cells carrying infectious clones of ACMV-NOg DNA A and DNA B [9, 14] were cultured for 48 h at 28 °C in 5 mL YEB broth (5 g/L tryptone, 1 g/L yeast extract, 5 g/L nutrient broth, 5 g/L sucrose, 2 mM MgSO4) supplemented with antibiotics (25 mg/L rifampicin, 100 mg/L streptomycin, and 50 mg/L kanamycin). Two milliliters of the starter cultures was then individually added to 200 mL YEB medium with antibiotics and incubated overnight at 28 °C to an OD600 of 0.6–1. Cells were pelleted by centrifuging at 5000×g for 10 min, then re-suspended in 5 mL inoculation medium (10 mM MES pH 5.6, 10 mM MgCl2, 0.25 mM acetosyringone) and incubated for 2 h at room temperature with shaking. DNA A and DNA B cultures at an OD600 of 2.0 were mixed in equal proportions prior to inoculation.
For inoculation, all leaves save the top leaf were removed from 6-week-old cassava plantlets (65 plants in total). The stem and axillary buds were pricked prior to dipping the plantlets in Agrobacterium solution for 10 s and subsequently covered in a Plexiglas box for 3 days. A minimum of five inoculated plants per line were monitored for symptom incidence and severity over a period of 8 weeks. Symptoms in the top three leaves were scored weekly from 3 to 8 weeks post inoculation (wpi) on a scale of 0–3 as depicted in Additional file 1: Figure S7. The first emerging leaf from each plant was harvested 3 wpi. The top three leaves were harvested from each plant at 8 wpi.
Virus cloning and inoculation in N. benthamiana
A 582-bp fragment of the ACMV-AC2 H54Q virus flanked by naturally occurring EcoRI and AflII restriction sites was chemically synthesized (Thermo Fisher). The original agroclone of the wild-type virus was digested with EcoRI and AflII, and the newly synthesized and digested fragment was ligated in place, without any scarring. The clone was sequence verified by Sanger sequencing and electroporated into electrocompetent A. tumefaciens (strain LBA4404) cells. Agrobacterium cultures were grown and prepared for inoculation as described above (with 150 mM acetosyringone). Cells were infiltrated using a Softject 1-mL syringe in a single fully expanded leaf. Six N. benthamiana plants per test construct (and three plants per control construct) were infiltrated and monitored over 4 weeks of growth at 22 °C, 12 h light. After 4 weeks, the four top-most fully expanded leaves from each plant (not including the infiltrated leaf) were harvested for DNA extraction and virus sequencing.
Quantitation of virus titres
Virus quantitation was performed by RT- qPCR on 10 ng of total DNA extracts using ACMV DNA A-specific primers and MePP2A genomic DNA reference primers as listed in Additional file 1: Table S2. Symptomatic leaves were harvested in three separate pools for each plant line (in the absence of symptomatic leaves, asymptomatic leaves were used). Three technical replicates were used per pooled sample. Results are shown in Fig. 2b, c.
Single molecule real-time sequencing of viral amplicons
Full-length viral amplicons from selected plant lines were prepared using target-specific primers tailed with universal sequences (Additional file 1: Table S3) according to the protocol provided by Pacific Biosciences Inc. Equal amounts of each amplicon were used as a template in a second PCR using the Barcoded Universal Primers provided by Pacific Biosciences Inc. (Barcodes used per sample are listed in Additional file 1: Table S4). The standard SMRTBell library construction protocol was used to prepare a pooled, barcoded, sequencing library. Sequencing was performed using a standard MagBead loading protocol on a PacBio RSII instrument. Polymerase reads were processed into barcode separated subreads by primary analysis on the instrument. The resulting subreads  were processed using a standard circular consensus sequencing (CCS) analysis using SMRTPipe with a configuration file specifying a minimum predicted quality of 99.9 and a minimum length of 2600 bp. For the N. benthamiana experiment, an equimolar pool of amplified viral DNA from each replicate plant was pooled to prepare two sequencing pools: ACMV-AC2 H54Q+ACMV-WT, and ACMV-WT alone. Each pool was amplified using separate Barcoded Universal Primers and sequenced as described above.
Sequences representing a near full-length region (2692 bp) of the ACMV DNA A genome as well as a 100-bp region surrounding the sgRNA target site were extracted from each ROI in order to maintain identical start and end sequence positions in each viral amplicon sequence read. Each resulting sequence was pairwise aligned (global alignment using NEEDLE parameters) against its corresponding reference ACMV-NOg DNA A sequence (GenBank: AJ427910). Pairwise alignment scores were assigned to each nucleotide as the sequence mismatch percentage of a 10-nucleotide window surrounding it . The resulting per-base score (y-axis) along each sequence read was plotted using the ggplot and ggjoy packages in R to produce Fig. 1a and Additional file 1: Figure S3. Total pairwise identity scores were used to create Additional file 1: Figure S2d,e. Background mismatches resulting from either sequencing errors or viral variants were found in all lines, including controls. We also failed to find any conserved edits on viruses infecting Cas9+sgRNA1 lines (and not control lines), indicating the absence of an off-target on the virus genome (Additional file 1: Figure S8). This was expected because the seed sequence of sgRNA1 does not have a close match to another site on the viral genome.
For the N. benthamiana experiment, SMRT sequencing raw reads were error-corrected using the CCS pipeline at a threshold of 99.5 predicted accuracy. Resulting full-length virus sequences were searched with the mutant sgRNA target site to count the occurrence of mutant viruses.