Designing molecular beacons for SARS-CoV-2 RT-LAMP
Several beacons were tested for detection of SARS-CoV-2 RNA in RT-LAMP reactions (Additional file 2: Table S1). Optimization required identifying sequence designs that performed properly under the conditions of the RT-LAMP reaction, which is typically run at temperatures around 65°C. Function of the beacon requires that the hairpin remain mostly folded in the hairpin structure at this temperature, while still opening sufficiently often to allow annealing to the target RT-LAMP cDNA product. The annealed beacon-target cDNA duplex must then be sufficiently stable at 65°C to result in unquenching and an increase in fluorescence. To increase beacon affinity for use at higher temperatures, we substituted multiple dNTP positions within the target sequence of each beacon with locked nucleic acids [17]. Locked nucleic acids reduce the conformational flexibility of dNTPs and make the free energy of nucleic acid annealing more favorable [19]. We tested the performance of 28 molecular beacons using five previously reported SARS-CoV-2 (Fig. 1b) and three human control RT-LAMP amplicons (Additional file 1: Fig. S1, Additional file 2: Table S1).
Testing LAMP-BEAC
An example of a successful beacon design is Penn_LFMB_S1 (Additional file 2: Table S1). The RT-LAMP amplicon targets the orf1ab coding region and was first reported by El-Tholoth and coworkers at the University of Pennsylvania (named “Penn”) [7]. The favored beacon was designed to target sequences within the forward DNA loop generated during LAMP; thus, the beacon is designated Penn loop forward beacon, contracted to Penn_LFMB_S1. Detection can be accomplished with laboratory plate readers or PCR machines (below), and even visually with a simple blue light and orange filter (Fig. 1c).
Figure 2 shows use of the Penn_LFMB_S1 system to detect synthetic SARS-CoV-2 RNA. Tests were carried out with commercial LAMP polymerase and reverse transcriptase preparations. In addition, to avoid possible supply chain problems and allow potential production of reagents in resource limited settings, we produced and purified novel DNA polymerase and reverse transcriptase enzymes, which were assayed in parallel with commercial preparations for some tests (described below).
To compare standard LAMP amplification with LAMP-BEAC, reactions were prepared containing both fluorescent dye (Fig. 2a), which detects bulk DNA by intercalation, and the molecular beacon Penn_LFMB_S1 (Fig. 2b). Reaction products were detected at two wavelengths, allowing separate quantification of the bulk dye and the molecular beacon in single reactions. The non-specific intercalating dye reported bulk DNA production in positive samples earlier than the water controls, but the negative controls did amplify shortly after. This spurious late amplification is commonly seen with RT-LAMP, though the mechanism is unclear. The primers may interact with each other to form products and launch amplification, or perhaps the reaction results from amplification of adventitious environmental DNA. In separate tests, synthesis of DNA products was shown to depend on addition of LAMP primers (data not shown).
Molecular beacon Penn_LFMB_S1 in the same reactions showed more clear-cut discrimination (Fig. 2b). The positive samples showed positive signal, but no signal was detected for the negative water controls. Lack of amplification in negative controls has been reproducible over multiple independent reactions (examples below).
The nature of the products could be assessed using thermal denaturation (Fig. 2c and d). Reactions were first cooled to allow full annealing of complementary DNA strands, then slowly heated while recording fluorescence intensity. The fluorescent signal of the intercalating dye started high but dropped with increasing temperature in all samples (Fig. 2c), consistent with denaturation of the duplex and release of the intercalating dye into solution. In contrast, the beacon’s fluorescent signal in the water controls started at low fluorescence (Fig. 2d), consistent with annealing of the beacon DNA termini to form the hairpin structure (Fig. 1a). At temperatures above 70°C, the fluorescence modestly increased, consistent with opening of the hairpin and reptation of the beacon as a random coil in solution. For reactions containing the RT-LAMP product and Penn_LFMB_S1 beacon, fluorescence values were high at lower temperatures, consistent with formation of the annealed duplex, then at temperature sufficient for denaturation, the fluorescence values fell to match those of the random coil (Fig. 2d). Thus, the LAMP-BEAC assay generates strong fluorescence signals during LAMP amplification in the presence of target RNA but not in negative controls, and the thermal melting properties are consistent with formation of the expected products.
Multiplex LAMP-BEAC assays
We next sought to develop additional LAMP-BEAC assays to allow multiplex detection of SARS-CoV-2 RNA, and to allow parallel analysis of human RNA controls as a check on sample integrity, and so developed several additional beacons (Additional file 2: Table S1). E1_LBMB_S1 recognizes an amplicon targeting the viral E gene reported in [20], As1e_LBMB_S2 recognizes the As1e amplicon reported in [9] targeting the orf1ab coding region, N2_LBMB_S3 recognizes N2 amplicon reported in [20] targeting the N coding region, and N-A_LFMB_S2 recognizes N-A amplicon reported in [21] targeting the N coding region (Fig. 1b). We also developed positive control beacons, STATH_LFMB_S1 and a later brighter iteration STATH_LBMB_S2, to detect a LAMP amplicon targeting the human statherin mRNA [22]. Additional control beacons included ACTB to detect beta-actin mRNA [20] and RNaseP to detect ribonuclease P subunit p20 POP7 mRNA or DNA [23] (Additional file 2: Table S1, Additional file 1: Figure S1). We chose to focus on STATH for further testing because it is abundantly expressed in the human saliva and spans an exon junction to allow selective detection of RNA and not DNA.
To allow independent detection of each amplicon as a quadruplex assay, each beacon was labeled using fluorophores with different wavelengths of maximum emission. For example, E1_LBMB_S1 was labeled with FAM and detected at 520 nm, STATH_LFMB_S1 was labeled with hexachlorofluorescein (Hex) and detected at 587 nm, As1e_LBMB_S2 was labeled with Tex615 and detected at 623 nm, and Penn_LFMB_S1 was labeled with cyanine-5 (Cy5) and detected at 682 nm.
This quadruplex LAMP-BEAC assay was tested with contrived samples, in which the saliva was doped with synthetic SARS-CoV-2 RNA (Fig. 3). Prior to dilution, the saliva was treated with TCEP and EDTA, followed by heating at 95°C, which inactivates both SARS-CoV-2 and cellular RNases [9], and so is part of our sample processing pipeline. The STATH_LFMB_S1 amplicon detected the human RNA control in all saliva samples (Fig. 3a). The E1_LBMB_S1 and As1e_LBMB_S2 amplicons both consistently detected SARS-CoV-2 RNA down to ~250 copies per reaction (Fig. 3bc). Samples were called positive if either E1 or As1e showed amplification. Using this scoring method, the combination consistently detected SARS-CoV-2 down to 125 copies, and even detected 2/3 positives at 16 copies per reaction.
The Penn_LFMB_S1 amplicon was least sensitive, detecting SARS-CoV-2 RNA consistently only at ~1000 copies per reaction (Fig. 3d). In the multiplex setting, the Penn amplicon sensitivity was lower than that observed when run in isolation (Fig. 2), likely indicating competition between amplicons during multiplexed reactions. Thus, the use of the Penn_LFMB_S1 assay in the multiplex format selectively reports particularly high RNA copy numbers.
A useful feature of the STATH control amplicon used here is that it amplifies more slowly than the SARS-CoV-2 amplicons. Slower amplification of human controls is desirable to avoid exhaustion of reaction components due to competition, which could prevent viral detection.
Melt curve analysis was also carried out to verify reaction products (Fig. 3e–h). Melt curve profiles were distinctive for each beacon, but the overall pattern included high fluorescence in the positive samples and low values in negative samples at lower temperatures, then convergence of positive and negative samples at high temperatures associated with full melting of the beacon and reptation in solution. The melt curve data for each beacon supported correct function and the expected structures of the amplification products.
Assessing LAMP-BEAC performance on clinical saliva samples
We next tested the LAMP-BEAC assay on a set of 82 saliva samples collected during surveillance for potential SARS-CoV-2 infection. Samples were from a clinical site, where subjects were tested by nasopharyngeal (NP) swabbing and clinical RT-qPCR, and also donated saliva for comparison. Saliva samples were treated with TCEP and EDTA and heated at 95°C for 5 min to inactivate RNase and SARS-CoV-2 [9]. We performed a triplex LAMP-BEAC assay using Penn_LFMB_S1, N2_LBMB_S3, and STATH_LBMB_S2 beacons in a set of five 20 μl reactions. As an additional check, RNA was purified from the same set of saliva samples and RT-qPCR carried out using the CDC-recommend N1 primer set.
The absence of STATH amplification can indicate potential RNA degradation or inhibitors in the sample, but another reason for lost signal can be competition between amplicons. STATH tends to amplify more slowly than the SARS-CoV-2-targeted amplicons and so can be suppressed by robust amplification of viral amplicons (Fig. 3). In practice, we suggest that a sample with SARS-CoV-2 amplification should be called as positive regardless of STATH results, a sample with no SARS-CoV-2 amplification should be called as negative if STATH amplification is observed, and a sample with no STATH amplification and no viral amplification should be called indeterminant.
In these clinical samples, some degradation was apparent and STATH amplification was detected in only 56 out of 82 samples (Additional file 1: Figure S2, Additional file 3: Table S2). These saliva samples had been stored for months and frozen and thawed multiple times, so some attrition is not surprising. SARS-CoV-2 amplification was observed in 6 of these STATH failures suggesting potential competition between amplicons or a greater robustness of viral RNA. Where STATH or SARS-CoV-2 amplification was detectable, the LAMP-BEAC assay correlated perfectly with the amplification of SARS-CoV-2 above the limit of detection by laboratory RT-qPCR on the same saliva samples, i.e., a sensitivity and specificity of 1 (Fig. 4). Performance was similar in quadruplex and duplex LAMP-BEAC assays using Penn_LFMB_S1, E1_LBMB_S1, As1e_LBMB_S2, and STATH_LFMB_S1 performed on subsets of the same samples (Additional file 3: Table S2).
Comparison to the results of clinical RT-qPCR testing on NP swabs from the same patients was complicated by disagreements with the laboratory RT-qPCR testing on matched saliva samples. Of the 24 samples scored as positive by clinical testing on NP swabs, only 17 had detectable amplification by laboratory qPCR on the saliva and an additional 9 samples with detectable amplification by laboratory qPCR had been marked negative by clinical testing. All but one disagreement (see below) occurred in samples with concentrations inferred as less than 100 copies per microliter by laboratory qPCR (clinical quantifications were not available). We thus inferred the laboratory qPCR had a practical limit of detection of 100 copies per microliter. The LAMP-BEAC assay did not detect amplification in any of these discrepant samples. For samples with greater than 100 inferred copies per microliter, the clinical test results and LAMP-BEAC agreed perfectly with the exception of a single saliva sample called positive by LAMP-BEAC but negative by clinical NP testing. This sample was also estimated at 200,000 viral RNA copies per microliter by laboratory RT-qPCR and as positive in 23 LAMP-BEAC amplifications in 14 separate reactions across 4 different primer sets (Additional file 3: Table S2). A recent study has documented differences between the loads of SARS-CoV-2 RNA at different body sites [24], including oral and nasal sites, potentially accounting at least in part for the observed differences.
We note that the detection shown in Fig. 4, using end point fluorescence values and not reaction progression curves, offers a simplified read out for reaction results. That is, advanced qPCR machines are not needed for amplification or quantification of product formation using LAMP-BEAC, but rather reactions can be performed using a simple heat block or incubator and reaction end points can be read out using a simpler fluorescent plate reader or even visual/cell phone detection (Fig. 1c). This may help bypass possible supply chain bottlenecks and expenses associated with purchasing qPCR machines for SARS-CoV-2 assays.
Laboratory-based production of polymerases required for RT-LAMP
Polymerase enzymes are expensive and potentially subject to supply chain disruptions, so we engineered novel reverse transcriptase and DNA polymerase enzymes and devised simple purification protocols, allowing inexpensive local production of the required enzymes. HIV-2 reverse transcriptase and the polA large fragment from Geobacillus stearothermophilus were each engineered to contain several amino acid substitutions expected to stabilize enzyme folding at higher temperatures (RT) or improve strand displacement activity (Bst). Enzymes were purified and tested as described in the methods. Side-by-side assays using lab-purified polymerases and commercial enzyme preparations indicated that our novel polymerase enzymes are at least as efficient as commercial preparations (Additional file 1: Figure S3).
Increased sensitivity through increased reaction volume
The combination of affordable enzyme and low probability of false positives suggests that it could be possible to increase testing sensitivity by increasing reaction volume and sample input. To test this, we quantified sensitivity versus reaction volume using the N2 primer set, comparing detection with nonspecific dye and the N2_LBMB_S3 beacon.
To test the relationship of reaction volume and sensitivity, we first compared the performance of 10 μl reactions with 4 μl of saliva input versus 20 μl reactions with 8 μl of the same saliva input. Samples were contrived using varying concentrations of synthetic SARS-CoV-2 RNA in the inactivated saliva. We observed that the larger 20-μl reaction volume and correspondingly larger saliva input increased the detection rate for the molecular beacon and non-specific dye (Fig. 5a–f). However, interpretation of results by non-specific dye was complicated by the variable distribution of cycle thresholds observed in negative controls, making a clean distinction of positive amplification difficult. In contrast, the LAMP-BEAC molecular beacon detected no sequence-specific amplification in any negative sample. The clear binary threshold provided by molecular beacons simplifies interpretation, enables endpoint detection, and suggests that sensitivity could be heightened by further increasing reaction volume.
We thus ran a series of 200 μl reactions with 80 μl of saliva input (Fig. 5g–i). Amplification detection by non-specific dye performed poorly in these large reaction volumes, with rapid amplification observed in all negative wells. With this strong background amplification, setting a threshold to call as positive was problematic. The fastest time to threshold in a negative well was 17 min while the fastest time to threshold in a positive well was 15 min, leaving little room for discrimination (Fig. 5g). Even using the unrealistically tight threshold of calling any amplification detected in less than a 17-min positive, nonspecific dye did not achieve high sensitivities in reactions with low copy numbers (Fig. 5i).
In contrast, the N2_LBMB_S3 molecular beacon showed almost perfect discrimination in the 200-μl reactions (Fig. 5h). No amplification was detected by molecular beacon in any of the 24 negative reactions (4800 μl of total reaction volume) over the 80-min reaction time. In wells containing synthetic SARS-CoV-2 RNA, the molecular beacon detected 100% of reactions containing 0.5 or 0.25 copies of RNA per μl of saliva input and 23/24 reactions containing 0.1 copy of RNA per μl of saliva (Fig. 5i). Note that even with these low target concentrations, the absence of signal in negative wells means that a real-time quantification is not necessary and simple endpoint read out is just as discriminative.