Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: NanoVar: accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing

Fig. 1

The NanoVar workflow. a About 2 μg of human genomic DNA is set aside for library preparation and nanopore sequencing to generate 3GS long sequencing reads. Long reads from all sequencing runs are combined into a single FASTQ/FASTA file (at least 12 Gb, recommended 24 Gb) which is used as input into NanoVar. b NanoVar SV characterization process. (Left) Long reads are aligned to a reference genome using HS-BLASTN to identify anchor sequences (blue) and divergent sequences or gaps (red) within each read. Next, the alignment information is used to detect and characterize the different SV classes. (Right) For each characterized SV, read-depth coverage is calculated at their breakend(s) site for the number of breakend-supporting and breakend-opposing reads. The breakend read depth, together with other alignment information, is employed as features in a neural network model to infer a confidence score for each SV. c NanoVar outputs all characterized SVs in a VCF file and produces an HTML report for QC and result visualization. The following figures can be found in the HTML report. (Top-left) Histogram showing the length distribution QC of the input sequencing reads. (Top-middle) Donut chart showing the distribution of SV classes characterized in the dataset (after confidence score filtering). Breakends represent translocation or genomic insertion SV. (Top-right) Scatter plot displaying the confidence score and breakend read ratio (fraction of breakend-supporting reads at a breakend) of each SV, also showing the confidence score threshold parameter used for filtering (red line). (Bottom) Table showing the details of all characterized SVs, which can be sorted, filtered, and extracted in CSV or MS Excel formats

Back to article page