Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks

Fig. 1

An example on how to construct input features for a SNP candidate site. a Reference sequence and read pileups at candidate site b and at other genomic positions that share the same reads. The columns in gray are genomic positions that will not be used in input features for candidate site b as they do not satisfy the criteria for being highly likely heterozygous SNP sites. Only the columns with colored bases will be used to generate input features for site b and will constitute the set Z as described in the SNP pileups generation section of "Methods". These neighboring likely heterozygous sites can be up to thousands of bases away from candidate site b. b Reference sequence and read pileups for only the candidate site and neighboring highly likely heterozygous SNP sites. c Raw counts of bases at sites in the set Z for each read group split by the nucleotide types at site b. These raw counts are multiplied with negative signs for reference bases. d Flattened pileup image with fifth channel after reference sequence row is added. e Pileup image used as input features for NanoCaller deep convolutional neural network

Back to article page