From: NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION

Tandem repeat analysis methods. To extract TR length and sequence information from PromethION data, we used a base calling (red) and our squiggle-based NanoSatellite (blue) approach. Consecutive steps are shown in bold with below the names of the used bioinformatics tools or squiggle illustrations. The “Raw PromethION data” illustration corresponds to a partial PromethION squiggle from a single read spanning the ABCA7 VNTR. The “TR consensus sequence” figure was obtained from De Roeck et al. [22]. The height of each nucleotide corresponds to its frequency on that position [22]. In the “Reference Squiggles” figure, three ABCA7 VNTR units (alternating colors) are shown based on Scrappie current estimation. After DTW with raw PromethION data and reference squiggles, the TR is delineated from the flanking sequence and segmented into individual TR units (alternating colors in the “Delineation and segmentation” figure). The final “TR unit clustering” figure depicts the DTW process between two TR units. Each current measurement from a TR unit (red or black) is matched (gray lines) to a current measurement from the other TR unit. More detail of the NanoSatellite process is shown in Additional file 1: Figure S6

