Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: seqQscorer: automated quality control of next-generation sequencing data using machine learning

Fig. 1

Workflow and dataset. a Workflow. Raw NGS data files were retrieved from the ENCODE data portal. Four quality feature sets were derived using standard bioinformatics tools (e.g., FastQC): raw data (RAW), genome mapping (MAP), genomic localization (LOC), and transcription start sites profile (TSS) feature sets. A grid search was used to derive optimal machine learning models by testing various parameter settings and algorithms. b Dataset. Bars show the number of files in each data subset (y-axis). Two files are counted for each paired-end sample

Back to article page