Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: seqQscorer: automated quality control of next-generation sequencing data using machine learning

Fig. 7

Independent validation on Cistrome datasets. We derived low-quality probabilities on human and mouse ATAC-Seq, ChIP-seq, and DNase-seq datasets referenced in the independent Cistrome database using the optimal generic model. Probabilities of 610 samples (from 32 datasets; not used to train the model) are compared to bad quality flags derived from Cistrome’s guidelines. The higher the number of bad quality flags, the lower the expected quality. Although trained on a binary categorization of independent ENCODE samples as low or high quality, the model’s low-quality probabilities positively correlate with the number of bad-quality flags from Cistrome. n, number of samples; r, Pearson’s correlation coefficient

Back to article page