Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: seqQscorer: automated quality control of next-generation sequencing data using machine learning

Fig. 4

Best performing algorithms and parameters. a The top three performing algorithms for selected data subset-feature set combinations (colored boxes). Numbers on the right-hand side show the difference of the area under receiver operating characteristics curve (auROC) between the very best and the third best algorithm. Feature sets: RAW (raw data), MAP (genome mapping), LOC (genomic localization), TSS (transcription start sites profile), ALL (all features). b Frequency of algorithmic parameter settings in 90 optimal models selected by the grid search. For each algorithm, optimal models were created for each possible combination of feature set and data subset. c Frequency of chi-square-based feature selection (chi2) and recursive feature elimination (RFE) settings in best performing models. For each possible combination of feature set and data subset, we compared optimal models derived by different algorithms and retained the best one and its algorithm

Back to article page