Afann: bias adjustment for alignment-free sequence comparison based on sequencing data using neural network regression

Table 1 Prediction accuracy using k-NN on 92 white oak datasets of mixed sequence quantity based on \(d_{2}^{*}\) before and after bias adjustment for different query sizes, reference sizes, and different numbers of neighbors k used

Query size	Reference size	k = 1	k = 2	k = 3	k = 4	k =5	k = 6	k = 7	k = 8	k = 9	k = 10
Before bias adjustment
1	91	0.97	0.97	0.97	1.00	1.00	1.00	0.98	0.95	0.91	0.91
17	75	0.98	0.98	0.96	0.99	0.96	0.98	0.96	0.95	0.91	0.91
32	60	0.97	0.97	0.94	0.96	0.94	0.95	0.91	0.92	0.88	0.89
47	45	0.95	0.95	0.93	0.94	0.91	0.91	0.88	0.89	0.87	0.88
62	30	0.93	0.93	0.88	0.89	0.85	0.87	0.83	0.84	0.82	0.81
77	15	0.84	0.84	0.77	0.78	0.75	0.74	0.69	0.70	0.67	0.65
After bias adjustment
1	91	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
17	75	1.00	1.00	1.00	1.00	1.00	1.00	0.99	1.00	1.00	1.00
32	60	1.00	1.00	1.00	1.00	0.99	1.00	0.99	1.00	0.99	0.99
47	45	1.00	1.00	0.99	1.00	0.99	0.99	0.98	0.99	0.97	0.98
62	30	0.99	0.99	0.97	0.97	0.94	0.95	0.92	0.93	0.90	0.91
77	15	0.96	0.96	0.92	0.93	0.87	0.87	0.81	0.79	0.74	0.70

For each query sizes and reference sizes, the dataset was randomly split 100 times and an average prediction accuracy was calculated over 100 splits

ISSN: 1474-760X