Skip to main content

Table 4 Parameter training and results on ab initio performance

From: Exploiting single-molecule transcript sequencing for eukaryotic gene prediction

Training settings   Parameter evaluation on 542 test genes
Training genes Setting Exon level Transcript level Sum Rank UTR bases
   Sensitivity Precision Sensitivity Precision    Sensitivity Precision
288 Manual only 0.759 0.486 0.356 0.201 1.802 9 0.484 0.359
800 - 0.800 0.509 0.424 0.233 1.966 7 0.509 0.350
1,200 - 0.808 0.511 0.445 0.244 2.008 3 0.502 0.358
1,200 New species 0.812 0.491 0.380 0.213 1.900 8 0.480 0.357
1,200 SMRT only 0.820 0.517 0.448 0.245 2.030 2 0.515 0.374
1,200 No UTR 0.808 0.511 0.445 0.244 2.008 3 0.502 0.358
1,200 3 opt. rounds 0.808 0.511 0.445 0.244 2.008 3 0.502 0.358
1,200 6 opt. rounds 0.808 0.511 0.445 0.244 2.008 3 0.502 0.358
2,000 - 0.830 0.515 0.458 0.248 2.051 1 0.508 0.363
A. thaliana parameters - 0.810 0.391 0.384 0.151 1.736 10 0.623 0.287
  1. Manual only: refers to 400 manually curated genes of which 288 were used for training and the remainder as test set. A. thaliana parameters: refers to the default A. thaliana parameters as provided by Stanke et al. Rank: calculated from the sum of the sensitivity and the precision for exons, transcripts, and UTR bases. New species: refers to calculating B. vulgaris parameters from scratch using ‘new_species.pl’ (part of the AUGUSTUS pipeline). Opt. round: refers to the number of optimization rounds when running optimize_augustus.pl; the default was nine rounds. SMRT only: refers to training including only SMRT-validated genes. No UTR: refers to not setting the ‘--UTR = on’ parameter when using optimize_augustus.pl