Skip to main content

Table 4 Parameter training and results on ab initio performance

From: Exploiting single-molecule transcript sequencing for eukaryotic gene prediction

Training settings

 

Parameter evaluation on 542 test genes

Training genes

Setting

Exon level

Transcript level

Sum

Rank

UTR bases

  

Sensitivity

Precision

Sensitivity

Precision

  

Sensitivity

Precision

288

Manual only

0.759

0.486

0.356

0.201

1.802

9

0.484

0.359

800

-

0.800

0.509

0.424

0.233

1.966

7

0.509

0.350

1,200

-

0.808

0.511

0.445

0.244

2.008

3

0.502

0.358

1,200

New species

0.812

0.491

0.380

0.213

1.900

8

0.480

0.357

1,200

SMRT only

0.820

0.517

0.448

0.245

2.030

2

0.515

0.374

1,200

No UTR

0.808

0.511

0.445

0.244

2.008

3

0.502

0.358

1,200

3 opt. rounds

0.808

0.511

0.445

0.244

2.008

3

0.502

0.358

1,200

6 opt. rounds

0.808

0.511

0.445

0.244

2.008

3

0.502

0.358

2,000

-

0.830

0.515

0.458

0.248

2.051

1

0.508

0.363

A. thaliana parameters

-

0.810

0.391

0.384

0.151

1.736

10

0.623

0.287

  1. Manual only: refers to 400 manually curated genes of which 288 were used for training and the remainder as test set. A. thaliana parameters: refers to the default A. thaliana parameters as provided by Stanke et al. Rank: calculated from the sum of the sensitivity and the precision for exons, transcripts, and UTR bases. New species: refers to calculating B. vulgaris parameters from scratch using ‘new_species.pl’ (part of the AUGUSTUS pipeline). Opt. round: refers to the number of optimization rounds when running optimize_augustus.pl; the default was nine rounds. SMRT only: refers to training including only SMRT-validated genes. No UTR: refers to not setting the ‘--UTR = on’ parameter when using optimize_augustus.pl