Skip to main content

Table 1 Predictor performance

From: ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes

 

GB

- IS

+ IS

Manual

A. dehalogenans2CPC (NC_007760)

    

   Total IS ORF

1

4

4

2

Complete ORF

-

0

0

0

Partial ORF

-

1

1

1

Pseudogene

1

2

2

1

Unknown ORF

-

1

1

0

   Total IS

-

4

4

2

   Different IS

-

4

4

2

Anaeromyxobacter sp. Fw109 5 (NC_009675)

    

   Total IS ORF

15

22

24

19

Complete ORF

-

4

12

12

Partial ORF

-

1

2

6

Pseudogene

1

4

4

1

Unknown ORF

-

13

6

0

   Total IS

-

20

21

16

   Different IS

-

16

17

12

Anaeromyxobacter sp. K (NC_011145)

    

   Total IS ORF

14

25

28

27

Complete ORF

-

12

26

26

Partial ORF

-

2

0

0

Pseudogene

-

1

1

1

Unknown ORF

-

10

1

0

   Total IS

-

19

19

18

   Different IS

-

10

10

9

A. dehalogenans 2CP1 (NC_011891)

    

   Total IS ORF

15

33

35

35

Complete ORF

-

18

24

27

Partial ORF

-

4

2

3

Pseudogene

-

8

8

5

Unknown ORF

-

3

1

0

   Total IS

-

25

25

23

   Different IS

-

12

12

14

A. aeolicus VF5 (NC_000918)

    

   Total IS ORF

-

7

7

3

Complete ORF

-

0

2

2

Partial ORF

-

1

1

1

Pseudogene

-

0

0

0

Unknown ORF

-

6

4

0

   Total IS

-

7

7

3

   Different IS

-

6

6

2

C. thermocellum 27405 (NC_009012)

    

   Total IS ORF

75

143

144

160

Complete ORF

-

81

123

125

Partial ORF

-

43

11

27

Pseudogene

-

7

7

8

Unknown ORF

-

12

3

0

   Total IS

-

115

115

119

   Different IS

-

27

27

26

S. maltophilia R5513 (NC_011071)

    

   Total IS ORF

11

21

22

20

Complete ORF

-

13

19

19

Partial ORF

-

7

1

1

Pseudogene

-

1

1

0

Unknown ORF

-

0

1

0

   Total IS

-

18

19

16

   Different IS

-

6

7

4

S. maltophilia K279a (NC_010943)

    

   Total IS ORF

49

53

54

57

Complete ORF

-

18

45

47

Partial ORF

-

27

5

9

Pseudogene

-

3

3

1

Unknown ORF

3

5

1

0

   Total IS

-

38

39

36

   Different IS

-

18

19

18

  1. The table shows a comparison of IS annotations of eight bacterial genomes contained in the corresponding GenBank files (GB) with those obtained by manual annotation (Manual) and using the ISsaga predictor with two different IS reference databases. In one database (-IS) the reference ISs contained in the genome under test were removed while in the other these ISs were included (+IS). The total number of IS-associated ORFs (Total IS ORF) are divided into four categories: Complete ORFs, Partial ORFs, Pseudogenes and Unknown. The category 'Unknown' includes all examples that cannot be distinguished by the predictor as complete or partial due to the absence of sufficient numbers of closely related examples in the reference database. The categories 'Total IS' and 'Different IS' are based on nucleotide predictions. In these predictions the number of ORFs carried by the IS are taken into account. For example, if an IS includes two ORFs, this will be counted as two examples in 'Complete ORF' but as a single IS in 'Total IS'.