Skip to main content

Table 2 Details on the types of mis-assemblies and feature characteristics for the results presented in Table 1

From: Genome assembly forensics: finding the elusive mis-assembly

  

Mis-assembly types

Mis-assembly signatures

Suspicious regions

Species

Len

Ins

Del

Join

Inv

Num

aLen

%Len

Num

aLen

%Len

B. anthracis

5.2

0

0

2

0

1,336

831

21.5

127

5,546

13.6

B. suis

3.4

0

0

7

3

1,047

1,354

42.2

158

7,575

35.6

C. burnetii

2.0

0

0

13

9

1,375

1,106

74.3

124

11,455

69.4

C. caviae

1.4

0

0

11

1

625

320

14.1

50

3,896

13.7

C. jejuni

1.8

1

0

3

1

290

613

10.0

61

1,981

6.8

D. ethenogenes

1.8

0

0

8

4

688

691

26.5

88

4,116

20.2

F. succinogenes

4.0

0

1

19

1

1,670

1,387

57.5

266

7,396

48.8

L. monocytogenes

2.9

0

0

1

0

1,381

873

42.1

201

5,254

36.9

M. capricolum

1.0

3

0

0

0

83

835

6.8

16

3,005

4.7

N. sennetsu

0.9

0

0

0

0

91

512

5.4

13

2,328

3.5

P. intermedia

2.7

0

0

19

2

1,655

727

44.5

201

6,263

46.5

P. syringae

6.4

0

1

43

20

2,841

782

34.4

366

5,725

32.4

S. agalactiae

2.1

0

0

16

5

687

793

25.6

112

4,082

21.5

S. aureus

2.8

1

0

34

6

1,850

740

49.0

227

5,582

45.4

W. pipientis

3.3

0

0

17

14

761

1,206

28.1

132

6,395

25.8

X. oryzae

5.0

1

0

74

76

2,569

1,551

79.0

100

27,771

55.1

Totals

46.8

6

2

267

142

18,949

895

35.1

2242

6773

30.0

  1. Phrap mis-assemblies are grouped into tandem insertion (Ins), tandem collapse (Del), mis-join (Join), and inversion (Inv) events in columns 3-6. Columns 7-9 give the total count (Num), average length (aLen), and total length as a percentage of genome (%Len) for the amosvalidate mis-assembly signatures. Columns 10-12 give the same information, but for amosvalidate suspicious regions.