Skip to main content

Table 2 Merqury quality and completeness statistics for Arabidopsis thaliana and human genome assemblies

From: Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies

 

Assembly

Continuity (Mb)

QV

k-mer completeness (%)

BUSCO (%)

Total

# Contigs

Max

NG50

Meryl

Mash

Map

All

Hap1

Hap2

PPV

Comp.

Dup.

Frag.

Mis.

A. thaliana Col x Cvi F1

TrioCanu

 Col

124

215

13.1

4.6

35.1

34.6

37.5

83.8

99.5

0.7

99.3

98.2

1.6

0.5

1.3

 Cvi

122

163

12.2

5.6

36.4

35.9

38.6

83.6

1.1

99.4

98.9

98.0

1.4

0.4

1.6

 Col + Cvi

246

378

13.1

5.5

35.7

33.0

38.0

98.3

99.7

99.4

99.1

98.2

78.8

0.2

1.6

FALCON-Unzip

 pri

140

172

13.3

8.0

34.8

33.9

36.9

87.1

65.2

59.8

N/A

98.1

6.2

0.3

1.6

 alt

105

248

11.6

4.3

38.3

37.9

39.9

74.5

38.2

40.6

N/A

93.1

2.0

0.3

6.6

 pri + alt

245

420

13.3

6.9

36.0

33.5

37.9

97.8

99.1

98.2

N/A

98.1

93.2

0.3

1.6

Canu

248

2368

6.9

2.3

29.3

26.8

27.2

95.7

90.5

90.8

N/A

97.6

61.5

0.4

2.0

H. sapiens NA12878

TrioCanu

 mat

2749

7388

9.0

1.1

31.3

30.4

34.1

94.0

90.7

0.9

99.1

86.5

0.8

6.9

6.6

 pat

2743

7252

11.5

1.1

31.0

30.1

34.1

93.7

1.0

91.0

98.8

85.1

0.7

7.8

7.1

 mat + pat

5492

14,640

11.5

1.1

31.1

27.5

34.1

98.2

90.9

91.2

99.0

90.0

47.3

5.0

5.0

  1. All bases in the continuity stats are in Mbp. Haploid genome size of 130 Mbp and 3.2 Gbp was used for NG50 in the A. thaliana and H. sapiens NA12878 haploid assemblies, respectively, with twice the haploid genome size for combined assemblies. Merqury-specific column headers are in bold, including k-mer-based quality (QV) and completeness estimates. Merqury includes both exact (Meryl) and approximate (Mash) methods for measuring k-mer QV, while completeness uses only the exact k-mer counting method. Consensus QV scores are Phred-scaled where QV = − 10 log10 E for a probability of error E at each base in the assembly. K-mer completeness is measured by the fraction of all distinct, reliable k-mers (all) and haplotype specific k-mers. Hap1 and Hap2 correspond to Col and Cvi in A. thaliana, maternal and paternal in NA12878. BUSCO v3 was run with embryophyta_odb9 (n = 1440) for A. thaliana and mammalia_odb9 (n = 4104) for NA12878
  2. Map. mapping, Comp. complete, Dup. duplicated, Frag. fragmented, Mis. missing