Skip to main content

Table 1 Comparison of computational efficiency of Mash, BinDash, and Dashing at k=31 and various sketch sizes

From: Dashing: fast and accurate genomic distances with HyperLogLog

      

Dashing

Dashing

Dashing

Phase

Measure

k

log2(size)

Mash

BinDash

Original

Ertl-MLE

Ertl-JMLE

Sketch

Wall clock (s)

31

10

1345

1157

273

271

277

   

12

1349

1157

273

274

270

   

14

1356

1159

286

289

278

   

16

1400

1226

359

367

299

 

Peak mem (MB)

31

10

17,720

141

12,683

12,721

12,644

   

12

18,296

399

12,723

12,430

12,726

   

14

19,706

1426

12,630

12,877

12,853

   

16

25,127

5542

12,888

12,412

12,933

Distance

Wall clock (s)

31

10

1901

74

80

100

601

   

12

2368

188

286

308

2139

   

14

3446

672

1113

1137

8308

   

16

8777

3603

6172

4251

30,506

 

Peak mem (MB)

31

10

1120

409

116

116

116

   

12

1380

673

371

371

372

   

14

2785

1,709

1392

1392

1392

   

16

10,776

5,816

5476

5476

5476

Both

Wall clock (s)

31

10

3246

1,231

345

365

870

   

12

3717

1,345

557

579

2407

   

14

4801

1,831

1390

1,408

8574

   

16

10,177

4,829

4394

4,453

30,433

 

Peak mem (MB)

31

10

17,720

409

12,468

12,950

12,988

   

12

18,296

673

12,958

13,042

13,020

   

14

19,706

1709

13,951

13,782

14,205

   

16

25,127

5816

18,320

18,081

18,011

  1. The log2(size) column reports the log2 of the sketch size in bytes. “Both” results obtained either by using a combined Sketch+Distance mode (for Dashing) or by combining results from separate sketching and distance-calculation invocations (for Mash and BinDash). Dashing was assessed using three estimation methods: Flajolet’s method using the harmonic mean (“Original”) and Ertl’s MLE and JMLE methods. Italicized entries indicate the lowest space or time for a given experiment