TY - STD TI - NCBI. SRA database growth. 2019 [cited 2019 August 8]; Available from: https://trace.ncbi.nlm.nih.gov/Traces/sra/. UR - https://trace.ncbi.nlm.nih.gov/Traces/sra/ ID - ref1 ER - TY - JOUR AU - Altschul, S. F. PY - 1990 DA - 1990// TI - Basic local alignment search tool JO - J Mol Biol VL - 215 UR - https://doi.org/10.1016/S0022-2836(05)80360-2 DO - 10.1016/S0022-2836(05)80360-2 ID - Altschul1990 ER - TY - JOUR AU - Ondov, B. D. PY - 2016 DA - 2016// TI - Mash: fast genome and metagenome distance estimation using MinHash JO - Genome Biol VL - 17 UR - https://doi.org/10.1186/s13059-016-0997-x DO - 10.1186/s13059-016-0997-x ID - Ondov2016 ER - TY - JOUR AU - Zhao, X. PY - 2019 DA - 2019// TI - BinDash, software for fast genome distance estimation on a typical personal laptop JO - Bioinformatics VL - 35 UR - https://doi.org/10.1093/bioinformatics/bty651 DO - 10.1093/bioinformatics/bty651 ID - Zhao2019 ER - TY - BOOK AU - Broder, A. Z. PY - 1998 DA - 1998// TI - On the resemblance and containment of documents. Compression and complexity of sequences 1997 - Proceedings ID - Broder1998 ER - TY - JOUR AU - Berlin, K. PY - 2015 DA - 2015// TI - Assembling large genomes with single-molecule sequencing and locality-sensitive hashing JO - Nat Biotechnol VL - 33 UR - https://doi.org/10.1038/nbt.3238 DO - 10.1038/nbt.3238 ID - Berlin2015 ER - TY - JOUR AU - Jain, C. PY - 2018 DA - 2018// TI - A fast adaptive algorithm for computing whole-genome homology maps JO - Bioinformatics VL - 34 UR - https://doi.org/10.1093/bioinformatics/bty597 DO - 10.1093/bioinformatics/bty597 ID - Jain2018 ER - TY - JOUR AU - Li, H. PY - 2016 DA - 2016// TI - Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences JO - Bioinformatics VL - 32 UR - https://doi.org/10.1093/bioinformatics/btw152 DO - 10.1093/bioinformatics/btw152 ID - Li2016 ER - TY - JOUR AU - Ondov, B. D. PY - 2019 DA - 2019// TI - Mash Screen: high-throughput sequence containment estimation for genome discovery JO - Genome Biol VL - 20 UR - https://doi.org/10.1186/s13059-019-1841-x DO - 10.1186/s13059-019-1841-x ID - Ondov2019 ER - TY - STD TI - Huiguang Yi, Yanling Lin, Chengqi Lin, Wenfei Jin., Kssd: Sequence dimensionality-reduction by K-mer substring space sampling enables real-time large-scale dataset analysis. Github. https://github.com/yhg926/public_kssd., 2021. UR - https://github.com/yhg926/public_kssd ID - ref10 ER - TY - STD TI - Huiguang Yi, Yanling Lin, Chengqi Lin, Wenfei Jin., Kssd: Sequence dimensionality-reduction by K-mer substring space sampling enables real-time large-scale dataset analysis. zenodo. DOI: https://doi.org/10.5281/zenodo.4438337., 2021. ID - ref11 ER - TY - JOUR AU - Shakya, M. PY - 2013 DA - 2013// TI - Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities JO - Environ Microbiol VL - 15 UR - https://doi.org/10.1111/1462-2920.12086 DO - 10.1111/1462-2920.12086 ID - Shakya2013 ER - TY - STD TI - Fan H, et al. An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data. BMC Genomics. 2015;16:522. ID - ref13 ER - TY - STD TI - NCBI. RefSeq Growth Statistics. [cited 2019 August 8th, 2019]. Available from: https://www.ncbi.nlm.nih.gov/refseq/statistics/. Accessed 8 Aug 2019. UR - https://www.ncbi.nlm.nih.gov/refseq/statistics/ ID - ref14 ER - TY - JOUR AU - Jain, C. PY - 2018 DA - 2018// TI - High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries JO - Nat Commun VL - 9 UR - https://doi.org/10.1038/s41467-018-07641-9 DO - 10.1038/s41467-018-07641-9 ID - Jain2018 ER - TY - JOUR AU - Li, H. AU - Durbin, R. PY - 2009 DA - 2009// TI - Fast and accurate short read alignment with Burrows-Wheeler transform JO - Bioinformatics VL - 25 UR - https://doi.org/10.1093/bioinformatics/btp324 DO - 10.1093/bioinformatics/btp324 ID - Li2009 ER - TY - JOUR AU - Fort, A. PY - 2017 DA - 2017// TI - MBV: a method to solve sample mislabeling and detect technical bias in large combined genotype and sequencing assay datasets JO - Bioinformatics VL - 33 UR - https://doi.org/10.1093/bioinformatics/btx074 DO - 10.1093/bioinformatics/btx074 ID - Fort2017 ER - TY - JOUR AU - McKenna, A. PY - 2010 DA - 2010// TI - The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data JO - Genome Res VL - 20 UR - https://doi.org/10.1101/gr.107524.110 DO - 10.1101/gr.107524.110 ID - McKenna2010 ER - TY - STD TI - Weber Y.W.Y.a.G.M. HyperMinHash: MinHash in LogLog space. 2017. https://arxiv.org/abs/1710.08436. Accessed 28 June 2020. UR - https://arxiv.org/abs/1710.08436 ID - ref19 ER - TY - JOUR AU - Baker, D. N. AU - Langmead, B. PY - 2019 DA - 2019// TI - Dashing: fast and accurate genomic distances with HyperLogLog JO - Genome Biol VL - 20 UR - https://doi.org/10.1186/s13059-019-1875-0 DO - 10.1186/s13059-019-1875-0 ID - Baker2019 ER - TY - STD TI - Ertl O. SuperMinHash - A new minwise hashing algorithm for Jaccard similarity estimation. 2017. abs/1706.05698. http://arxiv.org/abs/1706.05698. Accessed 28 June 2020. UR - http://arxiv.org/abs/1706.05698 ID - ref21 ER - TY - STD TI - Ertl, O., BagMinHash - minwise hashing algorithm for weighted sets. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018. ID - ref22 ER - TY - JOUR AU - Pierce, N. T. PY - 2019 DA - 2019// TI - Large-scale sequence comparisons with sourmash JO - F1000Res VL - 8 UR - https://doi.org/10.12688/f1000research.19675.1 DO - 10.12688/f1000research.19675.1 ID - Pierce2019 ER - TY - JOUR AU - Bradley, P. PY - 2019 DA - 2019// TI - Ultrafast search of all deposited bacterial and viral genomic data JO - Nat Biotechnol VL - 37 UR - https://doi.org/10.1038/s41587-018-0010-1 DO - 10.1038/s41587-018-0010-1 ID - Bradley2019 ER - TY - STD TI - Fisher, R.A. and F. Yates, Statistical tables for biological, agricultural and medical research. Statistical tables for biological, agricultural and medical research. 1938, Oxford, England: Oliver & Boyd. 90–90. ID - ref25 ER - TY - JOUR AU - Yi, H. AU - Jin, L. PY - 2013 DA - 2013// TI - Co-phylog: an assembly-free phylogenomic approach for closely related organisms JO - Nucleic Acids Res VL - 41 UR - https://doi.org/10.1093/nar/gkt003 DO - 10.1093/nar/gkt003 ID - Yi2013 ER -