Term | Definition |
---|---|
Bit-pattern observable | The run of 0 s in a binary string |
Bit vector | An array data structure that holds bits |
Canonical k-mer | The smallest hash value between a k-mer and its reverse complement |
Hash function | A function that takes input data of arbitrary size and maps it to a bit string that is of fixed size and typically smaller than the input |
Jaccard similarity | A similarity measure defined as the intersection of sets, divided by their union |
K-mer decomposition | The process of extracting all sub-sequences of length k from a sequence |
Minimizer | The smallest hash value in a set |
Multiset | A set that allows for multiple instances of each of its elements (i.e. element frequency) |
Register | A quickly accessible bit vector used to hold information |
Sketch | A compact data structure that approximates a data set |
Stochastic averaging | A process used to reduce the variance of an estimator |