Skip to main content

Table 3 Dynamic programming matrix with distributions of the number of adjacent UMIs

From: dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments

\( \frac{S_g}{K} \)

1

2

3

…

S g

0

1

1 − p Neighb

(1 − p Neighb )2

...

\( {\left(1-{p}_{Neighb}\right)}^{S_g-1} \)

1

0

p Neighb

\( \left(1-{p}_{Neighb}\right)\ast {p}_{Neighb}++{p}_{Neighb}\ast \left(1-{p}_{Neighb}\frac{K-1}{K}\right) \)

…

\( {t}_{0,S-1}\ast {p}_{Neighb}++{t}_{1,S-1}\ast \left(1-{p}_{Neighb}\frac{K-1}{K}\right) \)

2

0

0

\( {p}_{Neighb}^2\frac{K-1}{K} \)

…

\( {t}_{1,S-1}\ast {p}_{Neighb}\frac{K-1}{K}++{t}_{2,S-1}\ast \left(1-{p}_{Neighb}\frac{K-2}{K}\right) \)

…

…

…

…

…

…

k

0

0

0

…

\( {t}_{k-1,S-1}\ast {p}_{Neighb}\frac{K-k+1}{K}++{t}_{k,S-1}\ast \left(1-{p}_{Neighb}\frac{K-k}{K}\right) \)

…

…

…

…

…

…

K

0

0

0

…

\( {t}_{K-1,S-1}\frac{p_{Neighb}}{K}+{t}_{K,S-1} \)

  1. Here, K is the maximum number of adjacent UMIs, S g is the maximum number of molecules per gene. A cell tk, s of the matrix contains probability of observing k adjacent UMIs for a fixed UMI in a cell of size s