Skip to main content
Figure 1 | Genome Biology

Figure 1

From: Bayesian transcriptome assembly

Figure 1

Outline of the Bayesembler algorithm. (1) A splice graph is first constructed from the RNA-seq data. (2) Transcript candidates are subsequently enumerated by exhaustively searching paths in the graph. (3–5) Gibbs sampling is then used to infer the posterior distribution over expressed candidates, their abundances and assignments of reads to candidates. (3) The sampler is initialised by randomly sampling a candidate assignment for each paired-end read and proceeds as follows. (4a) First, candidate expression values (i.e. binary values indicating whether the candidate has non-zero abundance) are sampled conditioned on an assignment of reads. (4b) Next, abundance values are sampled conditioned on the expressions and read assignments. (4c) Finally, read-to-candidate assignments are sampled conditioned on the transcript expression and abundance values, and the conditional probabilities of observing the reads given the candidates. (5a,b) The fraction of iterations a candidate transcript is expressed during sampling and its mean abundance level across expressed iterations are then used to estimate candidate confidence and abundance levels, respectively. (6) The final assembly is produced by selecting the transcript candidates with highest confidence.

Back to article page