Skip to main content
Figure 8 | Genome Biology

Figure 8

From: BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments

Figure 8

A simple example of the alternating decision tree (ADTboost). The alternating decision tree contains splitter nodes (squares, associated with a test) and prediction nodes (circles, associated with a value). Each prediction node represents the results of a weak prediction rule. The number in the prediction nodes (circles) defines the contributions to the prediction score. In this example, negative contributions are evidence of nonbinding, whereas positive contributions are evidence of binding. The position and nucleotide features are used for constructing the weak prediction rule. In order to evaluate the prediction for a particular DNA sequence, we begin from the top node and follow the arrows down. We sum all the values at all the prediction nodes reached. This sum represents the prediction score, and its sign is the prediction by default. For instance, in the DNA sequence of AACGCTAATA, the nucleotide at position 1 is A, the nucleotide at position 3 is C, position 4 is not A, position 5 is not T, position 6 is not G, and position 7 is A. Applying the alternating decision tree in the figure to this, we derive the following prediction nodes: +0.541 (from A at position 1), +0.425 + -0.444 (from C at position 3 followed by not A), +0.441 + -0.167 (from not T at position 5, followed by not G), and +0.138 (from A at position 7). Notice that we do not refer to position 2 and other positions because they are not relevant to the rules here. The overall sum of all the nodes is +0.803, a confident score indicating that this is predicted to be a binding site.

Back to article page