From: Boosting with stumps for predicting transcription start sites

Classifier type | Features | |
---|---|---|

CpG | P versus U | Log-likelihood ratios from third order Markov chain, log-likelihood ratios from TSS weight matrix |

GC-box score, weighted score of transcription factor NFY, weighted energy score at position +1 | ||

Weighted score of transcription factor YY1, TATA score, weighted score of transcription factor ELK1 | ||

MTE score, weighted score of transcription factor CREB | ||

P versus D | Log-likelihood ratios from third order Markov chain, GC-box score | |

Weighted score of transcription factor NFY | ||

Log-likelihood ratios from TSS weight matrix | ||

Difference between the energy score around positions -25 and +1 and the average from surroundings | ||

Log-likelihood ratios from transcription factor ELK1, frequency of G+C | ||

Log-likelihood ratios from transcription factor YY1, TATA score, frequency of G | ||

Non-CpG | P versus U | Correlation between vector of energy scores and empirical average energy profile |

Log-likelihood ratios from third order Markov chain, TATA score | ||

Difference between the energy score around positions -25 and +1 and the average from surroundings | ||

Weighted energy at position +1 | ||

Proportion of Inr and GC-box pair within 10 bp of observed distance, Inr score. | ||

P versus D | Correlation between vector of energy scores and empirical average energy profile, TATA score | |

Log-likelihood ratios from third order Markov chain | ||

Weighted energy at position +1 | ||

Correlation between vector of flexibility scores and empirical average flexibility profile, Inr score | ||

Difference between the flexibility score around position +1 and the average from surroundings, GC-box score |