Skip to main content

Table 3 List of selected features for different training sets

From: DDIG-in: discriminating between disease-associated and neutral non-frameshifting micro-indels

Deletions

Insertions

indels

Non-redundant indels

Disorder(min)

Disorder(min)

Disorder(min)

Disorder(min)

DNA conservation(max)

DNA conservation(max)

DNA conservation(max)

DNA conservation(max)

Deletion length

P(m-i)e (min)

ΔSd

ΔSd

ASAa (min)

ΔSd

Neffc(ave)

Neffc(min)

P(m-d)b(ave)

P(m-i)e (ave)

indel length

ASAa (ave)

Neffc(min)

Disorder(ave)

Distance to the nearest splicing site (upstream)

indel length

Distance to the nearest splicing site (downstream)

Helical probability(max)

ASAa (max)

ASAa (max)

ASAa(max)

P(m-m)f(ave)

Neffc(min)

P(m-m)f(max)

ΔSd

  

DNA conservation(ave)

ASA(ave)

   
  1. aASA, solvent accessible surface area. bP(m-d), match-to-deletion transition probability. cNeff: the number of effective homologous sequences aligned to residues. dΔS, indel-induced change to alignment score. eP(m-i), match-to-insertion transition probability. fP(m-m), match-to-match transition probability.