Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: TADA—a machine learning tool for functional annotation-based prioritisation of pathogenic CNVs

Fig. 2

Generalised Workflow of the TADA tool. The basis for the CNV annotation are BED-files of TAD boundaries and additional sets of genomic annotations e.g. gene coordinates. In a first step, the annotation sets are sorted into the corresponding TAD environment based on genomic position. The resulting annotated TAD regions are used as a proxy of the regulatory environment during the CNV annotation (“TAD-aware annotation”). The default feature set for the CNV annotation process consists of features describing the distance to genomic elements such as genes and enhancers in the same TAD environment as well as metrics describing the functional relevance, e.g. conservation scores of affected coding or regulatory elements. Alternatively, the user can provide a set of BED-files containing the coordinates of genomic elements from which a new feature set i.e. the distance of CNVs to these annotations is generated. The user is then able to manually prioritise CNVs based on the distance features. If the default feature set is used TADA also allows for automated prioritisation using the pathogenicity score computed by our pre-trained random forest model

Back to article page