Skip to main content
Figure 1 | Genome Biology

Figure 1

From: Prioritizing causal disease genes using unbiased genomic features

Figure 1

A decision tree-based approach for causal gene prediction. (A) Mapping of SNPs to neighboring genes using a combination of linkage disequilibrium (LD) information and the location of recombination hotspots. (B) Workflow applying OPEN for causal gene prediction at GWA loci. GWA loci are represented by horizontal bars with individual genes represented by vertical bars. The bar height represents the probability that a gene is causal for the phenotype of interest. Initially, all probabilities are equal. Probabilities are then preliminarily updated based on physical distance from index variant or, optionally, if any prior experimental evidence links them to the phenotype of interest. These probabilities are used in the sampling of positive training examples at each locus during the construction of decision trees. After a 'burn-in' phase, only genes meeting a probability threshold are used as positive training examples. Through cross-validation, the output of the analysis is the log-odds of disease association for all genes in the genome. GBM, gradient boosting machine. (C) Representation of a sample decision tree used for partitioning positive and negative training examples. A classifier consists of multiple decision trees combined in an additive manner.

Back to article page