Skip to main content
Figure 1 | Genome Biology

Figure 1

From: Comparative analysis indicates regulatory neofunctionalization of yeast duplicates

Figure 1

Method for comparative analysis of gene expression. Given expression matrices for two species where rows correspond to genes and columns correspond to conditions, we first find one-to-one ortholog matches between the two species and arrange the matrices such that equivalent rows represent the expression patterns of orthologs. Note that after this step the two matrices have the same number of rows, but not necessarily the same number of conditions and the conditions are not comparable. Next, the Pearson correlation coefficient (PCC) is calculated for each pair of genes over all conditions, generating the correlation matrices R g , g c e r MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGsbWaa0baaSqaaiaadEgacaGGSaGaam4zaaqaaiaadogacaWGLbGaamOCaaaaaaa@3982@ , R g , g c a n MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGsbWaa0baaSqaaiaadEgacaGGSaGaam4zaaqaaiaadogacaWGHbGaamOBaaaaaaa@397A@ . Each row in these matrices corresponds to the correlations between one gene and all other genes (with orthologs) from the same genome. Equivalent rows in the two matrices correspond to the correlation vectors of a pair of orthologs with all other orthologs from the respective genomes. The correlation between these vectors of correlations is defined as the initial estimation of expression conservation (EC0). EC scores are then iteratively refined by calculating weighted Pearson correlation coefficients (PCCw) where EC scores from the previous iteration are used as weights and genes with negative weights are excluded from the calculation. This procedure is repeated until convergence of the EC scores (EC k ECk - 1). The iterative procedure can also be initiated from random weights to verify the convergence to a global minimum (see Materials and methods).

Back to article page