Statistical methods for comparing the abundances of metabolic pathways in metagenomics
© Liu and Pop; licensee BioMed Central Ltd. 2010
Published: 11 October 2010
A major goal of metagenomic studies is to identify specific functional adaptations of microbial communities to their habitats. The functional profile and the abundances for a sample can be estimated by mapping metagenomic sequences to the global metabolic network consisting of thousands of molecular reactions. Here we describe our development of statistical methods that can identify differentially abundant subnetworks between metagenomic samples.
First, we introduced a scoring function for an arbitrary subnetwork and find the max-weight subnetwork in the global network by greedy search. Then we compute p abund and p struct values using nonparametric approaches to answer two statistical questions: (i) Is this sub-network differentially abundant? (ii) What is the probability of finding such good subnetworks by chance? Significant metabolic subnetworks are detected on the basis of these two p values.
We have developed statistical methods to find differentially abundant metabolic pathways in metagenomics. The performance is better than previous approaches. Results from real metagenomic datasets confirm previous observations and also provide several new biological insights.
This article is published under license to BioMed Central Ltd.