MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

Treangen, Todd J; Koren, Sergey; Sommer, Daniel D; Liu, Bo; Astrovskaya, Irina; Ondov, Brian; Darling, Aaron E; Phillippy, Adam M; Pop, Mihai

doi:10.1186/gb-2013-14-1-r2

Table 1 Comparison of assembly statistics

From: MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

Dataset	Assembler	#ctgs/scfs	Good Ctgs/scfs	Total aln (Mbp)	Slt	Hvy	Ch	Size @ 10 Mbp	#@ 10 Mbp	Max ctg size	Err per Mbp
mockE	SOAPdenovo	63,014	99.3%	51	167	131	1	28,208	195	249,819	5.9
mockE	SOAPdenovo_MA	63,107	99.3%	51	166	131	1	28,208	195	249,819	5.8
mockE	Velvet	12,381	96.0%	41	269	106	2	46,122	128	183,815	9.2
mockE	Velvet_MA	12,830	96.2%	41	256	100	2	42,269	137	179,673	8.7
mockE	MetaVelvet	23,323	96.7%	49	474	160	5	62,131	93	367,458	13.0
mockE	MetaVelvet_MA	22,772	96.8%	49	462	156	4	62,138	91	367,458	12.7
mockE	Meta-IDBA	22,064	95.3%	47	362	151	3	26,141	223	249,069	11.0
mockE	Meta-IDBA_MA	22,032	95.4%	47	362	151	3	26,141	223	249,069	11.0
mockS	SOAPdenovo	45,251	98.8%	28	135	99	0	5,672	626	186,064	8.4
mockS	SOAPdenovo_MA	44,928	98.8%	28	135	98	0	5,672	626	186,064	8.3
mockS	Velvet	20,981	95.6%	28	498	127	1	6,134	770	119,120	22.4
mockS	Velvet_MA	21,050	95.8%	28	485	115	1	6,060	775	119,120	21.5
mockS	MetaVelvet	19,649	94.5%	28	518	158	2	13,028	351	217,330	24.2
mockS	MetaVelvet_MA	20,551	95.3%	28	517	143	3	6,685	622	217,330	20.1
mockS	Meta-IDBA	4,573	92.3%	18	101	83	0	13,150	368	119,604	10.2
mockS	Meta-IDBA_MA	4,559	92.5%	18	101	83	0	13,150	368	119,604	10.2
HMP	SOAPdenovo	39,028	89.9%	11	1,138	2,686	0	9,881	514	116,204	347.6
HMP	SOAPdenovo_MA	35,230	89.1%	11	1,138	2,618	0	11,359	426	238,051	341.5
HMP	Meta-IDBA	25,861	88.9%	7	718	2,102	0	4,215	1144	59,188	402.8
HMP	Meta-IDBA_MA	25,698	88.7%	7	710	2,087	0	4,215	1144	59,188	399.6
HMPscf	SOAPdenovo	31,673	99.9%	11	-	-	10	9,906	510	116,181	0.9
HMPscf	SOAPdenovo_MA	27,231	99.9%	11	-	-	10	11,359	426	238,051	0.9
HMPscf	Meta-IDBA	20,352	99.9%	7	-	-	10	4,946	939	59,188	1.4
HMPscf	Meta-IDBA_MA	22,886	99.9%	7	-	-	9	22,304	238	66,401	1.3

Datasets are mockE (mock Even), mockS (mock Staggered), HMP (Tongue dorsum, contig-level analysis), HMPscf (Tongue dorsum, scaffold-level analysis). All analyses other than HMPscf were done at the contig level. If necessary, contigs were extracted from scaffolds by splitting at three consecutive Ns. Assemblers with suffix _MA indicate the results produced by running MetAMOS on contigs produced by the corresponding assembler. #ctgs/scfs: total number of contigs/scaffolds in the assembly. Good Ctgs/scfs: fraction of contigs/scaffolds that mapped without errors to reference genomes. For the HMP dataset (Tongue dorsum contigs) alignments were only made to a small set of genomes estimated by the HMP project to match the genomes in this sample. For the HMPscf dataset good scaffolds are those without chimeric errors. Total Aln: total amount of sequence that can be aligned to the reference genomes (in Mbp). Slt: slight mis-assemblies determined by alignments that cover 80% or more of the aligned contig in a single match. Hvy: heavy misassemblies determined by alignments that cover less than 80% of the aligned contig in a single match or have two or more matches to a single reference. Ch: Chimeras are contigs with matches to two distinct reference genomes. Neither heavy mis-assemblies nor chimeras count towards reference coverage. Size @ 10 Mbp: the size of the largest contig c such that the sum of all contigs larger than c is more than 10 Mbp (similar to the commonly used N50 size). #@ 10 Mbp: smallest number of contigs whose cumulative size adds up to more than 10 Mbp. Max ctg size: size of the largest contig in the assembly. Err per Mbp: average number of errors per Mbp. Numbers in bold represent the best value for the specific dataset.

Back to article page

ISSN: 1474-760X

Contact us

Submission enquiries: editorial@genomebiology.com
General enquiries: info@biomedcentral.com

Genome Biology

Contact us