Skip to main content
Figure 5 | Genome Biology

Figure 5

From: Detection and correction of false segmental duplications caused by genome mis-assembly

Figure 5

Re-estimated fragment size distribution. The distribution of fragment sizes for chimpanzee library G591P4 is plotted above under three models. The normal distribution with mean and standard deviation given by the NCBI Trace Archive is plotted as 'Normal TA'. A normal distribution re-estimated from the placement of mated reads from the library is plotted as 'Normal re-estimate'. To lessen the effect of outliers, we did an initial estimation of the parameters, filtered out any mate pair distances that were greater than four standard deviations away, and then estimated the parameters again. 'Nonparametric' plots the actual density of mate pair distances after running a cubic smoothing spline. The actual fragment distribution has a mean of 4,500 rather than the 5,000 listed in the Trace Archive and is far tighter around the mean than suggested by the other models. In particular, the 'Normal TA' model would have given us a very inaccurate view of this library, which is one of the largest for chimpanzee with over 2.3 million reads.

Back to article page