Corset uses expression information to tease apart contigs from different genes. (A) Assembled contigs from a region of the human genome containing the two genes ATP5J and GABPA. Trinity assembles 8 contigs (bottom track), which are grouped into one cluster if the contig ratio test is not applied. Including this test allows corset to separate this region into four clusters (boxes). Notably, contigs 4 to 6 are false chimeras, caused by the overlapping UTRs of ATP5J and GABPA. These genes are differentially expressed, as shown by base-level coverage, averaged over replicates (top track). (B) When clustering, Corset checks for equal expression ratios between conditions when calculating distances between pairs of contigs: here we consider pairs contigs 2 and 3 (top) and contigs 3 and 4 (bottom). The ratio of the number of reads aligning to each contig is plotted for each sample (dots). It can be seen that contig 2 and contig 3 have the same expression ratio across groups and so are clustered together while contig 3 and contig 4 have different expression ratios between conditions and so are split. This feature helps Corset separate contigs that share sequence but are from different genes.