Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: plyranges: a grammar of genomic data transformation

Fig. 2

Illustration of the three overlap join operators. Each join takes two GRanges objects, x and y as input. A “Hits” object for the join is computed which consists of two components. The first component contains the indices of the ranges in x that have been overlapped (the rectangles of x that cross the orange lines). The second component consists of the indices of the ranges in y that overlap the ranges in x. In this case a range in y overlaps the ranges in x three times, so the index is repeated three times. The resulting “Hits” object is used to modify x by where it was “hit” by y and merge all metadata columns from x and y based on the indices contained in the “Hits” object. This procedure is applied generally in the plyranges DSL for both overlap and nearest neighbor operations. The join semantics alter what is returned: a for an inner join the x ranges that are overlapped by y are returned. The returned ranges also include the metadata from the y range that overlapped the three x ranges. b An intersect join is identical to an inner join except that the intersection is taken between the overlapped x ranges and the y ranges. c For the left join all x ranges are returned regardless of whether they are overlapped by y. In this case, the third range (rectangle with the asterisk next to it) of the join would have missing values on metadata columns that came from y

Back to article page