Skip to main content

Table 4 Summary of pangene clusters obtained for datasets ACK2 and rice3 and the corresponding orthogroups in Ensembl Plants. Core clusters contain genes from all analyzed genomes; in rice, shell clusters contain genes from two species. BUSCO completeness percentages for core sets are shown in parentheses. Clusters with multiple copies have several genes from the same species. gDNA segments are shell clusters that bring together a gene model and a matching genomic segment from the underlying WGA. Column ‘match Compara’ shows the number of pangene clusters that contain the same genes as the corresponding Compara orthogroups. The last column shows the number of pangene clusters that contain sequences that share an InterPro domain (the number in square brackets is for core clusters only)

From: GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation

 

Dataset

Core clusters [%BUSCO]

Multiple copies

Shell clusters

gDNA segments

Match Compara

Share InterPro domains

Compara orthogroups

ACK2

20,192 [90.6]

161

   

[18,259]

minimap2 clusters

ACK2

20,647 [94.1]

731

  

18,245

[18,792]

GSAlign clusters

ACK2

16,476 [74.9]

454

  

14,181

[14,817]

Compara orthogroups

rice3

13,020 [65.6]

219

6386

  

16,766 [11,571]

minimap2 clusters

rice3

22,880 [85.2]

3360

7825

6521

18,281

23,062 [19,239]

GSAlign clusters

rice3

20,399 [84.6]

2885

9730

6103

17,103

22,834 [17,135]