Skip to main content

Table 1 Comparison of the P. euphratica unigene collection with other sequence collections from whole genomes or EST projects

From: Gene expression and metabolite profiling of Populus euphratica growing in the Negev desert

Sequence collection

Matches

Unique

 

All

  

7,841

Populus genome

7,671

763

 

Arabidopsis genome

5,434

2

 

Rice genome

1,562

0

 

Populus EST sequence

5,780

5

 

Rosid EST sequence

4,597

1

 

Asterid EST sequence

3,490

4

 

Caryophyllid EST sequence

2,081

0

 

Monocot sequence

2,135

3

 

GenBank sequence

5,495

0

 

Short sequences

275

20

 

Low protein coding potential

728

28

 

Remainder

  

54

  1. All P. euphratica unigenes were compared against reference sequence collections to investigate sequence overlap and to identify the number of sequences unique to this sequence collection. The reference sequence collections include the draft Populus genome, the Arabidopsis thaliana genome, the rice genomes and pooled collections of openSputnik EST collections representing large collections from species taxonomically assigned to the plant groups of rosid, asterid, caryophyllid and monocot. Also included in the reference sets are the sequences having a match to an annotated protein in the UniProt database or P. euphratica sequences that are either short (less than 100 nucleotides) or have a low protein coding potential (less than 25% protein coding). In the table, the reference sequence collection is displayed along with the number of P. euphratica sequences that can be matches to the reference sequence collection and the number of sequences that are unique to this sequence collection. All blast analyses were performed using an arbitrary expectation value of 1e-10. The remainder (54) represents the number of sequences that have no match within any of the challenge datasets and may thus represent P. euphratica specific genes.