Skip to main content

Table 3 Fields for pseudogene features in the psiDR annotation file

From: The GENCODE pseudogene resource

Field

Explanation

psiDR value

Transcript ID

Pseudogene ID from GENCODE annotation. Used for cross-referencing

 

Parent

Protein ID, Gene ID, chromosome, start, end and strand. Detailed in section 'Parents of pseudogenes'

 

Sequence similarity

The percentage of pseudogene sequence preserved from parent

 

Transcription

Evidence for pseudogene transcription and validation results. May be tagged as EST, BodyMap, RT-PCR or None, which represent pseudogene expression evidence from corresponding data sources. Multiple tags are separated by commas. Detailed in section 'Transcription of pseudogenes'

1, transcription; 0, otherwise

DNaseI hypersensitivity

A categorical result indicating whether the pseudogene has easily accessible chromatin, predicted by a model integrating DNaseI hypersensitivity values within 4 kb genomic regions upstream and downstream of the 5' end of pseudogenes. Detailed in section 'Chromatin signatures of pseudogenes'

1, has Dnase hypersensitivity in upstream; 0, otherwise

Chromatin state

Whether a pseudogene maintains an active chromatin state, as predicted by a model using Segway segmentation. Detailed in section 'Chromatin signatures of pseudogenes'

1, active chromatin; 0, otherwise

Active Pol2* binding

Whether Pol2 binds to the upstream region of a pseudegene. Detailed in section 'Upstream regulatory elements'

1, active binding site; 0, otherwise

Active promoter region

Whether there are active promoter regions in the upstream of pseudogenes. Detailed in section 'Upstream regulatory elements'

1, active binding site; 0, otherwise

Conservation

Conservation of pseudogenes is derived from the divergence between human, chimp and mouse DNA sequences. Detailed in section 'Evolutionary constraint on pseudogenes'

1, conserved; 0, otherwise

  1. *Pol2, RNA polymerase II.