From: A standard variation file format for human genome sequences
Tag | Value | Necessity | Description |
---|---|---|---|
ID | String | Mandatory | While the GFF3 specification considers the ID tag to be optional, GVF requires it. As in GFF3 this ID must be unique within the file and is not required to have meaning outside of the file |
 |  |  | ID = chr1:Soap:SNP:12345; |
 |  |  | ID = rs10399749; |
Variant_seq | String | Optional | All sequences found in this individual (or group of individuals) at a variant location are given with the Variant_seq tag. If the sequence is longer than 50 nucleotides, the sequence may be abbreviated as '~'. In the case where the variant represents a deletion of sequence relative to the reference, the Variant_seq is given as '-' |
 |  |  | Variant_seq = A,T; |
Reference_seq | String | Optional | The reference sequence corresponding to the start and end coordinates of this feature |
 |  |  | Reference_seq = G; |
Variant_reads | Integer | Optional | The number of reads supporting each variant at this location |
 |  |  | Variant_reads = 34, 23; |
Total_reads | Integer | Optional | The total number of reads covering a variant |
 |  |  | Total_reads = 57; |
Genotype | String | Optional | The genotype of this variant, either heterozygous, homozygous, or hemizygous |
 |  |  | Genotype = heterozygous; |
Variant_freq | Real number between 0 and 1 | Optional | A real number describing the frequency of the variant in a population. The details of the source of the frequency should be described in an attribute-method pragma as discussed above. The order of the values given must be in the same order that the corresponding sequences occur in the Variant_seq tag |
 |  |  | Variant_freq = 0.05; |
Variant_effect | [1]String: SO term sequence_variant [2]Integer-index [3]String: SO sequence_feature [4]String feature ID | Optional | The effect of a variant on sequence features that overlap it. It is a four part, space delimited tag, The sequence_variant describes the effect of the alteration on the sequence features that follow. Both are typed by SO. The 0-based index corresponds to the causative sequence in the Variant_seq tag. The feature ID lists the IDs of affected features. A variant may have more than one variant effect depending on the intersected features |
 |  |  | Variant_effect = sequence_variant 0 mRNA NM_012345, NM_098765; |
Variant_copy_number | Integer | Optional | For regions on the variant genome that exist in multiple copies, this tag represents the copy number of the region as an integer value |
 |  |  | Variant_copy_number = 7; |
Reference_copy_number | Integer | Optional | For regions on the reference genome that exist in multiple copies, this tag represents the copy number of the region as an integer in the form: |
 |  |  | Reference_copy_number = 5; |
Nomenclature | String | Optional | A tag to capture the given nomenclature of the variant, as described by an authority such as the Human Genome Variation Society |
 |  |  | Nomenclature = HGVS: p.Trp26Cys; |