Skip to main content

Table 2 The pragmas defined by GVF, in addition to those already defined by GFF3 (gff-version, sequence-region, feature-ontology, attribute-ontology, source-ontology, species, genome-build)

From: A standard variation file format for human genome sequences

Pragma Allowed tags Description
file-version Comment This allows the specification of the version of a specific file. What exactly the version means is left undefined, but the tag is provided for the case when an individual's variants are described in GVF and then, at a later date, changes to the data or the software require an update to the file. An increment of the file-version could signify such a change. Any numeric version of file-version is allowed
file-date Comment The file-date pragma is included as a method to describe the date when the file was created. The ISO 8601 standard for dates in the form YYYY-MM-DD is required for the value
individual-id Dbxref, Gender, Population, Comment This pragma provides details about the individual whose variants are described in the file
##individual-id Dbxref = Coriell:NA18507;Gender = male;Ethnicity = Yoruba; Comment = Yoruba from Ibadan
source-method Seqid, Source, Type, Dbxref, Comment This pragma provides details about the algorithms or methodologies used to generate data for a given source in the file. This is used, for example, to document how a particular type of variant was called. A typical use would be to provide a DBxref link to a journal article describing software used for calling the variant data with the given source tag
##source-method Seqid = chr1;Source = MAQ;Type = SNV;Dbxref = PMID:18714091;Comment = MAQ SNV calls;
attribute-method Seqid, Source, Type, Attribute, Dbxref, Comment This pragma provides details about algorithms or methodologies for a given attribute tag in the file. This is used to document how a particular type of attribute value (that is, Genotype, Variant_effect) was calculated
##attribute-method Source = SOLiD;Type = SNV;Attribute = Genotype;Comment = Genotype is reported here as determined in the original study
technology-platform Seqid, Source, Type, Read_length, Read_type, Read_pair_span, Platform_class, Platform_name, Average_coverage. Comment, Dbxref This pragma provides details about the technologies (that is, sequencing or DNA microarray) used to generate the primary data
##technology-platform Seqid = chr1;Source = AFFY_SNP_6;Type = SNV;Dbxref = URI:; Platform_class = SNP_Array;Platform_name = Affymetrix Human SNP Array 6.0;
data-source Seqid, Source, Type, Dbxref, Data_type, Comment. This pragma provides details about the source data for the variants contained in this file. This could be links to the actual sequence reads in a trace archive, or links to a variant file in another format that have been converted to GVF
##data-source Source = MAQ;Type = SNV;Dbxref = SRA:SRA008175;Data_type = DNA sequence;Comment = NCBI Short Read Archive;
phenotype-description Ontology, Term, Comment A description of the phenotype of the individual. This pragma can contain either ontology constrained terms, or a free text description of the individual's phenotype or both.
##phenotype-description Ontology =;Term = acute myloid leukemia;Comment = AML relapse;
ploidy Ontology, Term, Comment This pragma defines the ploidy for a given genome. This pragma can contain either ontology constrained terms, or a free text description of the individual's ploidy. It is suggested that ontology constrained terms use a subtype of the term PATO:0001374, which includes haploid, diploid, polyploid, triploid etc
##ploidy chr22 1 49691432 diploid
##ploidy chrY 1 57772954 haploid
  1. The pragmas defined by GVF may refer to the entire file or may limit their scope by use of tag-value pairs. For example, if a pragma only applies to SNVs that were called by Gigabayes on chromosome 13, then the tags: Seqid = chr13;Source = Gigabayes;Type = SNV would indicate the scope. The Dbxref tag within a GVF pragma takes values of the form 'DBTAG:ID' and provides a reference for the information given by the pragma whether that be the location of sequence files or a link to a paper describing a method. Tags beginning with uppercase letters are reserved for future use within the GFF/GVF specification, but applications are free to provide additional tags beginning with lower case letters.