A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog

Morales, Joannella; Welter, Danielle; Bowler, Emily H.; Cerezo, Maria; Harris, Laura W.; McMahon, Aoife C.; Hall, Peggy; Junkins, Heather A.; Milano, Annalisa; Hastings, Emma; Malangone, Cinzia; Buniello, Annalisa; Burdett, Tony; Flicek, Paul; Parkinson, Helen; Cunningham, Fiona; Hindorff, Lucia A.; MacArthur, Jacqueline A. L.

doi:10.1186/s13059-018-1396-2

Table 2 Recommendations for authors reporting ancestry data in publications. These recommendations were generated by expert curators following a detailed review of the over 3200 GWAS publications included in the Catalog

From: A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog

1. Provide detailed information for each distinct group of samples, a. Ancestry descriptors should be as granular as possible (e.g., Yoruban instead of Sub-Saharan African, Japanese instead of Asian). b. Avoid using country or citizenship as a substitute for ancestry. c. Avoid using geographic descriptors that are part of a cohort name as a substitute for ancestry (e.g., TwinsUK cannot be assumed to be European ancestry). d. If a population self-identifies using sociocultural descriptors, clearly provide information about the underlying genetics or genealogy (e.g., Old Order Amish individuals of European descent) e. If samples were derived from an isolated or founder population with limited genetic heterogeneity, clearly state the genetic ancestry within which this sub-population falls. f. Every effort should be made to explicitly note whether the population is admixed and the ancestral backgrounds that contribute to admixture. g. If available, genetic genealogy or ancestry of grandparents or parents should be included. 2. Report the method used to determine the ancestry of participants (for example, self-reported, inferred by genomic methods, or a combination of both) a. Where possible, use genomic methods to confirm self-reported ancestry or to infer the ancestry of samples. b. If inferred, indicate the analytical procedure utilized. See Additional file 1: Box S1 for a description of commonly used methods. 3. Assign an ancestry category for each distinct group of samples. See Table 1 for a list of ancestry categories. Refer to Additional file 3: Table S2 for a list of descriptors in use in the Catalog with their category assignments. 4. Provide the sample size for each distinct group of samples included in the analysis. 5. Provide country of recruitment. 6. If ancestry information is not available due to confidentiality, or any other concerns, note this in the publication.

Back to article page

ISSN: 1474-760X

Contact us

Submission enquiries: editorial@genomebiology.com
General enquiries: info@biomedcentral.com

Genome Biology

Contact us