Skip to content

Advertisement

You're viewing the new version of our site. Please leave us feedback.

Learn more
Open Access

The genetic structure of south Asian populations as revealed by 650 000 SNPs

  • Mait Metspalu1,
  • Gyaneshwer Chaubey1,
  • Bayazit Yunusbayev1, 2,
  • Irene Gallego Romero4,
  • Monika Karmin1,
  • Chandana Basu Mallick1,
  • Ene Metspalu1,
  • Sadagopal Shanmugalakshmi6,
  • Karuppiah Balakrishnan6,
  • Kumarasamy Thangaraj3,
  • Lalji Singh3,
  • Ramasamy Pitchappan5,
  • Toomas Kivisild4, 1 and
  • Richard Villems1
Genome Biology201011(Suppl 1):O8

https://doi.org/10.1186/gb-2010-11-s1-o8

Published: 11 October 2010

The analyses of dense marker sets covering the whole genome has revolutionised the field of (human) population genetics. Driven largely by the needs of biomedical research, these new data are helping to unveil our demographic past, exemplified by the study of mtDNA and Y-chromosome variation during the past ~20 years.

We have analysed (Illumina 650K SNPs) over 320 new samples from South and Central Asia and the Caucasus, together with the publicly available databases (HGDP panel and our published data set of ~600 Eurasian samples) and illustrated the power of full genome analyses by addressing two specific questions. (i) What is the nature of genetic continuity and discontinuity between South Asia, Middle East and Central Asia? (ii) What are the genetic origins of the Munda speakers of India? We use principal component and structure-like analyses to reveal the structure in the genome wide SNP data. The most striking feature of the genetic structure of South Asian populations is the clear separation of the Indus valley and southern India populations. The genetic component prevalent in the latter region is marginal in the former and absent outside South Asia. By contrast, the component ubiquitous to Indus valley is also present (~30 - 40%) among Indo-European speakers from Ganges valley and Dravidic speakers in southern India. Furthermore, this component can also be found in Central Asia and the Caucasus as well as in Middle East. We explored possibilities to identify the source region for this genetic component.

Alternative models put the origins of Munda languages speakers either in South Asia (the Munda speakers sport exclusively autochthonous South Asian mtDNA variants) or in Southeast Asia, where the other Austro Asiatic languages have spread. Y-chromosome variation supports the latter model through sharing of hg O2a in both regions. We show that in addition to the dominant ancestry component being shared between the Indian Dravidic and Munda speakers, up to 30% of Munda speakers retain an ancestry component otherwise prevalent in East Asia. There is no widespread sign of South Asian ancestry component in Southeast Asia. This provides genomic support to the model by which Indian Austro-Asiatic populations derive from dispersal from Southeast/East Asia, followed by an extensive admixture with local Indian populations.

Authors’ Affiliations

(1)
Estonian Biocentre and Department of Evolutionary Biology, University of Tartu
(2)
Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences
(3)
Centre for Cellular and Molecular Biology
(4)
Leverhulme Centre of Human Evolutionary Studies, The Henry Wellcome Building, University of Cambridge
(5)
Department of Immunology, School of Biological Sciences, Madurai Kamaraj University
(6)
School of Biotechnology, Bharathidasan University

Copyright

© Metspalu et al; licensee BioMed Central Ltd. 2010

This article is published under license to BioMed Central Ltd.

Advertisement