1000 Genomes Project

General Background 

In 2008, the international 1000 Genomes Consortium launched the 1000 Genomes Project to develop a resource on human genetic variation that contains information on most of the genetic variants with frequencies of 1% or higher in the studies set of samples. This resource will support genome-wide association studies and other studies relating genetic variation to health and disease.

The 1000 Genomes Project started with three pilot projects, to provide data that would be used to help design the full-scale project:

  • Pilot 1 sequenced lightly 179 samples from the HapMap CEU, YRI, CHB, and JPT populations.
  • Pilot 2 sequenced deeply two trios:
    • CEU NA12878 (daughter) and mother NA12892 and father NA12891.
    • YRI NA19240 (daughter) and mother NA19238 and father NA19239.
  • Pilot 3 sequenced deeply in the exons of 906 genes, in 697 samples from the CEU, TSI, YRI, LWK, CHB, JPT, and CHD HapMap 3 populations.  

For the full-scale project, samples from 26 populations were studied in three phases. These studies not only include several new populations, but also include the populations in the HapMap (Yoruba, CEPH, Han Chinese, and Japanese), and several of the populations from the extended set of HapMap (HapMap 3) samples (Luhya, Toscani, African Ancestry, Mexican Ancestry, and Gujarati).


1000 Genomes Project Sample Data

The NHGRI Repository at Coriell does not house the data generated from the 1000 Genomes Project.   All data for the 1000 Genomes Project are freely available to the public through the 1000 Genomes Project website and dbSNP.


Populations in the 1000 Genomes Project

All 1000 Genomes Project population samples are available from the NHGRI Repository at Coriell, except for the CEPH [CEU] population samples, which are available from the NIGMS Repository at Coriell (see table below). The samples have no identifying or phenotype information available. Donors gave broad consent for use of the samples, including for genotyping, sequencing, and cellular phenotype studies.

All cell culture and DNA samples can be purchased individually. The NHGRI Repository also offers standard DNA panels for each population. Each standard panel provides all the samples from unrelated individuals that were used for the 1000 Genomes and HapMap Project populations. Except for the YRI trio included in the Pilot 2 project (with child sample NA19240), child DNA samples are not included in the panels but are available for order as individual samples. Each sample in the panel has 2 micrograms of DNA. By providing a standard panel for each population and a smaller amount of DNA, the cost is lowered so that each panel costs $1000 or less, as compared to $5500 per panel when the samples are ordered individually (with 50 micrograms of DNA each).

*Note that for the HapMap populations listed below, sample overlap exists between the panels listed below and other plates or panels offered as part of the International HapMap collection.


Population Panels Number of
Individual DNA Samples
Number of
Individual Cell Cultures
African Ancestry in SW USA [ASW] MGP00015 62 62
African Caribbean in Barbados [ACB] MGP00016 120 120
Bengali in Bangladesh [BEB] MGP00022
144 144
British From England and Scotland [GBR] MGP00003 100 100
Chinese Dai in Xishuangbanna, China [CDX] MGP00012 102 102
Colombian in Medellín, Colombia [CLM] MGP00005
136 136
Esan in Nigeria [ESN] MGP00023
173 173
Finnish in Finland [FIN] MGP00001 103 103
Gambian in Western Division – Mandinka [GWD] MGP00019 179 179
Gujarati Indians in Houston, Texas, USA [GIH] MGP00018 109 109
Han Chinese in Beijing, China [CHB] MGP00017 120 120
Han Chinese South [CHS] MGP00002 163 163
Iberian Populations in Spain [IBS] MGP00010 157 157
Indian Telugu in the U.K. [ITU] MGP00025 118 118
Japanese in Tokyo, Japan [JPT] MGP00009 120 120
Kinh in Ho Chi Minh City, Vietnam [KHV] MGP00014 124 124
Luhya in Webuye, Kenya [LWK] MGP00008 120 120
Mende in Sierra Leone [MSL] MGP00021
128 128
Mexican Ancestry in Los Angeles CA USA [MXL] MGP00006 71 71
Peruvian in Lima Peru [PEL] MGP00011 122 122
Puerto Rican in Puerto Rico [PUR] MGP00004 139 139
Punjabi in Lahore, Pakistan [PJL] MGP00020 158 158
Sri Lankan Tamil in the UK [STU] MGP00024 128 128
Toscani in Italia [TSI] MGP00007 114 114
Yoruba in Ibadan, Nigeria [YRI] MGP00013 120 120

* CEPH Collection [CEU] samples are available from the NIGMS Human Genetic Cell Repository at Coriell.