TY - JOUR
T1 - The Sequences of 150,119 Genomes in the UK Biobank
AU - Halldorsson, Bjarni V.
AU - Eggertsson, Hannes P.
AU - Moore, Kristjan H. S.
AU - Hauswedell, Hannes
AU - Eiriksson, Ogmundur
AU - Ulfarsson, Magnus O.
AU - Palsson, Gunnar
AU - Hardarson, Marteinn T.
AU - Oddsson, Asmundur
AU - Jensson, Brynjar O.
AU - Kristmundsdottir, Snaedis
AU - Sigurpalsdottir, Brynja D.
AU - Stefansson, Olafur A.
AU - Beyter, Doruk
AU - Holley, Guillaume
AU - Tragante, Vinicius
AU - Gylfason, Arnaldur
AU - Olason, Pall I.
AU - Zink, Florian
AU - Asgeirsdottir, Margret
AU - Sverrisson, Sverrir T.
AU - Sigurdsson, Brynjar
AU - Gudjonsson, Sigurjon A.
AU - Sigurdsson, Gunnar T.
AU - Halldorsson, Gisli H.
AU - Sveinbjornsson, Gardar
AU - Norland, Kristjan
AU - Styrkarsdottir, Unnur
AU - Magnusdottir, Droplaug N.
AU - Snorradottir, Steinunn
AU - Kristinsson, Kari
AU - Sobech, Emilia
AU - Jonsson, Helgi
AU - Geirsson, Arni J.
AU - Olafsson, Isleifur
AU - Jonsson, Palmi
AU - Pedersen, Ole Birger
AU - Erikstrup, Christian
AU - Brunak, Søren
AU - Ostrowski, Sisse Rye
AU - Andersen, Steffen
AU - Banasik, Karina
AU - Burgdorf, Kristoffer Sølvsten
AU - Didriksen, Maria
AU - Dinh, Khoa Manh
AU - Gudbjartsson, Daniel F.
AU - Folkmann Hansen, Thomas
AU - Hjalgrim, Henrik
AU - Jemec, Gregor Borut Ernst
AU - Jennum, Poul
AU - Johansson, Pär Ingemar
AU - Hørup Larsen, Margit Anita
AU - Mikkelsen, Susan
AU - Nielsen, Kasper René
AU - Nyegaard, Mette
AU - Sækmose, Susanne
AU - Sørensen, Erik
AU - Thorsteinsdottir, Unnur
AU - Brun, Mie Topholmt
AU - Ullum, Henrik
AU - Werge, Thomas
AU - Thorleifsson, Gudmar
AU - Jonsson, Frosti
AU - Melsted, Pall
AU - Jonsdottir, Ingileif
AU - Rafnar, Thorunn
AU - Holm, Hilma
AU - Stefansson, Hreinn
AU - Saemundsdottir, Jona
AU - Gudbjartsson, Daniel F.
AU - Magnusson, Olafur T.
AU - Masson, Gisli
AU - Helgason, Agnar
AU - Jonsson, Hakon
AU - Sulem, Patrick
AU - Stefansson, Kari
AU - DBDS Genetic Consortium
N1 - Published online: 20 Jul 2022.
PY - 2022/7/28
Y1 - 2022/7/28
N2 - Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.
AB - Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.
KW - DNA sequencing
KW - Genetic markers
KW - Genetic variation
KW - Genetics research
KW - Genome-wide association studies
KW - DNA sequencing
KW - Genetic markers
KW - Genetic variation
KW - Genetics research
KW - Genome-wide association studies
U2 - 10.1038/s41586-022-04965-x
DO - 10.1038/s41586-022-04965-x
M3 - Journal article
SN - 0028-0836
VL - 607
SP - 732
EP - 740
JO - Nature
JF - Nature
IS - 7920
ER -