Estonian Biobank
The EstBB is a volunteer-based biobank at the Institute of Genomics, University of Tartu62. The current EstBB data freeze consists of 212,955 adult (age ≥ 18 years) participants, reflecting the age, sex and geographical distribution of the adult Estonian population, for whom biological samples as well a variety of health-related and demographic information have been collected. All biobank participants have signed a broad informed consent form and their blood sample collection was undertaken across the country between 2002 and 202162,63. The activities of EstBB are regulated by the Human Genes Research Act, which was adopted in 2000 specifically for the operations of EstBB. The Nightingale Health NMR platform was used to generate plasma metabolic trait profiles for all individual samples in the biobank. The assay covers 249 metabolic traits ranging from low molecular weight compounds to lipids and lipoproteins. Individual-level data analysis in EstBB was carried out under ethical approval 1.1-12/624 from the Estonian Committee on Bioethics and Human Research (Estonian Ministry of Social Affairs), using data according to release application 6-7/GI/8988 from the EstBB.
UK Biobank
The UKBB is a longitudinal biomedical study of approximately half a million participants between 38–71 years of age from the UK64. Participant recruitment was conducted on a volunteer basis and took place between 2006 and 2010. Initial data were collected in 22 different assessment centres throughout Scotland, England and Wales. Data collection includes elaborate genotype, environmental and lifestyle data. Blood samples were drawn at baseline for all participants, with an average of 4 h since the last meal (that is, generally non-fasting). NMR metabolic traits (Nightingale Health, quantification library 2020) were measured from EDTA plasma samples (aliquot 3) during 2019–2024 from the entire cohort. Details on the NMR metabolomic measurements in UKBB have been described previously for the first tranche of ~120,000 samples65. The UKBB study was approved by the North West Multi-Centre Research Ethics Committee. This research was conducted using the UKBB Resource under application numbers 91233 and 30418.
NMR data QC and normalization
NMR data generation in the EstBB and UKBB has been previously described66. During the quality control of the NMR metabolomics data, we detected a difference between distributions of several metabolic traits (notably Ala and His) driven primarily by spectrometer and batch effect. We removed this unwanted technical variation using the R package ukbnmr in both EstBB and UKBB data67. We excluded individuals with more than 5 missing metabolic trait measurements from the cohort, confirmed that none of the 249 metabolic traits had a significant number of missing measurements (8,000 for EstBB, 24,000 for UKBB), and applied inverse normal transformation to each metabolic trait to obtain the final dataset.
Association testing and meta-analysis
Genotype imputation for the EstBB and UKBB cohorts is described in Supplementary Note 3. We conducted genome-wide association tests for each of the seven genetic ancestry groups separately using regenie v3.1.168, with sex, age, age squared and the top principal components (PCs) of the genotype data used as covariates (PC1–PC10 for EstBB, PC1–PC20 for UKBB). For step 1 (whole-genome model), we used genotype calls for UKBB and genotyping data for EstBB and included variants with a MAF of at least 1%, a minor allele count of at least 20, Hardy-Weinberg equilibrium exact test P values of 10−15 or less, and maximum per-variant and per-sample missing genotype rates of 0.1. For step 2 (association testing using a linear regression model), we used imputed genotypes and selected variants with a minor allele count of at least 20 and an imputation INFO score of at least 0.6.
We performed two different inverse-variance weighted fixed-effect meta-analyses: meta_EUR on individuals of predominantly European genetic ancestry (EstBB cohort and EUR genetic ancestry group of UKBB), and meta_ALL which encompasses all seven genetic ancestry groups from UKBB and EstBB.
Genetic correlations
... continue reading