Tech News
← Back to articles

Causal modelling of gene effects from regulators to programs to traits

read original related products more articles

Datasets

GWAS data

We downloaded the publicly available GWAS summary statistics and SNP heritability estimates for traits in the UKB from Ben Neale’s laboratory (see the URL section below). We focused on traits with SNP heritability estimates exceeding 0.04.

LoF data

We used LoF burden test summary statistics from the UKB with 454,787 participants, as previously reported1. Specifically, we utilized the gene-level aggregated effect estimates from predicted LoF variants with a minor allele frequency of less than 0.01%. Data were downloaded from the GWAS Catalog67.

Perturb-seq data

We utilized the genome-wide Perturb-seq dataset in K562 reported by Replogle et al.2. In this dataset, all expressed genes (n = 9,866) were targeted by a multiplexed CRISPRi sgRNA library in K562 cells engineered to express dCas9–KRAB. Single-cell RNA sequencing was performed to read out the sgRNAs together with the transcriptome. Only cells with a single genetic perturbation were used for the analysis, amounting to a median of 166 cells per gene perturbation and 11,499 unique molecular identifiers per cell. We downloaded the raw count data that the authors uploaded to figshare (see the URLs in the Code availability section).

For additional analyses, we utilized Perturb-seq data for essential genes in K562, RPE1 (ref. 2), HepG2 and Jurkat57 cell lines. Only cells with a single genetic perturbation were used for the analysis. The number of perturbations and the number of cells per perturbation are summarized in Supplementary Table 4. We downloaded the raw count data uploaded to figshare (see the URLs in the Code availability section) or the Gene Expression Omnibus (GSE264667).

ChIP–seq data

We utilized chromatin immunoprecipitation followed by sequencing (ChIP)–seq data in K562 for annotating gene programs. We downloaded 830 transcription factor ChIP–seq narrow peak files from the ENCODE project website48 (see the URL in the Code availability section). All coordinates were mapped to hg19 with LiftOver68.

... continue reading