Tech News
← Back to articles

Bidirectional CRISPR screens decode a GLIS3-dependent fibrotic cell circuit

read original related products more articles

Generating the integrated IBD scRNA-seq atlas

Previously published scRNA-seq datasets on CD and UC were used for integrative analysis2,9,11,13. The dataset of Smillie et al.2 was generated with colonic tissue biopsies obtained from 12 non-IBD controls and 18 patients with UC (from both inflamed and non-inflamed regions). Libraries were prepared for single-cell profiling by fractionating the tissues into epithelial and lamina propria fractions before downstream processing. The Kong et al.13 dataset was generated from 13 non-IBD controls and 46 patients with CD and consisted of biopsies obtained from inflamed regions of 17 participants and non-inflamed regions of 43 patients with disease. Libraries for single-cell profiling were prepared in a mixed manner, with some samples separated into epithelial and lamina propria fractions and some not separated. Biopsies were obtained from three segments of the gastrointestinal tract: small bowel, terminal ileum and colon. The Friedrich et al.11 dataset was generated by fractionation and sorting of EPCAM− and CD45− cells from colonic biopsies obtained from 4 non-IBD controls and 11 patients with UC, comprising 7 inflamed region biopsies and 4 non-inflamed region biopsies. The Martin et al.9 dataset consisted of separated lamina propria fractions of paired involved and non-involved region biopsies from 11 patients with CD. The combined dataset before downstream quality control filtering comprised 1,143,316 cells from 115 patients. Data were analysed using the scanpy implementation54.

scRNA-seq analysis and cell type identification

Gene expression normalization was performed on the combined scRNA-seq dataset to account for differences in sequencing depth across cells. Unique molecular identifier (UMI) counts were normalized by the total UMI count per cell, and the total count for each cell was set to 10,000 transcripts per cell. After natural logarithm conversion and scaling of the gene expression matrix, the top 2,000 most highly variable genes were selected for a first round of dimensionality reduction. Thereafter, batch correction was performed on each individual patient sample using harmony55, followed by neighbourhood clustering and uniform manifold approximation and projection (UMAP) embedding of the single cells56. On the basis of expression of known markers of epithelial (KRT8, EPCAM), stromal (PDGFRA, PECAM1, ACTA2, S100B, RGS5) and immune (CD79A, MZB1, CD3D, TRAC, C1QA, TPSAB) cell populations, the clusters were then subdivided into these three main compartments for subsequent rounds of clustering and analysis. For compartment-specific analyses, gene expression normalization was performed by excluding genes with higher UMI counts (more than 5% of the total UMI count per cell) to minimize the contribution of highly expressed genes to the normalization.

In the stromal compartment dataset, only genes expressed in more than five cells were considered for further integrative analysis. Dimensionality reduction and batch correction was performed by adjusting for the following covariates: 10x Genomics Single Cell Gene Expression Solution chemistry (v.1, v.2 or v.3), patient, study and tissue site (small bowel, ileum or colon). The top 45 adjusted principal components were considered for neighbourhood clustering using the Leiden algorithm57 and visualized using UMAP embedding. This was followed by post hoc analysis of identified clusters to remove poor-quality cells; that is, clusters with low UMI counts or high mitochondrial gene fraction and those expressing lineage markers of non-stromal cells were removed as doublet cells. Wilcoxon rank-sum test was performed to define the markers specific to individual clusters and annotate each cluster.

Analyses for the epithelial and immune compartments followed a similar workflow to that described above, except that further rounds of iterative clustering were performed after removal of doublet and low-quality cells.

Generation of a Xenium-based spatial transcriptomics dataset

Human sample collection

Sixteen patients diagnosed with UC, CD or diverticulitis who were recruited into the Prospective Registry in IBD Study at MGH (PRISM) study at Massachusetts General Hospital (MGH) participated in this study. Informed consent was obtained from all patients in accordance with the protocol approved by the institutional review board (IRB; 2004P001067). The study protocol complied with all relevant ethical regulations. Samples from patients with diverticulitis were pathologist-confirmed histologically normal. Excess tissues from clinically warranted surgical resections were collected for research purposes. IRB-approved secondary use protocol 2020P001262 allowed use of these tissues in research at the Broad Institute of MIT and Harvard.

Human colon sample processing

... continue reading