Tumor Immune Single-cell Hub (TISCH) is a scRNA-seq database, which aims to characterize tumor microenvironment at single-cell resolution.
Data collection and processing
We collected tumor-related scRNA-seq studies from human and mouse. Besides datasets of treatment-naive patients, those with samples treated are also included. For each collected dataset, a uniform analysis pipeline -- MAESTRO was adopted to perform quality control, clustering and cell-type annotation (Fig. 1). After the streamlined processing, we curated the cell-type annotation of all datasets at three levels: malignancy, major-lineage and minor-lineage (Fig. 2). The curation makes the gene expression in different cell types comparable across all datasets.
Currently, after quality control, a total of 1,944,551 cells from 76 datasets across 28 cancer types and 101,195 cells from 3 PBMC datasets are retained in TISCH (Fig. 3).
Function of TISCH
Based on the unified data processing, TISCH presents the analysis results in a user-friendly interface for public accessing, which allows researchers to gain a quick insight into the expression of genes of interest at the single-cell level (Fig. 1).
Starting from a cancer type
If users are interested in one cancer type, they can click the tissue card in home page to query the related datasets.
In the dataset page, users can further filter the query results according to other criteria. For example, users may be interested LIHC data from human patients without treatment. The datasets satisfying the conditions will be displayed as below.
Users can select multiple datasets and click the Submit button to take a quick look at the selected datasets at the same time. Then users can input genes of interest to compare the gene expression across datasets. Besides, users can explore the expression pattern of a gene signature by uploading a line-separated gene list file. The level of cell-type annotation could be switched.
If users are interested one specific dataset, users can click the left annotated UMAP plot to achieve a comprehensive understanding of it. The page will be re-directed to the single-dataset page.
In the overview tab of single-dataset page, the clustering and annotation result are displayed on the top. Notably, users can click the right annotated UMAP plot to check the expression of cell-type-specific marker genes. As in the multiple-dataset page, the annotation of cells can be chosen from three levels of cell-type annotation as well as meta information from original study (if available).
The pie and bar plot show the cell number of each cell type and the cell-type proportion for each patient, respectively. The top differentially expressed genes for each cluster are shown below.
In the gene tab, users can search genes of interest. Besides the UMAP plots, a violin plot will be returned to show the gene expression in different cell types. As in the multiple-dataset page, users can explore the expresion pattern of a gene signature by uploading a line-separated gene list file.
For the violin plot, users can choose to group cells by tissue origin or by other available meta information.
TISCH also provides the gene set enrichment analysis (GSEA) results for each dataset. In the GSEA tab, the pre-calculated GSEA results are available for users to characterize the functional differences between different cell types. We collected 16,626 gene sets from MSigDB, covering KEGG, hallmark, GO, immunological signatures, oncogenic signatures, and transcriptional factor targets. Heatmaps will be shown to display the enriched up- or down-regulated pathways identified based on differential genes in each cluster. For the datasets with treatment information, TISCH also provides GSEA results for comparing functional pathways between different treatment conditions or treatment responses for each cell type.
In addition, we integrated Single-Cell Signature Explorer (38) for computing GSEA pathway enrichment score at single-cell resolution. Users can optionally select a hallmark pathway of interest to visualize the single-cell-specific enrichment.
Users can download the gene expresion matrix averaged by cell types and differential gene table for further exploration.
Starting with a gene of interest
If users are interested in one gene, they can input the gene in the search bar and click the Explore button, then the page will be re-directed to gene page.
By default, the expression of the given gene will be visualized using all datasets with the gene expressed. Users can select the cancer types of interest to further filter the datasets.
After clicking the Search button, a heatmap and a violin plot will be displayed to reflect the gene expression (logTPM) in different cell types across all the selected datasets.
|AEL||Acute Erythroid Leukemia|
|ALL||Acute Lymphoblastic Leukemia|
|AML||Acute Myeloid Leukemia|
|BCC||Basal Cell Carcinoma|
|BLCA||Bladder Urothelial Carcinoma|
|BRCA||Breast Invasive Carcinoma|
|CLL||Chronic Lymphocytic Leukemia|
|HNSCC||Head and Neck Squamous Cell Carcinoma|
|KIRC||Kidney Renal Clear Cell Carcinoma|
|LIHC||Liver Hepatocellular Carcinoma|
|MCC||Merkel cell carcinoma|
|NSCLC||Non-small Cell Lung Cancer|
|OV||Ovarian Serous Cystadenocarcinoma|
|SCC||Squamous Cell Carcinoma|
|SKCM||Skin Cutaneous Melanoma|
|UCEC||Uterine Corpus Endometrial Carcinoma|
|AC-like Malignant||Astrocyte-like Malignant Cells|
|CD4Tconv||Conventional CD4 T Cells|
|CD8T||CD8 T Cells|
|CD8Tex||Exhausted CD8 T Cells|
|EryPro||Erythroid Progenitor Cells|
|Gland mucous||Gland Mucous Cells|
|GMP||Granulocyte-macrophage Progenitor Cells|
|Hepatic progenitor||Hepatic progenitor Cells|
|HSC||Hematopoietic Stem Cells|
|ILC||Innate Lymphoid Cells|
|Mono/Macro||Monocytes or Macrophages|
|MES-like Malignant||Mesenchymal-like Malignant Cells|
|NB-like Malignant||Neuroblast-like Malignant Cells|
|NK||Natural Killer Cells|
|NPC-like Malignant||Neural-progenitor-like-like Malignant Cells|
|OC-like Malignant||Oligodendrocyte-like Malignant Cells|
|OPC||Oligodendrocyte Precursor Cells|
|OPC-like Malignant||Oligodendrocyte-precursor-cell-like Malignant Cells|
|Pit mucous||Pit Mucous Cells|
|Secretory glandular||Secretory Glandular Cells|
|SMC||Smooth Muscle Cells|
|Tprolif||Proliferating T Cells|
|Treg||Regulatory T Cells|
1. How to cite TISCH?
Dongqing Sun, Jin Wang, Ya Han, Xin Dong, Jun Ge, Rongbin Zheng, Xiaoying Shi, Binbin Wang, Ziyi Li, Pengfei Ren, Liangdong Sun, Yilv Yan, Peng Zhang, Fan Zhang, Taiwen Li, Chenfei Wang, TISCH: a comprehensive web resource enabling interactive single-cell transcriptome visualization of tumor microenvironment, Nucleic Acids Research, gkaa1020, https://doi.org/10.1093/nar/gkaa1020
1. What's the units of the downloadable single-cell level expression matrices?
The values in the single-cell level expression matrix are normalized. We employed the global-scaling normalization method ('NormalizeData' function) in Seurat to scale the raw counts (UMI) in each cell to 10,000, and then log-transformed the results. And also, the gene expression level displayed using UMAP and violin plots in the Dataset page is quantified by the normalized values.
2. How to understand the values in the heatmaps and the violin plots of Gene page?
Firstly, in the Gene page, we converted raw count or FPKM, which depends on the available data, to TPM to ensure the expression level is relatively comparable between different datasets. The expression of a gene in the cell was quantified as log2(TPM/10+1). TPM values were divided by 10 to lower the impact of varying dropout rates between genes. Secondly, the values in the heatmap are mean expression values of the gene in different cell types of different datasets. And the mean values are the original ones in their own datasets, which means we didn't perform any normalization across multiple datasets.
1. How did TISCH annotate the cell types?
The clusters of malignant cells were determined by combining three approaches. First, we took the cell-type annotations provided by the original studies. Second, we checked the malignant cell makers’ expression distribution from the initial research, such as epithelial markers, EMT genes, if available. Third, we ran InferCNV to predict cell malignancy based on the predicted copy number variation and separated the cells into malignant and non-malignant clusters. For the other normal clusters, we automatically annotated the cell clusters with a marker-based annotation method employed in MAESTRO using the DE genes between clusters, and then manually corrected the cell-type annotation results according to the cell-type annotations provided by the original studies. Please see the paper for more details.
1. Is there a way to download all datasets in a batch?
Unfortunately, TISCH doesn't provide such a batch download function considering the bandwidth of the network.
2. How to download the pictures of high resolution in TISCH?
In the Dataset page, all the pictures can be saved to the local disk by right-clicking the image. In the Gene page, the heatmap can be downloaded by clicking the button at the top right corner. The violin plot in the Gene page can also be downloaded by right-clicking and selecting 'Save link as'.