Instructors : Somi Kim, Eunseo Park, Donggon Cha 2021/07/06
Seurat object should be loaded for automatic cell annotation.
To annotate cells automatically, we can use SingleR r package. Before performing SingleR, SingleCellExperiment (SCE) object is set with raw counts and normalized counts. Also, the Human Primary cell Atlas, represented as a SummarizedExperiment object containing a matrix of log-expression values with sample-level labels.
library(SingleR)
library(celldex)
library(SingleCellExperiment)
library(dplyr)
library(Seurat)
sce <- SingleCellExperiment(list(counts = seurat@assays$originalexp@counts,
logcounts = seurat@assays$originalexp@data))
hpca.se <- HumanPrimaryCellAtlasData()
hpca.se
## class: SummarizedExperiment
## dim: 19363 713
## metadata(0):
## assays(1): logcounts
## rownames(19363): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
## rowData names(0):
## colnames(713): GSM112490 GSM112491 ... GSM92233 GSM92234
## colData names(3): label.main label.fine label.ont
We use hpca.se reference to annotate each cell in SCE object via the SingleR() function. By this function, marker genes are identified from the reference and based on the Spearman correlation across markers, assignment scores are computed for each cell. Each cell is annotated as the cell type label with the highest score.
pred.hesc <- SingleR(test = sce, ref = hpca.se, assay.type.test=1,
labels = hpca.se$label.main)
head(pred.hesc)
## DataFrame with 6 rows and 5 columns
## scores first.labels
## <matrix> <character>
## AAACCTGAGGCGTACA-1 0.183971:0.116038:0.104517:... Tissue_stem_cells
## AAACGGGAGATCGATA-1 0.222493:0.142291:0.137941:... Tissue_stem_cells
## AAAGCAAAGATCGGGT-1 0.275393:0.155992:0.149990:... Fibroblasts
## AAATGCCGTCTCAACA-1 0.220968:0.133970:0.122140:... Tissue_stem_cells
## AACACGTCACGGCTAC-1 0.156163:0.138000:0.115655:... Endothelial_cells
## AACCGCGAGACGCTTT-1 0.289759:0.176489:0.156293:... Tissue_stem_cells
## tuning.scores labels pruned.labels
## <DataFrame> <character> <character>
## AAACCTGAGGCGTACA-1 0.1433232:0.0623682 Tissue_stem_cells Tissue_stem_cells
## AAACGGGAGATCGATA-1 0.1010954:0.0346291 Tissue_stem_cells Tissue_stem_cells
## AAAGCAAAGATCGGGT-1 0.0969523:0.0856622 Smooth_muscle_cells Smooth_muscle_cells
## AAATGCCGTCTCAACA-1 0.2313040:0.1259849 Tissue_stem_cells Tissue_stem_cells
## AACACGTCACGGCTAC-1 0.2704928:0.2038585 Endothelial_cells Endothelial_cells
## AACCGCGAGACGCTTT-1 0.3468541:0.2477355 Tissue_stem_cells Tissue_stem_cells
The result performing SingleR is below.
table(pred.hesc$labels)
##
## B_cell Chondrocytes CMP
## 649 10 1
## DC Endothelial_cells Epithelial_cells
## 48 885 138
## Erythroblast Fibroblasts GMP
## 4 72 5
## Hepatocytes iPS_cells Macrophage
## 480 1 96
## Monocyte MSC Neuroepithelial_cell
## 219 3 1
## NK_cell Osteoblasts Pre-B_cell_CD34-
## 133 3 20
## Pro-B_cell_CD34+ Smooth_muscle_cells T_cells
## 3 135 1330
## Tissue_stem_cells
## 513
table(pred.hesc$labels) %>% sum()
## [1] 4749
identical(colnames(seurat), rownames(pred.hesc))
## [1] TRUE
To compare the automatic cell type annotation result with our manual cell type annotation, we assign the result labels to each cell in our Seurat object, and visualize them in UMAP plot. The results are below.
seurat$singler_labels = pred.hesc$labels
DimPlot(seurat,
reduction = "umap",
group.by = "singler_labels",
label=TRUE)
DimPlot(seurat,
reduction = "umap",
group.by = "celltype",
label=TRUE)
L. Ma, M.O. Hernandez, Y. Zhao, M. Mehta, B. Tran, M. Kelly, Z. Rae, J.M. Hernandez, J.L. Davis, S.P. Martin, D.E. Kleiner, S.M. Hewitt, K. Ylaya, B.J. Wood, T.F. Greten, X.W. Wang. Tumor cell biodiversity drives microenvironmental reprogramming in liver cancer. Canc. Cell, 36 (4): 418-430 (2019)
Aran, D., Looney, A. P., Liu, L., Wu, E., Fong, V., Hsu, A., ... & Bhattacharya, M. (2019). Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nature immunology, 20(2), 163-172.
Mabbott, N.A., Baillie, J.K., Brown, H. et al. An expression atlas of human primary cells: inference of gene function from coexpression networks. BMC Genomics 14, 632 (2013), 14-632.