Elaborate FindMarkers() and AverageExpression() for Seurat v4 #4210
-
Hello, I am using Seurat v4 to integrate two disease samples and find differentially expressed genes between two samples for one particular cell type. I am very confused how Seurat calculates log2FC. Can anyone help me in understanding the basic steps in the example below? I followed the steps from the “Introduction to scRNAseq Integration” Vignette on the Seurat website to find DE genes. In particular, here are the functions that I used: CreateSeuratObject()-> SCTransform()-> ScaleData()-> FindVariableFeatures()-> SelectIntegrationFeatures()-> FindIntegrationAnchors()-> IntegrateData() -> ScaleData() -> RunPCA() -> RunUMAP() -> FindNeighbors() -> FindClusters()-> FindConservedMarkers(). Now, after clustering and finding the cell-type markers for each celltype, I want to find marker genes that are differentially expressed between the two samples for cell type B. I used FindMarkers() like this: The top 2 genes output for this cell type are: I thought that the log2FC of 79 was very high, so I wanted to see the average expression values for these two samples in this cell type. I used AverageExpression() like this: avg.t.cells <- AverageExpression(t.cells,slot='counts',use.counts=TRUE,return.seurat=TRUE) For sample#1 and the B cell type and geneA, the average expression is 2.90027283 For sample#2 and the B cell type and geneA, the average expression is 1.79175947 Usually, to calculate the avg2FC using the average expression, it would be something like this: log2(avg_AC / avg_HC) = log2( 2.90027283 / 1.791775947) = log2 (1.61867) = 0.6948 So, I am confused as to why it is a number like 79.1474718? Please explain how you calculate the avg_log2FC? Also, can you confirm that the steps given above for finding cell type clusters are correct? Best, |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 4 replies
-
We recommend It might help to paste here the code you are using. |
Beta Was this translation helpful? Give feedback.
-
Hi @saketkc , Thank you for your reply. We tested two different approaches using Seurat v4:
We feel that there is a problem with SCTransform(). Are we doing something wrong?? Should we stick with logNormalize() if we are doing differential expression for integrated samples? Thank you, |
Beta Was this translation helpful? Give feedback.
-
Thanks for your response. Your second approach is correct (so is the first; also see: #4000). I am not able to reproduce the discrepancy in log2FC. Can you share a reproducible example? You can use a subset of your data or any of the public datasets avaialble in SeuratData? Also, the workflow you mentioned in your first comment is different from what we recommend. There is no |
Beta Was this translation helpful? Give feedback.
-
Hi @saketkc , Thank you for your prompt reply. combined_counts=cbind(d1[["RNA"]]@CountS,d2[["RNA"]]@CountS,d3[["RNA"]]@CountS) seurat_obj=CreateSeuratObject(counts= combined_counts, min.cells = 3, project = "d1vsd2vsd3") we used dotplot() to do the cellmapping of the conservedmarker genes to particular cell type.seurat_obj <- RenameIdents(seurat_obj, seurat_obj$celltype.orig.ident <- paste(Idents(seurat_obj), seurat_obj$orig.ident, sep = "")
Thanks, Tulika |
Beta Was this translation helpful? Give feedback.
-
Hi, There are a bunch of things happening in your code which do no look correct. If you have three objects to start off with, you can follow these steps before proceeding with integration: data1 <- Read10X(data.dir = "data1/filtered_feature_bc_matrix")
data2 <- Read10X(data.dir = "data2/filtered_feature_bc_matrix")
data3 <- Read10X(data.dir = "data3/filtered_feature_bc_matrix")
d1 <- CreateSeuratObject(counts = data1, project = "Data1")
d1$disease <- "D1"
d2 <- CreateSeuratObject(counts = data2, project = "Data2")
d2$disease <- "D2"
d3 <- CreateSeuratObject(counts = data3, project = "Data3")
d3$disease <- "D3"
# if you have metadata, you can use `AddMetaData()` instead
dataset.combined <- merge(d1, y = c(d2, d3), add.cell.ids = c("D1", "D2", "D3"), project = "D1D2D3")
object.list <- SplitObject(dataset.combined, split.by = "disease")
object.list <- lapply(X = object.list, FUN = SCTransform) You can then proceed with |
Beta Was this translation helpful? Give feedback.
Hi,
There are a bunch of things happening in your code which do no look correct. If you have three objects to start off with, you can follow these steps before proceeding with integration: