How does Seurat handle sparsity in scRNAseq data #5011
-
Recently, it's been suggested that zero-inflated models are not required for scRNAseq UMI data (Valentine, 2020; Cao et al., 2021. In other words, the zeros in UMI data are sufficiently described by non-zero inflated models, such as the Poisson or negative binomial. Does this mean that SCTransform, which is based on the negative binomial model, is sufficient to handle scRNAseq data sparsity? In a similar vein, given that something like LogNormalize does not explicitly model zeros, why is it still popular? Thanks for helping me understand! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
As you point out through the linked references, negative binomial is a sufficient (and necessary )in explaining the number of zeros. LogNormalization is meant to achieve the same objective (1. correct for sequencing depth - scaling step and 2. reduce impact of outliers - log step) but can dampen biological variance at the same time (some slides here). The popularity comes probably because of ease of implementation (no explicit statistical estimation required). |
Beta Was this translation helpful? Give feedback.
As you point out through the linked references, negative binomial is a sufficient (and necessary )in explaining the number of zeros. LogNormalization is meant to achieve the same objective (1. correct for sequencing depth - scaling step and 2. reduce impact of outliers - log step) but can dampen biological variance at the same time (some slides here). The popularity comes probably because of ease of implementation (no explicit statistical estimation required).