-
I tired to regress cell cycle effect on integrated data (use RunFastMNN for batch correction). First I tried to regress out cell cycle effect in each batch then integrate. But cell cycle genes were in those genes used for integration. I read posts #2148 #1500, then I tried to integrate first using RunFastMNN then regress out cell cycle effect. I used ScaleData(integrated_data) to regress out cell cycle effect. I checked PCA plot for cell cycle effect, which all gone - as expected. However, I saw batch effect again, which was corrected after RunFastMNN. I think the problem of seeing batch effect again is because ScaleData for cell cycle regression was not done on the batch effect corrected MNN space. I tried to specify integrated_data@reductions$mnn for ScaleData() to regress cell cycle, I got an error: I'm wondering how can I apply ScaleData for cell cycle regression on the MNN corrected space? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
I would recommend computing the cell cycle scores using the integrated data, and then running ScaleData and regressing out cell cycle score from the integrated data. If the cell cycle score in the original data has any batch effect, you would re-introduce that batch effect by regressing it out in the integrated data. |
Beta Was this translation helpful? Give feedback.
-
For Q1: I ended up using integration with SCTransform approach. I used SCTransform to regress out cell cycle effect on individual runs and took the SCT assays to integrate. Cell cycle regression looks good, not separated due to different cell cycle states. Batch effect seems corrected, same cell type from different runs cluster together. But I agree with "If the cell cycle score in the original data has any batch effect, you would re-introduce that batch effect by regressing it out in the integrated data." I would try out the same approach but integrate first then regress out cell cycle. For Q2: it doesn't really matter if we have cell cycle effect in (at least for our dataset). Cells are still clustered together mainly because of same cell type from different sequencing runs. |
Beta Was this translation helpful? Give feedback.
I would recommend computing the cell cycle scores using the integrated data, and then running ScaleData and regressing out cell cycle score from the integrated data. If the cell cycle score in the original data has any batch effect, you would re-introduce that batch effect by regressing it out in the integrated data.