-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match annotation and deconvolved cell types #43
Comments
Hi @xieaq, So sorry for the delay. I have been gone on vacation and just getting back now. Additionally, I am in a new position and have reduced capacity to respond to issues. With respect to the difference between performing a transcriptional correlation or using GSEA to annotate the deconvolved cell types, If cell types are very distinct then I would expect the correlation and the GSEA to agree, but for more similar cell types there can be discrepancies. In the case of the mOB, the mitral and outer plexiform layers are pretty similar transcriptionally, so when performing the GSEA, it’s likely that cell types 8 and 7 probably have transcriptional profiles that contain marker genes from both the outer plexiform and mitral cell layers. In your case, if you have the references for all of the cell types of interest, then I would recommend using the correlation between the reference cell type transcriptional profiles and the deconvolved cell type transcriptional profiles (the beta, not the theta). If you have cases where the correlations between a deconvolved cell type and reference cell types are very similar, then I would suggest using the GSEA to see which of the reference cell types may have the most marker genes enriched in the deconvolved cell type. Hope this helps, |
A few details about figure d in the paper: Neuronal cell types were a major cell type in this dataset and contained the largest transcriptional variation in the data. We initially correlated against just the major cell types. But when we expanded the number of deconvolved cell types, we then obtained deconvolved cell types that highly correlated with the microglia and pericytes, likely because when using the major cell types, topics were first assigned to the neuronal cells, but once that variation was captured, topics were then assigned to cell types that represented less variation in the data. Pericytes and microglia were present at relatively small proportions, which is consistent with this explanation. You can check out the expanded correlation plot in the supplementary figures of the paper. |
Hi @xieaq, In this case I believe you are using the lsat hungarian sort algorithm to pair up the ground truth with the deconvolved cell type topics. While this algorithm attempts to pair up each of the ground truth cell types with a deconvolved one, it does not mean that the pairs are correct. In fact, the algorithm will attempt to create pairs regardless of the data, even if there are no correlations at all. It is a useful algorithm for visually organizing the data, but I would still defer to the actual correlations to decide which deconvolved cell type corresponds to a given ground truth. You can see that several deconvolved cell types correlate strongly with inhibitory and excitatory neurons. Here, LDA is restricted to 9 topics, but in this dataset there are 135 marker genes that were specifically chosen to identify neurons. So most of the transcriptional variation in the data is across neurons, so LDA keeps splitting the neuronal variation into more topics. With 9 topics, pericytes do not appear to correspond to any of the deconvolved cell types, likely because their variation in the data is too small for LDA to assign one of the 9 topics to it. I bet if you expand the number of topics, some of them will start to correlate with pericytes (and astrocytes, too). We talk more about this in the supplementary notes of the paper, and you can see in the supplementary figures that we do identify all of the major cell types and most of the neuronal subtypes if we expand the number of topics. This is a good example in which knowledge of the data (ie the marker genes chosen for neurons) can help instruct choice of the number of topics. With respect to the gene ranking, we essentially used
Hope this helps, |
Hi Dr.@bmill3r Thank you for your patience and assistance! I have noticed that as the number of topics expanded, some of them began to correlate with pericytes. However, I am curious about how to check the RMSE now. RMSE is computed for each pixel between the deconvolved and matched ground truth cell-type proportions in the ST dataset. Should I set the number of topics to be the same as the number of cell types, or is it acceptable to group some topics together when using more topics? Your insights on this matter would be greatly appreciated! |
Thanks for the tool, I am truly impressed by the remarkable findings presented in your work.
I have been working on implementing your simulation for MPOA. However, when attempting to compare the deconvolved cell-types with the ground truth, I've encountered a challenge. The heatmap generated does not exhibit a clearly defined diagonal block, making it difficult to match the deconvolved cell-types with the ground truth cell-types. In this context, I am seeking guidance on the process of matching the deconvolved cell-types and calculating the Root Mean Square Error (RMSE) concerning the ground truth for STdeconvolve.
Furthermore, I have been working through the GSEA tutorial to assign deconvolved cell types to their corresponding ground truth cell types. However, I have encountered challenges, as there are instances where the process does not yield the desired results.
Even when following the tutorial's provided examples, it remains a complex task to effectively match the deconvolved cell types to a specific annotation. I am seeking guidance on how to achieve a more reliable and accurate matching process.
Any insights or suggestions you can provide regarding this issue would be immensely valuable to me.
Thank you once again for your time and assistance.
The text was updated successfully, but these errors were encountered: