-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interspecies integration #19
Comments
Thanks for your interest in Cell BLAST and the detailed explanation! Given the above information, I can think of two potential fixes:
model = cb.directi.fit_DIRECTi(combined_dataset,
genes = var_genes_study,
latent_dim=10, cat_dim=20,
epoch=200, rmbatch_module_kwargs={"lambda_reg": 0.1}, # <- Changed here
batch_effect = ["study","species"],
path = model_dir
) |
Hi @Jeff1995 , thanks so much for the quick reply! These are all very helpful tips. I tried setting I even named the models differently and set lambda_reg=0.1lambda_reg=1Do you have an idea what might be happening here? Thanks, |
Well, that's weird... I have never seen anything like that. Maybe it's something in this data that triggered a bug in the model. Would you mind if you share the dataset you're using so I can have a closer look? |
Hi @Jeff1995 , sorry for the delay, had some issues finding a way to get my data shareable. I think this should work now, but let me know if you have any issues. Thanks again, |
Hello,
Thanks again for such a great tool!
I'm currently trying to use Cell BLAST to integrate some pretty diverse datasets; several mouse scRNAseq atlases, a zebrafish dataset, and fly dataset (all of which are mostly from central nervous system).
I'm struggling to find a tool that's able to handle this amount of diversity, and have had variable success with Cell BLAST. Here's some steps I've taken:
find_variable_genes()
after reducingmin_group_frac=
, since the default 0.5 only returns ~60 genes (which doesn't seem like it would be enough info to integrate the datasets well). I've played around with this parameter and run DIRECTi with anywhere from 60 to 400 to 4,000 to all genes.visualize_latent()
and manually running UMAP).Is there anything you can see that I might be doing wrong, or do you have any recommendations to improve the integration in this case? I've been finding that most tools have trouble with integrating data from species this divergent, probably in part due to the fact that most genes are 0s for some species. I've also tried using gene intersections, but this only leaves ~400 genes across mouse + zebrafish + fly, which doesn't seem to be enough to differentiate cell-types (and certainly not sub-types).
Thanks so much in advance,
Brian
The text was updated successfully, but these errors were encountered: