-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading Anndata from only parts of h5ad file: Hack solution #1517
Comments
This does seem useful, thank you! As you mentioned, it doesn’t integrate well with our versioned IO yet, so it’s really more of a recipe than a PR for now. This might be a good addition for our “How to” tutorials section. Are you interested in writing a little notebook? |
@flying-sheep Yeah I can try to make a little notebook, I'm also working on a function to overwrite selected fields as well that I can include. Should I just commit it and link you here? |
Perhaps our more general solution that based on |
Interesting! You don’t have to use an internal API for it btw, we’re exporting e.g. |
yeah, I’ll check if it’s a candidate for inclusion in our tutorial notebooks with few changes and we can go from there. |
Wow I wasn't aware of cap-anndata, that seems like a much more robust
solution, I think that should probably be an example notebook rather than
my hack solution!
…On Wed, Jun 12, 2024, 23:59 Philipp A. ***@***.***> wrote:
Should I just commit it and link you here?
yeah, I’ll check if it’s a candidate for inclusion in our tutorial
notebooks with few changes and we can go from there.
—
Reply to this email directly, view it on GitHub
<#1517 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGX7H2ZKM2DDYO4NOX47BLTZHE7N7AVCNFSM6AAAAABI6F3CROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRUG4YTIOBYHA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Just stumbled upon this issue, and as it's pretty recent, wanted to give a pointer to yet another experimental approach to only load parts of the data — https://github.com/scverse/shadows. Please give us feedback in case you end up trying it out! |
I love it when you start a conversation with an ad hoc approach and end up with several robust, purpose-built solutions :) |
I came across a previous issue #436 and couldn't get the dask solution working with my application, so I came up with a somewhat hacky solution to reading only the desired fields from an h5ad into an anndata (not chunking). It works by making a tree of all the fields in the H5, searching the tree for fields matching the ones you want to load, then loading the ancestors and descendants of that field. (see code below). Useful if you want to keep all your data together on disk, but only need to load some fields into memory.
Basically you run:
read_h5ad_backed_selective(model_path / 'p3_adata.h5ad', mode='r', selected_keys=['spliced', 'S_score', 'batch_name', 'var', 'uns', 'X_antipode'])
and get back:
Just wanted to share in case it is useful to someone!
The text was updated successfully, but these errors were encountered: