-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Define partial chunk shape for GenericDataChunkIterator #995
Comments
@CodyCBakerPhD is this an issue you could help with, since you are most familiar with |
Above is a proposed solution. Obv this needs tests, but I wanted to run it by the group before moving forward |
To be honest, the functionality you describe to me sounds more like a utility function that would be more broadly useful for DataChunk iterators. I.e., this could be a method (e.g.,
This function could either live on it's own in the same module as
|
Sorry, I didn't see that you made a draft PR, I was referring to the solution you suggested in the issue. Let me take a look at the PR. |
What would you like to see added to HDMF?
Right now, for the
GenericDataChunkIterator
, it's possible to definechunk_mb
orchunk_shape
. I would like to enable a hybrid approach, where a user could inputchunk_mb=10.0, chunk_shape=(None, 64)
, and theGenericDataChunkIterator
would identify the remaining dimension that gets you close to the target chunk size.Is your feature request related to a problem?
It is pretty common for users to have some insight into the likely read patterns of a dataset.
What solution would you like?
I would like
GenericDataChunkIterator
to find the maximum size (prod of dims) that is <= the target size. I also would like the chunk to be as cube-like as possible, so I would like to minimize the sum of the dimensions of the array. Previously, we tried building chunks that were scaled down versions of the data shape, similar to h5py, but experience with Jeremy has shown that this approach is poorly suited for common data reading routines, and I think a better naive assumption would be that (hyper-) cube chunks are a good default.Do you have any interest in helping implement the feature?
Yes.
Code of Conduct
The text was updated successfully, but these errors were encountered: