You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've discussed there being motivation for a chunking approach as an alternative to sending massive task graphs to Dask. The main appeal being that chunking can potentially provide a more memory-stable compute at the cost of adding some looping overhead to the overall performance, which would help users that run into dask issues avoid dask troubleshooting as their only path forward.
@wilsonbb and I talked about this in more depth, and we came to the conclusion that the likely best output of this would be to have an example within our documentation that shows how one would do this on something like workflow in #42 . This is preferable to building a bespoke chunk function, as a built-in function would have many limitations regarding the graphs it can chunk (for example anything where a global value is computed) and therefore may set bad expectations for users. And building something that's more general would risk building an entire dask streaming interface that directly competes with Dask's workflow.
The first step to this is to actually verify that a chunking approach performs well, which @wilsonbb has agreed to explore as part of his exploration in #42
The text was updated successfully, but these errors were encountered:
We've discussed there being motivation for a chunking approach as an alternative to sending massive task graphs to Dask. The main appeal being that chunking can potentially provide a more memory-stable compute at the cost of adding some looping overhead to the overall performance, which would help users that run into dask issues avoid dask troubleshooting as their only path forward.
@wilsonbb and I talked about this in more depth, and we came to the conclusion that the likely best output of this would be to have an example within our documentation that shows how one would do this on something like workflow in #42 . This is preferable to building a bespoke
chunk
function, as a built-in function would have many limitations regarding the graphs it can chunk (for example anything where a global value is computed) and therefore may set bad expectations for users. And building something that's more general would risk building an entire dask streaming interface that directly competes with Dask's workflow.The first step to this is to actually verify that a chunking approach performs well, which @wilsonbb has agreed to explore as part of his exploration in #42
The text was updated successfully, but these errors were encountered: