-
-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy load dask.dataframe in datashader.py #6309
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6309 +/- ##
=======================================
Coverage 88.49% 88.50%
=======================================
Files 323 323
Lines 68039 68093 +54
=======================================
+ Hits 60210 60263 +53
- Misses 7829 7830 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the current implementation but feel free to update it based on the comment.
holoviews/operation/datashader.py
Outdated
@@ -69,6 +69,13 @@ | |||
from ..streams import PointerXY | |||
from .resample import LinkableOperation, ResampleOperation2D | |||
|
|||
|
|||
def _lazy_dask_dataframe(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit nit but I tend to think this doesn't do what people mean with lazy import (i.e. delaying importing a module until it's used). I just remembered hvPlot has a few utils that do this together with the isinstance
check, like:
def is_dask(data):
if not check_library(data, 'dask'):
return False
import dask.dataframe as dd
return isinstance(data, (dd.DataFrame, dd.Series))
def is_xarray_dataarray(data):
if not check_library(data, 'xarray'):
return False
from xarray import DataArray
return isinstance(data, DataArray)
@@ -2329,3 +2331,30 @@ def flatten(line): | |||
yield from flatten(element) | |||
else: | |||
yield element | |||
|
|||
|
|||
def lazy_isinstance(obj, class_or_tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried that in my env and got some weird results.
from holoviews.core.util import lazy_isinstance
import dask.dataframe as dd
ddf = dd.DataFrame({})
print(lazy_isinstance(ddf, 'dask.dataframe:DataFrame'))
print(isinstance(ddf, dd.DataFrame))
False
True
That's because obj.__module__
is dask_expr._collection
in this case. Pretty unusual but well!
Alternative API with lazy_instance(obj, module, objname)
where objname
is either a string or a tuple of strings, e.g. lazy_instance(obj, 'dask.dataframe', 'DataFrame')
, lazy_instance(obj, 'cudf', ('DataFrame', 'Series'))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it to .startswith
. This is what is done in hvplot.
Co-authored-by: Maxime Liquet <[email protected]>
Preparation for holoviz/datashader#1350