Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on how to cache Polars LazyFrame #1135

Open
BartSchuurmans opened this issue Aug 23, 2024 · 1 comment
Open

Guidance on how to cache Polars LazyFrame #1135

BartSchuurmans opened this issue Aug 23, 2024 · 1 comment
Labels

Comments

@BartSchuurmans
Copy link

Link to doc page in question (if any):

https://docs.streamlit.io/develop/concepts/architecture/caching

Name of the Streamlit feature whose docs need improvement:

@st.cache_data / @st.cache_resource

What you think the docs should say:

Polars' LazyFrame fits somewhere between data and a resource, because it represents a query that will result in a DataFrame when collected. I think it would be good if the docs included this type in the large table on the bottom to advise whether a function returning a pl.LazyFrame should be decorated with @st.cache_data, @st.cache_resource, or neither (I don't know the answer).

@sfc-gh-dmatthews
Copy link
Contributor

Hi @BartSchuurmans. I'll need to do a little testing to confirm, but the initial thoughts I heard back from engineering were this:

Since a LazyFrame is data that hasn't been computed yet, it'd likely be better to cache the collected result with cache_data instead. If there is any good reason to cache a LazyFrame, then it will probably need cache_resource since cache_data might not work.

I'll try to test some things to confirm so I can add an example or something. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants