-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding caching #96
Comments
The staging server seems to reload the page indefinitely... This does not happen in the normal deployment, or in development in localhost... |
The callbacks that take the most time to load for the first time are:
|
Both
Nevertheless, as we use multiple processes, caching using a global variable (like we do on the CacheRequester) simply won't do. This method works because we are caching a disk read (and not a complex calculation). With long computations, the result should be cached between workers, and even between runs if possible. All of this is possible with long callbacks, but we could also use Flask-Caching. Using Flask-Caching doesn't have background tasks, but it would be better because we would cache the result of our functions, and not the entire callback. For example, we should just cache The cache would be invalidated somehow when the datawarehouse is updated. On deployment, we could even use the redis server used for wikichron (but I'd prefer we use a different one, to avoid disrupting it). |
The main problem with flask-caching is that it is an inter-process cache, while the CacheRequester is inter-thread (but intra-process). Every process has its own CacheRequester, which are not synchronized between them. If we do the CacheRequester something inter-process, we will need to use the filesystem, which will make the CacheRequester useless. So we can either:
Perhaps we could even implement both options (use Redis only if a redis backend is specified) Furthermore, what will happen if we have two users trying to do the same computation? I think that, with our current approach (using just flask-caching), the following will happen:
I.e. the computation will be done more than necessary, consuming resources. This could be improved using a task queue, like Celery. Celery is so well integrated with Flask, that you won't even need to install another package. |
It seems that Dash Caching (with DiskCache as a backend) allows for fine-tuning the things we couldn't fine-tune with Flask-Caching, like adding a function to check whether to invalidate the cache (which checks the last time the dw changed). Caching the entire callbacks won't be as space efficient as just caching the "computation functions" (the functions which make long computation and take a lot of time), but I think it will be easier to implement and will integrate better. It also adds a task queue, which will fix a lot of problems: if you go now to dao-analyzer.science, you'll have to wait a bit to see a graph because all callbacks start at the same time.
Furthermore, we could create our custom wrappers for more fine-tuning. At the time, the only thing that has been memoized is PlatformDao, there was not a lot of hours put into the flask-caching issue. The only drawback of using background callbacks instead of flask-caching is that if two or more callbacks use an intermediate function, the result of that intermediate function won't be cached, which would be great. Nevertheless, we still could have the option of creating a custom decorator with dash's Cache Managers, or even using flask-caching alongside.
|
Probably we should have a conversation about it. Anyway some thoughts. Some users have reported very long (unacceptable indeed) loading times in the dropdown menu component while searching for a DAO. This is the kind of problem that we have to fix. Another important issue is that, since we update data every day, for most situations the data cached will be deprecated each day, right? Moreover, on many days we won't have two users interested in the same DAO. According to this, we should prioritize the DAO platform page, especially the info in the two panels, (even pre-loading the cache) and worry less about the problem for each DAO and the tab data. As a guiding principle, even in life, "better strive for the good than for the best". So, probably, it makes no sense to keep both lines of work alive, but try to use a good solution even if not perfect. In this case, I'd try to fix the unacceptable loading times that we are aware of and the loading time of the DAOplatform page. I can be overlooking some other issues and technical problems... but as I said first, we should make a decision by having a conversation. |
I can't make background callbacks work... I tried making chart_controller use a background callback, but then all graphs will be the same. This is because the callback is added to a list of callbacks using the following hashing method:
Because all functions have the same code (only the id of the output changes), the code will be the same, and the hashed will be equal among all functions. I think the only possible solution would be to define the callback functions using Nevertheless, I'll ask on the forum to see if anyone else has come with a solution. |
Instead of creating multiple functions with the same code (one per each component), we could create just one of them, and use pattern matching callbacks for the output. The main problem is that each controller (and thus, each callback) has a different adapter (and perhaps a different layout), so this is not possible without creating a custom "callback registry"... |
I think I'll put this on hold for a while, and focus on the dropdown |
I submitted a bug report to Dash |
I managed to use background=True with the functions in main_view_controller. The improvement is not very noticeable in development, but perhaps it's noticeable after deployment. |
I ended up with a branch with both Flask-Caching and Dash Background Callbacks. I can't test if it's working and it has a lot of problems... Moving this to low priority and help wanted. |
CacheRequester.request
is called from multiple threads)The text was updated successfully, but these errors were encountered: