Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving caching. #11

Open
jeffreycwitt opened this issue May 31, 2017 · 2 comments
Open

improving caching. #11

jeffreycwitt opened this issue May 31, 2017 · 2 comments
Assignees
Milestone

Comments

@jeffreycwitt
Copy link
Collaborator

Here's an idea for improving the caching in lbp print.

Could we use python to create a shasum of the raw xml file and the shasum of the converting xslt file. Then use the sum of these two shasums as the file name in the cache. Then, the file will always be freshly retrieved, a shasum will be computed for it and the converting xslt file, and then we can check the cache to see if a file with this name exists. (edited)

This still wouldn’t detect further changes in the whitespace clean up etc, but it would be close.

Alternatively, since it doesn't take to much time to create the LaTeX source, we could take the shasum of the latex file and name the resulting pdf with this number.

Then we could always run the first step of processing, but skip the conversion to pdf, if there are any files in the cache with a file name that is equal to the shasum of the LaTeX source file.

In any case, right now, the webapp is not producing new renderings for updated source files because it is simply reverting to the older cached LaTeX and pdf output.

I think this is also an issue for the SCTA, as I think the SCTA should record a shasum for the master branch of the source xml file. In this way applications can just check the SCTA record before even deciding whether or not they need to retrieve a new version of the file. (I'll work on that)

@stenskjaer
Copy link
Owner

Okay. I will work on an implementations of this as a solution until we also can check for changes in SCTA. When that gets up and running, I also want to add that check, because that saves a remote fetch.

@jeffreycwitt jeffreycwitt changed the title improving chaching. improving caching. Jun 1, 2017
@stenskjaer
Copy link
Owner

stenskjaer commented Jul 23, 2017

I'm looking at this now. A possible procedure would be:

  • Receive id
  • Get XML
  • Is hash of XML with hash of XSLT as key (blake2 hash algorithm) in tex cache?
  • If yes: Use that pdf
  • If no: Create tex file.
  • Store tex and compile to PDF.

The thing about the clean up functions means that the process is not completely deterministic. But that could be achieved by adding the git sha of the current HEAD of the app as additional key to the hashing function. This would mean that any changes to the app would cause a rebuild of any (otherwise) cached files. I wonder whether that would be to overdo it and result in superfluous recompiles. [edited]

Another possibility would be to add a value in the module that is included in the keyed hash, but not updated unless file altering changes are made (e.g. to the tex_clean function). [edited]

(edit after untimely submit)

@stenskjaer stenskjaer added this to the 1.0 Rewrite milestone Feb 10, 2019
@stenskjaer stenskjaer self-assigned this Aug 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants