-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow excluding cache based on status code #1403
Conversation
Perfect. Great work! Perhaps the only thing I'd add is an integration test, similar to this: lychee/lychee-bin/tests/cli.rs Lines 794 to 858 in eff77d6
Probably it's just a matter of copy-pasting that block and changing the parameter, to test the new flag. |
I'm happy to do it, but it'll have to wait for a few days. I can then do it either in this PR, or in a new one. |
Just tested the changes locally. ❯❯❯ cargo run -- --verbose --cache https://lychee.cli.rs/
Finished dev [unoptimized + debuginfo] target(s) in 0.19s
Running `target/debug/lychee --verbose --cache 'https://lychee.cli.rs/'`
[INFO ] Cache is recent (age: 1m 6s, max age: 1d 0h 0m 0s). Using.
✔ [200] https://lychee.cli.rs/introduction/
✔ [200] https://lychee.cli.rs/#_top
✔ [200] https://lychee.cli.rs/favicon.svg
✔ [200] https://lychee.cli.rs/
✔ [200] https://lychee.cli.rs/_astro/logo.BJx6koUn.svg
✔ [200] https://lychee.cli.rs/sitemap-index.xml
✔ [200] https://lychee.cli.rs/_astro/index.fVW1leCO.css
✔ [200] https://lychee.cli.rs/
✔ [200] https://lychee.cli.rs/
✔ [200] https://lychee.cli.rs/_astro/logo.BJx6koUn_1qY8Fi.svg
✔ [200] https://github.com/lycheeverse/lycheeverse.github.io/edit/master/src/content/docs/index.mdx
✔ [200] https://github.com/lycheeverse/lychee/
🔍 12 Total (in 1s) ✅ 12 OK 🚫 0 Errors
~/C/p/l/lychee ❯❯❯ echo $?
0
~/C/p/l/lychee ❯❯❯ cat .lycheecache
~/C/p/l/lychee ❯❯❯ |
I've added the integration test we discussed, and that one seems to work. Something isn't adding up here. 🤔 |
Could that be related to a cache file already existing? (in which case, the issue could be that in the case of running tests with an existing cache, the file isn't written again?) |
31bc11e
to
8631028
Compare
I'm not sure whether there's still something missing in this PR? |
Sorry for the late response. The feature doesn't work right now.
Nope, tried in a directory without a cache file. I didn't have time to look at it, but I guess it has something to do with the defaults of If you have some time to investigate or test on your side, I would greatly appreciate the effort. :) |
Sorry for the delay. I have been trying this branch explicitly, and I'm not seeing any issue in the behavior. I'm testing a single file which has the following content: [Valid](https://github.com)
[Error](https://example.ai)
[Missing](https://github.com/404missingpage) I run lychee and then inspect the cache:
The 404 URL is properly not cached, while the 200 one is cached. |
No worries and thanks for the update. What I'm seeing is the following:
❯ rm .lycheecache
❯ cargo run -- --cache --cache-exclude-status 404 tmp ✘
Running `target/debug/lychee --cache --cache-exclude-status 404 tmp`
3/3 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links Issues found in 1 input. Find details below.
[tmp]:
[ERROR] https://example.ai/
[404] https://github.com/404missingpage
🔍 3 Total (in 0s) ✅ 1 OK 🚫 2 Errors
[WARN ] There were issues with GitHub URLs. You could try setting a GitHub token and running lychee again.
❯ cat .lycheecache ✘
https://example.ai/,,1728333536
https://github.com/,200,1728333536 It caches the 200 status code from github.com and excludes the 404 Now let us remove the cache and run ❯ rm .lycheecache
❯ cargo run -- --cache tmp
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.60s
Running `target/debug/lychee --cache tmp`
3/3 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links Issues found in 1 input. Find details below.
[tmp]:
[404] https://github.com/404missingpage
[ERROR] https://example.ai/
🔍 3 Total (in 0s) ✅ 1 OK 🚫 2 Errors
[WARN ] There were issues with GitHub URLs. You could try setting a GitHub token and running lychee again.
❯ cat .lycheecache ✘
https://example.ai/,,1728333603
https://github.com/404missingpage,404,1728333603 As we can see, the cache now contains the 404 status code as expected. ❯ rm .lycheecache
❯ lychee --cache tmp
3/3 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links Issues found in 1 input. Find details below.
[tmp]:
✗ [404] https://github.com/404missingpage | Failed: Network error: Not Found
✗ [ERR] https://example.ai/ | Failed: Network error: error sending request for url (https://example.ai/)
🔍 3 Total (in 0s) ✅ 1 OK 🚫 2 Errors
💡 There were issues with GitHub URLs. You could try setting a GitHub token and running lychee again.%
❯ cat .lycheecache ✘
https://github.com/404missingpage,404,1728333682
https://github.com/,200,1728333682
https://example.ai/,,1728333682 I think we want the behavior of the current release, where the cache |
Ah yes, you're right. It seems this is because we reuse the same I've been looking around, but I haven't found a clean way to fix this so far. It seems like the derivative crate could be a solution, but I'd rather not add a dependency on this project for no reason. An alternative would be to somehow have a |
You're right. |
The CI failure seems to be a flake? I'm not reproducing it locally. |
Fixed a couple of typos and split up the tests into individual test-cases. Thanks for the contribution @dmathieu! |
This introduces an option `--cache-exclude-status`, which allows specifying a range of HTTP status codes which will be ignored from the cache. Closes #1400.
Closes #1400
This introduces an option
--cache-exclude-status
, which allows specifying a range of HTTP status codes which will be ignored from the cache.