Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

static-checks: Try multiple user agents #5801

Merged
merged 2 commits into from
Dec 7, 2023

Conversation

cmaf
Copy link

@cmaf cmaf commented Dec 6, 2023

Port of 2 commits from kata-containers/kata-containers#8592.

Note: there is no intention of maintaining 2 static checks scripts beyond this. It is for kata-containers/kata-containers#6826 to pass before testing for (kata-containers/kata-containers#8595) is complete.

Split the call to `curl` in the URL checker out into a new
`run_url_check_cmd()` function to make `check_url()` slightly clearer.

Fixes kata-containers#5800

Signed-off-by: James O. D. Hunt <[email protected]>
Signed-off-by: Chelsea Mafrica <[email protected]>
Make the URL checker cycle through a list of user agent values until we
hit one the remote server is happy with.

This is required since, unfortunately, we really, really want to check
these URLs, but some sites block clients based on their `User-Agent`
(UA) request header value. And of course, each site is different and can
change its behaviour at any time.

Our strategy therefore is to try various UA's until we find one the
server accepts:

- No explicit UA (use `curl`'s default)
- Explicitly no UA.
- A blank UA.
- Partial UA values for various CLI tools.
- Partial UA values for various console web browsers.
- Partial UA for Emacs's built-in browser.
- The existing UA which is used as a "last ditch" attempt where the UA implies multiple platforms and browser.

> **Notes:**
>
> - The "partial UA" values specify specify the UA "product" but not the
>   UA "product version": we specify `foo` and not `foo/1.2.3`). We do
>   this since most sites tested appear to not care about the version.
>   This is as expected given that the version is strictly optional (see `[*]`).
>
> - We now treat URLs that the server reports as HTTP 401, HTTP 402 or
>   HTTP 403 as *valid*. See the comments in the code.
>
> - We now log all errors and display a summary on error in addition to
>   the simple list of the URLs we believe to be invalid. This should make
>   future debugging simpler.

`[*]` - https://www.rfc-editor.org/rfc/rfc9110#section-10.1.5

Fixes: kata-containers#5800

Signed-off-by: James O. D. Hunt <[email protected]>
Signed-off-by: Chelsea Mafrica <[email protected]>
@katacontainersbot katacontainersbot added the size/large Task of significant size label Dec 6, 2023
@cmaf
Copy link
Author

cmaf commented Dec 6, 2023

/test

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cmaf.

lgtm

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @cmaf!

@fidencio fidencio merged commit 46da907 into kata-containers:main Dec 7, 2023
12 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
size/large Task of significant size
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants