Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable fail-fast behavior for health checks #933

Merged
merged 1 commit into from
Aug 2, 2023
Merged

Conversation

stefanprodan
Copy link
Member

@stefanprodan stefanprodan commented Jul 31, 2023

Fail the health check as soon as a resource becomes stalled without waiting for the timeout to expire.

For example, if wait: true and timeout: 20m and a Deployment has reached its deadline progressing in 5m, the controller will not wait for another 15m, it will fail the reconciliation when the Deployment rollout has stalled.

Note that the fail-fast behavior does not currently work with HelmReleases as these don't have a stalled condition. We expect to ship stalled conditions in the HelmRelease API v2beta2.

Fix: fluxcd/flux2#3980

This behavior can be disabled using the DisableFailFastBehavior feature flag.

Fail the health check as soon as a resource becomes stalled
without waiting for the timeout to expire.
This behavior can be disabled using the `DisableFailFastBehavior` feature flag.

Signed-off-by: Stefan Prodan <[email protected]>
@stefanprodan stefanprodan added enhancement New feature or request area/kstatus Health checking related issues and pull requests labels Jul 31, 2023
Copy link
Member

@hiddeco hiddeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this pull request by itself looks good to me, I am wondering if there is any thought and/or reasoning around this being a global flag versus a field in the API?

@stefanprodan
Copy link
Member Author

@hiddeco I think failing fast is how heath checking should behave. The feature flag is temporary, if no issues will arise, we'll remove it in a future minor version.

Copy link
Member

@makkes makkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Would we be able to add a test for this feature?

@stefanprodan
Copy link
Member Author

@makkes fail-fast tests where added in the ssa package.

@stefanprodan stefanprodan merged commit 460a165 into main Aug 2, 2023
8 checks passed
@stefanprodan stefanprodan deleted the fail-fast branch August 2, 2023 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kstatus Health checking related issues and pull requests enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants