[stable/vpa] Probe defaults don't make sense #1519
Labels
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
What happened?
The VPA chart sets
readinessProbe
andlivenessProbe
for several containers. Here are the values for therecommender
:charts/stable/vpa/values.yaml
Lines 88 to 107 in 1dbb322
Both are essentially the same, just differ in their
failureThreshold
values. Those values are the problem I see: If6
failed liveness probes lead to restarting the container, there is no way for the container to ever become unready after120
readiness probes, as the restart happens way earlier.The behavior has been like this since the probes were added in #399.
In a scenario with a couple of thousand VPA resources I've seen the recommender being restarted all the time because its liveness probes failed, as the container wasn't done with its startup.
What did you expect to happen?
A
failureThreshold
of120
for thereadinessProbe
seems quite high. I wonder if it's rather meant to be astartupProbe
. That highfailureThreshold
allows quite some time for the container to come up before thelivenessProbe
then takes over.How can we reproduce this?
Create lots of VPA resources to slow down the startup of the VPA's recommender pod. It then gets restarted due to the failing liveness probe.
Version
4.5.0
Search
Code of Conduct
Additional context
No response
The text was updated successfully, but these errors were encountered: