Graceful Descheduling instead of Eviction #1558

B1F030 · 2024-11-19T06:33:36Z

Is your feature request related to a problem? Please describe.

Is there a graceful resolution(like rolling update) for Eviction?
For now, descheduler evicts the pods which will cause a service interruption.
And I wonder that can I customize the config to make it Restart the pods instead of Eviction?
like this:

kubectl rollout restart deployment/abc

Describe the solution you'd like

Provide an optional config to restart pods instead of evict pods, so that the service will not be interrupted.

Describe alternatives you've considered

Or create new pods before evict old pods, when new pods are ready, old pods can be deleted.

What version of descheduler are you using?

descheduler version: v0.31.0

Additional context

The text was updated successfully, but these errors were encountered:

a7i · 2024-11-19T18:41:23Z

For now, descheduler evicts the pods which will cause a service interruption.

Would you please elaborate on why that is, given that it uses the eviction api. Do you define a PodDisruptionBudget?

B1F030 · 2024-11-20T03:30:17Z

Would you please elaborate on why that is, given that it uses the eviction api. Do you define a PodDisruptionBudget?

Sure, I'll provide more details about our scenario:

We have two kubernetes gpu node pools, A as monthly(one node) and B as elastic(zero node, but with autoscaler).
now a deployment with one replicas using gpu is running on A(exclusive to all resources of one node), when we rolling update it, it will trigger the autoscaler, and be scheduled to B, then A will be in low usage.

Since monthly node is cheaper, we want the pod to be rescheduled and go back to A, so that the elastic node can be recovered to zero.

In conclusion, this workload takes up almost all of resources on one node, and there's only one replicas so we can't use PDB(using multiple replicas will increase cost).
We hope that, when rolling update, it will be scheduled to the elastic node. After rolling update is done, trigger the reschedule and evict the workload to monthly node(depends on preferredDuringSchedulingIgnoredDuringExecution).

Also we don't want the service interrupted, so I'm looking for a graceful method to reschedule(create the pod before evict it, just like rolling update or kubectl rollout restart deployment).

B1F030 added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2024

B1F030 changed the title ~~optional Restart instead of Eviction~~ Graceful Descheduling instead of Eviction Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graceful Descheduling instead of Eviction #1558

Graceful Descheduling instead of Eviction #1558

B1F030 commented Nov 19, 2024 •

edited

Loading

a7i commented Nov 19, 2024

B1F030 commented Nov 20, 2024

Graceful Descheduling instead of Eviction #1558

Graceful Descheduling instead of Eviction #1558

Comments

B1F030 commented Nov 19, 2024 • edited Loading

a7i commented Nov 19, 2024

B1F030 commented Nov 20, 2024

B1F030 commented Nov 19, 2024 •

edited

Loading