Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Descheduling instead of Eviction #1558

Open
B1F030 opened this issue Nov 19, 2024 · 2 comments
Open

Graceful Descheduling instead of Eviction #1558

B1F030 opened this issue Nov 19, 2024 · 2 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@B1F030
Copy link
Member

B1F030 commented Nov 19, 2024

Is your feature request related to a problem? Please describe.

Is there a graceful resolution(like rolling update) for Eviction?
For now, descheduler evicts the pods which will cause a service interruption.
And I wonder that can I customize the config to make it Restart the pods instead of Eviction?
like this:

kubectl rollout restart deployment/abc

Describe the solution you'd like

Provide an optional config to restart pods instead of evict pods, so that the service will not be interrupted.

Describe alternatives you've considered

Or create new pods before evict old pods, when new pods are ready, old pods can be deleted.

What version of descheduler are you using?

descheduler version: v0.31.0

Additional context

@B1F030 B1F030 added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2024
@B1F030 B1F030 changed the title optional Restart instead of Eviction Graceful Descheduling instead of Eviction Nov 19, 2024
@a7i
Copy link
Contributor

a7i commented Nov 19, 2024

For now, descheduler evicts the pods which will cause a service interruption.

Would you please elaborate on why that is, given that it uses the eviction api. Do you define a PodDisruptionBudget?

@B1F030
Copy link
Member Author

B1F030 commented Nov 20, 2024

Would you please elaborate on why that is, given that it uses the eviction api. Do you define a PodDisruptionBudget?

Sure, I'll provide more details about our scenario:

We have two kubernetes gpu node pools, A as monthly(one node) and B as elastic(zero node, but with autoscaler).
now a deployment with one replicas using gpu is running on A(exclusive to all resources of one node), when we rolling update it, it will trigger the autoscaler, and be scheduled to B, then A will be in low usage.

Since monthly node is cheaper, we want the pod to be rescheduled and go back to A, so that the elastic node can be recovered to zero.

In conclusion, this workload takes up almost all of resources on one node, and there's only one replicas so we can't use PDB(using multiple replicas will increase cost).
We hope that, when rolling update, it will be scheduled to the elastic node. After rolling update is done, trigger the reschedule and evict the workload to monthly node(depends on preferredDuringSchedulingIgnoredDuringExecution).

Also we don't want the service interrupted, so I'm looking for a graceful method to reschedule(create the pod before evict it, just like rolling update or kubectl rollout restart deployment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants