-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use actual node resource utilization in the strategy "LowNodeUtilization" #225
Comments
/kind feature |
@zhiyxu it looks like you are not the first person to request this feature. See the discussions in #123 , #118, and #7. Based on the discussions in those issues it looks like the descheduler @damemi @aveshagarwal @ravisantoshgudimetla has anything changed recently to enable the k8s scheduler to use real load metrics during scheduling? For example could the new scheduler framework some how enable this feature in the scheduler? Maybe a custom plugin using the scheduler framework could be created to take real load metrics into account? |
@seanmalloy @ravisantoshgudimetla @damemi @aveshagarwal Any update or plan about this feature? |
+1, We need this feature too. If we can make a PR for this ? |
@zhiyxu and @kangtiann here are my initial thoughts on what the API spec might look like. Please let me know what you think. I'm pretty confident the v1alpha2 I believe it would be a good idea to write a proposal for this and have SIG scheduling review it. Create a new v1alpha1
Create a new v1alpha2 The HPA supports custom metrics. Does the descheduler also need to support custom metrics too? Keep in mind that the k8s scheduler does not take actual node utilization into account when scheduling pods. Pods evicted by this strategy could end up being scheduled on the same node again. Maybe this strategy could be paired with a yet to be created out of tree scheduler plugin that takes node utilization into account when scheduling pods. See the discussions in #123 and #118.
|
@kangtiann just want to clarify are you willing to implement this and submit a PR with the required code changes? |
Also, keep in mind that the kubelet will evict pods when a node starts running out of memory or disk, https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#eviction-signals. |
Would definitely like to get more feedback from the scheduling sig on the feasability of this. Getting "actual" pod usage has been a tricky problem I've personally hit trying to debug flaking e2es and I'm not totally caught up on the current state of getting that info. However I like @seanmalloy's proposal because we already have this strategy that uses resource requests, which users may desire/prefer/expect, but I don't think it would require an entirely new strategy. I think a simple boolean on the current strategy to flip between spec resources and "actual" resources would be less confusing in code and usage |
@seanmalloy The proposal is great, and there are some further details to consider:
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
There was a recent proposal at the SIG Scheduling meeting to add a scheduler plugin to take real load metrics into account during scheduling. /remove-lifecycle stale |
Here is the KEP document for Real Load Aware Scheduling: https://docs.google.com/document/d/1ffBpzhqELmhqJxdGMzYzIOoigxn3J0zlP1_nie34f9s/edit# |
Updated KEP document for Real Load Aware Scheduling: |
After evaluating Descheduler, we are very hopeful it will help us rebalance our clusters. However, we cannot move forward until this feature is implemented. In short, +1 for this feature request and we'll check back often to see when it is released. Thank you! |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
any updates about it, please? would be very useful feature |
@Stefik95 the linked enhancements around real load aware scheduling are still being worked on (mainly in the scheduler-plugins repo, under the "Trimaran" name). It was mentioned above, but getting actual pod consumption relies on access to the metrics api. To move forward with this, we should look into what we need to be able to access those metrics from within descheduler (and fallbacks/disable when those metrics aren't available). Any help with this step is welcome, it would likely follow a similar pattern to Trimaran's metrics collection. As a side note, there were also metrics recently added to report the scheduler's "observed" usage based on limits/requests for administrators to compare to real usage (kubernetes/enhancements#1916 and kubernetes/kubernetes#94866). This is intended to help admins optimize their requests and limits to better reflect actual values. |
Thanks for answer! Could you tell me please, is it true, or not. At the moment LowNodeUtilization works on requests which were set during pod's deploy, not along time changes of pod's requests? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
this would be a really useful feature for us are there any updates on this? /remove-lifecycle stale |
@robertchgo not at the moment. there have been a few who have offered to implement it as discussed above but no progress so far. with other ongoing work, this is a backlog feature right now /lifecycle frozen |
Hello everyone! I have a MR(#1087) to try to solve this problem, and I look forward to everyone's review comments to make it better. Hope it will help. |
This feature is really useful, when will it be updated? |
Currently, pods' request resource requirements are considered for computing node resource utilization in the strategy "LowNodeUtilization", is it more rational to use actual node resource utilization as judgement?
It is common that resource limit of pod is larger than request, and after the scheduling of default scheduler (by resource request), It's probably that cluster is balanced. But the actual resources usage of pod could be much larger than request, which may lead some nodes under pressure.
So is it much more reasonable to combine with metrics server and use actual node resource utilization.
The text was updated successfully, but these errors were encountered: