Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Linux, try to use /proc/loadavg to get instaneous load (#2218) #2268

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

entrope
Copy link

@entrope entrope commented Mar 6, 2023

The /proc/loadavg file includes the traditional 1-, 5- and 15-minute loads plus an instantaneous count of processes that are ready to run. This lets us respond faster to new processes being launched by other processes.

@entrope
Copy link
Author

entrope commented Mar 6, 2023

This is draft because I am not sure whether having a static variable is considered bad form. The Windows function CalculateProcessorLoad() already uses static variables, so perhaps it is okay here?

src/util.cc Show resolved Hide resolved
src/util.cc Show resolved Hide resolved
src/util.cc Outdated Show resolved Hide resolved
src/util.cc Outdated Show resolved Hide resolved
…d#2218)

The /proc/loadavg file includes the traditional 1-, 5- and 15-minute
loads plus an instantaneous count of processes that are ready to run.
This lets us respond faster to new processes being launched by other
processes.
@entrope entrope marked this pull request as ready for review March 30, 2023 00:40
@satmandu
Copy link

Is there any chance of this getting merged?

@Flowdalic
Copy link
Contributor

Flowdalic commented Mar 7, 2024

This changes the API of ninja's double GetLoadAverage() on Linux to return the currently running number of kernel-level threads (tasks) instead of the damped 1-minute average.

Frankly, besides me being skeptical that this function should return a different value depending on the operating system, using the instantaneous value also has drawbacks. The major one is that it, essentially being the value returned by Linux's nr_running(), does not account for uninterruptible tasks, i.e., tasks that are likely waiting for an I/O operation to complete, while those are included in the loadaverage (on Linux). You can read more about it in Brendan Gregg's excellent blog post: https://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html

This poses the risk of overloading the system as those I/O-waiting tasks are likely to become runnable again.

And yes, if they become not runnable again soon, then this would mean that no new ninja jobs are started, even though there are spare CPU resources. However, in this case, the system is likely already I/O-bound and starting to compilation jobs would be counterproductive anyways.

In summary, I am not sure if this change is sensible.

@entrope
Copy link
Author

entrope commented Mar 7, 2024

The underlying problem this was meant to address (#2218) is actually the opposite: The current approach takes too long to notice when lots of processes are creating, leading to CPU oversubscription and potentially VM (and thus swap -> IO) oversubscription. Do you propose a different way to solve the same problem?

@Flowdalic
Copy link
Contributor

Do you propose a different way to solve the same problem?

Yes, see #1949, which got recently merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants