You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The help text for the -l option says "do not start new jobs if the load average is greater than N". However, as currently implemented, that is not how it behaves.
int load_capacity = config_.max_load_average - GetLoadAverage();
if (load_capacity < capacity)
capacity = load_capacity;
Suppose we are running with -l5, so that config_.max_load_average == 5.0, and suppose GetLoadAverage() returns 4.1. Then the help text implies that new jobs should be able to start. However, this gives config_.max_load_average - GetLoadAverage() == 0.9, and assigning to int truncates the fractional part, so we end up with load_capacity == 0 and this stops any new jobs from starting until the load average falls to 4.0 or lower.
I think the code should instead be
int load_capacity = ceil(config_.max_load_average - GetLoadAverage());
so that we round up.
The code as it stands has other odd effects as well. I have been running ninja with -j4 -l5 on a 4-core system which is otherwise lightly loaded, so I expect that ninja will run 4 jobs at a time, and the load average will stabilize around 4.1. What happens instead is that ninja initially starts 4 jobs, but after a while it runs only 3 jobs at a time, and the load average stabilizes around 3.1, so most of a core goes unused.
Specifically, ninja initially runs 4 jobs at a time, until the load average reaches or slightly exceeds 4.0 (let's say 4.01). Then the next time a job finishes, we have load_capacity == 0, so a new job is not started and now only 3 jobs are running. The load average now falls down to say 3.99, and when the next job finishes, load_capacity == 1. This allows one new job to replace the one that just finished, and so we go on with just 3 jobs. We will continue to have load_capacity == 1 as long as the load average is greater than 3.0, which should remain true since we are running 3 jobs.
Switching to ceil() should fix that as well.
The text was updated successfully, but these errors were encountered:
The help text for the
-l
option says "do not start new jobs if the load average is greater than N". However, as currently implemented, that is not how it behaves.The relevant code is:
ninja/src/build.cc
Lines 629 to 631 in 0b4b43a
Suppose we are running with
-l5
, so thatconfig_.max_load_average == 5.0
, and supposeGetLoadAverage()
returns4.1
. Then the help text implies that new jobs should be able to start. However, this givesconfig_.max_load_average - GetLoadAverage() == 0.9
, and assigning toint
truncates the fractional part, so we end up withload_capacity == 0
and this stops any new jobs from starting until the load average falls to 4.0 or lower.I think the code should instead be
so that we round up.
The code as it stands has other odd effects as well. I have been running ninja with
-j4 -l5
on a 4-core system which is otherwise lightly loaded, so I expect that ninja will run 4 jobs at a time, and the load average will stabilize around 4.1. What happens instead is that ninja initially starts 4 jobs, but after a while it runs only 3 jobs at a time, and the load average stabilizes around 3.1, so most of a core goes unused.Specifically, ninja initially runs 4 jobs at a time, until the load average reaches or slightly exceeds 4.0 (let's say 4.01). Then the next time a job finishes, we have
load_capacity == 0
, so a new job is not started and now only 3 jobs are running. The load average now falls down to say 3.99, and when the next job finishes,load_capacity == 1
. This allows one new job to replace the one that just finished, and so we go on with just 3 jobs. We will continue to haveload_capacity == 1
as long as the load average is greater than 3.0, which should remain true since we are running 3 jobs.Switching to
ceil()
should fix that as well.The text was updated successfully, but these errors were encountered: