Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Enable exclusive scheduling by default #194

Open
cartalla opened this issue Jan 22, 2024 · 0 comments
Open

[FEATURE] Enable exclusive scheduling by default #194

cartalla opened this issue Jan 22, 2024 · 0 comments
Assignees

Comments

@cartalla
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Currently, users specify core and memory requirements for jobs so that Slurm can pick best compute node instance type for the job. This works by running a job like:

srun -c 1 --mem 1G toolname

BUT, if I really don't want multiple jobs per instance, what do I do? Here's the issue. Let's say a job requires 2 cores and 500G of memory. It lands on a r7 instance due to the memory requirement. Now that instance is available and has idle cores and unused memory. So Slurm schedules a job with 1 core and 2G of memory on it that runs for a long time. What happens is that after the large memory job finishes the low memory job is running on an expensive instance that is severely underutilized.
If many jobs can be packed onto the instance utilization may be acceptable, but at some point as the cluster scales down the instance may be left running in an underutilized state.

Some analysis of real workloads seems to indicate that the optimal scheduling algorithm for cost is to not share jobs on compute nodes. So what I effectively want to be able to do is to specify the job requirements for the purpose of instance type selection, but always use the whole node. Using -N 1 doesn't reserve the whole node, but --exclusive does. In fact it overrides -c and allocates all of the cores on the node to the job. So that handles the scheduling issue. The only remaining thing is that the job can only use the requested memory when it technically could use it all. Not a huge issue.

Describe the solution you'd like
One suggestion is that you could configure the cluster without memory based scheduling. In this case, slurm will use the memory
request to pick a compute node with the required memory, but will not treat memory as a consumable resource. But combined with
the --exclusive option this would prevent over-subscription of memory.

The remaining issue then would be what would happen if a large-memory instance is running and a low-memory job is submitted. I think that Slurm would allocated the job exclusively to the running instance instead of powering up a lower memory compute node. I don't know if there is an option to get Slurm to pick the best node regardless of power state.

@cartalla cartalla self-assigned this Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant