This module is a wrapper around the slurm-controller-hybrid module by SchedMD as part of the slurm-gcp github repository. The hybrid module serves to create the configurations needed to extend an on-premise slurm cluster to one with one or more Google Cloud bursting partitions. These partitions will create the requested nodes in a GCP project on-demand and scale after a period of not being used, in the same way as the schedmd-slurm-gcp-v5-controller module auto-scales VMs.
Further documentation on how to use this module when deploying a hybrid Slurm cluster can be found in our docs. There, you can find two tutorials. The first tutorial walks you through deploying a test environment entirely in GCP that is designed to demonstrate the capabilities without needing to make any changes to your local slurm cluster. The second tutorial goes through the process of deploying the hybrid configuration onto a on-premise slurm cluster.
NOTE: This is an experimental module and the functionality and documentation will likely be updated in the near future. This module has only been tested in limited capacity with the Cluster Toolkit. On Premise Slurm configurations can vary significantly, this module should be used as a starting point, not a complete solution.
The slurm-controller-hybrid is intended to be run on the controller of the on
premise slurm cluster, meaning executing terraform init/apply
against the
deployment directory. This allows the module to infer settings such as the
slurm user and user ID when setting permissions for the created configurations.
If unable to install terraform and other dependencies on the controller directly, it is possible to deploy the hybrid module in a separate build environment and copy the created configurations to the on premise controller manually. This will require addition configuration and verification of permissions. For more information see the hybrid.md documentation on slurm-gcp.
NOTE: The hybrid module requires the following dependencies to be installed on the system deploying the module:
- terraform
- addict
- httplib2
- pyyaml
- google-api-python-client
- google-cloud-pubsub
- A full list of recommended python packages is available in a requirements.txt file in the slurm-gcp repo.
This module does not complete the installation of hybrid partitions on your slurm cluster. After deploying, you must follow the steps listed out in the hybrid.md documentation under manual steps.
The hybrid module can be added to a blueprint as follows:
- id: slurm-controller
source: ./community/modules/scheduler/schedmd-slurm-gcp-v5-hybrid
use:
- debug-partition
- compute-partition
- pre-existing-storage
settings:
output_dir: ./hybrid
slurm_bin_dir: /usr/local/bin
slurm_control_host: static-controller
This defines a HPC module that create a hybrid configuration with the following attributes:
- 2 partitions defined in previous modules with the IDs of
debug-partition
andcompute-partition
. These are the same partition modules used by schedmd-slurm-gcp-v5-controller. - Network storage to be mounted on the compute nodes when created, defined in
pre-existing-storage
. output_directory
set to./hybrid
. This is where the hybrid configurations will be created.slurm_bin_dir
located at/usr/local/bin
. Set this to wherever the slurm executables are installed on your system.slurm_control_host
: The name of the on premise host is provided to the module for configuring NFS mounts and communicating with the controller after VM creation.
Shared directories from the controller: By default, the following directories are NFS mounted from the on premise controller to the created cloud VMs:
- /home
- /opt/apps
- /etc/munge
- /usr/local/slurm/etc
The expectation is that these directories exist on the controller and that all files required by slurmd to be in sync with the controller are in those directories.
If this does not match your slurm cluster, these directories can be overwritten
with a custom NFS mount using pre-existing-network-storage or by setting the
network_storage
variable directly in the hybrid module. Any value in
network_storage
, added directly or with use
, will override the default
directories above.
The variable disable_default_mounts
will disregard these defaults. Note that
at a minimum, the cloud VMs require /etc/munge
and /usr/local/slurm/etc
to
be mounted from the controller. Those will need to be managed manually if the
disable_default_mounts
variable is set to true.
Power Saving Logic: The cloud partitions will make use of the power saving
logic and the suspend and resume programs will be set. If any local partitions
also make use of these slurm.conf
variables, a conflict will likely occur.
There is no support currently for partition level suspend and resume scripts,
therefore either the local partition will need to turn this off or the hybrid
module will not work.
Slurm versions: The version of slurm on the on premise cluster must match the slurm version on the cloud VMs created by the hybrid partitions. The version on the cloud VMs will be dictated by the version on the disk image that can be set when defining the partitions using schedmd-slurm-gcp-v5-partition.
If the publicly available images do not suffice, slurm-gcp provides packer templates for creating custom disk images.
SchedMD only supports the current and last major version of slurm, therefore we strongly advise only using versions 21 or 22 when using this module. Attempting to use this module with any version older than 21 may lead to unexpected results.
Copyright 2022 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Name | Version |
---|---|
terraform | >= 0.14.0 |
No providers.
Name | Source | Version |
---|---|---|
slurm_controller_instance | github.com/GoogleCloudPlatform/slurm-gcp.git//terraform/slurm_cluster/modules/slurm_controller_hybrid | 5.11.1 |
No resources.
Name | Description | Type | Default | Required |
---|---|---|---|---|
cloud_parameters | cloud.conf options. | object({ |
{ |
no |
compute_startup_script | Startup script used by the compute VMs. | string |
"" |
no |
compute_startup_scripts_timeout | The timeout (seconds) applied to the compute_startup_script. If any script exceeds this timeout, then the instance setup process is considered failed and handled accordingly. NOTE: When set to 0, the timeout is considered infinite and thus disabled. |
number |
300 |
no |
deployment_name | Name of the deployment. | string |
n/a | yes |
disable_default_mounts | Disable default global network storage from the controller: /usr/local/etc/slurm, /etc/munge, /home, /apps. If these are disabled, the slurm etc and munge dirs must be added manually, or some other mechanism must be used to synchronize the slurm conf files and the munge key across the cluster. |
bool |
false |
no |
enable_bigquery_load | Enables loading of cluster job usage into big query. NOTE: Requires Google Bigquery API. |
bool |
false |
no |
enable_cleanup_compute | Enables automatic cleanup of compute nodes and resource policies (e.g. placement groups) managed by this module, when cluster is destroyed. NOTE: Requires Python and script dependencies. WARNING: Toggling this may impact the running workload. Deployed compute nodes may be destroyed and their jobs will be requeued. |
bool |
false |
no |
enable_cleanup_subscriptions | Enables automatic cleanup of pub/sub subscriptions managed by this module, when cluster is destroyed. NOTE: Requires Python and script dependencies. WARNING: Toggling this may temporarily impact var.enable_reconfigure behavior. |
bool |
false |
no |
enable_devel | Enables development mode. Not for production use. | bool |
false |
no |
enable_reconfigure | Enables automatic Slurm reconfigure on when Slurm configuration changes (e.g. slurm.conf.tpl, partition details). Compute instances and resource policies (e.g. placement groups) will be destroyed to align with new configuration. NOTE: Requires Python and Google Pub/Sub API. WARNING: Toggling this will impact the running workload. Deployed compute nodes will be destroyed and their jobs will be requeued. |
bool |
false |
no |
enable_slurm_gcp_plugins | Enables calling hooks in scripts/slurm_gcp_plugins during cluster resume and suspend. | bool |
false |
no |
epilog_scripts | List of scripts to be used for Epilog. Programs for the slurmd to execute on every node when a user's job completes. See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. |
list(object({ |
[] |
no |
google_app_cred_path | Path to Google Application Credentials. | string |
null |
no |
install_dir | Directory where the hybrid configuration directory will be installed on the on-premise controller. This updates the prefix path for the resume and suspend scripts in the generated cloud.conf file. The value defaults tooutput_dir if not specified. |
string |
null |
no |
munge_mount | Remote munge mount for compute and login nodes to acquire the munge.key. By default, the munge mount server will be assumed to be the var.slurm_control_host (or var.slurm_control_addr if non-null) whenserver_ip=null . |
object({ |
{ |
no |
network_storage | An array of network attached storage mounts to be configured on all instances. | list(object({ |
[] |
no |
output_dir | Directory where this module will write its files to. These files include: cloud.conf; cloud_gres.conf; config.yaml; resume.py; suspend.py; and util.py. If not specified explicitly, this will also be used as the default value for the install_dir variable. |
string |
null |
no |
partition | Cluster partitions as a list. | list(object({ |
[] |
no |
project_id | Project ID to create resources in. | string |
n/a | yes |
prolog_scripts | List of scripts to be used for Prolog. Programs for the slurmd to execute whenever it is asked to run a job step from a new job allocation. See https://slurm.schedmd.com/slurm.conf.html#OPT_Prolog. |
list(object({ |
[] |
no |
slurm_bin_dir | Path to directory of Slurm binary commands (e.g. scontrol, sinfo). If 'null', then it will be assumed that binaries are in $PATH. |
string |
null |
no |
slurm_cluster_name | Cluster name, used for resource naming and slurm accounting. If not provided it will default to the first 8 characters of the deployment name (removing any invalid characters). |
string |
null |
no |
slurm_control_addr | The IP address or a name by which the address can be identified. This value is passed to slurm.conf such that: SlurmctldHost={var.slurm_control_host}({var.slurm_control_addr}) See https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost |
string |
null |
no |
slurm_control_host | The short, or long, hostname of the machine where Slurm control daemon is executed (i.e. the name returned by the command "hostname -s"). This value is passed to slurm.conf such that: SlurmctldHost={var.slurm_control_host}({var.slurm_control_addr}) See https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost |
string |
n/a | yes |
slurm_control_host_port | The port number that the Slurm controller, slurmctld, listens to for work. See https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldPort |
string |
null |
no |
slurm_log_dir | Directory where Slurm logs to. | string |
"/var/log/slurm" |
no |
No outputs.