Add a new Partition to a running CycleCloud SLURM cluster
Overview
Azure CycleCloud (CC) is a user-friendly platform that orchestrates High-Performance Computing (HPC) environments on Azure, enabling admins to set up infrastructure, job schedulers, filesystems and scale resources efficiently at any size. It’s designed for HPC administrators intent on deploying environments with specific schedulers.
SLURM, a widely-used HPC job scheduler, is notable for its open-source, scalable, fault-tolerant design, suitable for Linux clusters of any scale. SLURM manages user resources, workloads, accounting, monitoring, and supports parallel/distributed computing, organizing compute nodes into partitions.
This blog will specifically explain how to integrate a new partition into an active SLURM cluster within CycleCloud, without the need to terminate or restart the entire cluster.
Requirements/Versions:
CycleCloud Server (CC version used is 8.6.2)
Cyclecloud cli initialized on the CycleCloud VM
A Running Slurm Cluster
CycleCloud project used is 3.0.7
Slurm version used is 23.11.7-1
SSH and HTTPS access to CycleCloud VM
High Level Overview
Git clone the CC SLURM repo (not required if you already have a slurm template file)
Edit the Slurm template to add a new partition
Export parameters from the running SLURM cluster
Import the updated template file to the running cluster
Activate the new nodearray(s)
Update the cluster settings (VM size, core count, Image, etc)
Scale the cluster to create the nodes
Step 1: Git clone the CC SLURM repo
SSH into the CC VM and run the following commands:
sudo yum install -y git
git clone https://github.com/Azure/cyclecloud-slurm.git
cd cyclecloud-slurm/templates
ll
Step 2: Edit the SLURM template to add new partition(s)
Use your editor of choice (ie. vi, vim, Nano, VSCode remote, etc) to edit the “slurm.txt” template file:
cp slurm.txt slurm-part.txt
vim slurm-part.txt
The template file nodearray is the CC configuration unit that associates to a SLURM partition. There are 3 nodearrays defined in the default template:
hpc: tightly coupled MPI workloads with Infiniband (slurm.hpc = true)
htc: massively parallel throughput jobs w/o Infiniband (slurm.hpc = false)
dynamic: enables multiple VM types in the same partition
Choose the nodearray type for the new partition (hpc or htc) and duplicate the [[[nodearray …]]] config section. For example, to create a new nodearray named “GPU” based on the hpc nodearray (NOTE: hpc nodearray configs included for reference):
[[nodearray hpc]]
Extends = nodearraybase
MachineType = $HPCMachineType
ImageName = $HPCImageName
MaxCoreCount = $MaxHPCExecuteCoreCount
Azure.MaxScalesetSize = $HPCMaxScalesetSize
AdditionalClusterInitSpecs = $HPCClusterInitSpecs
EnableNodeHealthChecks = $EnableNodeHealthChecks
[[[configuration]]]
slurm.default_partition = true
slurm.hpc = true
slurm.partition = hpc
[[nodearray GPU]]
Extends = nodearraybase
MachineType = $GPUMachineType
ImageName = $GPUImageName
MaxCoreCount = $MaxGPUExecuteCoreCount
Azure.MaxScalesetSize = $HPCMaxScalesetSize
AdditionalClusterInitSpecs = $GPUClusterInitSpecs
EnableNodeHealthChecks = $EnableNodeHealthChecks
[[[configuration]]]
slurm.default_partition = false
slurm.hpc = true
slurm.partition = gpu
slurm.use_pcpu = false
NOTE: there can only be 1 “slurm.default_partition” and by default it is the HPC nodearray. Set the new one to false, or if you set it to true then change the HPC nodearray to false.
The “variables” in the nodearray config (ie. $GPUMachineType) are referred to as “Parameters” in CC. The Parameters are attributes exposed in the CC GUI to enable per cluster customization. Further down in the template file begins the Parameters configuration beginning with [parameters About] section. We need to add several configuration blocks throughout this section to correspond to the Parameters defined in the nodearray (ie. $GPUMachineType).
Add the GPUMachineType from HPCMachineType:
[[[parameter HPCMachineType]]]
Label = HPC VM Type
Description = The VM type for HPC execute nodes
ParameterType = Cloud.MachineType
DefaultValue = Standard_F2s_v2
[[[parameter GPUMachineType]]]
Label = GPU VM Type
Description = The VM type for GPU execute nodes
ParameterType = Cloud.MachineType
DefaultValue = Standard_F2s_v2
Add the GPUExecuteCoreCount from HPCExecuteCoreCount:
[[[parameter MaxHPCExecuteCoreCount]]]
Label = Max HPC Cores
Description = The total number of HPC execute cores to start
DefaultValue = 100
Config.Plugin = pico.form.NumberTextBox
Config.MinValue = 1
Config.IntegerOnly = true
[[[parameter MaxGPUExecuteCoreCount]]]
Label = Max GPU Cores
Description = The total number of GPU execute cores to start
DefaultValue = 100
Config.Plugin = pico.form.NumberTextBox
Config.MinValue = 1
Config.IntegerOnly = true
Add the GPUImageName from HPCImageName:
[[[parameter HPCImageName]]]
Label = HPC OS
ParameterType = Cloud.Image
Config.OS = linux
DefaultValue = almalinux8
Config.Filter := Package in {“cycle.image.centos7”, “cycle.image.ubuntu20”, “cycle.image.ubuntu22”, “cycle.image.sles15-hpc”, “almalinux8”}
[[[parameter GPUImageName]]]
Label = GPU OS
ParameterType = Cloud.Image
Config.OS = linux
DefaultValue = almalinux8
Config.Filter := Package in {“cycle.image.centos7”, “cycle.image.ubuntu20”, “cycle.image.ubuntu22”, “cycle.image.sles15-hpc”, “almalinux8”}
Add the GPUClusterInitSpecs from HPCClusterInitSpecs:
[[[parameter HPCClusterInitSpecs]]]
Label = HPC Cluster-Init
DefaultValue = =undefined
Description = Cluster init specs to apply to HPC execute nodes
ParameterType = Cloud.ClusterInitSpecs
[[[parameter GPUClusterInitSpecs]]]
Label = GPU Cluster-Init
DefaultValue = =undefined
Description = Cluster init specs to apply to GPU execute nodes
ParameterType = Cloud.ClusterInitSpecs
NOTE: Keep in mind that you can customize the “DefaultValue” for parameters as per your requirements, or alternatively, you can make changes directly within the CycleCloud graphical user interface.
Save the template file and exit (ie. :wq for vi/vim).
Step 3: Export parameters from the running SLURM cluster
You now have an updated SLURM template file to add a new GPU partition. The template will need to be “imported” into CycleCloud to overwrite the existing cluster definition. Before doing that, however, we need to export all the current cluster GUI parameter configs from the cluster into a local json file to use in the import process. Without this json file the cluster configs are all reset to the default values specified in the template file (and overwriting any customizations applied to the cluster in the GUI).
From the CycleCloud VM run the following command format:
cyclecloud export_parameters cluster_name > file_name.json
For my cluster the specific command is:
cyclecloud export_parameters jm-slurm-test > jm-slurm-test-params.json
cat jm-slurm-test-params.json
{
“UsePublicNetwork” : false,
“configuration_slurm_accounting_storageloc” : null,
“AdditionalNFSMountOptions” : null,
“About shared” : null,
“NFSSchedAddress” : null,
“loginMachineType” : “Standard_D8as_v4”,
“DynamicUseLowPrio” : false,
“configuration_slurm_accounting_password” : null,
“Region” : “southcentralus”,
“MaxHPCExecuteCoreCount” : 240,
“NumberLoginNodes” : 0,
“HTCImageName” : “cycle.image.ubuntu22”,
“MaxHTCExecuteCoreCount” : 10,
“AdditionalNFSExportPath” : “/data”,
“DynamicClusterInitSpecs” : null,
“About shared part 2” : null,
“HPCImageName” : “cycle.image.ubuntu22”,
“SchedulerClusterInitSpecs” : null,
“SchedulerMachineType” : “Standard_D4as_v4”,
“NFSSchedDiskWarning” : null,
…<truncated>
}
If the cyclecloud command does not work you may need to initialize the cli tool as described in the docs: https://learn.microsoft.com/en-us/azure/cyclecloud/how-to/install-cyclecloud-cli?view=cyclecloud-8#initialize-cyclecloud-cli
Step 4: Import the updated template file to the running cluster
To import the updated template to the running cluster in CycleCloud run the following command format:
cyclecloud import_cluster <cluster_name> -c Slurm -f <template file name> txt -p <parameter file name> –force
For my cluster the specific command is:
cyclecloud import_cluster jm-slurm-test -c Slurm -f slurm-part.txt -p jm-slurm-test-params.json –force
In the CycleCloud GUI we can now see the “gpu” nodearray has been added. Click on the “Arrays” tab in the middle panel as shown in the following screen capture:
The gpu nodearray is added to the cluster but it is not yet “Activated,” which means it is not yet available for use.
Step 5: Activate the new nodearray(s)
The cyclecloud start_cluster command will now kickstart the new nodearray activation using the following format:
cyclecloud start_cluster <cluster_name>
For my cluster the command is:
cyclecloud start_cluster jm-slurm-test
From the CycleCloud GUI we will see the gpu nodearray status will move to “Activation” and finally “Activated:”
Step 6: Update the cluster settings
Edit the cluster settings in the CycleCloud GUI to pick the “GPU VM Type” and “Max GPU Cores” in the “Required Settings” section:
Update the “GPU OS” and “GPU Cluster-Init” as needed in the “Advanced Settings” section:
Step 7: Scale the cluster to create the nodes
To this point we added the new nodearray to CycleCloud but SLURM does not yet know about the new GPU partition. We can see this from the scheduler VM with the sinfo command:
The final step is to “scale” the cluster to “pre-define” the compute nodes as needed by SLURM. The CycleCloud azslurm scale command will accomplish this:
Your cluster is now ready to use the new GPU partition.
SUMMARY
Adding a new partition to SLURM with Azure CycleCloud is a flexible and efficient way to update your cluster and leverage different types of compute nodes. You can follow the steps outlined in this article to create a new nodearray, configure the cluster settings, and scale the cluster to match the SLURM partition. By using CycleCloud and SLURM, you can optimize your cluster performance and resource utilization.
References:
CycleCloud Documentation
CycleCloud-SLURM Github repository
Microsoft Training for SLURM on Azure CycleCloud
Microsoft Tech Community – Latest Blogs –Read More