helpers.conf(5) | Slurm Configuration File | helpers.conf(5) |
helpers.conf - Slurm configuration file for the helpers plugin.
helpers.conf is an ASCII file which defines parameters used by Slurm's "helpers" node feature plugin. The file will always be located in the same directory as the slurm.conf.
Parameter names are case insensitive. Any text following a "#" in the configuration file is treated as a comment through the end of that line. The size of each line in the file is limited to 1024 characters. Changes to the configuration file take effect upon restart of Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the command "scontrol reconfigure" unless otherwise noted.
Features can be defined per node by creating a unique file for each node. The controller must have a helpers.conf that lists all possible helper features.
A single helpers.conf can be created that defines features for specific nodes by prepending NodeName=<nodelist> to the front of the Feature line. A Feature not prepended with NodeName will apply to all nodes.
# helpers.conf NodeName=n1_[1-10] Feature=a1,a2 Helper=/path/helper.sh NodeName=n2_[1-10] Feature=b1,b2 Helper=/path/helper.sh Feature=c1,c2 Helper=/path/helper.sh
If a feature is defined in the helpers.conf and is not defined on a specific node in the helpers.conf but is defined for that node in the slurm.conf, that feature is treated as a changeable/rebootable feature by the controller. For example, if feature fa is defined on node node1 in the slurm.conf but is only listed on node2 in the helpers.conf, the feature will still trigger the node to be rebooted if not active.
The Helper is an arbitrary program or script that reports and modifies the feature set on a given node. The helpers are site-specific and are not included with Slurm. Features modified by the helpers require a reboot of the node using the RebootProgram. The Helper program/script must be executable by the SlurmdUser. The same program/script can be used to control multiple features. slurmd will execute the Helper in one of two ways:
1. Execute with no arguments to query the status of node features. It must return an exit code of 0 and either print a superset of the features expected by Slurm, or it can print nothing. Otherwise, the node will be drained.
2. Execute with a single argument of the feature to be activated on node reboot. In the case of multiple features the script is called multiple times.
NodeFeaturesPlugins=node_features/helpers
# helpers.conf Feature=nps1,nps2,nps4 Helper=/usr/local/bin/nps Feature=mig=on Helper=/usr/local/bin/mig Feature=mig=off Helper=/usr/local/bin/mig MutuallyExclusive=nps1,nps2,nps4 MutuallyExclusive=mig=on,mig=off ExecTime=60 BootTime=60 AllowUserBoot=user1,user2
#!/bin/bash if [ "$1" = "nps1" ]; then echo "$1" > /etc/slurm/feature elif [ "$1" = "nps2" ]; then echo "$1" > /etc/slurm/feature elif [ "$1" = "nps4" ]; then echo "$1" > /etc/slurm/feature else cat /etc/slurm/feature fi
Copyright (C) 2021 NVIDIA CORPORATION. All rights reserved.
Copyright (C) 2021 SchedMD LLC.
This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
slurm.conf(5)
Slurm Configuration File | July 2024 |