slurm_submit_workers - submit work_queue_worker to a SLURM
cluster.
slurm_submit_workers [options] <servername>
<port> <num-workers>
slurm_submit_workers schedules the execution of
work_queue_worker(1) on the SLURM batch system through its job
submission interface, qsub. The number of work_queue_worker scheduled
and run is given by the num-workers argument.
The servername and port arguments specify the
hostname and port number of the manager for the work_queue_worker to
connect. These two arguments become optional when the auto mode option is
specified for work_queue_worker.
- -M <name>
- Name of the preferred manager for worker.
- -c <cores>
- Set the number of cores each worker should use (0=auto). (default=1)
- -C <catalog>
- Set catalog server for work_queue_worker to <catalog>.
<catalog> format: HOSTNAME:PORT.
- -C <catalog>
- Set catalog server for work_queue_worker to <catalog>.
<catalog> format: HOSTNAME:PORT.
- -t <seconds>
- Abort work_queue_worker after this amount of idle time
(default=900s).
- -d <subsystem>
- Enable debugging on worker for this subsystem (try -d all to start).
- -w <size>
- Set TCP window size
- -i <time>
- Set initial value for backoff interval when worker fails to connect to a
manager. (default=1s)
- -b <time>
- Set maxmimum value for backoff interval when worker fails to connect to a
manager. (default=60s)
- -z <size>
- Set available disk space threshold (in MB). When exceeded worker will
clean up and reconnect. (default=100MB)
- -A <arch>
- Set architecture string for the worker to report to manager instead of the
value in uname.
- -O <os>
- Set operating system string for the worker to report to manager instead of
the value in uname.
- -s <path>
- Set the location for creating the working directory of the worker.
- -p <parameters>
- SLURM qsub parameters.
- --scratch-dir=<path>
- Set the scratch directory location created on the local machine.
(default=${USER}-workers)
- -h
- Show help message.
On success, returns zero. On failure, returns non-zero.
Submit 10 worker instances to run on SLURM and connect to a
specific manager:
-
-
slurm_submit_workers manager.somewhere.edu 9123 10
Submit 10 work_queue_worker instances to run on SLURM in auto mode
with their preferred project name set to Project_A and abort timeout set to
3600 seconds:
-
-
slurm_submit_workers -a -t 3600 -M Project_A 10
The Cooperative Computing Tools are Copyright (C) 2022 The
University of Notre Dame. This software is distributed under the GNU General
Public License. See the file COPYING for details.
- Cooperative Computing Tools Documentation
- Work Queue User Manual
- work_queue_worker(1) work_queue_status(1)
work_queue_factory(1) condor_submit_workers(1)
uge_submit_workers(1) torque_submit_workers(1)