parsl.providers.SlurmProvider
- class parsl.providers.SlurmProvider(partition: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, clusters: str | None = None, nodes_per_block: int = 1, cores_per_node: int | None = None, mem_per_node: int | None = None, gpus_per_node: int | None = None, gres: str | None = None, init_blocks: int = 1, min_blocks: int = 0, max_blocks: int = 1, parallelism: float = 1, walltime: str = '00:10:00', scheduler_options: str = '', regex_job_id: str = 'Submitted batch job (?P<id>\\S*)', worker_init: str = '', cmd_timeout: int = 10, status_batch_size: int = 50, exclusive: bool = True, launcher: Launcher = SingleNodeLauncher())[source]
Slurm Execution Provider
This provider uses sbatch to submit, sacct for status and scancel to cancel jobs. The sbatch script to be used is created from a template file in this same module.
- Parameters:
partition (str) – Slurm partition to request blocks from. If unspecified or
None, no partition slurm directive will be specified.account (str) – Slurm account to which to charge resources used by the job. If unspecified or
None, the job will use the user’s default account.qos (str) – Slurm queue to place job in. If unspecified or
None, no queue slurm directive will be specified.constraint (str) – Slurm job constraint, often used to choose cpu or gpu type. If unspecified or
None, no constraint slurm directive will be added.clusters (str) – Slurm cluster name, or comma seperated cluster list, used to choose between different clusters in a federated Slurm instance. If unspecified or
None, no slurm directive for clusters will be added.nodes_per_block (int) – Nodes to provision per block.
cores_per_node (int) – Specify the number of cores to provision per node. If set to None, executors will assume all cores on the node are available for computation. Default is None.
mem_per_node (int) – Specify the real memory to provision per node in GB. If set to None, no explicit request to the scheduler will be made. Default is None.
gpus_per_node (int) – Specify the number of GPUs to provision per node. If set to None, executors request no gpus. Default: None
gres (str) – Slurm GRES options to request arbitrary resources. Refer to slurm docs
init_blocks (int) – Number of blocks to provision at the start of the run. Default is 1.
min_blocks (int) – Minimum number of blocks to maintain.
max_blocks (int) – Maximum number of blocks to maintain.
parallelism (float) – Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used.
walltime (str) – Walltime requested per block in HH:MM:SS.
scheduler_options (str) – String to prepend to the #SBATCH blocks in the submit script to the scheduler.
regex_job_id (str) – The regular expression used to extract the job ID from the
sbatchstandard output. The default isr"Submitted batch job (?P<id>\S*)", whereidis the regular expression symbolic group for the job ID.worker_init (str) – Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’.
cmd_timeout (int (Default = 10)) – Number of seconds to wait for slurm commands to finish. For schedulers with many this may need to be increased to wait longer for scheduler information.
status_batch_size (int (Default = 50)) – Number of jobs to batch together in calls to the scheduler status. For schedulers with many jobs this may need to be decreased to get jobs in smaller batches.
exclusive (bool (Default = True)) – Requests nodes which are not shared with other running jobs.
launcher (Launcher) – Launcher for this provider. Possible launchers include
SingleNodeLauncher(the default),SrunLauncher, orAprunLauncher
- __init__(partition: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, clusters: str | None = None, nodes_per_block: int = 1, cores_per_node: int | None = None, mem_per_node: int | None = None, gpus_per_node: int | None = None, gres: str | None = None, init_blocks: int = 1, min_blocks: int = 0, max_blocks: int = 1, parallelism: float = 1, walltime: str = '00:10:00', scheduler_options: str = '', regex_job_id: str = 'Submitted batch job (?P<id>\\S*)', worker_init: str = '', cmd_timeout: int = 10, status_batch_size: int = 50, exclusive: bool = True, launcher: Launcher = SingleNodeLauncher())[source]
Methods
__init__([partition, account, qos, ...])cancel(job_ids)Cancels the jobs specified by a list of job ids
execute_wait(cmd[, timeout])status(job_ids)Get the status of a list of jobs identified by the job identifiers returned from the submit request.
submit(command, tasks_per_node[, job_name])Submit the command as a slurm job.
Attributes
labelProvides the label for this provider
Returns the interval, in seconds, at which the status method should be called.
- cancel(job_ids)[source]
Cancels the jobs specified by a list of job ids
Args: job_ids : [<job_id> …]
Returns : [True/False…] : If the cancel operation fails the entire list will be False.
- property status_polling_interval[source]
Returns the interval, in seconds, at which the status method should be called.
- Returns:
the number of seconds to wait between calls to status()
- submit(command: str, tasks_per_node: int, job_name='parsl.slurm') str[source]
Submit the command as a slurm job.
- Parameters:
command (str) – Command to be made on the remote side.
tasks_per_node (int) – Command invocations to be launched per node
job_name (str) – Name for the job
- Returns:
job id – A string identifier for the job
- Return type:
str