Schedule Schema
SIMBA Scheduler
The schema describes which and how modules are executed during each tick of the driver.
https://raw.githubusercontent.com/NSSAC/SIMBA_driver/master/schema/schedule.json |
|||
type |
object |
||
properties |
|||
|
type |
array |
|
items |
allOf |
||
|
Common Data |
||
Common data provided to all modules upon execution. |
|||
type |
object |
||
Integer
An integer
integer |
|
type |
integer |
minimum |
0 |
Integer Tuple
A tuple of 2 integers
integerTuple |
|
type |
array |
items |
|
maxItems |
2 |
minItems |
2 |
Executor
executor |
||||
type |
object |
|||
properties |
||||
|
||||
const |
HighThroughputExecutor |
|||
|
Label for this executor instance. |
|||
type |
string |
|||
default |
HighThroughputExecutor |
|||
|
||||
|
Command line string to launch the process_worker_pool from the provider. The command line string will be formatted with appropriate values for the following values (debug, task_url, result_url, cores_per_worker, nodes_per_block, heartbeat_period ,heartbeat_threshold, logdir). |
|||
type |
string / null |
|||
default |
null |
|||
|
An address to connect to the main Parsl process which is reachable from the network in which workers will be running. This field expects an IPv4 address (xxx.xxx.xxx.xxx). Most login nodes on clusters have several network interfaces available, only some of which can be reached from the compute nodes. This field can be used to limit the executor to listen only on a specific interface, and limiting connections to the internal network. By default, the executor will attempt to enumerate and connect through all possible addresses. Setting an address here overrides the default behavior. |
|||
type |
string / null |
|||
default |
null |
|||
|
Specify the ports to be used by workers to connect to Parsl. If this option is specified, worker_port_range will not be honored. |
|||
default |
null |
|||
oneOf |
||||
type |
null |
|||
|
Worker ports will be chosen between the two integers provided. |
|||
default |
[54000, 55000] |
|||
oneOf |
||||
type |
null |
|||
|
Port range used by Parsl to communicate with the Interchange. |
|||
default |
[55000, 56000] |
|||
oneOf |
||||
type |
null |
|||
|
||||
default |
null |
|||
oneOf |
type |
array |
||
items |
||||
type |
null |
|||
|
Working dir to be used by the executor. |
|||
type |
string / null |
|||
default |
null |
|||
|
Enables worker debug logging. |
|||
type |
boolean |
|||
default |
False |
|||
|
cores to be assigned to each worker. Oversubscription is possible by setting cores_per_worker < 1.0. |
|||
type |
number |
|||
default |
1.0 |
|||
|
GB of memory required per worker. If this option is specified, the node manager will check the available memory at startup and limit the number of workers such that the there’s sufficient memory for each worker. |
|||
type |
number / null |
|||
default |
null |
|||
|
Caps the number of workers launched per node. |
|||
default |
infinity |
|||
oneOf |
type |
number |
||
minimum |
0 |
|||
const |
infinity |
|||
|
Whether or how each worker process sets thread affinity. Options include “none” to forgo any CPU affinity configuration, “block” to assign adjacent cores to workers (ex: assign 0-1 to worker 0, 2-3 to worker 1), and “alternating” to assign cores to workers in round-robin (ex: assign 0,2 to worker 0, 1,3 to worker 1). The “block-reverse” option assigns adjacent cores to workers, but assigns the CPUs with large indices to low index workers (ex: assign 2-3 to worker 1, 0,1 to worker 2) |
|||
type |
string |
|||
default |
none |
|||
|
Accelerators available for workers to use. Each worker will be pinned to exactly one of the provided accelerators, and no more workers will be launched than the number of accelerators. Either provide the list of accelerator names or the number available. If a number is provided, Parsl will create names as integers starting with 0. |
|||
default |
[] |
|||
oneOf |
||||
type |
array |
|||
items |
type |
string |
||
|
Number of tasks that could be prefetched over available worker capacity. When there are a few tasks (<100) or when tasks are long running, this option should be set to 0 for better load balancing. |
|||
default |
0 |
|||
|
Seconds since the last message from the counterpart in the communication pair: (interchange, manager) after which the counterpart is assumed to be un-available. |
|||
default |
120 |
|||
|
Number of seconds after which a heartbeat message indicating liveness is sent to the counterpart (interchange, manager). |
|||
default |
30 |
|||
|
Timeout period to be used by the executor components in milliseconds. Increasing poll_periods trades performance for cpu efficiency. |
|||
default |
10 |
|||
|
Managers attempt connecting over many different addresses to determine a viable address. This option sets a time limit in seconds on the connection attempt. Default of None implies 30s timeout set on worker. |
|||
default |
null |
|||
oneOf |
||||
type |
null |
|||
|
n case of a remote file system, specify the path to where logs will be kept. |
|||
type |
string / null |
|||
default |
null |
|||
Slurm Provider
slurmProvider |
||
type |
object |
|
properties |
||
|
Slurm Provider: https://parsl.readthedocs.io/en/stable/stubs/parsl.providers.SlurmProvider.html#parsl.providers.SlurmProvider |
|
const |
SlurmProvider |
|
|
Slurm partition to request blocks from. If unspecified or None, no partition slurm directive will be specified. |
|
type |
string |
|
|
Slurm account to which to charge resources used by the job. If unspecified or None, the job will use the user’s default account. |
|
type |
string |
|
|
Slurm queue to place job in. If unspecified or None, no queue slurm directive will be specified. |
|
type |
string |
|
|
Slurm job constraint, often used to choose cpu or gpu type. If unspecified or None, no constraint slurm directive will be added. |
|
type |
string |
|
|
Nodes to provision per block. |
|
|
Specify the number of cores to provision per node. If set to None, executors will assume all cores on the node are available for computation. Default is None. |
|
|
Specify the real memory to provision per node in GB. If set to None, no explicit request to the scheduler will be made. Default is None. |
|
|
Initial number of blocks. |
|
default |
1 |
|
|
Minimum number of blocks to maintain. |
|
default |
0 |
|
|
Maximum number of blocks to maintain. |
|
|
Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used. |
|
type |
number |
|
|
Walltime requested per block in HH:MM:SS. |
|
type |
string |
|
default |
00:10:00 |
|
|
String to prepend to the #SBATCH blocks in the submit script to the scheduler. |
|
type |
string |
|
default |
||
|
The regular expression used to extract the job ID from the sbatch standard output. The default is r’Submitted batch job (?P<id>S*)’, where id is the regular expression symbolic group for the job ID. |
|
type |
string |
|
default |
Submitted batch job (?P<id>\S*) |
|
|
Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’. |
|
type |
string |
|
|
Requests nodes which are not shared with other running jobs. |
|
type |
boolean |
|
default |
True |
|
|
should files be moved? by default, Parsl will try to move files.) |
|
type |
boolean |
|
default |
True |
|
|
Launcher for this provider. Possible launchers include SingleNodeLauncher (the default), SrunLauncher, or AprunLauncher |
|
Local Provider
localProvider |
||
type |
object |
|
properties |
||
|
Slurm Provider: https://parsl.readthedocs.io/en/stable/stubs/parsl.providers.SlurmProvider.html#parsl.providers.LocalProvider |
|
const |
LocalProvider |
|
|
Nodes to provision per block. |
|
|
Initial number of blocks. |
|
default |
1 |
|
|
Minimum number of blocks to maintain. |
|
default |
0 |
|
|
Maximum number of blocks to maintain. |
|
default |
1 |
|
|
Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used. |
|
type |
number |
|
|
Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’. |
|
type |
string |
|
|
should files be moved? by default, Parsl will try to move files.) |
|
type |
boolean |
|
default |
True |
|
|
Launcher for this provider. Possible launchers include SingleNodeLauncher |
|
Provider
Provider: https://parsl.readthedocs.io/en/stable/reference.html#providers
provider |
|
oneOf |
|
Srun Launcher
srunLauncher |
||
type |
object |
|
properties |
||
|
srun Launcher: https://parsl.readthedocs.io/en/stable/stubs/parsl.launchers.SrunLauncher.html#parsl.launchers.SrunLauncher |
|
const |
SrunLauncher |
|
|
type |
boolean |
default |
True |
|
|
This string will be passed to the srun launcher. Default: ‘’ |
|
type |
string |
|
default |
||
Single Node Launcher
singleNodeLauncher |
||
type |
object |
|
properties |
||
|
Single Node Launcher Launcher: https://parsl.readthedocs.io/en/stable/stubs/parsl.launchers.SingleNodeLauncher.html#parsl.launchers.SingleNodeLauncher |
|
const |
SingleNodeLauncher |
|
|
type |
boolean |
default |
True |
|
|
type |
boolean |
default |
False |
|
Launcher
Provider: https://parsl.readthedocs.io/en/stable/reference.html#launchers
launcher |
|
oneOf |
|
Storage Access
storageAccess |
|
type |
object |
Schedule
The execution schedule of a module.
schedule |
|||
type |
object |
||
properties |
|||
|
Priority of Module Execution |
||
Higher priority is executed first. Priorities must be unique. |
|||
anyOf |
type |
number |
|
enum |
-Infinity, Infinity |
||
|
Start Tick |
||
The tick at which the module is executed first (default: -infinity). |
|||
|
End Tick |
||
The targetTick for which the module is executed last (default: infinity). |
|||
|
Tick Increment |
||
The tick increment in which the module is executed (default: 1). |
|||
minimum |
1 |
||
Module
Description of module specific information
module |
|||
allOf |
type |
object |
|
properties |
|||
|
type |
string |
|
|
Module Name |
||
The name of the module. |
|||
type |
string |
||
|
Module Command |
||
The command to execute the module. |
|||
type |
string |
||
|
Update Common Data |
||
Specifies whether this module may update common data. |
|||
type |
boolean |
||
default |
False |
||
|
Module specific Data |
||
Module specific data provided to the module upon execution. |
|||
type |
object |
||
oneOf |
|||
Local Module
This describes module details required to execute a module locally (without parsl)
localModule |
||
type |
object |
|
properties |
||
|
const |
local |
Parsl Module
This describes module details required to execute a module through parsl (https://parsl.readthedocs.io/en/stable/index.html)
parslModule |
||
type |
object |
|
properties |
||
|
const |
parsl |
|
||
|
||
|
||
|
||