Configuration

Torc provides these mechanisms to configure a workflow of jobs:

  • Workflow specification

  • Commands file

  • CLI commands

  • API calls

Refer to Jobs for complete information about how to define jobs.

Workflow Specification

Create a workflow specification in a JSON file. The JSON document fully defines a workflow and relationships between objects. Users can upload the workflow to the database with a CLI command. This is the recommended process because the JSON file defines everything about the workflow.

Refer to OpenAPI Client Models for a definition of the data model.

For a specific example, refer to this file.

Note

In that example torc determines the order of execution of jobs based on the job/file input/output relationships.

You can create an empty version of this file with the command below. Save the output to a file and customize as you wish.

$ torc workflows template

Advanced Configuration Options

You can specify these options in the config section of the workflow specification.

compute_node_wait_for_new_jobs_seconds (int): Inform all compute nodes to wait for new jobs for this time period before exiting. Does not apply if the workflow is complete. Defaults to 0.

compute_node_ignore_workflow_completion (bool): Inform all compute nodes to ignore workflow completions and hold onto allocations indefinitely. Useful for debugging failed jobs and possibly dynamic workflows where jobs get added after starting. Defaults to false.

compute_node_expiration_buffer_seconds (int): Inform all compute nodes to shut down this number of seconds before the expiration time. This allows torc to send SIGTERM to all job processes and set all statuses to terminated. Increase the time in cases where the job processes handle SIGTERM and need more time to gracefully shut down. Set the value to 0 to maximize the time given to jobs. Defaults to 60 seconds.

compute_node_wait_for_healthy_database_minutes (int): Inform all compute nodes to wait this number of minutes if the database becomes unresponsive. Defaults to 20 minutes.

prepare_jobs_sort_method (str): Inform all compute nodes to use this sort method when requesting jobs. Options are gpus_runtime_memory (default), gpus_memory_runtime, and none. The default behavior is to sort jobs by GPUs, runtime, and then memory. There may be cases where you want to guarantee that bigger-memory jobs are sorted first; in those cases choose gpus_memory_runtime. Choose none if you have a large number of jobs (tens or hundreds of thousands) and sorting isn’t important. There is a performance cost in the database for sorting high job counts.

Upload

Here’s how to upload the workflow to the database:

$ torc workflows create-from-json-file examples/diamond_workflow.json
2023-03-28 16:36:35,149 - INFO [torc.cli.workflows workflows.py:156] : Created a workflow from examples/diamond_workflow.json5 with key=92238688

Commands File

Job definitions in a text file. Each job is a CLI command with options and arguments. The text file has one command on each line. The torc CLI tool creates an empty workflow, converts each command into a job, and adds the job. Users can add dependencies and other resources with torc CLI tools. This process is convenient if your workflow is simple.

This example will create a workflow from 5 commands with a name and description.

$ cat commands.txt
bash my_script.sh -i input1.json -o output1.json
bash my_script.sh -i input2.json -o output2.json
bash my_script.sh -i input3.json -o output3.json
$ torc workflows create-from-commands-file -n my-workflow -d "My workflow" commands.txt

You can also specify the resource requirements for each job in the commands file. For example, this command will create a workflow with 5 commands, each requiring 8 CPUs and 10 GB of memory, and a maximum runtime of 1 hour. The -r option specifies the maximum runtime in ISO 8601 format.

$ torc workflows create-from-commands-file -n my-workflow -d "My workflow" -c 8 -m 10g -r P0DT1H commands.txt

Finally, you can use a similar command to add jobs to an existing workflow. This is useful if different jobs have different resource requirements.

$ torc workflows create-from-commands-file -n my-workflow -d "My workflow" -c 1 -m 1g -r P0DT4H commands.txt
$ torc workflows -k 392742 add-jobs-from-commands-file -c 4 -m 10g -r P0DT4H commands.txt
$ torc workflows -k 392742 add-jobs-from-commands-file -c 8 -m 50g -r P0DT8H commands.txt

CLI commands

Build a workflow incrementally with torc CLI commands like the example below. This process may be required if your workflow exceeds the size that can be transferred in one HTTP POST command.

$ torc workflows create -n my-workflow -d "My workflow"
2023-03-28 16:17:36,736 - INFO [torc.cli.workflows workflows.py:78] : Created workflow with key=92237770
$ torc -k 92237770 jobs add -n job1 -c "bash my_script.sh -i input1.json -o output1.json"
2023-03-28 18:19:17,330 - INFO [torc.cli.jobs jobs.py:80] : Added job with key=92237922

API calls

Make your own API calls directly to the torc database service. Here is one script example.

Dependency graphs

You may want to inspect your workflow graphs for proper dependency definitions. Refer to Plot Graphs for instructions on how to create visualizations.