reV batch

Execute an analysis pipeline over a parametric set of inputs.

The general structure for calling this CLI command is given below (add --help to print help info to the terminal).

reV batch [OPTIONS]

Options

-c, --config_file <config_file>

Required Path to the batch configuration file. Below is a sample template config

{
    "logging": {
        "log_file": null,
        "log_level": "INFO"
    },
    "pipeline_config": "[REQUIRED]",
    "sets": [
        {
            "args": "[REQUIRED]",
            "files": "[REQUIRED]",
            "set_tag": "set1"
        },
        {
            "args": "[REQUIRED]",
            "files": "[REQUIRED]",
            "set_tag": "set2"
        }
    ]
}

Parameters

loggingdict, optional

Dictionary containing keyword-argument pairs to pass to init_logger. This initializes logging for the batch command. Note that each pipeline job submitted via batch has it’s own logging key that will initialize pipeline step logging. Therefore, it’s only ever necessary to use this input if you want logging information about the batching portion of the execution.

pipeline_configstr

Path to the pipeline configuration defining the commands to run for every parametric set.

setslist of dicts

A list of dictionaries, where each dictionary defines a “set” of parametric runs. Each dictionary should have the following keys:

argsdict

A dictionary defining the arguments across all input configuration files to parameterize. Each argument to be parametrized should be a key in this dictionary, and the value should be a list of the parameter values to run for this argument (single-item lists are allowed and can be used to vary a parameter value across sets).

"args": {
    "input_constant_1": [
        18.02,
        19.04
    ],
    "path_to_a_file": [
        "/first/path.h5",
        "/second/path.h5",
        "/third/path.h5"
    ]
}

This example would run a total of six pipelines, one with each of the following arg combinations:

input_constant_1=18.20, path_to_a_file="/first/path.h5"
input_constant_1=18.20, path_to_a_file="/second/path.h5"
input_constant_1=18.20, path_to_a_file="/third/path.h5"
input_constant_1=19.04, path_to_a_file="/first/path.h5"
input_constant_1=19.04, path_to_a_file="/second/path.h5"
input_constant_1=19.04, path_to_a_file="/third/path.h5"

Remember that the keys in the args dictionary should be part of (at least) one of your other configuration files.

fileslist

A list of paths to the configuration files that contain the arguments to be updated for every parametric run. Arguments can be spread out over multiple files. For example:

"files": [
    "./config_run.yaml",
    "./config_analyze.json"
]
set_tagstr, optional

Optional string defining a set tag that will prefix each job tag for this set. This tag does not need to include an underscore, as that is provided during concatenation.

--dry

Flag to do a dry run (make batch dirs and update files without running the pipeline).

--cancel

Flag to cancel all jobs associated associated with the batch_jobs.csv file in the current batch config directory.

--delete

Flag to delete all batch job sub directories associated with the batch_jobs.csv file in the current batch config directory.

--monitor-background

Flag to monitor all batch pipelines continuously in the background. Note that the stdout/stderr will not be captured, but you can set a pipeline "log_file" to capture logs.