reV bespoke
Execute the bespoke
step from a config file.
Much like generation, reV
bespoke analysis runs SAM
simulations by piping in renewable energy resource data (usually
from the WTK), loading the SAM config, and then executing the
PySAM.Windpower.Windpower
compute module.
However, unlike reV
generation, bespoke analysis is
performed on the supply-curve grid resolution, and the plant
layout is optimized for every supply-curve point based on an
optimization objective specified by the user. See the NREL
publication on the bespoke methodology for more information.
See the documentation for the reV
SAM class (e.g.
reV.SAM.generation.WindPower
,
reV.SAM.generation.PvWattsv8
,
reV.SAM.generation.Geothermal
, etc.) for info on the
allowed and/or required SAM config file inputs.
The general structure for calling this CLI command is given below
(add --help
to print help info to the terminal).
reV bespoke [OPTIONS]
Options
- -c, --config_file <config_file>
Required Path to the
bespoke
configuration file. Below is a sample template config{ "execution_control": { "option": "local", "allocation": "[REQUIRED IF ON HPC]", "walltime": "[REQUIRED IF ON HPC]", "qos": "normal", "memory": null, "nodes": 1, "queue": null, "feature": null, "conda_env": null, "module": null, "sh_script": null, "num_test_nodes": null, "max_workers": null }, "log_directory": "./logs", "log_level": "INFO", "excl_fpath": "[REQUIRED]", "res_fpath": "[REQUIRED]", "tm_dset": "[REQUIRED]", "objective_function": "[REQUIRED]", "capital_cost_function": "[REQUIRED]", "fixed_operating_cost_function": "[REQUIRED]", "variable_operating_cost_function": "[REQUIRED]", "balance_of_system_cost_function": "[REQUIRED]", "project_points": "[REQUIRED]", "sam_files": "[REQUIRED]", "min_spacing": "5x", "wake_loss_multiplier": 1, "ga_kwargs": null, "output_request": [ "system_capacity", "cf_mean" ], "ws_bins": [ 0.0, 20.0, 5.0 ], "wd_bins": [ 0.0, 360.0, 45.0 ], "excl_dict": null, "area_filter_kernel": "queen", "min_area": null, "resolution": 64, "excl_area": null, "data_layers": null, "pre_extract_inclusions": false, "eos_mult_baseline_cap_mw": 200, "prior_run": null, "gid_map": null, "bias_correct": null, "pre_load_data": false }
execution_control: option: local allocation: '[REQUIRED IF ON HPC]' walltime: '[REQUIRED IF ON HPC]' qos: normal memory: null nodes: 1 queue: null feature: null conda_env: null module: null sh_script: null num_test_nodes: null max_workers: null log_directory: ./logs log_level: INFO excl_fpath: '[REQUIRED]' res_fpath: '[REQUIRED]' tm_dset: '[REQUIRED]' objective_function: '[REQUIRED]' capital_cost_function: '[REQUIRED]' fixed_operating_cost_function: '[REQUIRED]' variable_operating_cost_function: '[REQUIRED]' balance_of_system_cost_function: '[REQUIRED]' project_points: '[REQUIRED]' sam_files: '[REQUIRED]' min_spacing: 5x wake_loss_multiplier: 1 ga_kwargs: null output_request: - system_capacity - cf_mean ws_bins: - 0.0 - 20.0 - 5.0 wd_bins: - 0.0 - 360.0 - 45.0 excl_dict: null area_filter_kernel: queen min_area: null resolution: 64 excl_area: null data_layers: null pre_extract_inclusions: false eos_mult_baseline_cap_mw: 200 prior_run: null gid_map: null bias_correct: null pre_load_data: false
log_directory = "./logs" log_level = "INFO" excl_fpath = "[REQUIRED]" res_fpath = "[REQUIRED]" tm_dset = "[REQUIRED]" objective_function = "[REQUIRED]" capital_cost_function = "[REQUIRED]" fixed_operating_cost_function = "[REQUIRED]" variable_operating_cost_function = "[REQUIRED]" balance_of_system_cost_function = "[REQUIRED]" project_points = "[REQUIRED]" sam_files = "[REQUIRED]" min_spacing = "5x" wake_loss_multiplier = 1 output_request = [ "system_capacity", "cf_mean",] ws_bins = [ 0.0, 20.0, 5.0,] wd_bins = [ 0.0, 360.0, 45.0,] area_filter_kernel = "queen" resolution = 64 pre_extract_inclusions = false eos_mult_baseline_cap_mw = 200 pre_load_data = false [execution_control] option = "local" allocation = "[REQUIRED IF ON HPC]" walltime = "[REQUIRED IF ON HPC]" qos = "normal" nodes = 1
Parameters
- execution_controldict
Dictionary containing execution control arguments. Allowed arguments are:
- option:
({‘local’, ‘kestrel’, ‘eagle’, ‘awspc’, ‘slurm’, ‘peregrine’}) Hardware run option. Determines the type of job scheduler to use as well as the base AU cost. The “slurm” option is a catchall for HPC systems that use the SLURM scheduler and should only be used if desired hardware is not listed above. If “local”, no other HPC-specific keys in are required in execution_control (they are ignored if provided).
- allocation:
(str) HPC project (allocation) handle.
- walltime:
(int) Node walltime request in hours.
- qos:
(str, optional) Quality-of-service specifier. For Kestrel users: This should be one of {‘standby’, ‘normal’, ‘high’}. Note that ‘high’ priority doubles the AU cost. By default,
"normal"
.- memory:
(int, optional) Node memory max limit (in GB). By default,
None
, which uses the scheduler’s default memory limit. For Kestrel users: If you would like to use the full node memory, leave this argument unspecified (or set toNone
) if you are running on standard nodes. However, if you would like to use the bigmem nodes, you must specify the full upper limit of memory you would like for your job, otherwise you will be limited to the standard node memory size (250GB).- nodes:
(int, optional) Number of nodes to split the project points across. Note that the total number of requested nodes for a job may be larger than this value if the command splits across other inputs. Default is
1
.- max_workers:
(int, optional) Number of local workers to run on. If
None
, uses all available cores (typically 36). By default,None
.- queue:
(str, optional; PBS ONLY) HPC queue to submit job to. Examples include: ‘debug’, ‘short’, ‘batch’, ‘batch-h’, ‘long’, etc. By default,
None
, which uses “test_queue”.- feature:
(str, optional) Additional flags for SLURM job (e.g. “-p debug”). By default,
None
, which does not specify any additional flags.- conda_env:
(str, optional) Name of conda environment to activate. By default,
None
, which does not load any environments.- module:
(str, optional) Module to load. By default,
None
, which does not load any modules.- sh_script:
(str, optional) Extra shell script to run before command call. By default,
None
, which does not run any scripts.- num_test_nodes:
(str, optional) Number of nodes to submit before terminating the submission process. This can be used to test a new submission configuration without sumbitting all nodes (i.e. only running a handful to ensure the inputs are specified correctly and the outputs look reasonable). By default,
None
, which submits all node jobs.
Only the option key is required for local execution. For execution on the HPC, the allocation and walltime keys are also required. All other options are populated with default values, as seen above.
- log_directorystr
Path to directory where logs should be written. Path can be relative and does not have to exist on disk (it will be created if missing). By default,
"./logs"
.- log_level{“DEBUG”, “INFO”, “WARNING”, “ERROR”}
String representation of desired logger verbosity. Suitable options are
DEBUG
(most verbose),INFO
(moderately verbose),WARNING
(only log warnings and errors), andERROR
(only log errors). By default,"INFO"
.- excl_fpathstr | list | tuple
Filepath to exclusions data HDF5 file. The exclusions HDF5 file should contain the layers specified in excl_dict and data_layers. These layers may also be spread out across multiple HDF5 files, in which case this input should be a list or tuple of filepaths pointing to the files containing the layers. Note that each data layer must be uniquely defined (i.e.only appear once and in a single input file).
- res_fpathstr
Unix shell style path to wind resource HDF5 file in NREL WTK format. Can also be a path including a wildcard input like
/h5_dir/prefix*suffix
to run bespoke on multiple years of resource data. Can also be an explicit list of resource HDF5 file paths, which themselves can contain wildcards. If multiple files are specified in this way, they must have the same coordinates but can have different time indices (i.e. different years). This input must be readable byrex.multi_year_resource.MultiYearWindResource
(i.e. the resource data conform to the rex data format). This means the data file(s) must contain a 1Dtime_index
dataset indicating the UTC time of observation, a 1Dmeta
dataset represented by a DataFrame with site-specific columns, and 2D resource datasets that match the dimensions of (time_index, meta). The time index must start at 00:00 of January 1st of the year under consideration, and its shape must be a multiple of 8760.- tm_dsetstr
Dataset name in the excl_fpath file containing the techmap (exclusions-to-resource mapping data). This data layer links the supply curve GID’s to the generation GID’s that are used to evaluate the performance metrics of each wind plant. By default, the generation GID’s are assumed to match the resource GID’s, but this mapping can be customized via the gid_map input (see the documentation for gid_map for more details).
Important
This dataset uniquely couples the (typically high-resolution) exclusion layers to the (typically lower-resolution) resource data. Therefore, a separate techmap must be used for every unique combination of resource and exclusion coordinates.
- objective_functionstr
The objective function of the optimization written out as a string. This expression should compute the objective to be minimized during layout optimization. Variables available for computation are:
n_turbines
: the number of turbinessystem_capacity
: wind plant capacityaep
: annual energy productionavg_sl_dist_to_center_m
: Average straight-line distance to the supply curve point center from all turbine locations (in m). Useful for computing plant BOS costs.avg_sl_dist_to_medoid_m
: Average straight-line distance to the medoid of all turbine locations (in m). Useful for computing plant BOS costs.nn_conn_dist_m
: Total BOS connection distance using nearest-neighbor connections. This variable is only available for thebalance_of_system_cost_function
equation.fixed_charge_rate
: user input fixed_charge_rate if included as part of the sam system config.capital_cost
: plant capital cost as evaluated by capital_cost_functionfixed_operating_cost
: plant fixed annual operating cost as evaluated by fixed_operating_cost_functionvariable_operating_cost
: plant variable annual operating cost as evaluated by variable_operating_cost_functionbalance_of_system_cost
: plant balance of system cost as evaluated by balance_of_system_cost_functionself.wind_plant
: the SAM wind plant object, through which all SAM variables can be accessed
- capital_cost_functionstr
The plant capital cost function written out as a string. This expression must return the total plant capital cost in $. This expression has access to the same variables as the objective_function argument above.
- fixed_operating_cost_functionstr
The plant annual fixed operating cost function written out as a string. This expression must return the fixed operating cost in $/year. This expression has access to the same variables as the objective_function argument above.
- variable_operating_cost_functionstr
The plant annual variable operating cost function written out as a string. This expression must return the variable operating cost in $/kWh. This expression has access to the same variables as the objective_function argument above. You can set this to “0” to effectively ignore variable operating costs.
- balance_of_system_cost_functionstr
The plant balance-of-system cost function as a string, must return the variable operating cost in $. Has access to the same variables as the objective_function. You can set this to “0” to effectively ignore balance-of-system costs.
- project_pointsint | list | tuple | str | dict | pd.DataFrame | slice
Input specifying which sites to process. A single integer representing the supply curve GID of a site may be specified to evaluate
reV
at a supply curve point. A list or tuple of integers (or slice) representing the supply curve GIDs of multiple sites can be specified to evaluatereV
at multiple specific locations. A string pointing to a project points CSV file may also be specified. Typically, the CSV contains the following columns:gid
: Integer specifying the supply curve GID of each site.config
: Key in the sam_files input dictionary (see below) corresponding to the SAM configuration to use for each particular site. This value can also beNone
(or left out completely) if you specify only a single SAM configuration file as the sam_files input.
The CSV file may also contain site-specific inputs by including a column named after a config keyword (e.g. a column called
capital_cost
may be included to specify a site-specific capital cost value for each location). Columns that do not correspond to a config key may also be included, but they will be ignored. The CSV file input can also have these extra, optional columns:capital_cost_multiplier
fixed_operating_cost_multiplier
variable_operating_cost_multiplier
balance_of_system_cost_multiplier
These particular inputs are treated as multipliers to be applied to the respective cost curves (capital_cost_function, fixed_operating_cost_function, variable_operating_cost_function, and balance_of_system_cost_function) both during and after the optimization. A DataFrame following the same guidelines as the CSV input (or a dictionary that can be used to initialize such a DataFrame) may be used for this input as well. If you would like to obtain all available
reV
supply curve points to run, you can use thereV.supply_curve.extent.SupplyCurveExtent
class like so:import pandas as pd from reV.supply_curve.extent import SupplyCurveExtent excl_fpath = "..." resolution = ... with SupplyCurveExtent(excl_fpath, resolution) as sc: points = sc.valid_sc_points(tm_dset).tolist() points = pd.DataFrame({"gid": points}) points["config"] = "default" # or a list of config choices # Use the points directly or save them to csv for CLI usage points.to_csv("project_points.csv", index=False)
- sam_filesdict | str
A dictionary mapping SAM input configuration ID(s) to SAM configuration(s). Keys are the SAM config ID(s) which correspond to the
config
column in the project points CSV. Values for each key are either a path to a corresponding SAM config file or a full dictionary of SAM config inputs. For example:sam_files = { "default": "/path/to/default/sam.json", "onshore": "/path/to/onshore/sam_config.yaml", "offshore": { "sam_key_1": "sam_value_1", "sam_key_2": "sam_value_2", ... }, ... }
This input can also be a string pointing to a single SAM config file. In this case, the
config
column of the CSV points input should be set toNone
or left out completely. See the documentation for thereV
SAM class (e.g.reV.SAM.generation.WindPower
,reV.SAM.generation.PvWattsv8
,reV.SAM.generation.Geothermal
, etc.) for info on the allowed and/or required SAM config file inputs.- min_spacingfloat | int | str, optional
Minimum spacing between turbines (in meters). This input can also be a string like “5x”, which is interpreted as 5 times the turbine rotor diameter. By default,
"5x"
.- wake_loss_multiplierfloat, optional
A multiplier used to scale the annual energy lost due to wake losses.
Warning
This multiplier will ONLY be applied during the optimization process and will NOT come through in output values such as the hourly profiles, aep, any of the cost functions, or even the output objective.
By default,
1
.- ga_kwargsdict, optional
Dictionary of keyword arguments to pass to GA initialization. If
None
, default initialization values are used. SeeGeneticAlgorithm
for a description of the allowed keyword arguments. By default,None
.- output_requestlist | tuple, optional
Outputs requested from the SAM windpower simulation after the bespoke plant layout optimization. Can be any of the parameters in the “Outputs” group of the PySAM module
PySAM.Windpower.Windpower.Outputs
, PySAM module. This list can also include a select number of SAM config/resource parameters to include in the output: any key in any of the output attribute JSON files may be requested. Time-series profiles requested via this input are output in UTC. This input can also be used to request resource means like"ws_mean"
,"windspeed_mean"
,"temperature_mean"
, and"pressure_mean"
. By default,('system_capacity', 'cf_mean')
.- ws_binstuple, optional
A 3-entry tuple with
(start, stop, step)
for the windspeed binning of the wind joint probability distribution. The stop value is inclusive, sows_bins=(0, 20, 5)
would result in four bins with bin edges (0, 5, 10, 15, 20). By default,(0.0, 20.0, 5.0)
.- wd_binstuple, optional
A 3-entry tuple with
(start, stop, step)
for the wind direction binning of the wind joint probability distribution. The stop value is inclusive, sowd_bins=(0, 360, 90)
would result in four bins with bin edges (0, 90, 180, 270, 360). By default,(0.0, 360.0, 45.0)
.- excl_dictdict, optional
Dictionary of exclusion keyword arguments of the format
{layer_dset_name: {kwarg: value}}
, wherelayer_dset_name
is a dataset in the exclusion h5 file and thekwarg: value
pair is a keyword argument to thereV.supply_curve.exclusions.LayerMask
class. For example:excl_dict = { "typical_exclusion": { "exclude_values": 255, }, "another_exclusion": { "exclude_values": [2, 3], "weight": 0.5 }, "exclusion_with_nodata": { "exclude_range": [10, 100], "exclude_nodata": True, "nodata_value": -1 }, "partial_setback": { "use_as_weights": True }, "height_limit": { "exclude_range": [0, 200] }, "slope": { "include_range": [0, 20] }, "developable_land": { "force_include_values": 42 }, "more_developable_land": { "force_include_range": [5, 10] }, ... }
Note that all the keys given in this dictionary should be datasets of the excl_fpath file. If
None
or empty dictionary, no exclusions are applied. By default,None
.- area_filter_kernel{“queen”, “rook”}, optional
Contiguous area filter method to use on final exclusions mask. The filters are defined as:
# Queen: # Rook: [[1,1,1], [[0,1,0], [1,1,1], [1,1,1], [1,1,1]] [0,1,0]]
These filters define how neighboring pixels are “connected”. Once pixels in the final exclusion layer are connected, the area of each resulting cluster is computed and compared against the min_area input. Any cluster with an area less than min_area is excluded from the final mask. This argument has no effect if min_area is
None
. By default,"queen"
.- min_areafloat, optional
Minimum area (in km2) required to keep an isolated cluster of (included) land within the resulting exclusions mask. Any clusters of land with areas less than this value will be marked as exclusions. See the documentation for area_filter_kernel for an explanation of how the area of each land cluster is computed. If
None
, no area filtering is performed. By default,None
.- resolutionint, optional
Supply Curve resolution. This value defines how many pixels are in a single side of a supply curve cell. For example, a value of
64
would generate a supply curve where the side of each supply curve cell is64x64
exclusion pixels. By default,64
.- excl_areafloat, optional
Area of a single exclusion mask pixel (in km2). If
None
, this value will be inferred from the profile transform attribute in excl_fpath. By default,None
.- data_layersdict, optional
Dictionary of aggregation data layers of the format:
data_layers = { "output_layer_name": { "dset": "layer_name", "method": "mean", "fpath": "/path/to/data.h5" }, "another_output_layer_name": { "dset": "input_layer_name", "method": "mode", # optional "fpath" key omitted }, ... }
The
"output_layer_name"
is the column name under which the aggregated data will appear in the meta DataFrame of the output file. The"output_layer_name"
does not have to match thedset
input value. The latter should match the layer name in the HDF5 from which the data to aggregate should be pulled. Themethod
should be one of{"mode", "mean", "min", "max", "sum", "category"}
, describing how the high-resolution data should be aggregated for each supply curve point.fpath
is an optional key that can point to an HDF5 file containing the layer data. If left out, the data is assumed to exist in the file(s) specified by the excl_fpath input. IfNone
, no data layer aggregation is performed. By default,None
.- pre_extract_inclusionsbool, optional
Optional flag to pre-extract/compute the inclusion mask from the excl_dict input. It is typically faster to compute the inclusion mask on the fly with parallel workers. By default,
False
.- eos_mult_baseline_cap_mwint | float, optional
Baseline plant capacity (MW) used to calculate economies of scale (EOS) multiplier from the capital_cost_function. EOS multiplier is calculated as the $-per-kW of the wind plant divided by the $-per-kW of a plant with this baseline capacity. By default, 200 (MW), which aligns the baseline with ATB assumptions. See here: https://tinyurl.com/y85hnu6h.
- prior_runstr, optional
Optional filepath to a bespoke output HDF5 file belonging to a prior run. If specified, this module will only run the timeseries power generation step and assume that all of the wind plant layouts are fixed from the prior run. The meta data of this file must contain the following columns (automatically satisfied if the HDF5 file was generated by
reV
bespoke):capacity
: Capacity of the plant, in MW.turbine_x_coords
: A string representation of a python list containing the X coordinates (in m; origin of cell at bottom left) of the turbines within the plant (supply curve cell).turbine_y_coords
: A string representation of a python list containing the Y coordinates (in m; origin of cell at bottom left) of the turbines within the plant (supply curve cell).
If
None
, no previous run data is considered. By default,None
- gid_mapstr | dict, optional
Mapping of unique integer generation gids (keys) to single integer resource gids (values). This enables unique generation gids in the project points to map to non-unique resource gids, which can be useful when evaluating multiple resource datasets in
reV
(e.g., forecasted ECMWF resource data to complement historical WTK meteorology). This input can be a pre-extracted dictionary or a path to a JSON or CSV file. If this input points to a CSV file, the file must have the columnsgid
(which matches the project points) andgid_map
(gids to extract from the resource input). IfNone
, the GID values in the project points are assumed to match the resource GID values. By default,None
.- bias_correctstr | pd.DataFrame, optional
Optional DataFrame or CSV filepath to a wind or solar resource bias correction table. This has columns:
gid
: GID of site (can be index name of dataframe)method
: function name fromrex.bias_correction
module
The
gid
field should match the true resourcegid
regardless of the optionalgid_map
input. Onlywindspeed
orGHI
+DNI
+DHI
are corrected, depending on the technology (wind for the former, PV or CSP for the latter). See the functions in therex.bias_correction
module for available inputs formethod
. Any additional kwargs required for the requestedmethod
can be input as additional columns in thebias_correct
table e.g., for linear bias correction functions you can includescalar
andadder
inputs as columns in thebias_correct
table on a site-by-site basis. IfNone
, no corrections are applied. By default,None
.- pre_load_databool, optional
Option to pre-load resource data. This step can be time-consuming up front, but it drastically reduces the number of parallel reads to the res_fpath HDF5 file(s), and can have a significant overall speedup on systems with slow parallel I/O capabilities. Pre-loaded data can use a significant amount of RAM, so be sure to split execution across many nodes (e.g. 100 nodes, 36 workers each for CONUS) or request large amounts of memory for a smaller number of nodes. By default,
False
.- log_directorystr
Path to log output directory.
Note that you may remove any keys with a
null
value if you do not intend to update them yourself.