reV supply-curve-aggregation
Execute the supply-curve-aggregation
step from a config file.
reV
supply curve aggregation combines a high-resolution
(e.g. 90m) exclusion dataset with a (typically) lower resolution
(e.g. 2km) generation dataset by mapping all data onto the high-
resolution grid and aggregating it by a large factor (e.g. 64 or
128). The result is coarsely-gridded data that summarizes
capacity and generation potential as well as associated
economics under a particular land access scenario. This module
can also summarize extra data layers during the aggregation
process, allowing for complementary land characterization
analysis.
The general structure for calling this CLI command is given below
(add --help
to print help info to the terminal).
reV supply-curve-aggregation [OPTIONS]
Options
- -c, --config_file <config_file>
Required Path to the
supply-curve-aggregation
configuration file. Below is a sample template config{ "execution_control": { "option": "local", "allocation": "[REQUIRED IF ON HPC]", "walltime": "[REQUIRED IF ON HPC]", "qos": "normal", "memory": null, "queue": null, "feature": null, "conda_env": null, "module": null, "sh_script": null, "num_test_nodes": null, "max_workers": null, "sites_per_worker": 100 }, "log_directory": "./logs", "log_level": "INFO", "excl_fpath": "[REQUIRED]", "tm_dset": "[REQUIRED]", "econ_fpath": null, "excl_dict": null, "area_filter_kernel": "queen", "min_area": null, "resolution": 64, "excl_area": null, "res_fpath": null, "gids": null, "pre_extract_inclusions": false, "res_class_dset": null, "res_class_bins": null, "cf_dset": "cf_mean-means", "lcoe_dset": "lcoe_fcr-means", "h5_dsets": null, "data_layers": null, "power_density": null, "friction_fpath": null, "friction_dset": null, "cap_cost_scale": null, "recalc_lcoe": true, "gen_fpath": null, "args": null }
execution_control: option: local allocation: '[REQUIRED IF ON HPC]' walltime: '[REQUIRED IF ON HPC]' qos: normal memory: null queue: null feature: null conda_env: null module: null sh_script: null num_test_nodes: null max_workers: null sites_per_worker: 100 log_directory: ./logs log_level: INFO excl_fpath: '[REQUIRED]' tm_dset: '[REQUIRED]' econ_fpath: null excl_dict: null area_filter_kernel: queen min_area: null resolution: 64 excl_area: null res_fpath: null gids: null pre_extract_inclusions: false res_class_dset: null res_class_bins: null cf_dset: cf_mean-means lcoe_dset: lcoe_fcr-means h5_dsets: null data_layers: null power_density: null friction_fpath: null friction_dset: null cap_cost_scale: null recalc_lcoe: true gen_fpath: null args: null
log_directory = "./logs" log_level = "INFO" excl_fpath = "[REQUIRED]" tm_dset = "[REQUIRED]" area_filter_kernel = "queen" resolution = 64 pre_extract_inclusions = false cf_dset = "cf_mean-means" lcoe_dset = "lcoe_fcr-means" recalc_lcoe = true [execution_control] option = "local" allocation = "[REQUIRED IF ON HPC]" walltime = "[REQUIRED IF ON HPC]" qos = "normal" sites_per_worker = 100
Parameters
- execution_controldict
Dictionary containing execution control arguments. Allowed arguments are:
- option:
({‘local’, ‘kestrel’, ‘eagle’, ‘awspc’, ‘slurm’, ‘peregrine’}) Hardware run option. Determines the type of job scheduler to use as well as the base AU cost. The “slurm” option is a catchall for HPC systems that use the SLURM scheduler and should only be used if desired hardware is not listed above. If “local”, no other HPC-specific keys in are required in execution_control (they are ignored if provided).
- allocation:
(str) HPC project (allocation) handle.
- walltime:
(int) Node walltime request in hours.
- qos:
(str, optional) Quality-of-service specifier. For Kestrel users: This should be one of {‘standby’, ‘normal’, ‘high’}. Note that ‘high’ priority doubles the AU cost. By default,
"normal"
.- memory:
(int, optional) Node memory max limit (in GB). By default,
None
, which uses the scheduler’s default memory limit. For Kestrel users: If you would like to use the full node memory, leave this argument unspecified (or set toNone
) if you are running on standard nodes. However, if you would like to use the bigmem nodes, you must specify the full upper limit of memory you would like for your job, otherwise you will be limited to the standard node memory size (250GB).- max_workers:
(int, optional) Number of cores to run summary on.
None
is all available CPUs. By default,None
.- sites_per_worker:
(int, optional) Number of sc_points to summarize on each worker. By default,
100
.- queue:
(str, optional; PBS ONLY) HPC queue to submit job to. Examples include: ‘debug’, ‘short’, ‘batch’, ‘batch-h’, ‘long’, etc. By default,
None
, which uses “test_queue”.- feature:
(str, optional) Additional flags for SLURM job (e.g. “-p debug”). By default,
None
, which does not specify any additional flags.- conda_env:
(str, optional) Name of conda environment to activate. By default,
None
, which does not load any environments.- module:
(str, optional) Module to load. By default,
None
, which does not load any modules.- sh_script:
(str, optional) Extra shell script to run before command call. By default,
None
, which does not run any scripts.- num_test_nodes:
(str, optional) Number of nodes to submit before terminating the submission process. This can be used to test a new submission configuration without sumbitting all nodes (i.e. only running a handful to ensure the inputs are specified correctly and the outputs look reasonable). By default,
None
, which submits all node jobs.
Only the option key is required for local execution. For execution on the HPC, the allocation and walltime keys are also required. All other options are populated with default values, as seen above.
- log_directorystr
Path to directory where logs should be written. Path can be relative and does not have to exist on disk (it will be created if missing). By default,
"./logs"
.- log_level{“DEBUG”, “INFO”, “WARNING”, “ERROR”}
String representation of desired logger verbosity. Suitable options are
DEBUG
(most verbose),INFO
(moderately verbose),WARNING
(only log warnings and errors), andERROR
(only log errors). By default,"INFO"
.- excl_fpathstr | list | tuple
Filepath to exclusions data HDF5 file. The exclusions HDF5 file should contain the layers specified in excl_dict and data_layers. These layers may also be spread out across multiple HDF5 files, in which case this input should be a list or tuple of filepaths pointing to the files containing the layers. Note that each data layer must be uniquely defined (i.e.only appear once and in a single input file).
- tm_dsetstr
Dataset name in the excl_fpath file containing the techmap (exclusions-to-resource mapping data). This data layer links the supply curve GID’s to the generation GID’s that are used to evaluate performance metrics such as
mean_cf
.Important
This dataset uniquely couples the (typically high-resolution) exclusion layers to the (typically lower-resolution) resource data. Therefore, a separate techmap must be used for every unique combination of resource and exclusion coordinates.
Note
If executing
reV
from the command line, you can specify a name that is not in the exclusions HDF5 file, andreV
will calculate the techmap for you. Note however that computing the techmap and writing it to the exclusion HDF5 file is a blocking operation, so you may only run a singlereV
aggregation step at a time this way.- econ_fpathstr, optional
Filepath to HDF5 file with
reV
econ output results containing an lcoe_dset dataset. IfNone
, lcoe_dset should be a dataset in the gen_fpath HDF5 file that aggregation is executed on.Note
If executing
reV
from the command line, this input can be set to"PIPELINE"
to parse the output from one of these preceding pipeline steps:multi-year
,collect
, orgeneration
. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the econ_fpath input will have to be specified manually.By default,
None
.- excl_dictdict | None
Dictionary of exclusion keyword arguments of the format
{layer_dset_name: {kwarg: value}}
, wherelayer_dset_name
is a dataset in the exclusion h5 file and thekwarg: value
pair is a keyword argument to thereV.supply_curve.exclusions.LayerMask
class. For example:excl_dict = { "typical_exclusion": { "exclude_values": 255, }, "another_exclusion": { "exclude_values": [2, 3], "weight": 0.5 }, "exclusion_with_nodata": { "exclude_range": [10, 100], "exclude_nodata": True, "nodata_value": -1 }, "partial_setback": { "use_as_weights": True }, "height_limit": { "exclude_range": [0, 200] }, "slope": { "include_range": [0, 20] }, "developable_land": { "force_include_values": 42 }, "more_developable_land": { "force_include_range": [5, 10] }, "viewsheds": { "exclude_values": 1, "extent": { "layer": "federal_parks", "include_range": [1, 5] } } ... }
Note that all the keys given in this dictionary should be datasets of the excl_fpath file. If
None
or empty dictionary, no exclusions are applied. By default,None
.- area_filter_kernel{“queen”, “rook”}, optional
Contiguous area filter method to use on final exclusions mask. The filters are defined as:
# Queen: # Rook: [[1,1,1], [[0,1,0], [1,1,1], [1,1,1], [1,1,1]] [0,1,0]]
These filters define how neighboring pixels are “connected”. Once pixels in the final exclusion layer are connected, the area of each resulting cluster is computed and compared against the min_area input. Any cluster with an area less than min_area is excluded from the final mask. This argument has no effect if min_area is
None
. By default,"queen"
.- min_areafloat, optional
Minimum area (in km2) required to keep an isolated cluster of (included) land within the resulting exclusions mask. Any clusters of land with areas less than this value will be marked as exclusions. See the documentation for area_filter_kernel for an explanation of how the area of each land cluster is computed. If
None
, no area filtering is performed. By default,None
.- resolutionint, optional
Supply Curve resolution. This value defines how many pixels are in a single side of a supply curve cell. For example, a value of
64
would generate a supply curve where the side of each supply curve cell is64x64
exclusion pixels. By default,64
.- excl_areafloat, optional
Area of a single exclusion mask pixel (in km2). If
None
, this value will be inferred from the profile transform attribute in excl_fpath. By default,None
.- res_fpathstr, optional
Filepath to HDF5 resource file (e.g. WTK or NSRDB). This input is required if techmap dset is to be created or if the
gen_fpath
input to thesummarize
orrun
methods isNone
. By default,None
.- gidslist, optional
List of supply curve point gids to get summary for. If you would like to obtain all available
reV
supply curve points to run, you can use thereV.supply_curve.extent.SupplyCurveExtent
class like so:import pandas as pd from reV.supply_curve.extent import SupplyCurveExtent excl_fpath = "..." resolution = ... tm_dset = "..." with SupplyCurveExtent(excl_fpath, resolution) as sc: gids = sc.valid_sc_points(tm_dset).tolist() ...
If
None
, supply curve aggregation is computed for all gids in the supply curve extent. By default,None
.- pre_extract_inclusionsbool, optional
Optional flag to pre-extract/compute the inclusion mask from the excl_dict input. It is typically faster to compute the inclusion mask on the fly with parallel workers. By default,
False
.- res_class_dsetstr, optional
Name of dataset in the
reV
generation HDF5 output file containing resource data. IfNone
, no aggregated resource classification is performed (i.e. nomean_res
output), and the res_class_bins is ignored. By default,None
.- res_class_binslist, optional
Optional input to perform separate aggregations for various resource data ranges. If
None
, only a single aggregation per supply curve point is performed. Otherwise, this input should be a list of floats or ints representing the resource bin boundaries. One aggregation per resource value range is computed, and only pixels within the given resource range are aggregated. By default,None
.- cf_dsetstr, optional
Dataset name from the
reV
generation HDF5 output file containing a 1D dataset of mean capacity factor values. This dataset will be mapped onto the high-resolution grid and used to compute the mean capacity factor for non-excluded area. By default,"cf_mean-means"
.- lcoe_dsetstr, optional
Dataset name from the
reV
generation HDF5 output file containing a 1D dataset of mean LCOE values. This dataset will be mapped onto the high-resolution grid and used to compute the mean LCOE for non-excluded area, but only if the LCOE is not re-computed during processing (see the recalc_lcoe input for more info). By default,"lcoe_fcr-means"
.- h5_dsetslist, optional
Optional list of additional datasets from the
reV
generation/econ HDF5 output file to aggregate. IfNone
, no extra datasets are aggregated.Warning
This input is meant for passing through 1D datasets. If you specify a 2D or higher-dimensional dataset, you may run into memory errors. If you wish to aggregate 2D datasets, see the rep-profiles module.
By default,
None
.- data_layersdict, optional
Dictionary of aggregation data layers of the format:
data_layers = { "output_layer_name": { "dset": "layer_name", "method": "mean", "fpath": "/path/to/data.h5" }, "another_output_layer_name": { "dset": "input_layer_name", "method": "mode", # optional "fpath" key omitted }, ... }
The
"output_layer_name"
is the column name under which the aggregated data will appear in the output CSV file. The"output_layer_name"
does not have to match thedset
input value. The latter should match the layer name in the HDF5 from which the data to aggregate should be pulled. Themethod
should be one of{"mode", "mean", "min", "max", "sum", "category"}
, describing how the high-resolution data should be aggregated for each supply curve point.fpath
is an optional key that can point to an HDF5 file containing the layer data. If left out, the data is assumed to exist in the file(s) specified by the excl_fpath input. IfNone
, no data layer aggregation is performed. By default,None
- power_densityfloat | str, optional
Power density value (in MW/km2) or filepath to variable power density CSV file containing the following columns:
gid
: resource gid (typically wtk or nsrdb gid)power_density
: power density value (in MW/km2)
If
None
, a constant power density is inferred from the generation meta data technology. By default,None
.- friction_fpathstr, optional
Filepath to friction surface data (cost based exclusions). Must be paired with the friction_dset input below. The friction data must be the same shape as the exclusions. Friction input creates a new output column
"mean_lcoe_friction"
which is the nominal LCOE multiplied by the friction data. IfNone
, no friction data is aggregated. By default,None
.- friction_dsetstr, optional
Dataset name in friction_fpath for the friction surface data. Must be paired with the friction_fpath above. If
None
, no friction data is aggregated. By default,None
.- cap_cost_scalestr, optional
Optional LCOE scaling equation to implement “economies of scale”. Equations must be in python string format and must return a scalar value to multiply the capital cost by. Independent variables in the equation should match the names of the columns in the
reV
supply curve aggregation output table (see the documentation ofSupplyCurveAggregation
for details on available outputs). IfNone
, no economies of scale are applied. By default,None
.- recalc_lcoebool, optional
Flag to re-calculate the LCOE from the multi-year mean capacity factor and annual energy production data. This requires several datasets to be aggregated in the h5_dsets input:
system_capacity
fixed_charge_rate
capital_cost
fixed_operating_cost
variable_operating_cost
If any of these datasets are missing from the
reV
generation HDF5 output, or if recalc_lcoe is set toFalse
, the mean LCOE will be computed from the data stored under the lcoe_dset instead. By default,True
.- gen_fpathstr, optional
Filepath to HDF5 file with
reV
generation output results. IfNone
, a simple aggregation without any generation, resource, or cost data is performed.Note
If executing
reV
from the command line, this input can be set to"PIPELINE"
to parse the output from one of these preceding pipeline steps:multi-year
,collect
, orecon
. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the gen_fpath input will have to be specified manually.By default,
None
.- argstuple | list, optional
List of columns to include in summary output table.
None
defaults to all available args defined in theSupplyCurveAggregation
documentation. By default,None
.
Note that you may remove any keys with a
null
value if you do not intend to update them yourself.