reV.supply_curve.sc_aggregation.SupplyCurveAggregation
- class SupplyCurveAggregation(excl_fpath, tm_dset, econ_fpath=None, excl_dict=None, area_filter_kernel='queen', min_area=None, resolution=64, excl_area=None, res_fpath=None, gids=None, pre_extract_inclusions=False, res_class_dset=None, res_class_bins=None, cf_dset='cf_mean-means', lcoe_dset='lcoe_fcr-means', h5_dsets=None, data_layers=None, power_density=None, friction_fpath=None, friction_dset=None, cap_cost_scale=None, recalc_lcoe=True)[source]
Bases:
BaseAggregation
ReV supply curve points aggregation framework.
reV
supply curve aggregation combines a high-resolution (e.g. 90m) exclusion dataset with a (typically) lower resolution (e.g. 2km) generation dataset by mapping all data onto the high- resolution grid and aggregating it by a large factor (e.g. 64 or 128). The result is coarsely-gridded data that summarizes capacity and generation potential as well as associated economics under a particular land access scenario. This module can also summarize extra data layers during the aggregation process, allowing for complementary land characterization analysis.- Parameters:
excl_fpath (str | list | tuple) – Filepath to exclusions data HDF5 file. The exclusions HDF5 file should contain the layers specified in excl_dict and data_layers. These layers may also be spread out across multiple HDF5 files, in which case this input should be a list or tuple of filepaths pointing to the files containing the layers. Note that each data layer must be uniquely defined (i.e.only appear once and in a single input file).
tm_dset (str) – Dataset name in the excl_fpath file containing the techmap (exclusions-to-resource mapping data). This data layer links the supply curve GID’s to the generation GID’s that are used to evaluate performance metrics such as
mean_cf
.Important
This dataset uniquely couples the (typically high-resolution) exclusion layers to the (typically lower-resolution) resource data. Therefore, a separate techmap must be used for every unique combination of resource and exclusion coordinates.
Note
If executing
reV
from the command line, you can specify a name that is not in the exclusions HDF5 file, andreV
will calculate the techmap for you. Note however that computing the techmap and writing it to the exclusion HDF5 file is a blocking operation, so you may only run a singlereV
aggregation step at a time this way.econ_fpath (str, optional) – Filepath to HDF5 file with
reV
econ output results containing an lcoe_dset dataset. IfNone
, lcoe_dset should be a dataset in the gen_fpath HDF5 file that aggregation is executed on.Note
If executing
reV
from the command line, this input can be set to"PIPELINE"
to parse the output from one of these preceding pipeline steps:multi-year
,collect
, orgeneration
. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the econ_fpath input will have to be specified manually.By default,
None
.excl_dict (dict | None) – Dictionary of exclusion keyword arguments of the format
{layer_dset_name: {kwarg: value}}
, wherelayer_dset_name
is a dataset in the exclusion h5 file and thekwarg: value
pair is a keyword argument to thereV.supply_curve.exclusions.LayerMask
class. For example:excl_dict = { "typical_exclusion": { "exclude_values": 255, }, "another_exclusion": { "exclude_values": [2, 3], "weight": 0.5 }, "exclusion_with_nodata": { "exclude_range": [10, 100], "exclude_nodata": True, "nodata_value": -1 }, "partial_setback": { "use_as_weights": True }, "height_limit": { "exclude_range": [0, 200] }, "slope": { "include_range": [0, 20] }, "developable_land": { "force_include_values": 42 }, "more_developable_land": { "force_include_range": [5, 10] }, "viewsheds": { "exclude_values": 1, "extent": { "layer": "federal_parks", "include_range": [1, 5] } } ... }
Note that all the keys given in this dictionary should be datasets of the excl_fpath file. If
None
or empty dictionary, no exclusions are applied. By default,None
.area_filter_kernel ({“queen”, “rook”}, optional) – Contiguous area filter method to use on final exclusions mask. The filters are defined as:
# Queen: # Rook: [[1,1,1], [[0,1,0], [1,1,1], [1,1,1], [1,1,1]] [0,1,0]]
These filters define how neighboring pixels are “connected”. Once pixels in the final exclusion layer are connected, the area of each resulting cluster is computed and compared against the min_area input. Any cluster with an area less than min_area is excluded from the final mask. This argument has no effect if min_area is
None
. By default,"queen"
.min_area (float, optional) – Minimum area (in km2) required to keep an isolated cluster of (included) land within the resulting exclusions mask. Any clusters of land with areas less than this value will be marked as exclusions. See the documentation for area_filter_kernel for an explanation of how the area of each land cluster is computed. If
None
, no area filtering is performed. By default,None
.resolution (int, optional) – Supply Curve resolution. This value defines how many pixels are in a single side of a supply curve cell. For example, a value of
64
would generate a supply curve where the side of each supply curve cell is64x64
exclusion pixels. By default,64
.excl_area (float, optional) – Area of a single exclusion mask pixel (in km2). If
None
, this value will be inferred from the profile transform attribute in excl_fpath. By default,None
.res_fpath (str, optional) – Filepath to HDF5 resource file (e.g. WTK or NSRDB). This input is required if techmap dset is to be created or if the
gen_fpath
input to thesummarize
orrun
methods isNone
. By default,None
.gids (list, optional) – List of supply curve point gids to get summary for. If you would like to obtain all available
reV
supply curve points to run, you can use thereV.supply_curve.extent.SupplyCurveExtent
class like so:import pandas as pd from reV.supply_curve.extent import SupplyCurveExtent excl_fpath = "..." resolution = ... tm_dset = "..." with SupplyCurveExtent(excl_fpath, resolution) as sc: gids = sc.valid_sc_points(tm_dset).tolist() ...
If
None
, supply curve aggregation is computed for all gids in the supply curve extent. By default,None
.pre_extract_inclusions (bool, optional) – Optional flag to pre-extract/compute the inclusion mask from the excl_dict input. It is typically faster to compute the inclusion mask on the fly with parallel workers. By default,
False
.res_class_dset (str, optional) – Name of dataset in the
reV
generation HDF5 output file containing resource data. IfNone
, no aggregated resource classification is performed (i.e. nomean_res
output), and the res_class_bins is ignored. By default,None
.res_class_bins (list, optional) – Optional input to perform separate aggregations for various resource data ranges. If
None
, only a single aggregation per supply curve point is performed. Otherwise, this input should be a list of floats or ints representing the resource bin boundaries. One aggregation per resource value range is computed, and only pixels within the given resource range are aggregated. By default,None
.cf_dset (str, optional) – Dataset name from the
reV
generation HDF5 output file containing a 1D dataset of mean capacity factor values. This dataset will be mapped onto the high-resolution grid and used to compute the mean capacity factor for non-excluded area. By default,"cf_mean-means"
.lcoe_dset (str, optional) – Dataset name from the
reV
generation HDF5 output file containing a 1D dataset of mean LCOE values. This dataset will be mapped onto the high-resolution grid and used to compute the mean LCOE for non-excluded area, but only if the LCOE is not re-computed during processing (see the recalc_lcoe input for more info). By default,"lcoe_fcr-means"
.h5_dsets (list, optional) – Optional list of additional datasets from the
reV
generation/econ HDF5 output file to aggregate. IfNone
, no extra datasets are aggregated.Warning
This input is meant for passing through 1D datasets. If you specify a 2D or higher-dimensional dataset, you may run into memory errors. If you wish to aggregate 2D datasets, see the rep-profiles module.
By default,
None
.data_layers (dict, optional) –
Dictionary of aggregation data layers of the format:
data_layers = { "output_layer_name": { "dset": "layer_name", "method": "mean", "fpath": "/path/to/data.h5" }, "another_output_layer_name": { "dset": "input_layer_name", "method": "mode", # optional "fpath" key omitted }, ... }
The
"output_layer_name"
is the column name under which the aggregated data will appear in the output CSV file. The"output_layer_name"
does not have to match thedset
input value. The latter should match the layer name in the HDF5 from which the data to aggregate should be pulled. Themethod
should be one of{"mode", "mean", "min", "max", "sum", "category"}
, describing how the high-resolution data should be aggregated for each supply curve point.fpath
is an optional key that can point to an HDF5 file containing the layer data. If left out, the data is assumed to exist in the file(s) specified by the excl_fpath input. IfNone
, no data layer aggregation is performed. By default,None
power_density (float | str, optional) – Power density value (in MW/km2) or filepath to variable power density CSV file containing the following columns:
gid
: resource gid (typically wtk or nsrdb gid)power_density
: power density value (in MW/km2)
If
None
, a constant power density is inferred from the generation meta data technology. By default,None
.friction_fpath (str, optional) – Filepath to friction surface data (cost based exclusions). Must be paired with the friction_dset input below. The friction data must be the same shape as the exclusions. Friction input creates a new output column
"mean_lcoe_friction"
which is the nominal LCOE multiplied by the friction data. IfNone
, no friction data is aggregated. By default,None
.friction_dset (str, optional) – Dataset name in friction_fpath for the friction surface data. Must be paired with the friction_fpath above. If
None
, no friction data is aggregated. By default,None
.cap_cost_scale (str, optional) – Optional LCOE scaling equation to implement “economies of scale”. Equations must be in python string format and must return a scalar value to multiply the capital cost by. Independent variables in the equation should match the names of the columns in the
reV
supply curve aggregation output table (see the documentation ofSupplyCurveAggregation
for details on available outputs). IfNone
, no economies of scale are applied. By default,None
.recalc_lcoe (bool, optional) – Flag to re-calculate the LCOE from the multi-year mean capacity factor and annual energy production data. This requires several datasets to be aggregated in the h5_dsets input:
system_capacity
fixed_charge_rate
capital_cost
fixed_operating_cost
variable_operating_cost
If any of these datasets are missing from the
reV
generation HDF5 output, or if recalc_lcoe is set toFalse
, the mean LCOE will be computed from the data stored under the lcoe_dset instead. By default,True
.
Examples
Standard outputs:
- sc_gidint
Unique supply curve gid. This is the enumerated supply curve points, which can have overlapping geographic locations due to different resource bins at the same geographic SC point.
- res_gidslist
Stringified list of resource gids (e.g. original WTK or NSRDB resource GIDs) corresponding to each SC point.
- gen_gidslist
Stringified list of generation gids (e.g. GID in the reV generation output, which corresponds to the reV project points and not necessarily the resource GIDs).
- gid_countslist
Stringified list of the sum of inclusion scalar values corresponding to each gen_gid and res_gid, where 1 is included, 0 is excluded, and 0.7 is included with 70 percent of available land. Each entry in this list is associated with the corresponding entry in the gen_gids and res_gids lists.
- n_gidsint
Total number of included pixels. This is a boolean sum and considers partial inclusions to be included (e.g. 1).
- mean_cffloat
Mean capacity factor of each supply curve point (the arithmetic mean is weighted by the inclusion layer) (unitless).
- mean_lcoefloat
Mean LCOE of each supply curve point (the arithmetic mean is weighted by the inclusion layer). Units match the reV econ output ($/MWh). By default, the LCOE is re-calculated using the multi-year mean capacity factor and annual energy production. This requires several datasets to be aggregated in the h5_dsets input:
fixed_charge_rate
,capital_cost
,fixed_operating_cost
,annual_energy_production
, andvariable_operating_cost
. This recalc behavior can be disabled by settingrecalc_lcoe=False
.- mean_resfloat
Mean resource, the resource dataset to average is provided by the user in res_class_dset. The arithmetic mean is weighted by the inclusion layer.
- capacityfloat
Total capacity of each supply curve point (MW). Units are contingent on the power_density input units of MW/km2.
- area_sq_kmfloat
Total included area for each supply curve point in km2. This is based on the nominal area of each exclusion pixel which by default is calculated from the exclusion profile attributes. The NREL reV default is 0.0081 km2 pixels (90m x 90m). The area sum considers partial inclusions.
- latitudefloat
Supply curve point centroid latitude coordinate, in degrees (does not consider exclusions).
- longitudefloat
Supply curve point centroid longitude coordinate, in degrees (does not consider exclusions).
- countrystr
Country of the supply curve point based on the most common country of the associated resource meta data. Does not consider exclusions.
- statestr
State of the supply curve point based on the most common state of the associated resource meta data. Does not consider exclusions.
- countystr
County of the supply curve point based on the most common county of the associated resource meta data. Does not consider exclusions.
- elevationfloat
Mean elevation of the supply curve point based on the mean elevation of the associated resource meta data. Does not consider exclusions.
- timezoneint
UTC offset of local timezone based on the most common timezone of the associated resource meta data. Does not consider exclusions.
- sc_point_gidint
Spatially deterministic supply curve point gid. Duplicate sc_point_gid values can exist due to resource binning.
- sc_row_indint
Row index of the supply curve point in the aggregated exclusion grid.
- sc_col_indint
Column index of the supply curve point in the aggregated exclusion grid
- res_classint
Resource class for the supply curve gid. Each geographic supply curve point (sc_point_gid) can have multiple resource classes associated with it, resulting in multiple supply curve gids (sc_gid) associated with the same spatially deterministic supply curve point.
Optional outputs:
- mean_frictionfloat
Mean of the friction data provided in ‘friction_fpath’ and ‘friction_dset’. The arithmetic mean is weighted by boolean inclusions and considers partial inclusions to be included.
- mean_lcoe_frictionfloat
Mean of the nominal LCOE multiplied by mean_friction value.
- mean_{dset}float
Mean input h5 dataset(s) provided by the user in ‘h5_dsets’. These mean calculations are weighted by the partial inclusion layer.
- data_layersfloat | int | str | dict
Requested data layer aggregations, each data layer must be the same shape as the exclusion layers.
- mode: int | str
Most common value of a given data layer after applying the boolean inclusion mask.
- meanfloat
Arithmetic mean value of a given data layer weighted by the scalar inclusion mask (considers partial inclusions).
- minfloat | int
Minimum value of a given data layer after applying the boolean inclusion mask.
- maxfloat | int
Maximum value of a given data layer after applying the boolean inclusion mask.
- sumfloat
Sum of a given data layer weighted by the scalar inclusion mask (considers partial inclusions).
- categorydict
Dictionary mapping the unique values in the data_layer to the sum of inclusion scalar values associated with all pixels with that unique value.
Methods
run
(out_fpath[, gen_fpath, args, ...])Run a supply curve aggregation.
run_parallel
(gen_fpath[, args, max_workers, ...])Get the supply curve points aggregation summary using futures.
run_serial
(excl_fpath, gen_fpath, tm_dset, ...)Standalone method to create agg summary - can be parallelized.
summarize
(gen_fpath[, args, max_workers, ...])Get the supply curve points aggregation summary
Attributes
1D array of supply curve point gids to aggregate
Get the shape of the full exclusions raster.
- classmethod run_serial(excl_fpath, gen_fpath, tm_dset, gen_index, econ_fpath=None, excl_dict=None, inclusion_mask=None, area_filter_kernel='queen', min_area=None, resolution=64, gids=None, args=None, res_class_dset=None, res_class_bins=None, cf_dset='cf_mean-means', lcoe_dset='lcoe_fcr-means', h5_dsets=None, data_layers=None, power_density=None, friction_fpath=None, friction_dset=None, excl_area=None, cap_cost_scale=None, recalc_lcoe=True)[source]
Standalone method to create agg summary - can be parallelized.
- Parameters:
excl_fpath (str | list | tuple) – Filepath to exclusions h5 with techmap dataset (can be one or more filepaths).
gen_fpath (str) – Filepath to .h5 reV generation output results.
tm_dset (str) – Dataset name in the exclusions file containing the exclusions-to-resource mapping data.
gen_index (np.ndarray) – Array of generation gids with array index equal to resource gid. Array value is -1 if the resource index was not used in the generation run.
econ_fpath (str | None) – Filepath to .h5 reV econ output results. This is optional and only used if the lcoe_dset is not present in the gen_fpath file.
excl_dict (dict | None) – Dictionary of exclusion keyword arugments of the format {layer_dset_name: {kwarg: value}} where layer_dset_name is a dataset in the exclusion h5 file and kwarg is a keyword argument to the reV.supply_curve.exclusions.LayerMask class.
inclusion_mask (np.ndarray | dict | optional) – 2D array pre-extracted inclusion mask where 1 is included and 0 is excluded. This must be either match the full exclusion shape or be a dict lookup of single-sc-point exclusion masks corresponding to the gids input and keyed by gids, by default None which will calculate exclusions on the fly for each sc point.
area_filter_kernel (str) – Contiguous area filter method to use on final exclusions mask
min_area (float | None) – Minimum required contiguous area filter in sq-km
resolution (int | None) – SC resolution, must be input in combination with gid. Prefered option is to use the row/col slices to define the SC point instead.
gids (list | None) – List of supply curve point gids to get summary for (can use to subset if running in parallel), or None for all gids in the SC extent, by default None
args (list | None) – List of positional args for sc_point_method
res_class_dset (str | None) – Dataset in the generation file dictating resource classes. None if no resource classes.
res_class_bins (list | None) – List of two-entry lists dictating the resource class bins. None if no resource classes.
cf_dset (str) – Dataset name from f_gen containing capacity factor mean values.
lcoe_dset (str) – Dataset name from f_gen containing LCOE mean values.
h5_dsets (list | None) – Optional list of additional datasets from the source h5 gen/econ files to aggregate.
data_layers (None | dict) – Aggregation data layers. Must be a dictionary keyed by data label name. Each value must be another dictionary with “dset”, “method”, and “fpath”.
power_density (float | str | None) – Power density in MW/km2 or filepath to variable power density file. None will attempt to infer a constant power density from the generation meta data technology. Variable power density csvs must have “gid” and “power_density” columns where gid is the resource gid (typically wtk or nsrdb gid) and the power_density column is in MW/km2.
friction_fpath (str | None) – Filepath to friction surface data (cost based exclusions). Must be paired with friction_dset. The friction data must be the same shape as the exclusions. Friction input creates a new output “mean_lcoe_friction” which is the nominal LCOE multiplied by the friction data.
friction_dset (str | None) – Dataset name in friction_fpath for the friction surface data. Must be paired with friction_fpath. Must be same shape as exclusions.
excl_area (float | None, optional) – Area of an exclusion pixel in km2. None will try to infer the area from the profile transform attribute in excl_fpath, by default None
cap_cost_scale (str | None) – Optional LCOE scaling equation to implement “economies of scale”. Equations must be in python string format and return a scalar value to multiply the capital cost by. Independent variables in the equation should match the names of the columns in the reV supply curve aggregation table.
recalc_lcoe (bool) – Flag to re-calculate the LCOE from the multi-year mean capacity factor and annual energy production data. This requires several datasets to be aggregated in the h5_dsets input: system_capacity, fixed_charge_rate, capital_cost, fixed_operating_cost, and variable_operating_cost.
- Returns:
summary (list) – List of dictionaries, each being an SC point summary.
- run_parallel(gen_fpath, args=None, max_workers=None, sites_per_worker=100)[source]
Get the supply curve points aggregation summary using futures.
- Parameters:
gen_fpath (str) – Filepath to .h5 reV generation output results.
args (tuple | list | None) – List of summary arguments to include. None defaults to all available args defined in the class attr.
max_workers (int | None, optional) – Number of cores to run summary on. None is all available cpus, by default None
sites_per_worker (int) – Number of sc_points to summarize on each worker, by default 100
- Returns:
summary (list) – List of dictionaries, each being an SC point summary.
- summarize(gen_fpath, args=None, max_workers=None, sites_per_worker=100)[source]
Get the supply curve points aggregation summary
- Parameters:
gen_fpath (str) – Filepath to .h5 reV generation output results.
args (tuple | list | None) – List of summary arguments to include. None defaults to all available args defined in the class attr.
max_workers (int | None, optional) – Number of cores to run summary on. None is all available cpus, by default None
sites_per_worker (int) – Number of sc_points to summarize on each worker, by default 100
- Returns:
summary (list) – List of dictionaries, each being an SC point summary.
- run(out_fpath, gen_fpath=None, args=None, max_workers=None, sites_per_worker=100)[source]
Run a supply curve aggregation.
- Parameters:
gen_fpath (str, optional) – Filepath to HDF5 file with
reV
generation output results. IfNone
, a simple aggregation without any generation, resource, or cost data is performed.Note
If executing
reV
from the command line, this input can be set to"PIPELINE"
to parse the output from one of these preceding pipeline steps:multi-year
,collect
, orecon
. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the gen_fpath input will have to be specified manually.By default,
None
.args (tuple | list, optional) – List of columns to include in summary output table.
None
defaults to all available args defined in theSupplyCurveAggregation
documentation. By default,None
.max_workers (int, optional) – Number of cores to run summary on.
None
is all available CPUs. By default,None
.sites_per_worker (int, optional) – Number of sc_points to summarize on each worker. By default,
100
.
- Returns:
str – Path to output CSV file containing supply curve aggregation.
- property gids
1D array of supply curve point gids to aggregate
- Returns:
ndarray
- property shape
Get the shape of the full exclusions raster.
- Returns:
tuple