reV.bespoke.bespoke.BespokeWindPlants
- class BespokeWindPlants(excl_fpath, res_fpath, tm_dset, objective_function, capital_cost_function, fixed_operating_cost_function, variable_operating_cost_function, balance_of_system_cost_function, project_points, sam_files, min_spacing='5x', wake_loss_multiplier=1, ga_kwargs=None, output_request=('system_capacity', 'cf_mean'), ws_bins=(0.0, 20.0, 5.0), wd_bins=(0.0, 360.0, 45.0), excl_dict=None, area_filter_kernel='queen', min_area=None, resolution=64, excl_area=None, data_layers=None, pre_extract_inclusions=False, eos_mult_baseline_cap_mw=200, prior_run=None, gid_map=None, bias_correct=None, pre_load_data=False)[source]
Bases:
BaseAggregation
reV bespoke analysis class.
Much like generation,
reV
bespoke analysis runs SAM simulations by piping in renewable energy resource data (usually from the WTK), loading the SAM config, and then executing thePySAM.Windpower.Windpower
compute module. However, unlikereV
generation, bespoke analysis is performed on the supply-curve grid resolution, and the plant layout is optimized for every supply-curve point based on an optimization objective specified by the user. See the NREL publication on the bespoke methodology for more information.See the documentation for the
reV
SAM class (e.g.reV.SAM.generation.WindPower
,reV.SAM.generation.PvWattsv8
,reV.SAM.generation.Geothermal
, etc.) for info on the allowed and/or required SAM config file inputs.- Parameters:
excl_fpath (str | list | tuple) – Filepath to exclusions data HDF5 file. The exclusions HDF5 file should contain the layers specified in excl_dict and data_layers. These layers may also be spread out across multiple HDF5 files, in which case this input should be a list or tuple of filepaths pointing to the files containing the layers. Note that each data layer must be uniquely defined (i.e.only appear once and in a single input file).
res_fpath (str) – Unix shell style path to wind resource HDF5 file in NREL WTK format. Can also be a path including a wildcard input like
/h5_dir/prefix*suffix
to run bespoke on multiple years of resource data. Can also be an explicit list of resource HDF5 file paths, which themselves can contain wildcards. If multiple files are specified in this way, they must have the same coordinates but can have different time indices (i.e. different years). This input must be readable byrex.multi_year_resource.MultiYearWindResource
(i.e. the resource data conform to the rex data format). This means the data file(s) must contain a 1Dtime_index
dataset indicating the UTC time of observation, a 1Dmeta
dataset represented by a DataFrame with site-specific columns, and 2D resource datasets that match the dimensions of (time_index, meta). The time index must start at 00:00 of January 1st of the year under consideration, and its shape must be a multiple of 8760.tm_dset (str) – Dataset name in the excl_fpath file containing the techmap (exclusions-to-resource mapping data). This data layer links the supply curve GID’s to the generation GID’s that are used to evaluate the performance metrics of each wind plant. By default, the generation GID’s are assumed to match the resource GID’s, but this mapping can be customized via the gid_map input (see the documentation for gid_map for more details).
Important
This dataset uniquely couples the (typically high-resolution) exclusion layers to the (typically lower-resolution) resource data. Therefore, a separate techmap must be used for every unique combination of resource and exclusion coordinates.
objective_function (str) – The objective function of the optimization written out as a string. This expression should compute the objective to be minimized during layout optimization. Variables available for computation are:
n_turbines
: the number of turbinessystem_capacity
: wind plant capacityaep
: annual energy productionavg_sl_dist_to_center_m
: Average straight-line distance to the supply curve point center from all turbine locations (in m). Useful for computing plant BOS costs.avg_sl_dist_to_medoid_m
: Average straight-line distance to the medoid of all turbine locations (in m). Useful for computing plant BOS costs.nn_conn_dist_m
: Total BOS connection distance using nearest-neighbor connections. This variable is only available for thebalance_of_system_cost_function
equation.fixed_charge_rate
: user input fixed_charge_rate if included as part of the sam system config.capital_cost
: plant capital cost as evaluated by capital_cost_functionfixed_operating_cost
: plant fixed annual operating cost as evaluated by fixed_operating_cost_functionvariable_operating_cost
: plant variable annual operating cost as evaluated by variable_operating_cost_functionbalance_of_system_cost
: plant balance of system cost as evaluated by balance_of_system_cost_functionself.wind_plant
: the SAM wind plant object, through which all SAM variables can be accessed
capital_cost_function (str) – The plant capital cost function written out as a string. This expression must return the total plant capital cost in $. This expression has access to the same variables as the objective_function argument above.
fixed_operating_cost_function (str) – The plant annual fixed operating cost function written out as a string. This expression must return the fixed operating cost in $/year. This expression has access to the same variables as the objective_function argument above.
variable_operating_cost_function (str) – The plant annual variable operating cost function written out as a string. This expression must return the variable operating cost in $/kWh. This expression has access to the same variables as the objective_function argument above. You can set this to “0” to effectively ignore variable operating costs.
balance_of_system_cost_function (str) – The plant balance-of-system cost function as a string, must return the variable operating cost in $. Has access to the same variables as the objective_function. You can set this to “0” to effectively ignore balance-of-system costs.
project_points (int | list | tuple | str | dict | pd.DataFrame | slice) – Input specifying which sites to process. A single integer representing the supply curve GID of a site may be specified to evaluate
reV
at a supply curve point. A list or tuple of integers (or slice) representing the supply curve GIDs of multiple sites can be specified to evaluatereV
at multiple specific locations. A string pointing to a project points CSV file may also be specified. Typically, the CSV contains the following columns:gid
: Integer specifying the supply curve GID of each site.config
: Key in the sam_files input dictionary (see below) corresponding to the SAM configuration to use for each particular site. This value can also beNone
(or left out completely) if you specify only a single SAM configuration file as the sam_files input.
The CSV file may also contain site-specific inputs by including a column named after a config keyword (e.g. a column called
capital_cost
may be included to specify a site-specific capital cost value for each location). Columns that do not correspond to a config key may also be included, but they will be ignored. The CSV file input can also have these extra, optional columns:capital_cost_multiplier
fixed_operating_cost_multiplier
variable_operating_cost_multiplier
balance_of_system_cost_multiplier
These particular inputs are treated as multipliers to be applied to the respective cost curves (capital_cost_function, fixed_operating_cost_function, variable_operating_cost_function, and balance_of_system_cost_function) both during and after the optimization. A DataFrame following the same guidelines as the CSV input (or a dictionary that can be used to initialize such a DataFrame) may be used for this input as well. If you would like to obtain all available
reV
supply curve points to run, you can use thereV.supply_curve.extent.SupplyCurveExtent
class like so:import pandas as pd from reV.supply_curve.extent import SupplyCurveExtent excl_fpath = "..." resolution = ... with SupplyCurveExtent(excl_fpath, resolution) as sc: points = sc.valid_sc_points(tm_dset).tolist() points = pd.DataFrame({"gid": points}) points["config"] = "default" # or a list of config choices # Use the points directly or save them to csv for CLI usage points.to_csv("project_points.csv", index=False)
sam_files (dict | str) – A dictionary mapping SAM input configuration ID(s) to SAM configuration(s). Keys are the SAM config ID(s) which correspond to the
config
column in the project points CSV. Values for each key are either a path to a corresponding SAM config file or a full dictionary of SAM config inputs. For example:sam_files = { "default": "/path/to/default/sam.json", "onshore": "/path/to/onshore/sam_config.yaml", "offshore": { "sam_key_1": "sam_value_1", "sam_key_2": "sam_value_2", ... }, ... }
This input can also be a string pointing to a single SAM config file. In this case, the
config
column of the CSV points input should be set toNone
or left out completely. See the documentation for thereV
SAM class (e.g.reV.SAM.generation.WindPower
,reV.SAM.generation.PvWattsv8
,reV.SAM.generation.Geothermal
, etc.) for info on the allowed and/or required SAM config file inputs.min_spacing (float | int | str, optional) – Minimum spacing between turbines (in meters). This input can also be a string like “5x”, which is interpreted as 5 times the turbine rotor diameter. By default,
"5x"
.wake_loss_multiplier (float, optional) – A multiplier used to scale the annual energy lost due to wake losses.
Warning
This multiplier will ONLY be applied during the optimization process and will NOT come through in output values such as the hourly profiles, aep, any of the cost functions, or even the output objective.
By default,
1
.ga_kwargs (dict, optional) – Dictionary of keyword arguments to pass to GA initialization. If
None
, default initialization values are used. SeeGeneticAlgorithm
for a description of the allowed keyword arguments. By default,None
.output_request (list | tuple, optional) – Outputs requested from the SAM windpower simulation after the bespoke plant layout optimization. Can be any of the parameters in the “Outputs” group of the PySAM module
PySAM.Windpower.Windpower.Outputs
, PySAM module. This list can also include a select number of SAM config/resource parameters to include in the output: any key in any of the output attribute JSON files may be requested. Time-series profiles requested via this input are output in UTC. This input can also be used to request resource means like"ws_mean"
,"windspeed_mean"
,"temperature_mean"
, and"pressure_mean"
. By default,('system_capacity', 'cf_mean')
.ws_bins (tuple, optional) – A 3-entry tuple with
(start, stop, step)
for the windspeed binning of the wind joint probability distribution. The stop value is inclusive, sows_bins=(0, 20, 5)
would result in four bins with bin edges (0, 5, 10, 15, 20). By default,(0.0, 20.0, 5.0)
.wd_bins (tuple, optional) – A 3-entry tuple with
(start, stop, step)
for the wind direction binning of the wind joint probability distribution. The stop value is inclusive, sowd_bins=(0, 360, 90)
would result in four bins with bin edges (0, 90, 180, 270, 360). By default,(0.0, 360.0, 45.0)
.excl_dict (dict, optional) – Dictionary of exclusion keyword arguments of the format
{layer_dset_name: {kwarg: value}}
, wherelayer_dset_name
is a dataset in the exclusion h5 file and thekwarg: value
pair is a keyword argument to thereV.supply_curve.exclusions.LayerMask
class. For example:excl_dict = { "typical_exclusion": { "exclude_values": 255, }, "another_exclusion": { "exclude_values": [2, 3], "weight": 0.5 }, "exclusion_with_nodata": { "exclude_range": [10, 100], "exclude_nodata": True, "nodata_value": -1 }, "partial_setback": { "use_as_weights": True }, "height_limit": { "exclude_range": [0, 200] }, "slope": { "include_range": [0, 20] }, "developable_land": { "force_include_values": 42 }, "more_developable_land": { "force_include_range": [5, 10] }, ... }
Note that all the keys given in this dictionary should be datasets of the excl_fpath file. If
None
or empty dictionary, no exclusions are applied. By default,None
.area_filter_kernel ({“queen”, “rook”}, optional) – Contiguous area filter method to use on final exclusions mask. The filters are defined as:
# Queen: # Rook: [[1,1,1], [[0,1,0], [1,1,1], [1,1,1], [1,1,1]] [0,1,0]]
These filters define how neighboring pixels are “connected”. Once pixels in the final exclusion layer are connected, the area of each resulting cluster is computed and compared against the min_area input. Any cluster with an area less than min_area is excluded from the final mask. This argument has no effect if min_area is
None
. By default,"queen"
.min_area (float, optional) – Minimum area (in km2) required to keep an isolated cluster of (included) land within the resulting exclusions mask. Any clusters of land with areas less than this value will be marked as exclusions. See the documentation for area_filter_kernel for an explanation of how the area of each land cluster is computed. If
None
, no area filtering is performed. By default,None
.resolution (int, optional) – Supply Curve resolution. This value defines how many pixels are in a single side of a supply curve cell. For example, a value of
64
would generate a supply curve where the side of each supply curve cell is64x64
exclusion pixels. By default,64
.excl_area (float, optional) – Area of a single exclusion mask pixel (in km2). If
None
, this value will be inferred from the profile transform attribute in excl_fpath. By default,None
.data_layers (dict, optional) –
Dictionary of aggregation data layers of the format:
data_layers = { "output_layer_name": { "dset": "layer_name", "method": "mean", "fpath": "/path/to/data.h5" }, "another_output_layer_name": { "dset": "input_layer_name", "method": "mode", # optional "fpath" key omitted }, ... }
The
"output_layer_name"
is the column name under which the aggregated data will appear in the meta DataFrame of the output file. The"output_layer_name"
does not have to match thedset
input value. The latter should match the layer name in the HDF5 from which the data to aggregate should be pulled. Themethod
should be one of{"mode", "mean", "min", "max", "sum", "category"}
, describing how the high-resolution data should be aggregated for each supply curve point.fpath
is an optional key that can point to an HDF5 file containing the layer data. If left out, the data is assumed to exist in the file(s) specified by the excl_fpath input. IfNone
, no data layer aggregation is performed. By default,None
.pre_extract_inclusions (bool, optional) – Optional flag to pre-extract/compute the inclusion mask from the excl_dict input. It is typically faster to compute the inclusion mask on the fly with parallel workers. By default,
False
.eos_mult_baseline_cap_mw (int | float, optional) – Baseline plant capacity (MW) used to calculate economies of scale (EOS) multiplier from the capital_cost_function. EOS multiplier is calculated as the $-per-kW of the wind plant divided by the $-per-kW of a plant with this baseline capacity. By default, 200 (MW), which aligns the baseline with ATB assumptions. See here: https://tinyurl.com/y85hnu6h.
prior_run (str, optional) – Optional filepath to a bespoke output HDF5 file belonging to a prior run. If specified, this module will only run the timeseries power generation step and assume that all of the wind plant layouts are fixed from the prior run. The meta data of this file must contain the following columns (automatically satisfied if the HDF5 file was generated by
reV
bespoke):capacity
: Capacity of the plant, in MW.turbine_x_coords
: A string representation of a python list containing the X coordinates (in m; origin of cell at bottom left) of the turbines within the plant (supply curve cell).turbine_y_coords
: A string representation of a python list containing the Y coordinates (in m; origin of cell at bottom left) of the turbines within the plant (supply curve cell).
If
None
, no previous run data is considered. By default,None
gid_map (str | dict, optional) – Mapping of unique integer generation gids (keys) to single integer resource gids (values). This enables unique generation gids in the project points to map to non-unique resource gids, which can be useful when evaluating multiple resource datasets in
reV
(e.g., forecasted ECMWF resource data to complement historical WTK meteorology). This input can be a pre-extracted dictionary or a path to a JSON or CSV file. If this input points to a CSV file, the file must have the columnsgid
(which matches the project points) andgid_map
(gids to extract from the resource input). IfNone
, the GID values in the project points are assumed to match the resource GID values. By default,None
.bias_correct (str | pd.DataFrame, optional) – Optional DataFrame or CSV filepath to a wind or solar resource bias correction table. This has columns:
gid
: GID of site (can be index name of dataframe)method
: function name fromrex.bias_correction
module
The
gid
field should match the true resourcegid
regardless of the optionalgid_map
input. Onlywindspeed
orGHI
+DNI
+DHI
are corrected, depending on the technology (wind for the former, PV or CSP for the latter). See the functions in therex.bias_correction
module for available inputs formethod
. Any additional kwargs required for the requestedmethod
can be input as additional columns in thebias_correct
table e.g., for linear bias correction functions you can includescalar
andadder
inputs as columns in thebias_correct
table on a site-by-site basis. IfNone
, no corrections are applied. By default,None
.pre_load_data (bool, optional) – Option to pre-load resource data. This step can be time-consuming up front, but it drastically reduces the number of parallel reads to the res_fpath HDF5 file(s), and can have a significant overall speedup on systems with slow parallel I/O capabilities. Pre-loaded data can use a significant amount of RAM, so be sure to split execution across many nodes (e.g. 100 nodes, 36 workers each for CONUS) or request large amounts of memory for a smaller number of nodes. By default,
False
.
Methods
run
([out_fpath, max_workers])Run the bespoke wind plant optimization in serial or parallel.
run_parallel
([max_workers])Run the bespoke optimization for many supply curve points in parallel.
run_serial
(excl_fpath, res_fpath, tm_dset, ...)Standalone serial method to run bespoke optimization.
Update the sam_sys_inputs with site data for the given GID.
save_outputs
(out_fpath)Save Bespoke Wind Plant optimization outputs to disk.
Attributes
Get a sorted list of completed BespokeSinglePlant gids
1D array of supply curve point gids to aggregate
Meta data for all completed BespokeSinglePlant objects.
Saved outputs for the multi wind plant bespoke optimization.
Get the shape of the full exclusions raster.
Lookup mapping sc_point_gid to exclusion slice.
- property outputs
Saved outputs for the multi wind plant bespoke optimization. Keys are reV supply curve gids and values are BespokeSinglePlant.outputs dictionaries.
- Returns:
dict
- property completed_gids
Get a sorted list of completed BespokeSinglePlant gids
- Returns:
list
- property meta
Meta data for all completed BespokeSinglePlant objects.
- Returns:
pd.DataFrame
- property slice_lookup
Lookup mapping sc_point_gid to exclusion slice.
- Type:
Dict | None
- sam_sys_inputs_with_site_data(gid)[source]
Update the sam_sys_inputs with site data for the given GID.
Site data is extracted from the project points DataFrame. Every column in the project DataFrame becomes a key in the site_data output dictionary.
- Parameters:
gid (int) – SC point gid for site to pull site data for.
- Returns:
dictionary (dict) – SAM system config with extra keys from the project points DataFrame.
- save_outputs(out_fpath)[source]
Save Bespoke Wind Plant optimization outputs to disk.
- Parameters:
out_fpath (str) – Full filepath to an output .h5 file to save Bespoke data to. The parent directories will be created if they do not already exist.
- Returns:
out_fpath (str) – Full filepath to desired .h5 output file, the .h5 extension has been added if it was not already present.
- classmethod run_serial(excl_fpath, res_fpath, tm_dset, sam_sys_inputs, objective_function, capital_cost_function, fixed_operating_cost_function, variable_operating_cost_function, balance_of_system_cost_function, min_spacing='5x', wake_loss_multiplier=1, ga_kwargs=None, output_request=('system_capacity', 'cf_mean'), ws_bins=(0.0, 20.0, 5.0), wd_bins=(0.0, 360.0, 45.0), excl_dict=None, inclusion_mask=None, area_filter_kernel='queen', min_area=None, resolution=64, excl_area=0.0081, data_layers=None, gids=None, exclusion_shape=None, slice_lookup=None, eos_mult_baseline_cap_mw=200, prior_meta=None, gid_map=None, bias_correct=None, pre_loaded_data=None)[source]
Standalone serial method to run bespoke optimization. See BespokeWindPlants docstring for parameter description.
This method can only take a single sam_sys_inputs… For a spatially variant gid-to-config mapping, see the BespokeWindPlants class methods.
- Returns:
out (dict) – Bespoke outputs keyed by sc point gid
- run_parallel(max_workers=None)[source]
Run the bespoke optimization for many supply curve points in parallel.
- Parameters:
max_workers (int | None, optional) – Number of cores to run summary on. None is all available cpus, by default None
- Returns:
out (dict) – Bespoke outputs keyed by sc point gid
- run(out_fpath=None, max_workers=None)[source]
Run the bespoke wind plant optimization in serial or parallel.
- Parameters:
out_fpath (str, optional) – Path to output file. If
None
, no output file will be written. If the filepath is specified but the module name (bespoke) is not included, the module name will get added to the output file name. By default,None
.max_workers (int, optional) – Number of local workers to run on. If
None
, uses all available cores (typically 36). By default,None
.
- Returns:
str | None – Path to output HDF5 file, or
None
if results were not written to disk.
- property gids
1D array of supply curve point gids to aggregate
- Returns:
ndarray
- property shape
Get the shape of the full exclusions raster.
- Returns:
tuple