reV.supply_curve.sc_aggregation.SupplyCurveAggregation

class SupplyCurveAggregation(excl_fpath, tm_dset, econ_fpath=None, excl_dict=None, area_filter_kernel='queen', min_area=None, resolution=64, excl_area=None, res_fpath=None, gids=None, pre_extract_inclusions=False, res_class_dset=None, res_class_bins=None, cf_dset='cf_mean-means', lcoe_dset='lcoe_fcr-means', h5_dsets=None, data_layers=None, power_density=None, friction_fpath=None, friction_dset=None, cap_cost_scale=None, recalc_lcoe=True)[source]

Bases: BaseAggregation

ReV supply curve points aggregation framework.

reV supply curve aggregation combines a high-resolution (e.g. 90m) exclusion dataset with a (typically) lower resolution (e.g. 2km) generation dataset by mapping all data onto the high- resolution grid and aggregating it by a large factor (e.g. 64 or 128). The result is coarsely-gridded data that summarizes capacity and generation potential as well as associated economics under a particular land access scenario. This module can also summarize extra data layers during the aggregation process, allowing for complementary land characterization analysis.

Parameters:
  • excl_fpath (str | list | tuple) – Filepath to exclusions data HDF5 file. The exclusions HDF5 file should contain the layers specified in excl_dict and data_layers. These layers may also be spread out across multiple HDF5 files, in which case this input should be a list or tuple of filepaths pointing to the files containing the layers. Note that each data layer must be uniquely defined (i.e.only appear once and in a single input file).

  • tm_dset (str) – Dataset name in the excl_fpath file containing the techmap (exclusions-to-resource mapping data). This data layer links the supply curve GID’s to the generation GID’s that are used to evaluate performance metrics such as mean_cf.

    Important

    This dataset uniquely couples the (typically high-resolution) exclusion layers to the (typically lower-resolution) resource data. Therefore, a separate techmap must be used for every unique combination of resource and exclusion coordinates.

    Note

    If executing reV from the command line, you can specify a name that is not in the exclusions HDF5 file, and reV will calculate the techmap for you. Note however that computing the techmap and writing it to the exclusion HDF5 file is a blocking operation, so you may only run a single reV aggregation step at a time this way.

  • econ_fpath (str, optional) – Filepath to HDF5 file with reV econ output results containing an lcoe_dset dataset. If None, lcoe_dset should be a dataset in the gen_fpath HDF5 file that aggregation is executed on.

    Note

    If executing reV from the command line, this input can be set to "PIPELINE" to parse the output from one of these preceding pipeline steps: multi-year, collect, or generation. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the econ_fpath input will have to be specified manually.

    By default, None.

  • excl_dict (dict | None) – Dictionary of exclusion keyword arguments of the format {layer_dset_name: {kwarg: value}}, where layer_dset_name is a dataset in the exclusion h5 file and the kwarg: value pair is a keyword argument to the reV.supply_curve.exclusions.LayerMask class. For example:

    excl_dict = {
        "typical_exclusion": {
            "exclude_values": 255,
        },
        "another_exclusion": {
            "exclude_values": [2, 3],
            "weight": 0.5
        },
        "exclusion_with_nodata": {
            "exclude_range": [10, 100],
            "exclude_nodata": True,
            "nodata_value": -1
        },
        "partial_setback": {
            "use_as_weights": True
        },
        "height_limit": {
            "exclude_range": [0, 200]
        },
        "slope": {
            "include_range": [0, 20]
        },
        "developable_land": {
            "force_include_values": 42
        },
        "more_developable_land": {
            "force_include_range": [5, 10]
        },
        "viewsheds": {
            "exclude_values": 1,
            "extent": {
                "layer": "federal_parks",
                "include_range": [1, 5]
            }
        }
        ...
    }
    

    Note that all the keys given in this dictionary should be datasets of the excl_fpath file. If None or empty dictionary, no exclusions are applied. By default, None.

  • area_filter_kernel ({“queen”, “rook”}, optional) – Contiguous area filter method to use on final exclusions mask. The filters are defined as:

    # Queen:     # Rook:
    [[1,1,1],    [[0,1,0],
     [1,1,1],     [1,1,1],
     [1,1,1]]     [0,1,0]]
    

    These filters define how neighboring pixels are “connected”. Once pixels in the final exclusion layer are connected, the area of each resulting cluster is computed and compared against the min_area input. Any cluster with an area less than min_area is excluded from the final mask. This argument has no effect if min_area is None. By default, "queen".

  • min_area (float, optional) – Minimum area (in km2) required to keep an isolated cluster of (included) land within the resulting exclusions mask. Any clusters of land with areas less than this value will be marked as exclusions. See the documentation for area_filter_kernel for an explanation of how the area of each land cluster is computed. If None, no area filtering is performed. By default, None.

  • resolution (int, optional) – Supply Curve resolution. This value defines how many pixels are in a single side of a supply curve cell. For example, a value of 64 would generate a supply curve where the side of each supply curve cell is 64x64 exclusion pixels. By default, 64.

  • excl_area (float, optional) – Area of a single exclusion mask pixel (in km2). If None, this value will be inferred from the profile transform attribute in excl_fpath. By default, None.

  • res_fpath (str, optional) – Filepath to HDF5 resource file (e.g. WTK or NSRDB). This input is required if techmap dset is to be created or if the gen_fpath input to the summarize or run methods is None. By default, None.

  • gids (list, optional) – List of supply curve point gids to get summary for. If you would like to obtain all available reV supply curve points to run, you can use the reV.supply_curve.extent.SupplyCurveExtent class like so:

    import pandas as pd
    from reV.supply_curve.extent import SupplyCurveExtent
    
    excl_fpath = "..."
    resolution = ...
    tm_dset = "..."
    with SupplyCurveExtent(excl_fpath, resolution) as sc:
        gids = sc.valid_sc_points(tm_dset).tolist()
    ...
    

    If None, supply curve aggregation is computed for all gids in the supply curve extent. By default, None.

  • pre_extract_inclusions (bool, optional) – Optional flag to pre-extract/compute the inclusion mask from the excl_dict input. It is typically faster to compute the inclusion mask on the fly with parallel workers. By default, False.

  • res_class_dset (str, optional) – Name of dataset in the reV generation HDF5 output file containing resource data. If None, no aggregated resource classification is performed (i.e. no mean_res output), and the res_class_bins is ignored. By default, None.

  • res_class_bins (list, optional) – Optional input to perform separate aggregations for various resource data ranges. If None, only a single aggregation per supply curve point is performed. Otherwise, this input should be a list of floats or ints representing the resource bin boundaries. One aggregation per resource value range is computed, and only pixels within the given resource range are aggregated. By default, None.

  • cf_dset (str, optional) – Dataset name from the reV generation HDF5 output file containing a 1D dataset of mean capacity factor values. This dataset will be mapped onto the high-resolution grid and used to compute the mean capacity factor for non-excluded area. By default, "cf_mean-means".

  • lcoe_dset (str, optional) – Dataset name from the reV generation HDF5 output file containing a 1D dataset of mean LCOE values. This dataset will be mapped onto the high-resolution grid and used to compute the mean LCOE for non-excluded area, but only if the LCOE is not re-computed during processing (see the recalc_lcoe input for more info). By default, "lcoe_fcr-means".

  • h5_dsets (list, optional) – Optional list of additional datasets from the reV generation/econ HDF5 output file to aggregate. If None, no extra datasets are aggregated.

    Warning

    This input is meant for passing through 1D datasets. If you specify a 2D or higher-dimensional dataset, you may run into memory errors. If you wish to aggregate 2D datasets, see the rep-profiles module.

    By default, None.

  • data_layers (dict, optional) –

    Dictionary of aggregation data layers of the format:

    data_layers = {
        "output_layer_name": {
            "dset": "layer_name",
            "method": "mean",
            "fpath": "/path/to/data.h5"
        },
        "another_output_layer_name": {
            "dset": "input_layer_name",
            "method": "mode",
            # optional "fpath" key omitted
        },
        ...
    }
    

    The "output_layer_name" is the column name under which the aggregated data will appear in the output CSV file. The "output_layer_name" does not have to match the dset input value. The latter should match the layer name in the HDF5 from which the data to aggregate should be pulled. The method should be one of {"mode", "mean", "min", "max", "sum", "category"}, describing how the high-resolution data should be aggregated for each supply curve point. fpath is an optional key that can point to an HDF5 file containing the layer data. If left out, the data is assumed to exist in the file(s) specified by the excl_fpath input. If None, no data layer aggregation is performed. By default, None

  • power_density (float | str, optional) – Power density value (in MW/km2) or filepath to variable power density CSV file containing the following columns:

    • gid : resource gid (typically wtk or nsrdb gid)

    • power_density : power density value (in MW/km2)

    If None, a constant power density is inferred from the generation meta data technology. By default, None.

  • friction_fpath (str, optional) – Filepath to friction surface data (cost based exclusions). Must be paired with the friction_dset input below. The friction data must be the same shape as the exclusions. Friction input creates a new output column "mean_lcoe_friction" which is the nominal LCOE multiplied by the friction data. If None, no friction data is aggregated. By default, None.

  • friction_dset (str, optional) – Dataset name in friction_fpath for the friction surface data. Must be paired with the friction_fpath above. If None, no friction data is aggregated. By default, None.

  • cap_cost_scale (str, optional) – Optional LCOE scaling equation to implement “economies of scale”. Equations must be in python string format and must return a scalar value to multiply the capital cost by. Independent variables in the equation should match the names of the columns in the reV supply curve aggregation output table (see the documentation of SupplyCurveAggregation for details on available outputs). If None, no economies of scale are applied. By default, None.

  • recalc_lcoe (bool, optional) – Flag to re-calculate the LCOE from the multi-year mean capacity factor and annual energy production data. This requires several datasets to be aggregated in the h5_dsets input:

    • system_capacity

    • fixed_charge_rate

    • capital_cost

    • fixed_operating_cost

    • variable_operating_cost

    If any of these datasets are missing from the reV generation HDF5 output, or if recalc_lcoe is set to False, the mean LCOE will be computed from the data stored under the lcoe_dset instead. By default, True.

Examples

Standard outputs:

sc_gidint

Unique supply curve gid. This is the enumerated supply curve points, which can have overlapping geographic locations due to different resource bins at the same geographic SC point.

res_gidslist

Stringified list of resource gids (e.g. original WTK or NSRDB resource GIDs) corresponding to each SC point.

gen_gidslist

Stringified list of generation gids (e.g. GID in the reV generation output, which corresponds to the reV project points and not necessarily the resource GIDs).

gid_countslist

Stringified list of the sum of inclusion scalar values corresponding to each gen_gid and res_gid, where 1 is included, 0 is excluded, and 0.7 is included with 70 percent of available land. Each entry in this list is associated with the corresponding entry in the gen_gids and res_gids lists.

n_gidsint

Total number of included pixels. This is a boolean sum and considers partial inclusions to be included (e.g. 1).

mean_cffloat

Mean capacity factor of each supply curve point (the arithmetic mean is weighted by the inclusion layer) (unitless).

mean_lcoefloat

Mean LCOE of each supply curve point (the arithmetic mean is weighted by the inclusion layer). Units match the reV econ output ($/MWh). By default, the LCOE is re-calculated using the multi-year mean capacity factor and annual energy production. This requires several datasets to be aggregated in the h5_dsets input: fixed_charge_rate, capital_cost, fixed_operating_cost, annual_energy_production, and variable_operating_cost. This recalc behavior can be disabled by setting recalc_lcoe=False.

mean_resfloat

Mean resource, the resource dataset to average is provided by the user in res_class_dset. The arithmetic mean is weighted by the inclusion layer.

capacityfloat

Total capacity of each supply curve point (MW). Units are contingent on the power_density input units of MW/km2.

area_sq_kmfloat

Total included area for each supply curve point in km2. This is based on the nominal area of each exclusion pixel which by default is calculated from the exclusion profile attributes. The NREL reV default is 0.0081 km2 pixels (90m x 90m). The area sum considers partial inclusions.

latitudefloat

Supply curve point centroid latitude coordinate, in degrees (does not consider exclusions).

longitudefloat

Supply curve point centroid longitude coordinate, in degrees (does not consider exclusions).

countrystr

Country of the supply curve point based on the most common country of the associated resource meta data. Does not consider exclusions.

statestr

State of the supply curve point based on the most common state of the associated resource meta data. Does not consider exclusions.

countystr

County of the supply curve point based on the most common county of the associated resource meta data. Does not consider exclusions.

elevationfloat

Mean elevation of the supply curve point based on the mean elevation of the associated resource meta data. Does not consider exclusions.

timezoneint

UTC offset of local timezone based on the most common timezone of the associated resource meta data. Does not consider exclusions.

sc_point_gidint

Spatially deterministic supply curve point gid. Duplicate sc_point_gid values can exist due to resource binning.

sc_row_indint

Row index of the supply curve point in the aggregated exclusion grid.

sc_col_indint

Column index of the supply curve point in the aggregated exclusion grid

res_classint

Resource class for the supply curve gid. Each geographic supply curve point (sc_point_gid) can have multiple resource classes associated with it, resulting in multiple supply curve gids (sc_gid) associated with the same spatially deterministic supply curve point.

Optional outputs:

mean_frictionfloat

Mean of the friction data provided in ‘friction_fpath’ and ‘friction_dset’. The arithmetic mean is weighted by boolean inclusions and considers partial inclusions to be included.

mean_lcoe_frictionfloat

Mean of the nominal LCOE multiplied by mean_friction value.

mean_{dset}float

Mean input h5 dataset(s) provided by the user in ‘h5_dsets’. These mean calculations are weighted by the partial inclusion layer.

data_layersfloat | int | str | dict

Requested data layer aggregations, each data layer must be the same shape as the exclusion layers.

  • mode: int | str

    Most common value of a given data layer after applying the boolean inclusion mask.

  • meanfloat

    Arithmetic mean value of a given data layer weighted by the scalar inclusion mask (considers partial inclusions).

  • minfloat | int

    Minimum value of a given data layer after applying the boolean inclusion mask.

  • maxfloat | int

    Maximum value of a given data layer after applying the boolean inclusion mask.

  • sumfloat

    Sum of a given data layer weighted by the scalar inclusion mask (considers partial inclusions).

  • categorydict

    Dictionary mapping the unique values in the data_layer to the sum of inclusion scalar values associated with all pixels with that unique value.

Methods

run(out_fpath[, gen_fpath, args, ...])

Run a supply curve aggregation.

run_parallel(gen_fpath[, args, max_workers, ...])

Get the supply curve points aggregation summary using futures.

run_serial(excl_fpath, gen_fpath, tm_dset, ...)

Standalone method to create agg summary - can be parallelized.

summarize(gen_fpath[, args, max_workers, ...])

Get the supply curve points aggregation summary

Attributes

gids

1D array of supply curve point gids to aggregate

shape

Get the shape of the full exclusions raster.

classmethod run_serial(excl_fpath, gen_fpath, tm_dset, gen_index, econ_fpath=None, excl_dict=None, inclusion_mask=None, area_filter_kernel='queen', min_area=None, resolution=64, gids=None, args=None, res_class_dset=None, res_class_bins=None, cf_dset='cf_mean-means', lcoe_dset='lcoe_fcr-means', h5_dsets=None, data_layers=None, power_density=None, friction_fpath=None, friction_dset=None, excl_area=None, cap_cost_scale=None, recalc_lcoe=True)[source]

Standalone method to create agg summary - can be parallelized.

Parameters:
  • excl_fpath (str | list | tuple) – Filepath to exclusions h5 with techmap dataset (can be one or more filepaths).

  • gen_fpath (str) – Filepath to .h5 reV generation output results.

  • tm_dset (str) – Dataset name in the exclusions file containing the exclusions-to-resource mapping data.

  • gen_index (np.ndarray) – Array of generation gids with array index equal to resource gid. Array value is -1 if the resource index was not used in the generation run.

  • econ_fpath (str | None) – Filepath to .h5 reV econ output results. This is optional and only used if the lcoe_dset is not present in the gen_fpath file.

  • excl_dict (dict | None) – Dictionary of exclusion keyword arugments of the format {layer_dset_name: {kwarg: value}} where layer_dset_name is a dataset in the exclusion h5 file and kwarg is a keyword argument to the reV.supply_curve.exclusions.LayerMask class.

  • inclusion_mask (np.ndarray | dict | optional) – 2D array pre-extracted inclusion mask where 1 is included and 0 is excluded. This must be either match the full exclusion shape or be a dict lookup of single-sc-point exclusion masks corresponding to the gids input and keyed by gids, by default None which will calculate exclusions on the fly for each sc point.

  • area_filter_kernel (str) – Contiguous area filter method to use on final exclusions mask

  • min_area (float | None) – Minimum required contiguous area filter in sq-km

  • resolution (int | None) – SC resolution, must be input in combination with gid. Prefered option is to use the row/col slices to define the SC point instead.

  • gids (list | None) – List of supply curve point gids to get summary for (can use to subset if running in parallel), or None for all gids in the SC extent, by default None

  • args (list | None) – List of positional args for sc_point_method

  • res_class_dset (str | None) – Dataset in the generation file dictating resource classes. None if no resource classes.

  • res_class_bins (list | None) – List of two-entry lists dictating the resource class bins. None if no resource classes.

  • cf_dset (str) – Dataset name from f_gen containing capacity factor mean values.

  • lcoe_dset (str) – Dataset name from f_gen containing LCOE mean values.

  • h5_dsets (list | None) – Optional list of additional datasets from the source h5 gen/econ files to aggregate.

  • data_layers (None | dict) – Aggregation data layers. Must be a dictionary keyed by data label name. Each value must be another dictionary with “dset”, “method”, and “fpath”.

  • power_density (float | str | None) – Power density in MW/km2 or filepath to variable power density file. None will attempt to infer a constant power density from the generation meta data technology. Variable power density csvs must have “gid” and “power_density” columns where gid is the resource gid (typically wtk or nsrdb gid) and the power_density column is in MW/km2.

  • friction_fpath (str | None) – Filepath to friction surface data (cost based exclusions). Must be paired with friction_dset. The friction data must be the same shape as the exclusions. Friction input creates a new output “mean_lcoe_friction” which is the nominal LCOE multiplied by the friction data.

  • friction_dset (str | None) – Dataset name in friction_fpath for the friction surface data. Must be paired with friction_fpath. Must be same shape as exclusions.

  • excl_area (float | None, optional) – Area of an exclusion pixel in km2. None will try to infer the area from the profile transform attribute in excl_fpath, by default None

  • cap_cost_scale (str | None) – Optional LCOE scaling equation to implement “economies of scale”. Equations must be in python string format and return a scalar value to multiply the capital cost by. Independent variables in the equation should match the names of the columns in the reV supply curve aggregation table.

  • recalc_lcoe (bool) – Flag to re-calculate the LCOE from the multi-year mean capacity factor and annual energy production data. This requires several datasets to be aggregated in the h5_dsets input: system_capacity, fixed_charge_rate, capital_cost, fixed_operating_cost, and variable_operating_cost.

Returns:

summary (list) – List of dictionaries, each being an SC point summary.

run_parallel(gen_fpath, args=None, max_workers=None, sites_per_worker=100)[source]

Get the supply curve points aggregation summary using futures.

Parameters:
  • gen_fpath (str) – Filepath to .h5 reV generation output results.

  • args (tuple | list | None) – List of summary arguments to include. None defaults to all available args defined in the class attr.

  • max_workers (int | None, optional) – Number of cores to run summary on. None is all available cpus, by default None

  • sites_per_worker (int) – Number of sc_points to summarize on each worker, by default 100

Returns:

summary (list) – List of dictionaries, each being an SC point summary.

summarize(gen_fpath, args=None, max_workers=None, sites_per_worker=100)[source]

Get the supply curve points aggregation summary

Parameters:
  • gen_fpath (str) – Filepath to .h5 reV generation output results.

  • args (tuple | list | None) – List of summary arguments to include. None defaults to all available args defined in the class attr.

  • max_workers (int | None, optional) – Number of cores to run summary on. None is all available cpus, by default None

  • sites_per_worker (int) – Number of sc_points to summarize on each worker, by default 100

Returns:

summary (list) – List of dictionaries, each being an SC point summary.

run(out_fpath, gen_fpath=None, args=None, max_workers=None, sites_per_worker=100)[source]

Run a supply curve aggregation.

Parameters:
  • gen_fpath (str, optional) – Filepath to HDF5 file with reV generation output results. If None, a simple aggregation without any generation, resource, or cost data is performed.

    Note

    If executing reV from the command line, this input can be set to "PIPELINE" to parse the output from one of these preceding pipeline steps: multi-year, collect, or econ. However, note that duplicate executions of any of these commands within the pipeline may invalidate this parsing, meaning the gen_fpath input will have to be specified manually.

    By default, None.

  • args (tuple | list, optional) – List of columns to include in summary output table. None defaults to all available args defined in the SupplyCurveAggregation documentation. By default, None.

  • max_workers (int, optional) – Number of cores to run summary on. None is all available CPUs. By default, None.

  • sites_per_worker (int, optional) – Number of sc_points to summarize on each worker. By default, 100.

Returns:

str – Path to output CSV file containing supply curve aggregation.

property gids

1D array of supply curve point gids to aggregate

Returns:

ndarray

property shape

Get the shape of the full exclusions raster.

Returns:

tuple