reV.generation.generation.Gen
- class Gen(technology, project_points, sam_files, resource_file, low_res_resource_file=None, output_request=('cf_mean',), site_data=None, curtailment=None, gid_map=None, drop_leap=False, sites_per_worker=None, memory_utilization_limit=0.4, scale_outputs=True, write_mapped_gids=False, bias_correct=None)[source]
Bases:
BaseGen
ReV generation analysis class.
reV
generation analysis runs SAM simulations by piping in renewable energy resource data (usually from the NSRDB or WTK), loading the SAM config, and then executing the PySAM compute module for a given technology. See the documentation for thereV
SAM class (e.g.reV.SAM.generation.WindPower
,reV.SAM.generation.PvWattsv8
,reV.SAM.generation.Geothermal
, etc.) for info on the allowed and/or required SAM config file inputs. If economic parameters are supplied in the SAM config, then you can bundle a “follow-on” econ calculation by just adding the desired econ output keys to the output_request. You can requestreV
to run the analysis for one or more “sites”, which correspond to the meta indices in the resource data (also commonly called thegid's
).Examples
The following is an example of the most simple way to run reV generation. Note that the
TESTDATADIR
refers to the local cloned repository and will need to be replaced with a valid path if you installedreV
via a simple pip install.>>> import os >>> from reV import Gen, TESTDATADIR >>> >>> sam_tech = 'pvwattsv8' >>> sites = 0 >>> fp_sam = os.path.join(TESTDATADIR, 'SAM/naris_pv_1axis_inv13.json') >>> fp_res = os.path.join(TESTDATADIR, 'nsrdb/ri_100_nsrdb_2013.h5') >>> >>> gen = Gen(sam_tech, sites, fp_sam, fp_res) >>> gen.run() >>> >>> gen.out {'cf_mean': array([0.16966143], dtype=float32)} >>> >>> sites = [3, 4, 7, 9] >>> req = ('cf_mean', 'lcoe_fcr') >>> gen = Gen(sam_tech, sites, fp_sam, fp_res, output_request=req) >>> gen.run() >>> >>> gen.out {'fixed_charge_rate': array([0.096, 0.096, 0.096, 0.096], 'base_capital_cost': array([39767200, 39767200, 39767200, 39767200], 'base_variable_operating_cost': array([0, 0, 0, 0], 'base_fixed_operating_cost': array([260000, 260000, 260000, 260000], 'capital_cost': array([39767200, 39767200, 39767200, 39767200], 'fixed_operating_cost': array([260000, 260000, 260000, 260000], 'variable_operating_cost': array([0, 0, 0, 0], 'capital_cost_multiplier': array([1, 1, 1, 1], 'cf_mean': array([0.17859147, 0.17869979, 0.1834818 , 0.18646291], 'lcoe_fcr': array([130.32126, 130.24226, 126.84782, 124.81981]}
- Parameters:
technology (str) – String indicating which SAM technology to analyze. Must be one of the keys of
OPTIONS
. The string should be lower-cased with spaces and underscores removed.project_points (int | list | tuple | str | dict | pd.DataFrame | slice) – Input specifying which sites to process. A single integer representing the generation GID of a site may be specified to evaluate reV at a single location. A list or tuple of integers (or slice) representing the generation GIDs of multiple sites can be specified to evaluate reV at multiple specific locations. A string pointing to a project points CSV file may also be specified. Typically, the CSV contains the following columns:
gid
: Integer specifying the generation GID of each site.config
: Key in the sam_files input dictionary (see below) corresponding to the SAM configuration to use for each particular site. This value can also beNone
(or left out completely) if you specify only a single SAM configuration file as the sam_files input.capital_cost_multiplier
: This is an optional multiplier input that, if included, will be used to regionally scale thecapital_cost
input in the SAM config. If you include this column in your CSV, you do not need to specifycapital_cost
, unless you would like that value to vary regionally and independently of the multiplier (i.e. the multiplier will still be applied on top of thecapital_cost
input).
The CSV file may also contain other site-specific inputs by including a column named after a config keyword (e.g. a column called
wind_turbine_rotor_diameter
may be included to specify a site-specific turbine diameter for each location). Columns that do not correspond to a config key may also be included, but they will be ignored. A DataFrame following the same guidelines as the CSV input (or a dictionary that can be used to initialize such a DataFrame) may be used for this input as well.Note
By default, the generation GID of each site is assumed to match the resource GID to be evaluated for that site. However, unique generation GID’s can be mapped to non-unique resource GID’s via the gid_map input (see the documentation for gid_map for more details).
sam_files (dict | str) – A dictionary mapping SAM input configuration ID(s) to SAM configuration(s). Keys are the SAM config ID(s) which correspond to the
config
column in the project points CSV. Values for each key are either a path to a corresponding SAM config file or a full dictionary of SAM config inputs. For example:sam_files = { "default": "/path/to/default/sam.json", "onshore": "/path/to/onshore/sam_config.yaml", "offshore": { "sam_key_1": "sam_value_1", "sam_key_2": "sam_value_2", ... }, ... }
This input can also be a string pointing to a single SAM config file. In this case, the
config
column of the CSV points input should be set toNone
or left out completely. See the documentation for thereV
SAM class (e.g.reV.SAM.generation.WindPower
,reV.SAM.generation.PvWattsv8
,reV.SAM.generation.Geothermal
, etc.) for info on the allowed and/or required SAM config file inputs.resource_file (str) – Filepath to resource data. This input can be path to a single resource HDF5 file or a path including a wildcard input like
/h5_dir/prefix*suffix
(i.e. if your datasets for a single year are spread out over multiple files). In all cases, the resource data must be readable byrex.resource.Resource
orrex.multi_file_resource.MultiFileResource
. (i.e. the resource data conform to the rex data format). This means the data file(s) must contain a 1Dtime_index
dataset indicating the UTC time of observation, a 1Dmeta
dataset represented by a DataFrame with site-specific columns, and 2D resource datasets that match the dimensions of (time_index
,meta
). The time index must start at 00:00 of January 1st of the year under consideration, and its shape must be a multiple of 8760.Note
If executing
reV
from the command line, this input string can contain brackets{}
that will be filled in by the analysis_years input. Alternatively, this input can be a list of explicit files to process. In this case, the length of the list must match the length of the analysis_years input exactly, and the path are assumed to align with the analysis_years (i.e. the first path corresponds to the first analysis year, the second path corresponds to the second analysis year, and so on).Important
If you are using custom resource data (i.e. not NSRDB/WTK/Sup3rCC, etc.), ensure the following:
The data conforms to the rex data format.
The
meta
DataFrame is organized such that every row is a pixel and at least the columnslatitude
,longitude
,timezone
, andelevation
are given for each location.The time index and associated temporal data is in UTC.
The latitude is between -90 and 90 and longitude is between -180 and 180.
For solar data, ensure the DNI/DHI are not zero. You can calculate one of these these inputs from the other using the relationship
\[GHI = DNI * cos(SZA) + DHI\]
low_res_resource_file (str, optional) – Optional low resolution resource file that will be dynamically mapped+interpolated to the nominal-resolution resource_file. This needs to be of the same format as resource_file - both files need to be handled by the same
rex Resource
handler (e.g.WindResource
). All of the requirements from the resource_file apply to this input as well. IfNone
, no dynamic mapping to higher resolutions is performed. By default,None
.output_request (list | tuple, optional) – List of output variables requested from SAM. Can be any of the parameters in the “Outputs” group of the PySAM module (e.g.
PySAM.Windpower.Windpower.Outputs
,PySAM.Pvwattsv8.Pvwattsv8.Outputs
,PySAM.Geothermal.Geothermal.Outputs
, etc.) being executed. This list can also include a select number of SAM config/resource parameters to include in the output: any key in any of the output attribute JSON files may be requested. Ifcf_mean
is not included in this list, it will automatically be added. Time-series profiles requested via this input are output in UTC.Note
If you are performing
reV
solar runs usingPVWatts
and would likereV
to include AC capacity values in your aggregation/supply curves, then you must include the"dc_ac_ratio"
time series as an output in output_request when runningreV
generation. The AC capacity outputs will automatically be added during the aggregation/supply curve step if the"dc_ac_ratio"
dataset is detected in the generation file.By default,
('cf_mean',)
.site_data (str | pd.DataFrame, optional) – Site-specific input data for SAM calculation. If this input is a string, it should be a path that points to a CSV file. Otherwise, this input should be a DataFrame with pre-extracted site data. Rows in this table should match the input sites via a
gid
column. The rest of the columns should match configuration input keys that will take site-specific values. Note that some or all site-specific inputs can be specified via the project_points input table instead. IfNone
, no site-specific data is considered.Note
This input is often used to provide site-based regional capital cost multipliers.
reV
does not ingest multipliers directly; instead, this file is expected to have acapital_cost
column that gives the multiplier-adjusted capital cost value for each location. Therefore, you must re-create this input file every time you change your base capital cost assumption.By default,
None
.curtailment (dict | str, optional) –
Inputs for curtailment parameters, which can be:
Explicit namespace of curtailment variables (dict)
Pointer to curtailment config file with path (str)
The allowed key-value input pairs in the curtailment configuration are documented as properties of the
reV.config.curtailment.Curtailment
class. IfNone
, no curtailment is modeled. By default,None
.gid_map (dict | str, optional) – Mapping of unique integer generation gids (keys) to single integer resource gids (values). This enables unique generation gids in the project points to map to non-unique resource gids, which can be useful when evaluating multiple resource datasets in
reV
(e.g., forecasted ECMWF resource data to complement historical WTK meteorology). This input can be a pre-extracted dictionary or a path to a JSON or CSV file. If this input points to a CSV file, the file must have the columnsgid
(which matches the project points) andgid_map
(gids to extract from the resource input). IfNone
, the GID values in the project points are assumed to match the resource GID values. By default,None
.drop_leap (bool, optional) – Drop leap day instead of final day of year when handling leap years. By default,
False
.sites_per_worker (int, optional) – Number of sites to run in series on a worker.
None
defaults to the resource file chunk size. By default,None
.memory_utilization_limit (float, optional) – Memory utilization limit (fractional). Must be a value between 0 and 1. This input sets how many site results will be stored in-memory at any given time before flushing to disk. By default,
0.4
.scale_outputs (bool, optional) – Flag to scale outputs in-place immediately upon
Gen
returning data. By default,True
.write_mapped_gids (bool, optional) – Option to write mapped gids to output meta instead of resource gids. By default,
False
.bias_correct (str | pd.DataFrame, optional) – Optional DataFrame or CSV filepath to a wind or solar resource bias correction table. This has columns:
gid
: GID of site (can be index name of dataframe)method
: function name fromrex.bias_correction
module
The
gid
field should match the true resourcegid
regardless of the optionalgid_map
input. Onlywindspeed
orGHI
+DNI
+DHI
are corrected, depending on the technology (wind for the former, PV or CSP for the latter). See the functions in therex.bias_correction
module for available inputs formethod
. Any additional kwargs required for the requestedmethod
can be input as additional columns in thebias_correct
table e.g., for linear bias correction functions you can includescalar
andadder
inputs as columns in thebias_correct
table on a site-by-site basis. IfNone
, no corrections are applied. By default,None
.
Methods
add_site_data_to_pp
(site_data)Add the site df (site-specific inputs) to project points dataframe.
flush
()Flush the output data in self.out attribute to disk in .h5 format.
get_pc
(points, points_range, sam_configs, tech)Get a PointsControl instance.
get_sites_per_worker
(res_file[, default])Get the nominal sites per worker (x-chunk size) for a given file.
handle_leap_ti
(ti[, drop_leap])Handle a time index for a leap year by dropping a day.
Adjust the time index if modeling full system lifetime.
run
([out_fpath, max_workers, timeout, pool_size])Execute a parallel reV generation run with smart data flushing.
site_index
(site_gid[, out_index])Get the index corresponding to the site gid.
unpack_futures
(futures)Combine list of futures results into their native dict format/type.
unpack_output
(site_gid, site_output)Unpack a SAM SiteOutput object to the output attribute.
Attributes
ECON_ATTRS
LCOE_ARGS
reV technology options.
OUT_ATTRS
Get the (optional) low-resolution resource filename and path.
Get resource meta for all sites in project points.
Get the reV gen or econ output results.
Get the current output chunk index range (INCLUSIVE).
Get the output variables requested from the user.
Get project points controller.
Get project points
Get the resource filename and path.
Run time attributes (__init__ args and kwargs)
Get the sam config dictionary.
SAM configurations including runtime module
Get the SAM module class to be used for SAM simulations.
Get the site-specific inputs in dataframe format.
Get the number of sites results that can be stored in memory at once
Get the memory (MB) required to store all results for a single site.
Get the reV technology string.
Get the generation resource time index data.
Get the resource year.
- OPTIONS = {'geothermal': <class 'reV.SAM.generation.Geothermal'>, 'lineardirectsteam': <class 'reV.SAM.generation.LinearDirectSteam'>, 'mhkwave': <class 'reV.SAM.generation.MhkWave'>, 'pvsamv1': <class 'reV.SAM.generation.PvSamv1'>, 'pvwattsv5': <class 'reV.SAM.generation.PvWattsv5'>, 'pvwattsv7': <class 'reV.SAM.generation.PvWattsv7'>, 'pvwattsv8': <class 'reV.SAM.generation.PvWattsv8'>, 'solarwaterheat': <class 'reV.SAM.generation.SolarWaterHeat'>, 'tcsmoltensalt': <class 'reV.SAM.generation.TcsMoltenSalt'>, 'troughphysicalheat': <class 'reV.SAM.generation.TroughPhysicalHeat'>, 'windpower': <class 'reV.SAM.generation.WindPower'>}
reV technology options.
- property res_file
Get the resource filename and path.
- Returns:
res_file (str) – Filepath to single resource file, multi-h5 directory, or /h5_dir/prefix*suffix
- property lr_res_file
Get the (optional) low-resolution resource filename and path.
- Returns:
str | None
- property meta
Get resource meta for all sites in project points.
- Returns:
meta (pd.DataFrame) – Meta data df for sites in project points. Column names are meta data variables, rows are different sites. The row index does not indicate the site number if the project points are non-sequential or do not start from 0, so a SiteDataField.GID column is added.
- property time_index
Get the generation resource time index data.
- Returns:
_time_index (pandas.DatetimeIndex) – Time-series datetime index
- handle_lifetime_index(ti)[source]
Adjust the time index if modeling full system lifetime.
- Parameters:
ti (pandas.DatetimeIndex) – Time-series datetime index with leap days.
- Returns:
ti (pandas.DatetimeIndex) – Time-series datetime index.
- add_site_data_to_pp(site_data)
Add the site df (site-specific inputs) to project points dataframe.
This ensures that only the relevant site’s data will be passed through to parallel workers when points_control is iterated and split.
- Parameters:
site_data (pd.DataFrame) – Site-specific data for econ calculation. Rows correspond to sites, columns are variables.
- flush()
Flush the output data in self.out attribute to disk in .h5 format.
The data to be flushed is accessed from the instance attribute “self.out”. The disk target is based on the instance attributes “self._out_fpath”. Data is not flushed if _fpath is None or if .out is empty.
- classmethod get_pc(points, points_range, sam_configs, tech, sites_per_worker=None, res_file=None, curtailment=None)
Get a PointsControl instance.
- Parameters:
points (int | slice | list | str | pandas.DataFrame | PointsControl) – Single site integer, or slice or list specifying project points, or string pointing to a project points csv, or a pre-loaded project points DataFrame, or a fully instantiated PointsControl object.
points_range (list | None) – Optional two-entry list specifying the index range of the sites to analyze. To be taken from the reV.config.PointsControl.split_range property.
sam_configs (dict | str | SAMConfig) – SAM input configuration ID(s) and file path(s). Keys are the SAM config ID(s) which map to the config column in the project points CSV. Values are either a JSON SAM config file or dictionary of SAM config inputs. Can also be a single config file path or a pre loaded SAMConfig object.
tech (str) – SAM technology to analyze (pvwattsv7, windpower, tcsmoltensalt, solarwaterheat, troughphysicalheat, lineardirectsteam) The string should be lower-cased with spaces and _ removed.
sites_per_worker (int) – Number of sites to run in series on a worker. None defaults to the resource file chunk size.
res_file (str) – Filepath to single resource file, multi-h5 directory, or /h5_dir/prefix*suffix
curtailment (NoneType | dict | str | config.curtailment.Curtailment) – Inputs for curtailment parameters. If not None, curtailment inputs are expected. Can be:
Explicit namespace of curtailment variables (dict)
Pointer to curtailment config json file with path (str)
Instance of curtailment config object (config.curtailment.Curtailment)
- Returns:
pc (reV.config.project_points.PointsControl) – PointsControl object instance.
- static get_sites_per_worker(res_file, default=100)
Get the nominal sites per worker (x-chunk size) for a given file.
This is based on the concept that it is most efficient for one core to perform one read on one chunk of resource data, such that chunks will not have to be read into memory twice and no sites will be read redundantly.
- Parameters:
res_file (str) – Filepath to single resource file, multi-h5 directory, or /h5_dir/prefix*suffix
default (int) – Sites to be analyzed on a single core if the chunk size cannot be determined from res_file.
- Returns:
sites_per_worker (int) – Nominal sites to be analyzed per worker. This is set to the x-axis chunk size for windspeed and dni datasets for the WTK and NSRDB data, respectively.
- static handle_leap_ti(ti, drop_leap=False)
Handle a time index for a leap year by dropping a day.
- Parameters:
ti (pandas.DatetimeIndex) – Time-series datetime index with or without leap days.
drop_leap (bool) – Option to drop leap days (if True) or drop the last day of each leap year (if False).
- Returns:
ti (pandas.DatetimeIndex) – Time-series datetime index with length a multiple of 365.
- property out
Get the reV gen or econ output results.
- Returns:
out (dict) – Dictionary of gen or econ results from SAM.
- property out_chunk
Get the current output chunk index range (INCLUSIVE).
- Returns:
_out_chunk (tuple) – Two entry tuple (start, end) indicies (inclusive) for where the current data in-memory belongs in the final output.
- property output_request
Get the output variables requested from the user.
- Returns:
output_request (list) – Output variables requested from SAM.
- property points_control
Get project points controller.
- Returns:
points_control (reV.config.project_points.PointsControl) – Project points control instance for site and SAM config spec.
- property project_points
Get project points
- Returns:
project_points (reV.config.project_points.ProjectPoints) – Project points from the points control instance.
- run(out_fpath=None, max_workers=1, timeout=1800, pool_size=None)[source]
Execute a parallel reV generation run with smart data flushing.
- Parameters:
out_fpath (str, optional) – Path to output file. If
None
, no output file will be written. If the filepath is specified but the module name (generation) and/or resource data year is not included, the module name and/or resource data year will get added to the output file name. By default,None
.max_workers (int, optional) – Number of local workers to run on. If
None
, or if running from the command line and omitting this argument from your config file completely, this input is set toos.cpu_count()
. Otherwise, the default is1
.timeout (int, optional) – Number of seconds to wait for parallel run iteration to complete before returning zeros. By default,
1800
seconds.pool_size (int, optional) – Number of futures to submit to a single process pool for parallel futures. If
None
, the pool size is set toos.cpu_count() * 2
. By default,None
.
- Returns:
str | None – Path to output HDF5 file, or
None
if results were not written to disk.
- property run_attrs
Run time attributes (__init__ args and kwargs)
- Returns:
run_attrs (dict) – Dictionary of runtime args and kwargs
- property sam_configs
Get the sam config dictionary.
- Returns:
sam_configs (dict) – SAM config from the project points instance.
- property sam_metas
SAM configurations including runtime module
- Returns:
sam_metas (dict) – Nested dictionary of SAM configuration files with module used at runtime
- property sam_module
Get the SAM module class to be used for SAM simulations.
- Returns:
sam_module (object) – SAM object like PySAM.Pvwattsv7 or PySAM.Lcoefcr
- property site_data
Get the site-specific inputs in dataframe format.
- Returns:
_site_data (pd.DataFrame) – Site-specific input data for gen or econ calculation. Rows match sites, columns are variables.
- site_index(site_gid, out_index=False)
Get the index corresponding to the site gid.
- Parameters:
site_gid (int) – Resource-native site index (gid).
out_index (bool) – Option to get output index (if true) which is the column index in the current in-memory output array, or (if false) the global site index from the project points site list.
- Returns:
index (int) – Global site index if out_index=False, otherwise column index in the current in-memory output array.
- property site_limit
Get the number of sites results that can be stored in memory at once
- Returns:
_site_limit (int) – Number of site result sets that can be stored in memory at once without violating memory limits.
- property site_mem
Get the memory (MB) required to store all results for a single site.
- Returns:
_site_mem (float) – Memory (MB) required to store all results in requested in output_request for a single site.
- property tech
Get the reV technology string.
- Returns:
tech (str) – SAM technology to analyze (pvwattsv7, windpower, tcsmoltensalt, solarwaterheat, troughphysicalheat, lineardirectsteam, econ) The string should be lower-cased with spaces and _ removed.
- static unpack_futures(futures)
Combine list of futures results into their native dict format/type.
- Parameters:
futures (list) – List of dictionary futures results.
- Returns:
out (dict) – Compiled results of the native future results type (dict).
- unpack_output(site_gid, site_output)
Unpack a SAM SiteOutput object to the output attribute.
- Parameters:
site_gid (int) – Resource-native site gid (index).
site_output (dict) – SAM site output object.
- property year
Get the resource year.
- Returns:
_year (int) – Year of the time-series datetime index.