reV.handlers.multi_year.MultiYear

class MultiYear(h5_file, group=None, unscale=True, mode='r', str_decode=True)[source]

Bases: Outputs

Class to handle multiple years of data and: - collect datasets from multiple years - compute multi-year means - compute multi-year standard deviations - compute multi-year coefficient of variations

Parameters:

h5_file (str) – Path to .h5 resource file
group (str) – Group to collect datasets into
unscale (bool) – Boolean flag to automatically unscale variables on extraction
mode (str) – Mode to instantiate h5py.File instance
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

Methods

`CV`(dset)	Extract or compute multi-year coefficient of variation for given source dset
`add_dataset`(h5_file, dset_name, dset_data, dtype)	Add dataset to h5_file
`close`()	Close h5 instance
`collect`(source_files, dset[, profiles, ...])	Collect dataset dset from given list of h5 files
`collect_means`(my_file, source_files, dset[, ...])	Collect and compute multi-year means for given dataset
`collect_profiles`(my_file, source_files, dset)	Collect multi-year profiles associated with given dataset
`df_str_decode`(df)	Decode a dataframe with byte string columns into ordinary str cols.
`get_SAM_df`(site)	Placeholder for get_SAM_df method that it resource specific
`get_attrs`([dset])	Get h5 attributes either from file or dataset
`get_config`(config_name)	Get SAM config
`get_dset_properties`(dset)	Get dataset properties (shape, dtype, chunks)
`get_meta_arr`(rec_name[, rows])	Get a meta array by name (faster than DataFrame extraction).
`get_scale_factor`(dset)	Get dataset scale factor
`get_units`(dset)	Get dataset units
`init_h5`(h5_file, dsets, shapes, attrs, ...)	Init a full output file with the final intended shape without data.
`is_hsds_file`(file_path)	Parse one or more filepath to determine if it is hsds
`is_profile`(source_files, dset)	Check dataset in source files to see if it is a profile.
`is_s3_file`(file_path)	Parse one or more filepath to determine if it is s3
`means`(dset)	Extract or compute multi-year means for given source dset
`open_dataset`(ds_name)	Open resource dataset
`open_file`(file_path[, mode, hsds, hsds_kwargs])	Open a filepath to an h5, s3, or hsds nrel resource file with the appropriate python object.
`parse_source_files_pattern`(source_files)	Parse a source_files pattern that can be either an explicit list of source files or a unix-style /filepath/pattern*.h5 and either way return a list of explicit filepaths.
`pass_through`(my_file, source_files, dset[, ...])	Pass through a dataset that is identical in all source files to a dataset of the same name in the output multi-year file.
`preload_SAM`(h5_file, sites, tech[, unscale, ...])	Pre-load project_points for SAM
`set_configs`(SAM_configs)	Set SAM configuration JSONs as attributes of 'meta'
`set_version_attr`()	Set the version attribute to the h5 file.
`stdev`(dset)	Extract or compute multi-year standard deviation for given source dset
`update_dset`(dset, dset_array[, dset_slice])	Check to see if dset needs to be updated on disk If so write dset_array to disk
`write_dataset`(dset_name, data, dtype[, ...])	Write dataset to disk.
`write_means`(h5_file, meta, dset_name, means, ...)	Write means array to disk
`write_profiles`(h5_file, meta, time_index, ...)	Write profiles to disk

Attributes

`ADD_ATTR`
`SAM_configs`	SAM configuration JSONs used to create CF profiles
`SCALE_ATTR`
`UNIT_ATTR`
`adders`	Dictionary of all dataset add offset factors
`attrs`	Dictionary of all dataset attributes
`chunks`	Dictionary of all dataset chunk sizes
`coordinates`	(lat, lon) pairs
`data_version`	Get the version attribute of the data.
`datasets`	Datasets available
`dsets`	Datasets available
`dtypes`	Dictionary of all dataset dtypes
`full_version_record`	Get record of versions for dependencies
`global_attrs`	Global (file) attributes
`groups`	Groups available
`h5`	Open h5py File instance.
`lat_lon`	Extract (latitude, longitude) pairs
`meta`	Resource meta data DataFrame
`package`	Package used to create file
`res_dsets`	Available resource datasets
`resource_datasets`	Available resource datasets
`run_attrs`	Runtime attributes stored at the global (file) level
`scale_factors`	Dictionary of all dataset scale factors
`shape`	Variable array shape from time_index and meta
`shapes`	Dictionary of all dataset shapes
`source`	Package and version used to create file
`time_index`	Resource DatetimeIndex
`units`	Dictionary of all dataset units
`version`	Version of package used to create file
`writable`	Check to see if h5py.File instance is writable

static parse_source_files_pattern(source_files)[source]

Parse a source_files pattern that can be either an explicit list of source files or a unix-style /filepath/pattern*.h5 and either way return a list of explicit filepaths.

Parameters:: source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
Returns:: source_files (list) – List of .h5 filepaths.

collect(source_files, dset, profiles=False, pass_through=False)[source]

Collect dataset dset from given list of h5 files

Parameters:

source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect
profiles (bool) – Boolean flag to indicate if profiles are being collected If True also collect time_index
pass_through (bool) – Flag to just pass through dataset without name modifications (no differences between years, no means or stdevs)

means(dset)[source]

Extract or compute multi-year means for given source dset

Parameters:: dset (str) – Dataset of interest
Returns:: my_means (ndarray) – Array of multi-year means for dataset of interest

stdev(dset)[source]

Extract or compute multi-year standard deviation for given source dset

Parameters:: dset (str) – Dataset of interest
Returns:: my_stdev (ndarray) – Array of multi-year standard deviation for dataset of interest

CV(dset)[source]

Extract or compute multi-year coefficient of variation for given source dset

Parameters:: dset (str) – Dataset of interest
Returns:: my_cv (ndarray) – Array of multi-year coefficient of variation for dataset of interest

classmethod is_profile(source_files, dset)[source]

Check dataset in source files to see if it is a profile.

Parameters:

source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect

Returns:

is_profile (bool) – True if profile, False if not.

classmethod pass_through(my_file, source_files, dset, group=None)[source]

Pass through a dataset that is identical in all source files to a dataset of the same name in the output multi-year file.

Parameters:

my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to pass through (will also be the name of the output dataset in my_file)
group (str) – Group to collect datasets into

classmethod collect_means(my_file, source_files, dset, group=None)[source]

Collect and compute multi-year means for given dataset

Parameters:

my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect
group (str) – Group to collect datasets into

classmethod collect_profiles(my_file, source_files, dset, group=None)[source]

Collect multi-year profiles associated with given dataset

Parameters:

my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Profiles dataset to collect
group (str) – Group to collect datasets into

property SAM_configs

SAM configuration JSONs used to create CF profiles

Returns:: configs (dict) – Dictionary of SAM configuration JSONs

classmethod add_dataset(h5_file, dset_name, dset_data, dtype, attrs=None, chunks=None, unscale=True, mode='a', str_decode=True, group=None)

Add dataset to h5_file

Parameters:

h5_file (str) – Path to .h5 resource file
dset_name (str) – Name of dataset to be added to h5 file
dset_data (ndarray) – Data to be added to h5 file
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘a’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None

property adders

Dictionary of all dataset add offset factors

Returns:: adders (dict)

property attrs

Dictionary of all dataset attributes

Returns:: attrs (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns:: chunks (dict)

close(): Close h5 instance

property coordinates

(lat, lon) pairs

Returns:: lat_lon (ndarray)
Type:: Coordinates

property data_version

Get the version attribute of the data. None if not available.

Returns:: version (str | None)

property datasets

Datasets available

Returns:: list

static df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

Parameters:: df (pd.DataFrame) – Dataframe with some columns being byte strings.
Returns:: df (pd.DataFrame) – DataFrame with str columns instead of byte str columns.

property dsets

Datasets available

Returns:: list

property dtypes

Dictionary of all dataset dtypes

Returns:: dtypes (dict)

property full_version_record

Get record of versions for dependencies

Returns:: dict – Dictionary of package versions for dependencies

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

Parameters:: site (int) – Site to extract SAM DataFrame for

get_attrs(dset=None)

Get h5 attributes either from file or dataset

Parameters:: dset (str) – Dataset to get attributes for, if None get file (global) attributes
Returns:: attrs (dict) – Dataset or file attributes

get_config(config_name)

Get SAM config

Parameters:: config_name (str) – Name of config
Returns:: config (dict) – SAM config JSON as a dictionary

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

shape (tuple) – Dataset array shape
dtype (str) – Dataset array dtype
chunks (tuple) – Dataset chunk size

get_meta_arr(rec_name, rows=slice(None, None, None))

Get a meta array by name (faster than DataFrame extraction).

Parameters:

rec_name (str) – Named record from the meta data to retrieve.
rows (slice) – Rows of the record to extract.

Returns:

meta_arr (np.ndarray) – Extracted array from the meta data record name.

get_scale_factor(dset)

Get dataset scale factor

Parameters:: dset (str) – Dataset to get scale factor for
Returns:: float – Dataset scale factor, used to unscale int values to floats

get_units(dset)

Get dataset units

Parameters:: dset (str) – Dataset to get units for
Returns:: str – Dataset units, None if not defined

property global_attrs

Global (file) attributes

Returns:: global_attrs (dict)

property groups

Groups available

Returns:: groups (list) – List of groups

property h5

Open h5py File instance. If _group is not None return open Group

Returns:: h5 (h5py.File | h5py.Group)

classmethod init_h5(h5_file, dsets, shapes, attrs, chunks, dtypes, meta, time_index=None, configs=None, unscale=True, mode='w', str_decode=True, group=None, run_attrs=None)

Init a full output file with the final intended shape without data.

Parameters:

h5_file (str) – Full h5 output filepath.
dsets (list) – List of strings of dataset names to initialize (does not include meta or time_index).
shapes (dict) – Dictionary of dataset shapes (keys correspond to dsets).
attrs (dict) – Dictionary of dataset attributes (keys correspond to dsets).
chunks (dict) – Dictionary of chunk tuples (keys correspond to dsets).
dtypes (dict) – dictionary of numpy datatypes (keys correspond to dsets).
meta (pd.DataFrame) – Full meta data.
time_index (pd.datetimeindex | None) – Full pandas datetime index. None implies that only 1D results (no site profiles) are being written.
configs (dict | None) – Optional input configs to set as attr on meta.
unscale (bool) – Boolean flag to automatically unscale variables on extraction
mode (str) – Mode to instantiate h5py.File instance
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
group (str) – Group within .h5 resource file to open
run_attrs (dict | NoneType) – Runtime attributes (args, kwargs) to add as global (file) attributes

static is_hsds_file(file_path)

Parse one or more filepath to determine if it is hsds

Parameters:: file_path (str | list) – One or more file paths (only the first is parsed if multiple)
Returns:: is_hsds_file (bool) – True if hsds

static is_s3_file(file_path)

Parse one or more filepath to determine if it is s3

Parameters:: file_path (str | list) – One or more file paths (only the first is parsed if multiple)
Returns:: is_s3_file (bool) – True if s3

property lat_lon

Extract (latitude, longitude) pairs

Returns:: lat_lon (ndarray)

property meta

Resource meta data DataFrame

Returns:: meta (pandas.DataFrame)

open_dataset(ds_name)

Open resource dataset

Parameters:: ds_name (str) – Dataset name to open
Returns:: ds (ResourceDataset) – Resource for open resource dataset

classmethod open_file(file_path, mode='r', hsds=False, hsds_kwargs=None)

Open a filepath to an h5, s3, or hsds nrel resource file with the appropriate python object.

Parameters:

file_path (str) – String filepath to .h5 file to extract resource from. Can also be a path to an HSDS file (starts with /nrel/) or S3 file (starts with s3://)
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘r’
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False. This is now redundant; file paths starting with /nrel/ will be treated as hsds=True by default
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

Returns:

file (h5py.File | h5pyd.File) – H5 file handler either opening the local file using h5py, or the file on s3 using h5py and fsspec, or the file on HSDS using h5pyd.

property package

Package used to create file

Returns:: str

classmethod preload_SAM(h5_file, sites, tech, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None, time_index_step=None, means=False)

Pre-load project_points for SAM

Parameters:

h5_file (str) – String filepath to .h5 file to extract resource from. Can also be a path to an HSDS file (starts with /nrel/) or S3 file (starts with s3://)
sites (list) – List of sites to be provided to SAM (sites is synonymous with gids aka spatial indices)
tech (str) – Technology to be run by SAM
unscale (bool) – Boolean flag to automatically unscale variables on extraction
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
group (str) – Group within .h5 resource file to open
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False. This is now redundant; file paths starting with /nrel/ will be treated as hsds=True by default
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None
time_index_step (int, optional) – Step size for time_index, used to reduce temporal resolution, by default None
means (bool, optional) – Boolean flag to compute mean resource when res_array is set, by default False

Returns:

SAM_res (SAMResource) – Instance of SAMResource pre-loaded with Solar resource for sites in project_points

property res_dsets

Available resource datasets

Returns:: list

property resource_datasets

Available resource datasets

Returns:: list

property run_attrs

Runtime attributes stored at the global (file) level

Returns:: global_attrs (dict)

property scale_factors

Dictionary of all dataset scale factors

Returns:: scale_factors (dict)

set_configs(SAM_configs)

Set SAM configuration JSONs as attributes of ‘meta’

Parameters:: SAM_configs (dict) – Dictionary of SAM configuration JSONs

set_version_attr(): Set the version attribute to the h5 file.

property shape

Variable array shape from time_index and meta

Returns:: tuple – shape of variables arrays == (time, locations)

property shapes

Dictionary of all dataset shapes

Returns:: shapes (dict)

property source

Package and version used to create file

Returns:: str

property time_index

Resource DatetimeIndex

Returns:: time_index (pandas.DatetimeIndex)

property units

Dictionary of all dataset units

Returns:: units (dict)

update_dset(dset, dset_array, dset_slice=None)

Check to see if dset needs to be updated on disk If so write dset_array to disk

Parameters:

dset (str) – dataset to update
dset_array (ndarray) – dataset array
dset_slice (tuple) – slice of dataset to update, it None update all

property version

Version of package used to create file

Returns:: str

property writable

Check to see if h5py.File instance is writable

Returns:: is_writable (bool) – Flag if mode is writable

write_dataset(dset_name, data, dtype, chunks=None, attrs=None)

Write dataset to disk. Dataset it created in .h5 file and data is scaled if needed.

Parameters:

dset_name (str) – Name of dataset to be added to h5 file.
data (ndarray) – Data to be added to h5 file.
dtype (str) – Intended dataset datatype after scaling.
chunks (tuple) – Chunk size for capacity factor means dataset.
attrs (dict) – Attributes to be set. May include ‘scale_factor’.

classmethod write_means(h5_file, meta, dset_name, means, dtype, attrs=None, SAM_configs=None, chunks=None, unscale=True, mode='w-', str_decode=True, group=None)

Write means array to disk

Parameters:

h5_file (str) – Path to .h5 resource file
meta (pandas.Dataframe) – Locational meta data
dset_name (str) – Name of the target dataset (should identify the means).
means (ndarray) – output means array.
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None
chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default None
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None

classmethod write_profiles(h5_file, meta, time_index, dset_name, profiles, dtype, attrs=None, SAM_configs=None, chunks=(None, 100), unscale=True, mode='w-', str_decode=True, group=None)

Write profiles to disk

Parameters:

h5_file (str) – Path to .h5 resource file
meta (pandas.Dataframe) – Locational meta data
time_index (pandas.DatetimeIndex) – Temporal timesteps
dset_name (str) – Name of the target dataset (should identify the profiles).
profiles (ndarray) – output result timeseries profiles
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None
chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default (None, 100)
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None