reV.handlers.multi_year.MultiYear
- class MultiYear(h5_file, group=None, unscale=True, mode='r', str_decode=True)[source]
Bases:
Outputs
Class to handle multiple years of data and: - collect datasets from multiple years - compute multi-year means - compute multi-year standard deviations - compute multi-year coefficient of variations
- Parameters:
h5_file (str) – Path to .h5 resource file
group (str) – Group to collect datasets into
unscale (bool) – Boolean flag to automatically unscale variables on extraction
mode (str) – Mode to instantiate h5py.File instance
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
Methods
CV
(dset)Extract or compute multi-year coefficient of variation for given source dset
add_dataset
(h5_file, dset_name, dset_data, dtype)Add dataset to h5_file
close
()Close h5 instance
collect
(source_files, dset[, profiles, ...])Collect dataset dset from given list of h5 files
collect_means
(my_file, source_files, dset[, ...])Collect and compute multi-year means for given dataset
collect_profiles
(my_file, source_files, dset)Collect multi-year profiles associated with given dataset
df_str_decode
(df)Decode a dataframe with byte string columns into ordinary str cols.
get_SAM_df
(site)Placeholder for get_SAM_df method that it resource specific
get_attrs
([dset])Get h5 attributes either from file or dataset
get_config
(config_name)Get SAM config
get_dset_properties
(dset)Get dataset properties (shape, dtype, chunks)
get_meta_arr
(rec_name[, rows])Get a meta array by name (faster than DataFrame extraction).
get_scale_factor
(dset)Get dataset scale factor
get_units
(dset)Get dataset units
init_h5
(h5_file, dsets, shapes, attrs, ...)Init a full output file with the final intended shape without data.
is_profile
(source_files, dset)Check dataset in source files to see if it is a profile.
means
(dset)Extract or compute multi-year means for given source dset
open_dataset
(ds_name)Open resource dataset
parse_source_files_pattern
(source_files)Parse a source_files pattern that can be either an explicit list of source files or a unix-style /filepath/pattern*.h5 and either way return a list of explicit filepaths.
pass_through
(my_file, source_files, dset[, ...])Pass through a dataset that is identical in all source files to a dataset of the same name in the output multi-year file.
preload_SAM
(h5_file, sites, tech[, unscale, ...])Pre-load project_points for SAM
set_configs
(SAM_configs)Set SAM configuration JSONs as attributes of 'meta'
Set the version attribute to the h5 file.
stdev
(dset)Extract or compute multi-year standard deviation for given source dset
update_dset
(dset, dset_array[, dset_slice])Check to see if dset needs to be updated on disk If so write dset_array to disk
write_dataset
(dset_name, data, dtype[, ...])Write dataset to disk.
write_means
(h5_file, meta, dset_name, means, ...)Write means array to disk
write_profiles
(h5_file, meta, time_index, ...)Write profiles to disk
Attributes
ADD_ATTR
SAM configuration JSONs used to create CF profiles
SCALE_ATTR
UNIT_ATTR
Dictionary of all dataset add offset factors
Dictionary of all dataset attributes
Dictionary of all dataset chunk sizes
(lat, lon) pairs
Get the version attribute of the data.
Datasets available
Datasets available
Dictionary of all dataset dtypes
Get record of versions for dependencies
Global (file) attributes
Groups available
Open h5py File instance.
Extract (latitude, longitude) pairs
Resource meta data DataFrame
Package used to create file
Available resource datasets
Available resource datasets
Runtime attributes stored at the global (file) level
Dictionary of all dataset scale factors
Variable array shape from time_index and meta
Dictionary of all dataset shapes
Package and version used to create file
Resource DatetimeIndex
Dictionary of all dataset units
Version of package used to create file
Check to see if h5py.File instance is writable
- static parse_source_files_pattern(source_files)[source]
Parse a source_files pattern that can be either an explicit list of source files or a unix-style /filepath/pattern*.h5 and either way return a list of explicit filepaths.
- Parameters:
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
- Returns:
source_files (list) – List of .h5 filepaths.
- collect(source_files, dset, profiles=False, pass_through=False)[source]
Collect dataset dset from given list of h5 files
- Parameters:
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect
profiles (bool) – Boolean flag to indicate if profiles are being collected If True also collect time_index
pass_through (bool) – Flag to just pass through dataset without name modifications (no differences between years, no means or stdevs)
- means(dset)[source]
Extract or compute multi-year means for given source dset
- Parameters:
dset (str) – Dataset of interest
- Returns:
my_means (ndarray) – Array of multi-year means for dataset of interest
- stdev(dset)[source]
Extract or compute multi-year standard deviation for given source dset
- Parameters:
dset (str) – Dataset of interest
- Returns:
my_stdev (ndarray) – Array of multi-year standard deviation for dataset of interest
- CV(dset)[source]
Extract or compute multi-year coefficient of variation for given source dset
- Parameters:
dset (str) – Dataset of interest
- Returns:
my_cv (ndarray) – Array of multi-year coefficient of variation for dataset of interest
- classmethod is_profile(source_files, dset)[source]
Check dataset in source files to see if it is a profile.
- Parameters:
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect
- Returns:
is_profile (bool) – True if profile, False if not.
- classmethod pass_through(my_file, source_files, dset, group=None)[source]
Pass through a dataset that is identical in all source files to a dataset of the same name in the output multi-year file.
- Parameters:
my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to pass through (will also be the name of the output dataset in my_file)
group (str) – Group to collect datasets into
- classmethod collect_means(my_file, source_files, dset, group=None)[source]
Collect and compute multi-year means for given dataset
- Parameters:
my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Dataset to collect
group (str) – Group to collect datasets into
- classmethod collect_profiles(my_file, source_files, dset, group=None)[source]
Collect multi-year profiles associated with given dataset
- Parameters:
my_file (str) – Path to multi-year .h5 file
source_files (list | str) – List of .h5 files to collect datasets from. This can also be a unix-style /filepath/pattern*.h5 to find .h5 files to collect, however all resulting files must be .h5 otherwise an exception will be raised. NOTE: .h5 file names must indicate the year the data pertains to
dset (str) – Profiles dataset to collect
group (str) – Group to collect datasets into
- property SAM_configs
SAM configuration JSONs used to create CF profiles
- Returns:
configs (dict) – Dictionary of SAM configuration JSONs
- classmethod add_dataset(h5_file, dset_name, dset_data, dtype, attrs=None, chunks=None, unscale=True, mode='a', str_decode=True, group=None)
Add dataset to h5_file
- Parameters:
h5_file (str) – Path to .h5 resource file
dset_name (str) – Name of dataset to be added to h5 file
dset_data (ndarray) – Data to be added to h5 file
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘a’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None
- property adders
Dictionary of all dataset add offset factors
- Returns:
adders (dict)
- property attrs
Dictionary of all dataset attributes
- Returns:
attrs (dict)
- property chunks
Dictionary of all dataset chunk sizes
- Returns:
chunks (dict)
- close()
Close h5 instance
- property coordinates
(lat, lon) pairs
- Returns:
lat_lon (ndarray)
- Type:
Coordinates
- property data_version
Get the version attribute of the data. None if not available.
- Returns:
version (str | None)
- property datasets
Datasets available
- Returns:
list
- static df_str_decode(df)
Decode a dataframe with byte string columns into ordinary str cols.
- Parameters:
df (pd.DataFrame) – Dataframe with some columns being byte strings.
- Returns:
df (pd.DataFrame) – DataFrame with str columns instead of byte str columns.
- property dsets
Datasets available
- Returns:
list
- property dtypes
Dictionary of all dataset dtypes
- Returns:
dtypes (dict)
- property full_version_record
Get record of versions for dependencies
- Returns:
dict – Dictionary of package versions for dependencies
- get_SAM_df(site)
Placeholder for get_SAM_df method that it resource specific
- Parameters:
site (int) – Site to extract SAM DataFrame for
- get_attrs(dset=None)
Get h5 attributes either from file or dataset
- Parameters:
dset (str) – Dataset to get attributes for, if None get file (global) attributes
- Returns:
attrs (dict) – Dataset or file attributes
- get_config(config_name)
Get SAM config
- Parameters:
config_name (str) – Name of config
- Returns:
config (dict) – SAM config JSON as a dictionary
- get_dset_properties(dset)
Get dataset properties (shape, dtype, chunks)
- Parameters:
dset (str) – Dataset to get scale factor for
- Returns:
shape (tuple) – Dataset array shape
dtype (str) – Dataset array dtype
chunks (tuple) – Dataset chunk size
- get_meta_arr(rec_name, rows=slice(None, None, None))
Get a meta array by name (faster than DataFrame extraction).
- Parameters:
rec_name (str) – Named record from the meta data to retrieve.
rows (slice) – Rows of the record to extract.
- Returns:
meta_arr (np.ndarray) – Extracted array from the meta data record name.
- get_scale_factor(dset)
Get dataset scale factor
- Parameters:
dset (str) – Dataset to get scale factor for
- Returns:
float – Dataset scale factor, used to unscale int values to floats
- get_units(dset)
Get dataset units
- Parameters:
dset (str) – Dataset to get units for
- Returns:
str – Dataset units, None if not defined
- property global_attrs
Global (file) attributes
- Returns:
global_attrs (dict)
- property groups
Groups available
- Returns:
groups (list) – List of groups
- property h5
Open h5py File instance. If _group is not None return open Group
- Returns:
h5 (h5py.File | h5py.Group)
- classmethod init_h5(h5_file, dsets, shapes, attrs, chunks, dtypes, meta, time_index=None, configs=None, unscale=True, mode='w', str_decode=True, group=None, run_attrs=None)
Init a full output file with the final intended shape without data.
- Parameters:
h5_file (str) – Full h5 output filepath.
dsets (list) – List of strings of dataset names to initialize (does not include meta or time_index).
shapes (dict) – Dictionary of dataset shapes (keys correspond to dsets).
attrs (dict) – Dictionary of dataset attributes (keys correspond to dsets).
chunks (dict) – Dictionary of chunk tuples (keys correspond to dsets).
dtypes (dict) – dictionary of numpy datatypes (keys correspond to dsets).
meta (pd.DataFrame) – Full meta data.
time_index (pd.datetimeindex | None) – Full pandas datetime index. None implies that only 1D results (no site profiles) are being written.
configs (dict | None) – Optional input configs to set as attr on meta.
unscale (bool) – Boolean flag to automatically unscale variables on extraction
mode (str) – Mode to instantiate h5py.File instance
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
group (str) – Group within .h5 resource file to open
run_attrs (dict | NoneType) – Runtime attributes (args, kwargs) to add as global (file) attributes
- property lat_lon
Extract (latitude, longitude) pairs
- Returns:
lat_lon (ndarray)
- property meta
Resource meta data DataFrame
- Returns:
meta (pandas.DataFrame)
- open_dataset(ds_name)
Open resource dataset
- Parameters:
ds_name (str) – Dataset name to open
- Returns:
ds (ResourceDataset) – Resource for open resource dataset
- property package
Package used to create file
- Returns:
str
- classmethod preload_SAM(h5_file, sites, tech, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None, time_index_step=None, means=False)
Pre-load project_points for SAM
- Parameters:
h5_file (str) – h5_file to extract resource from
sites (list) – List of sites to be provided to SAM (sites is synonymous with gids aka spatial indices)
tech (str) – Technology to be run by SAM
unscale (bool) – Boolean flag to automatically unscale variables on extraction
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
group (str) – Group within .h5 resource file to open
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None
time_index_step (int, optional) – Step size for time_index, used to reduce temporal resolution, by default None
means (bool, optional) – Boolean flag to compute mean resource when res_array is set, by default False
- Returns:
SAM_res (SAMResource) – Instance of SAMResource pre-loaded with Solar resource for sites in project_points
- property res_dsets
Available resource datasets
- Returns:
list
- property resource_datasets
Available resource datasets
- Returns:
list
- property run_attrs
Runtime attributes stored at the global (file) level
- Returns:
global_attrs (dict)
- property scale_factors
Dictionary of all dataset scale factors
- Returns:
scale_factors (dict)
- set_configs(SAM_configs)
Set SAM configuration JSONs as attributes of ‘meta’
- Parameters:
SAM_configs (dict) – Dictionary of SAM configuration JSONs
- set_version_attr()
Set the version attribute to the h5 file.
- property shape
Variable array shape from time_index and meta
- Returns:
tuple – shape of variables arrays == (time, locations)
- property shapes
Dictionary of all dataset shapes
- Returns:
shapes (dict)
- property source
Package and version used to create file
- Returns:
str
- property time_index
Resource DatetimeIndex
- Returns:
time_index (pandas.DatetimeIndex)
- property units
Dictionary of all dataset units
- Returns:
units (dict)
- update_dset(dset, dset_array, dset_slice=None)
Check to see if dset needs to be updated on disk If so write dset_array to disk
- Parameters:
dset (str) – dataset to update
dset_array (ndarray) – dataset array
dset_slice (tuple) – slice of dataset to update, it None update all
- property version
Version of package used to create file
- Returns:
str
- property writable
Check to see if h5py.File instance is writable
- Returns:
is_writable (bool) – Flag if mode is writable
- write_dataset(dset_name, data, dtype, chunks=None, attrs=None)
Write dataset to disk. Dataset it created in .h5 file and data is scaled if needed.
- Parameters:
dset_name (str) – Name of dataset to be added to h5 file.
data (ndarray) – Data to be added to h5 file.
dtype (str) – Intended dataset datatype after scaling.
chunks (tuple) – Chunk size for capacity factor means dataset.
attrs (dict) – Attributes to be set. May include ‘scale_factor’.
- classmethod write_means(h5_file, meta, dset_name, means, dtype, attrs=None, SAM_configs=None, chunks=None, unscale=True, mode='w-', str_decode=True, group=None)
Write means array to disk
- Parameters:
h5_file (str) – Path to .h5 resource file
meta (pandas.Dataframe) – Locational meta data
dset_name (str) – Name of the target dataset (should identify the means).
means (ndarray) – output means array.
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None
chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default None
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None
- classmethod write_profiles(h5_file, meta, time_index, dset_name, profiles, dtype, attrs=None, SAM_configs=None, chunks=(None, 100), unscale=True, mode='w-', str_decode=True, group=None)
Write profiles to disk
- Parameters:
h5_file (str) – Path to .h5 resource file
meta (pandas.Dataframe) – Locational meta data
time_index (pandas.DatetimeIndex) – Temporal timesteps
dset_name (str) – Name of the target dataset (should identify the profiles).
profiles (ndarray) – output result timeseries profiles
dtype (str) – Intended dataset datatype after scaling.
attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None
SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None
chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default (None, 100)
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None