nsrdb.file_handlers.outputs.Outputs

class Outputs(h5_file, mode='r', unscale=True, str_decode=True, group=None)[source]

Bases: Outputs

Base class to handle NSRDB output data in .h5 format

Parameters:
  • h5_file (str) – Path to .h5 resource file

  • mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘r’

  • unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True

  • str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True

  • group (str, optional) – Group within .h5 resource file to open, by default None

Methods

add_dataset(h5_file, dset_name, dset_data, dtype)

Add dataset to h5_file

close()

Close h5 instance

df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

get_attrs([dset])

Get h5 attributes either from file or dataset

get_config(config_name)

Get SAM config

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

get_meta_arr(rec_name[, rows])

Get a meta array by name (faster than DataFrame extraction).

get_scale_factor(dset)

Get dataset scale factor

get_units(dset)

Get dataset units

init_h5(fout, dsets, attrs, chunks, dtypes, ...)

Initialize a full h5 output file with the final intended shape.

is_hsds_file(file_path)

Parse one or more filepath to determine if it is hsds

is_s3_file(file_path)

Parse one or more filepath to determine if it is s3

open_dataset(ds_name)

Open resource dataset

open_file(file_path[, mode, hsds, hsds_kwargs])

Open a filepath to an h5, s3, or hsds nrel resource file with the appropriate python object.

preload_SAM(h5_file, sites, tech[, unscale, ...])

Pre-load project_points for SAM

set_configs(SAM_configs)

Set SAM configuration JSONs as attributes of 'meta'

set_version_attr()

Set the version attribute to the h5 file.

update_dset(dset, dset_array[, dset_slice])

Check to see if dset needs to be updated on disk If so write dset_array to disk

write_dataset(dset_name, data, dtype[, ...])

Write dataset to disk.

write_means(h5_file, meta, dset_name, means, ...)

Write means array to disk

write_profiles(h5_file, meta, time_index, ...)

Write profiles to disk

Attributes

ADD_ATTR

SAM_configs

SAM configuration JSONs used to create CF profiles

SCALE_ATTR

UNIT_ATTR

adders

Dictionary of all dataset add offset factors

attrs

Dictionary of all dataset attributes

chunks

Dictionary of all dataset chunk sizes

coordinates

(lat, lon) pairs

data_version

Get the version attribute of the data.

datasets

Datasets available

dsets

Datasets available

dtypes

Dictionary of all dataset dtypes

full_version_record

Get record of versions for dependencies

global_attrs

Global (file) attributes

groups

Groups available

h5

Open h5py File instance.

lat_lon

Extract (latitude, longitude) pairs

meta

Resource meta data DataFrame

package

Package used to create file

res_dsets

Available resource datasets

resource_datasets

Available resource datasets

run_attrs

Runtime attributes stored at the global (file) level

scale_factors

Dictionary of all dataset scale factors

shape

Variable array shape from time_index and meta

shapes

Dictionary of all dataset shapes

source

Package and version used to create file

time_index

Resource DatetimeIndex

units

Dictionary of all dataset units

version

Version of package used to create file

writable

Check to see if h5py.File instance is writable

set_version_attr()[source]

Set the version attribute to the h5 file.

classmethod init_h5(fout, dsets, attrs, chunks, dtypes, time_index, meta, add_coords=False, mode='w-')[source]

Initialize a full h5 output file with the final intended shape.

Parameters:
  • fout (str) – Full output filepath.

  • dsets (list) – List of dataset name strings.

  • attrs (dict) – Dictionary of dataset attributes.

  • chunks (dict) – Dictionary of chunk tuples corresponding to each dataset.

  • dtypes (dict) – dictionary of numpy datatypes corresponding to each dataset.

  • time_index (pd.datetimeindex) – Full pandas datetime index.

  • meta (pd.DataFrame) – Full meta data.

  • mode (str) – Outputs write mode. w- will raise error if fout exists. w will overwrite file.

  • add_coords (bool) – Option to include coordinates in output

property SAM_configs

SAM configuration JSONs used to create CF profiles

Returns:

configs (dict) – Dictionary of SAM configuration JSONs

classmethod add_dataset(h5_file, dset_name, dset_data, dtype, attrs=None, chunks=None, unscale=True, mode='a', str_decode=True, group=None)[source]

Add dataset to h5_file

Parameters:
  • h5_file (str) – Path to .h5 resource file

  • dset_name (str) – Name of dataset to be added to h5 file

  • dset_data (ndarray) – Data to be added to h5 file

  • dtype (str) – Intended dataset datatype after scaling.

  • attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None

  • unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True

  • mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘a’

  • str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True

  • group (str, optional) – Group within .h5 resource file to open, by default None

property adders

Dictionary of all dataset add offset factors

Returns:

adders (dict)

property attrs

Dictionary of all dataset attributes

Returns:

attrs (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns:

chunks (dict)

close()

Close h5 instance

property coordinates

(lat, lon) pairs

Returns:

lat_lon (ndarray)

Type:

Coordinates

property data_version

Get the version attribute of the data. None if not available.

Returns:

version (str | None)

property datasets

Datasets available

Returns:

list

static df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

Parameters:

df (pd.DataFrame) – Dataframe with some columns being byte strings.

Returns:

df (pd.DataFrame) – DataFrame with str columns instead of byte str columns.

property dsets

Datasets available

Returns:

list

property dtypes

Dictionary of all dataset dtypes

Returns:

dtypes (dict)

property full_version_record

Get record of versions for dependencies

Returns:

dict – Dictionary of package versions for dependencies

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

Parameters:

site (int) – Site to extract SAM DataFrame for

get_attrs(dset=None)

Get h5 attributes either from file or dataset

Parameters:

dset (str) – Dataset to get attributes for, if None get file (global) attributes

Returns:

attrs (dict) – Dataset or file attributes

get_config(config_name)[source]

Get SAM config

Parameters:

config_name (str) – Name of config

Returns:

config (dict) – SAM config JSON as a dictionary

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

  • shape (tuple) – Dataset array shape

  • dtype (str) – Dataset array dtype

  • chunks (tuple) – Dataset chunk size

get_meta_arr(rec_name, rows=slice(None, None, None))

Get a meta array by name (faster than DataFrame extraction).

Parameters:
  • rec_name (str) – Named record from the meta data to retrieve.

  • rows (slice) – Rows of the record to extract.

Returns:

meta_arr (np.ndarray) – Extracted array from the meta data record name.

get_scale_factor(dset)

Get dataset scale factor

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

float – Dataset scale factor, used to unscale int values to floats

get_units(dset)

Get dataset units

Parameters:

dset (str) – Dataset to get units for

Returns:

str – Dataset units, None if not defined

property global_attrs

Global (file) attributes

Returns:

global_attrs (dict)

property groups

Groups available

Returns:

groups (list) – List of groups

property h5

Open h5py File instance. If _group is not None return open Group

Returns:

h5 (h5py.File | h5py.Group)

static is_hsds_file(file_path)

Parse one or more filepath to determine if it is hsds

Parameters:

file_path (str | list) – One or more file paths (only the first is parsed if multiple)

Returns:

is_hsds_file (bool) – True if hsds

static is_s3_file(file_path)

Parse one or more filepath to determine if it is s3

Parameters:

file_path (str | list) – One or more file paths (only the first is parsed if multiple)

Returns:

is_s3_file (bool) – True if s3

property lat_lon

Extract (latitude, longitude) pairs

Returns:

lat_lon (ndarray)

property meta

Resource meta data DataFrame

Returns:

meta (pandas.DataFrame)

open_dataset(ds_name)

Open resource dataset

Parameters:

ds_name (str) – Dataset name to open

Returns:

ds (ResourceDataset) – Resource for open resource dataset

classmethod open_file(file_path, mode='r', hsds=False, hsds_kwargs=None)

Open a filepath to an h5, s3, or hsds nrel resource file with the appropriate python object.

Parameters:
  • file_path (str) – String filepath to .h5 file to extract resource from. Can also be a path to an HSDS file (starts with /nrel/) or S3 file (starts with s3://)

  • mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘r’

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False. This is now redundant; file paths starting with /nrel/ will be treated as hsds=True by default

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

Returns:

file (h5py.File | h5pyd.File) – H5 file handler either opening the local file using h5py, or the file on s3 using h5py and fsspec, or the file on HSDS using h5pyd.

property package

Package used to create file

Returns:

str

classmethod preload_SAM(h5_file, sites, tech, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None, time_index_step=None, means=False)

Pre-load project_points for SAM

Parameters:
  • h5_file (str) – String filepath to .h5 file to extract resource from. Can also be a path to an HSDS file (starts with /nrel/) or S3 file (starts with s3://)

  • sites (list) – List of sites to be provided to SAM (sites is synonymous with gids aka spatial indices)

  • tech (str) – Technology to be run by SAM

  • unscale (bool) – Boolean flag to automatically unscale variables on extraction

  • str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

  • group (str) – Group within .h5 resource file to open

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False. This is now redundant; file paths starting with /nrel/ will be treated as hsds=True by default

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

  • time_index_step (int, optional) – Step size for time_index, used to reduce temporal resolution, by default None

  • means (bool, optional) – Boolean flag to compute mean resource when res_array is set, by default False

Returns:

SAM_res (SAMResource) – Instance of SAMResource pre-loaded with Solar resource for sites in project_points

property res_dsets

Available resource datasets

Returns:

list

property resource_datasets

Available resource datasets

Returns:

list

property run_attrs

Runtime attributes stored at the global (file) level

Returns:

global_attrs (dict)

property scale_factors

Dictionary of all dataset scale factors

Returns:

scale_factors (dict)

set_configs(SAM_configs)[source]

Set SAM configuration JSONs as attributes of ‘meta’

Parameters:

SAM_configs (dict) – Dictionary of SAM configuration JSONs

property shape

Variable array shape from time_index and meta

Returns:

tuple – shape of variables arrays == (time, locations)

property shapes

Dictionary of all dataset shapes

Returns:

shapes (dict)

property source

Package and version used to create file

Returns:

str

property time_index

Resource DatetimeIndex

Returns:

time_index (pandas.DatetimeIndex)

property units

Dictionary of all dataset units

Returns:

units (dict)

update_dset(dset, dset_array, dset_slice=None)[source]

Check to see if dset needs to be updated on disk If so write dset_array to disk

Parameters:
  • dset (str) – dataset to update

  • dset_array (ndarray) – dataset array

  • dset_slice (tuple) – slice of dataset to update, it None update all

property version

Version of package used to create file

Returns:

str

property writable

Check to see if h5py.File instance is writable

Returns:

is_writable (bool) – Flag if mode is writable

write_dataset(dset_name, data, dtype, chunks=None, attrs=None)[source]

Write dataset to disk. Dataset it created in .h5 file and data is scaled if needed.

Parameters:
  • dset_name (str) – Name of dataset to be added to h5 file.

  • data (ndarray) – Data to be added to h5 file.

  • dtype (str) – Intended dataset datatype after scaling.

  • chunks (tuple) – Chunk size for capacity factor means dataset.

  • attrs (dict) – Attributes to be set. May include ‘scale_factor’.

classmethod write_means(h5_file, meta, dset_name, means, dtype, attrs=None, SAM_configs=None, chunks=None, unscale=True, mode='w-', str_decode=True, group=None)[source]

Write means array to disk

Parameters:
  • h5_file (str) – Path to .h5 resource file

  • meta (pandas.Dataframe) – Locational meta data

  • dset_name (str) – Name of the target dataset (should identify the means).

  • means (ndarray) – output means array.

  • dtype (str) – Intended dataset datatype after scaling.

  • attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None

  • SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None

  • chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default None

  • unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True

  • mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’

  • str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True

  • group (str, optional) – Group within .h5 resource file to open, by default None

classmethod write_profiles(h5_file, meta, time_index, dset_name, profiles, dtype, attrs=None, SAM_configs=None, chunks=(None, 100), unscale=True, mode='w-', str_decode=True, group=None)[source]

Write profiles to disk

Parameters:
  • h5_file (str) – Path to .h5 resource file

  • meta (pandas.Dataframe) – Locational meta data

  • time_index (pandas.DatetimeIndex) – Temporal timesteps

  • dset_name (str) – Name of the target dataset (should identify the profiles).

  • profiles (ndarray) – output result timeseries profiles

  • dtype (str) – Intended dataset datatype after scaling.

  • attrs (dict, optional) – Attributes to be set. May include ‘scale_factor’, by default None

  • SAM_configs (dict, optional) – Dictionary of SAM configuration JSONs used to compute cf means, by default None

  • chunks (tuple, optional) – Chunk size for capacity factor means dataset, by default (None, 100)

  • unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True

  • mode (str, optional) – Mode to instantiate h5py.File instance, by default ‘w-’

  • str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True

  • group (str, optional) – Group within .h5 resource file to open, by default None