nsrdb.file_handlers.resource.MultiFileResource

class MultiFileResource(h5_source, unscale=True, str_decode=True, check_files=False, use_lapse_rate=True)[source]

Bases: MultiFileResource, Resource

Multi file resource handler with handling of legacy nsrdb scale factors

Parameters:
  • h5_source (str | list) – Unix shell style pattern path with * wildcards to multi-file resource file sets. Files must have the same time index and coordinates but can have different datasets. Can also be an explicit list of complete filepaths.

  • unscale (bool) – Boolean flag to automatically unscale variables on extraction

  • str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

  • check_files (bool) – Check to ensure files have the same coordinates and time_index

  • use_lapse_rate (bool) – If a dataset is only available at a single hub-height and this flag value is set to True, pressure / temperature values will be calculated using linear lapse rate adjustment from the available hub height to the requested one. If the flag value is set to False, the value of these variables at the single available hub-height will be returned for all requested heights. This option has no effect if data is available at multiple hub-heights.

Methods

close()

Close h5 instance

df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

get_attrs([dset])

Get h5 attributes either from file or dataset

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

get_meta_arr(rec_name[, rows])

Get a meta array by name (faster than DataFrame extraction).

get_scale_factor(dset)

Get dataset scale factor

get_units(dset)

Get dataset units

open_dataset(ds_name)

Open resource dataset

preload_SAM(h5_file, sites, tech[, unscale, ...])

Pre-load project_points for SAM

Attributes

ADD_ATTR

INTERPOLABLE_DSETS

LAPSE_RATES

Air Temperature and Pressure lapse rate in C/km and Pa/km

OLD_ADD_ATTR

OLD_SCALE_ATTR

OLD_UNIT_ATTR

SCALE_ATTR

UNIT_ATTR

VARIABLE_NAME

VARIABLE_UNIT

adders

Dictionary of all dataset add offset factors

attrs

Dictionary of all dataset attributes

chunks

Dictionary of all dataset chunk sizes

coordinates

(lat, lon) pairs

data_version

Get the version attribute of the data.

datasets

Datasets available

dsets

Datasets available

dtypes

Dictionary of all dataset dtypes

global_attrs

Global (file) attributes

groups

Groups available

h5

Open h5py File instance.

lat_lon

Extract (latitude, longitude) pairs

meta

Resource meta data DataFrame

res_dsets

Available resource datasets

resource_datasets

Available resource datasets

scale_factors

Dictionary of all dataset scale factors

shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

shapes

Dictionary of all dataset shapes

time_index

Resource DatetimeIndex

units

Dictionary of all dataset units

LAPSE_RATES = {'pressure': 11109, 'temperature': 6.56}

Air Temperature and Pressure lapse rate in C/km and Pa/km

property adders

Dictionary of all dataset add offset factors

Returns:

adders (dict)

property attrs

Dictionary of all dataset attributes

Returns:

attrs (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns:

chunks (dict)

close()

Close h5 instance

property coordinates

(lat, lon) pairs

Returns:

lat_lon (ndarray)

Type:

Coordinates

property data_version

Get the version attribute of the data. None if not available.

Returns:

version (str | None)

property datasets

Datasets available

Returns:

list

static df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

Parameters:

df (pd.DataFrame) – Dataframe with some columns being byte strings.

Returns:

df (pd.DataFrame) – DataFrame with str columns instead of byte str columns.

property dsets

Datasets available

Returns:

list

property dtypes

Dictionary of all dataset dtypes

Returns:

dtypes (dict)

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

Parameters:

site (int) – Site to extract SAM DataFrame for

get_attrs(dset=None)

Get h5 attributes either from file or dataset

Parameters:

dset (str) – Dataset to get attributes for, if None get file (global) attributes

Returns:

attrs (dict) – Dataset or file attributes

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

  • shape (tuple) – Dataset array shape

  • dtype (str) – Dataset array dtype

  • chunks (tuple) – Dataset chunk size

get_meta_arr(rec_name, rows=slice(None, None, None))

Get a meta array by name (faster than DataFrame extraction).

Parameters:
  • rec_name (str) – Named record from the meta data to retrieve.

  • rows (slice) – Rows of the record to extract.

Returns:

meta_arr (np.ndarray) – Extracted array from the meta data record name.

get_scale_factor(dset)

Get dataset scale factor

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

float – Dataset scale factor, used to unscale int values to floats

get_units(dset)

Get dataset units

Parameters:

dset (str) – Dataset to get units for

Returns:

str – Dataset units, None if not defined

property global_attrs

Global (file) attributes

Returns:

global_attrs (dict)

property groups

Groups available

Returns:

groups (list) – List of groups

property h5

Open h5py File instance. If _group is not None return open Group

Returns:

h5 (h5py.File | h5py.Group)

property lat_lon

Extract (latitude, longitude) pairs

Returns:

lat_lon (ndarray)

property meta

Resource meta data DataFrame

Returns:

meta (pandas.DataFrame)

open_dataset(ds_name)

Open resource dataset

Parameters:

ds_name (str) – Dataset name to open

Returns:

ds (ResourceDataset) – Resource for open resource dataset

classmethod preload_SAM(h5_file, sites, tech, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None, time_index_step=None, means=False)

Pre-load project_points for SAM

Parameters:
  • h5_file (str) – h5_file to extract resource from

  • sites (list) – List of sites to be provided to SAM (sites is synonymous with gids aka spatial indices)

  • tech (str) – Technology to be run by SAM

  • unscale (bool) – Boolean flag to automatically unscale variables on extraction

  • str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

  • group (str) – Group within .h5 resource file to open

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

  • time_index_step (int, optional) – Step size for time_index, used to reduce temporal resolution, by default None

  • means (bool, optional) – Boolean flag to compute mean resource when res_array is set, by default False

Returns:

SAM_res (SAMResource) – Instance of SAMResource pre-loaded with Solar resource for sites in project_points

property res_dsets

Available resource datasets

Returns:

list

property resource_datasets

Available resource datasets

Returns:

list

property scale_factors

Dictionary of all dataset scale factors

Returns:

scale_factors (dict)

property shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

Returns:

shape (tuple)

property shapes

Dictionary of all dataset shapes

Returns:

shapes (dict)

property time_index

Resource DatetimeIndex

Returns:

time_index (pandas.DatetimeIndex)

property units

Dictionary of all dataset units

Returns:

units (dict)