rex.external.rexarray.RexStore

class RexStore(manager, group=None, mode=None, hsds=False, lock=<SerializableLock: f2743537-79c6-4d5e-b67d-9767723c9c98>)[source]

Bases: AbstractDataStore

Store for reading NREL-rex style data via h5py

Parameters:
  • manager (FileManager) – A FileManager instance that can track whether files are locked for reading or not.

  • group (str, optional) – Name of subgroup in HDF5 file to open. By default, None.

  • mode (str, default=”r”) – Mode to open file in. Note that cloud-based files (i.e. S3 or HSDS) can only be opened in read mode. By default, "r".

  • hsds (bool, optional) – Boolean flag indicating wether h5pyd is being used to access the data. By default, False.

  • lock (SerializableLock, optional) – Resource lock to use when reading data from disk. Only relevant when using dask or another form of parallelism. By default, None`, which chooses the appropriate locks to safely read and write files with the currently active dask scheduler.

Methods

close(**kwargs)

Close the store

get_attrs()

Get Dataset attribute dictionary

get_coord_names()

Set of variable names that represent coordinate datasets

get_dimensions()

Get Dataset dimensions

get_encoding()

get_variables()

Mapping of variables in the store

load()

This loads the variables and attributes simultaneously.

open(filename[, mode, group, lock, ...])

Open a RexStore instance

open_store_variable(name, var[, meta_index, ...])

Initialize a Variable instance from the store

Attributes

manager

mode

is_remote

lock

hsds

ds

File object that can be used to access the data

ds_shape

Shape of the dataset, i.e. (time_index, meta).

classmethod open(filename, mode='r', group=None, lock=None, h5_driver=None, h5_driver_kwds=None, hsds=False, hsds_kwargs=None)[source]

Open a RexStore instance

Parameters:
  • filename (path-like) – Path to file to open.

  • mode (str, default=”r”) – Mode to open file in. Note that cloud-based files (i.e. S3 or HSDS) can only be opened in read mode. By default, "r".

  • group (str, optional) – Name of subgroup in HDF5 file to open. By default, None.

  • lock (SerializableLock, optional) – Resource lock to use when reading data from disk. Only relevant when using dask or another form of parallelism. By default, None`, which chooses the appropriate locks to safely read and write files with the currently active dask scheduler.

  • h5_driver (str, optional) – HDF5 driver to use. See [here](https://docs.h5py.org/en/latest/high/file.html#file-drivers) for more details. By default, None.

  • h5_driver_kwds (_type_, optional) – HDF5 driver keyword-argument pairs. See [here](https://docs.h5py.org/en/latest/high/file.html#file-drivers) for more details. By default, None.

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle HDF5 ‘files’ hosted on AWS behind HSDS. Note that file paths starting with “/nrel/” will be treated as hsds=True regardless of this input. By default, False.

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, (e.g., bucket, username, password, etc.). By default, None.

Returns:

RexStore – Initialized RexStore instance.

Raises:

ValueError – If filename is a bytes object or if the file does not start with valid HDF5 magic number.

property ds

File object that can be used to access the data

Type:

obj

property ds_shape

Shape of the dataset, i.e. (time_index, meta)

Type:

tuple

open_store_variable(name, var, meta_index=-1, coord_index=-1)[source]

Initialize a Variable instance from the store

Parameters:
  • name (str) – Name of variable.

  • var (obj) – Handle that can be used to pull variable metadata. Typically this is an h5py.Dataset, but it can also be a custom wrapper as long as it has the correct attributes to compile a variable meta dictionary. RexMetaVar and RexCoordVar satisfy the latter requirement.

  • meta_index (int, default=-1) – Index value specifying wether variable came from meta. If this value is positive, the variable is assumed to originate from the meta. In this case, the value should represent the index in the meta records array corresponding to the variable. If negative, then this input is ignored. By default, -1.

  • coord_index (int, default=-1) – Index value specifying wether variable came from coordinates dataset. If this value is positive, the variable is assumed to originate from coordinates. In this case, the value should represent the last index in the coordinates array corresponding to the variable (typically 0 for latitude, 1 for longitude). If negative, then this input is ignored. By default, -1.

Returns:

Variable – Initialized Variable instance.

get_variables()[source]

Mapping of variables in the store

Returns:

FrozenDict – Dictionary mapping variable name to xr.Variable instance.

get_coord_names()[source]

Set of variable names that represent coordinate datasets

Most of these come from the meta, but some are based on datasets like time_index or coordinates.

Returns:

set – Set of variable names that should be treated as coordinates.

get_attrs()[source]

Get Dataset attribute dictionary

Returns:

dict – Immutable dictionary of attributes for the dataset.

get_dimensions()[source]

Get Dataset dimensions

Returns:

dict – Immutable mapping of dataset dimension names to their shape.

close(**kwargs)[source]

Close the store

load()

This loads the variables and attributes simultaneously. A centralized loading function makes it easier to create data stores that do automatic encoding/decoding.

For example:

class SuffixAppendingDataStore(AbstractDataStore):

    def load(self):
        variables, attributes = AbstractDataStore.load(self)
        variables = {'%s_suffix' % k: v
                     for k, v in variables.items()}
        attributes = {'%s_suffix' % k: v
                      for k, v in attributes.items()}
        return variables, attributes

This function will be called anytime variables or attributes are requested, so care should be taken to make sure its fast.