rex.multi_time_resource.MultiTimeResource

class MultiTimeResource(h5_path, unscale=True, str_decode=True, res_cls=<class 'rex.resource.Resource'>, hsds=False, hsds_kwargs=None)[source]

Bases: BaseDatasetIterable

Class to handle resource data stored temporally accross multiple .h5 files

Examples

Extracting the resource’s Datetime Index

>>> path = '$TESTDATADIR/nsrdb/ri_100_nsrdb_*.h5'
>>> with MultiTimeResource(path) as res:
>>>     ti = res.time_index
>>>
>>> ti
DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:30:00',
               '2012-01-01 01:00:00', '2012-01-01 01:30:00',
               '2012-01-01 02:00:00', '2012-01-01 02:30:00',
               '2012-01-01 03:00:00', '2012-01-01 03:30:00',
               '2012-01-01 04:00:00', '2012-01-01 04:30:00',
               ...
               '2013-12-31 19:00:00', '2013-12-31 19:30:00',
               '2013-12-31 20:00:00', '2013-12-31 20:30:00',
               '2013-12-31 21:00:00', '2013-12-31 21:30:00',
               '2013-12-31 22:00:00', '2013-12-31 22:30:00',
               '2013-12-31 23:00:00', '2013-12-31 23:30:00'],
              dtype='datetime64[ns]', length=35088, freq=None)

NOTE: time_index covers data from 2012 and 2013

>>> with MultiTimeResource(path) as res:
>>>     print(res.h5_files)
[‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2012.h5’,

‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2013.h5’]

Data slicing works the same as with “Resource” except axis 0 now covers 2012 and 2013

>>> with MultiTimeResource(path) as res:
>>>     temperature = res['air_temperature']
>>>
>>> temperature
[[ 4.  5.  5. ...  4.  3.  4.]
 [ 4.  4.  5. ...  4.  3.  4.]
 [ 4.  4.  5. ...  4.  3.  4.]
 ...
 [-1. -1.  0. ... -2. -3. -2.]
 [-1. -1.  0. ... -2. -3. -2.]
 [-1. -1.  0. ... -2. -3. -2.]]
>>> temperature.shape
(35088, 100)
>>> with MultiTimeResource(path) as res:
>>>     temperature = res['air_temperature', ::100] # every 100th timestep
>>>
>>> temperature
[[ 4.  5.  5. ...  4.  3.  4.]
 [ 1.  1.  2. ...  0.  0.  1.]
 [-2. -1. -1. ... -2. -4. -2.]
 ...
 [-3. -2. -2. ... -3. -4. -3.]
 [ 0.  0.  1. ...  0. -1.  0.]
 [ 3.  3.  3. ...  2.  2.  3.]]
>>> temperature.shape
(351, 100)
Parameters:
  • h5_path (str | list) – Unix shell style pattern path with * wildcards to multi-file resource file sets. Files must have the same coordinates but can have different datasets or time indexes. Can also be an explicit list of multi time files, which themselves can contain * wildcards.

  • unscale (bool) – Boolean flag to automatically unscale variables on extraction

  • str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

  • res_cls (obj) – Resource handler to us to open individual .h5 files

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

Methods

close()

Close h5 instance

get_attrs([dset])

Get h5 attributes either from file or dataset

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

get_meta_arr(rec_name[, rows])

Get a meta array by name (faster than DataFrame extraction).

get_scale_factor(dset)

Get dataset scale factor

get_units(dset)

Get dataset units

Attributes

attrs

Dictionary of all dataset attributes

chunks

Dictionary of all dataset chunk sizes

coordinates

(lat, lon) pairs

datasets

Datasets available

dsets

Datasets available

dtypes

Dictionary of all dataset dtypes

global_attrs

Global (file) attributes

h5

Open class instance that handles all .h5 files that data is to be extracted from

lat_lon

Extract (latitude, longitude) pairs

meta

Resource meta data DataFrame

res_dsets

Available resource datasets

resource_datasets

Available resource datasets

scale_factors

Dictionary of all dataset scale factors

shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

shapes

Dictionary of all dataset shapes

time_index

Resource DatetimeIndex

units

Dictionary of all dataset units

property h5

Open class instance that handles all .h5 files that data is to be extracted from

Returns:

h5 (MultiTimeH5 | MultiYearH5)

property datasets

Datasets available

Returns:

list

property dsets

Datasets available

Returns:

list

property resource_datasets

Available resource datasets

Returns:

list

property res_dsets

Available resource datasets

Returns:

list

property shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

Returns:

shape (tuple)

property meta

Resource meta data DataFrame

Returns:

meta (pandas.DataFrame)

property time_index

Resource DatetimeIndex

Returns:

time_index (pandas.DatetimeIndex)

property lat_lon

Extract (latitude, longitude) pairs

Returns:

lat_lon (ndarray)

property coordinates

(lat, lon) pairs

Returns:

lat_lon (ndarray)

Type:

Coordinates

property global_attrs

Global (file) attributes

Returns:

global_attrs (dict)

property attrs

Dictionary of all dataset attributes

Returns:

attrs (dict)

property shapes

Dictionary of all dataset shapes

Returns:

shapes (dict)

property dtypes

Dictionary of all dataset dtypes

Returns:

dtypes (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns:

chunks (dict)

property scale_factors

Dictionary of all dataset scale factors

Returns:

scale_factors (dict)

property units

Dictionary of all dataset units

Returns:

units (dict)

get_attrs(dset=None)[source]

Get h5 attributes either from file or dataset

Parameters:

dset (str) – Dataset to get attributes for, if None get file (global) attributes

Returns:

attrs (dict) – Dataset or file attributes

get_dset_properties(dset)[source]

Get dataset properties (shape, dtype, chunks)

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

  • shape (tuple) – Dataset array shape

  • dtype (str) – Dataset array dtype

  • chunks (tuple) – Dataset chunk size

get_scale_factor(dset)[source]

Get dataset scale factor

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

float – Dataset scale factor, used to unscale int values to floats

get_units(dset)[source]

Get dataset units

Parameters:

dset (str) – Dataset to get units for

Returns:

str – Dataset units, None if not defined

get_meta_arr(rec_name, rows=slice(None, None, None))[source]

Get a meta array by name (faster than DataFrame extraction).

Parameters:
  • rec_name (str) – Named record from the meta data to retrieve.

  • rows (slice) – Rows of the record to extract.

Returns:

meta_arr (np.ndarray) – Extracted array from the meta data record name.

close()[source]

Close h5 instance