rex.multi_time_resource.MultiTimeResource

class MultiTimeResource(h5_path, unscale=True, str_decode=True, res_cls=<class 'rex.resource.Resource'>, hsds=False, hsds_kwargs=None)[source]

Bases: object

Class to handle resource data stored temporally accross multiple .h5 files

Examples

Extracting the resource’s Datetime Index

>>> path = '$TESTDATADIR/nsrdb/ri_100_nsrdb_*.h5'
>>> with MultiTimeResource(path) as res:
>>>     ti = res.time_index
>>>
>>> ti
DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:30:00',
               '2012-01-01 01:00:00', '2012-01-01 01:30:00',
               '2012-01-01 02:00:00', '2012-01-01 02:30:00',
               '2012-01-01 03:00:00', '2012-01-01 03:30:00',
               '2012-01-01 04:00:00', '2012-01-01 04:30:00',
               ...
               '2013-12-31 19:00:00', '2013-12-31 19:30:00',
               '2013-12-31 20:00:00', '2013-12-31 20:30:00',
               '2013-12-31 21:00:00', '2013-12-31 21:30:00',
               '2013-12-31 22:00:00', '2013-12-31 22:30:00',
               '2013-12-31 23:00:00', '2013-12-31 23:30:00'],
              dtype='datetime64[ns]', length=35088, freq=None)

NOTE: time_index covers data from 2012 and 2013

>>> with MultiTimeResource(path) as res:
>>>     print(res.h5_files)
[‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2012.h5’,

‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2013.h5’]

Data slicing works the same as with “Resource” except axis 0 now covers 2012 and 2013

>>> with MultiTimeResource(path) as res:
>>>     temperature = res['air_temperature']
>>>
>>> temperature
[[ 4.  5.  5. ...  4.  3.  4.]
 [ 4.  4.  5. ...  4.  3.  4.]
 [ 4.  4.  5. ...  4.  3.  4.]
 ...
 [-1. -1.  0. ... -2. -3. -2.]
 [-1. -1.  0. ... -2. -3. -2.]
 [-1. -1.  0. ... -2. -3. -2.]]
>>> temperature.shape
(35088, 100)
>>> with MultiTimeResource(path) as res:
>>>     temperature = res['air_temperature', ::100] # every 100th timestep
>>>
>>> temperature
[[ 4.  5.  5. ...  4.  3.  4.]
 [ 1.  1.  2. ...  0.  0.  1.]
 [-2. -1. -1. ... -2. -4. -2.]
 ...
 [-3. -2. -2. ... -3. -4. -3.]
 [ 0.  0.  1. ...  0. -1.  0.]
 [ 3.  3.  3. ...  2.  2.  3.]]
>>> temperature.shape
(351, 100)
Parameters
  • h5_path (str) – Unix shell style pattern path with * wildcards to multi-file resource file sets. Files must have the same coordinates but can have different datasets or time indexes.

  • unscale (bool) – Boolean flag to automatically unscale variables on extraction

  • str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.

  • res_cls (obj) – Resource handler to us to open individual .h5 files

  • hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False

  • hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

Methods

close()

Close h5 instance

get_attrs([dset])

Get h5 attributes either from file or dataset

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

get_meta_arr(rec_name[, rows])

Get a meta array by name (faster than DataFrame extraction).

get_scale_factor(dset)

Get dataset scale factor

get_units(dset)

Get dataset units

Attributes

attrs

Dictionary of all dataset attributes

chunks

Dictionary of all dataset chunk sizes

coordinates

(lat, lon) pairs

datasets

Datasets available

dsets

Datasets available

dtypes

Dictionary of all dataset dtypes

global_attrs

Global (file) attributes

h5

Open class instance that handles all .h5 files that data is to be extracted from

lat_lon

Extract (latitude, longitude) pairs

meta

Resource meta data DataFrame

res_dsets

Available resource datasets

resource_datasets

Available resource datasets

scale_factors

Dictionary of all dataset scale factors

shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

shapes

Dictionary of all dataset shapes

time_index

Resource DatetimeIndex

units

Dictionary of all dataset units

property h5

Open class instance that handles all .h5 files that data is to be extracted from

Returns

h5 (MultiTimeH5 | MultiYearH5)

property datasets

Datasets available

Returns

list

property dsets

Datasets available

Returns

list

property resource_datasets

Available resource datasets

Returns

list

property res_dsets

Available resource datasets

Returns

list

property shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

Returns

shape (tuple)

property meta

Resource meta data DataFrame

Returns

meta (pandas.DataFrame)

property time_index

Resource DatetimeIndex

Returns

time_index (pandas.DatetimeIndex)

property lat_lon

Extract (latitude, longitude) pairs

Returns

lat_lon (ndarray)

property coordinates

(lat, lon) pairs

Returns

lat_lon (ndarray)

Type

Coordinates

property global_attrs

Global (file) attributes

Returns

global_attrs (dict)

property attrs

Dictionary of all dataset attributes

Returns

attrs (dict)

property shapes

Dictionary of all dataset shapes

Returns

shapes (dict)

property dtypes

Dictionary of all dataset dtypes

Returns

dtypes (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns

chunks (dict)

property scale_factors

Dictionary of all dataset scale factors

Returns

scale_factors (dict)

property units

Dictionary of all dataset units

Returns

units (dict)

get_attrs(dset=None)[source]

Get h5 attributes either from file or dataset

Parameters

dset (str) – Dataset to get attributes for, if None get file (global) attributes

Returns

attrs (dict) – Dataset or file attributes

get_dset_properties(dset)[source]

Get dataset properties (shape, dtype, chunks)

Parameters

dset (str) – Dataset to get scale factor for

Returns

  • shape (tuple) – Dataset array shape

  • dtype (str) – Dataset array dtype

  • chunks (tuple) – Dataset chunk size

get_scale_factor(dset)[source]

Get dataset scale factor

Parameters

dset (str) – Dataset to get scale factor for

Returns

float – Dataset scale factor, used to unscale int values to floats

get_units(dset)[source]

Get dataset units

Parameters

dset (str) – Dataset to get units for

Returns

str – Dataset units, None if not defined

get_meta_arr(rec_name, rows=slice(None, None, None))[source]

Get a meta array by name (faster than DataFrame extraction).

Parameters
  • rec_name (str) – Named record from the meta data to retrieve.

  • rows (slice) – Rows of the record to extract.

Returns

meta_arr (np.ndarray) – Extracted array from the meta data record name.

close()[source]

Close h5 instance