rex.multi_time_resource.MultiTimeResource
- class MultiTimeResource(h5_path, unscale=True, str_decode=True, res_cls=<class 'rex.resource.Resource'>, hsds=False, hsds_kwargs=None)[source]
Bases:
BaseDatasetIterable
Class to handle resource data stored temporally accross multiple .h5 files
Examples
Extracting the resource’s Datetime Index
>>> path = '$TESTDATADIR/nsrdb/ri_100_nsrdb_*.h5' >>> with MultiTimeResource(path) as res: >>> ti = res.time_index >>> >>> ti DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:30:00', '2012-01-01 01:00:00', '2012-01-01 01:30:00', '2012-01-01 02:00:00', '2012-01-01 02:30:00', '2012-01-01 03:00:00', '2012-01-01 03:30:00', '2012-01-01 04:00:00', '2012-01-01 04:30:00', ... '2013-12-31 19:00:00', '2013-12-31 19:30:00', '2013-12-31 20:00:00', '2013-12-31 20:30:00', '2013-12-31 21:00:00', '2013-12-31 21:30:00', '2013-12-31 22:00:00', '2013-12-31 22:30:00', '2013-12-31 23:00:00', '2013-12-31 23:30:00'], dtype='datetime64[ns]', length=35088, freq=None)
NOTE: time_index covers data from 2012 and 2013
>>> with MultiTimeResource(path) as res: >>> print(res.h5_files)
- [‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2012.h5’,
‘/Users/mrossol/Git_Repos/rex/tests/data/nsrdb/ri_100_nsrdb_2013.h5’]
Data slicing works the same as with “Resource” except axis 0 now covers 2012 and 2013
>>> with MultiTimeResource(path) as res: >>> temperature = res['air_temperature'] >>> >>> temperature [[ 4. 5. 5. ... 4. 3. 4.] [ 4. 4. 5. ... 4. 3. 4.] [ 4. 4. 5. ... 4. 3. 4.] ... [-1. -1. 0. ... -2. -3. -2.] [-1. -1. 0. ... -2. -3. -2.] [-1. -1. 0. ... -2. -3. -2.]] >>> temperature.shape (35088, 100)
>>> with MultiTimeResource(path) as res: >>> temperature = res['air_temperature', ::100] # every 100th timestep >>> >>> temperature [[ 4. 5. 5. ... 4. 3. 4.] [ 1. 1. 2. ... 0. 0. 1.] [-2. -1. -1. ... -2. -4. -2.] ... [-3. -2. -2. ... -3. -4. -3.] [ 0. 0. 1. ... 0. -1. 0.] [ 3. 3. 3. ... 2. 2. 3.]] >>> temperature.shape (351, 100)
- Parameters:
h5_path (str | list) – Unix shell style pattern path with * wildcards to multi-file resource file sets. Files must have the same coordinates but can have different datasets or time indexes. Can also be an explicit list of multi time files, which themselves can contain * wildcards.
unscale (bool) – Boolean flag to automatically unscale variables on extraction
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
res_cls (obj) – Resource handler to us to open individual .h5 files
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None
Methods
close
()Close h5 instance
get_attrs
([dset])Get h5 attributes either from file or dataset
get_dset_properties
(dset)Get dataset properties (shape, dtype, chunks)
get_meta_arr
(rec_name[, rows])Get a meta array by name (faster than DataFrame extraction).
get_scale_factor
(dset)Get dataset scale factor
get_units
(dset)Get dataset units
Attributes
Dictionary of all dataset attributes
Dictionary of all dataset chunk sizes
(lat, lon) pairs
Datasets available
Datasets available
Dictionary of all dataset dtypes
Global (file) attributes
Open class instance that handles all .h5 files that data is to be extracted from
Extract (latitude, longitude) pairs
Resource meta data DataFrame
Available resource datasets
Available resource datasets
Dictionary of all dataset scale factors
Resource shape (timesteps, sites) shape = (len(time_index), len(meta))
Dictionary of all dataset shapes
Resource DatetimeIndex
Dictionary of all dataset units
- property h5
Open class instance that handles all .h5 files that data is to be extracted from
- Returns:
h5 (MultiTimeH5 | MultiYearH5)
- property datasets
Datasets available
- Returns:
list
- property dsets
Datasets available
- Returns:
list
- property resource_datasets
Available resource datasets
- Returns:
list
- property res_dsets
Available resource datasets
- Returns:
list
- property shape
Resource shape (timesteps, sites) shape = (len(time_index), len(meta))
- Returns:
shape (tuple)
- property meta
Resource meta data DataFrame
- Returns:
meta (pandas.DataFrame)
- property time_index
Resource DatetimeIndex
- Returns:
time_index (pandas.DatetimeIndex)
- property lat_lon
Extract (latitude, longitude) pairs
- Returns:
lat_lon (ndarray)
- property coordinates
(lat, lon) pairs
- Returns:
lat_lon (ndarray)
- Type:
Coordinates
- property global_attrs
Global (file) attributes
- Returns:
global_attrs (dict)
- property attrs
Dictionary of all dataset attributes
- Returns:
attrs (dict)
- property shapes
Dictionary of all dataset shapes
- Returns:
shapes (dict)
- property dtypes
Dictionary of all dataset dtypes
- Returns:
dtypes (dict)
- property chunks
Dictionary of all dataset chunk sizes
- Returns:
chunks (dict)
- property scale_factors
Dictionary of all dataset scale factors
- Returns:
scale_factors (dict)
- property units
Dictionary of all dataset units
- Returns:
units (dict)
- get_attrs(dset=None)[source]
Get h5 attributes either from file or dataset
- Parameters:
dset (str) – Dataset to get attributes for, if None get file (global) attributes
- Returns:
attrs (dict) – Dataset or file attributes
- get_dset_properties(dset)[source]
Get dataset properties (shape, dtype, chunks)
- Parameters:
dset (str) – Dataset to get scale factor for
- Returns:
shape (tuple) – Dataset array shape
dtype (str) – Dataset array dtype
chunks (tuple) – Dataset chunk size
- get_scale_factor(dset)[source]
Get dataset scale factor
- Parameters:
dset (str) – Dataset to get scale factor for
- Returns:
float – Dataset scale factor, used to unscale int values to floats
- get_units(dset)[source]
Get dataset units
- Parameters:
dset (str) – Dataset to get units for
- Returns:
str – Dataset units, None if not defined
- get_meta_arr(rec_name, rows=slice(None, None, None))[source]
Get a meta array by name (faster than DataFrame extraction).
- Parameters:
rec_name (str) – Named record from the meta data to retrieve.
rows (slice) – Rows of the record to extract.
- Returns:
meta_arr (np.ndarray) – Extracted array from the meta data record name.