rex.resource.Resource

class Resource(h5_file, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None)[source]

Bases: BaseResource

Base class to handle resource .h5 files

Examples

Extracting the resource’s Datetime Index

>>> file = '$TESTDATADIR/nsrdb/ri_100_nsrdb_2012.h5'
>>> with Resource(file) as res:
>>>     ti = res.time_index
>>>
>>> ti
DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:30:00',
               '2012-01-01 01:00:00', '2012-01-01 01:30:00',
               '2012-01-01 02:00:00', '2012-01-01 02:30:00',
               '2012-01-01 03:00:00', '2012-01-01 03:30:00',
               '2012-01-01 04:00:00', '2012-01-01 04:30:00',
               ...
               '2012-12-31 19:00:00', '2012-12-31 19:30:00',
               '2012-12-31 20:00:00', '2012-12-31 20:30:00',
               '2012-12-31 21:00:00', '2012-12-31 21:30:00',
               '2012-12-31 22:00:00', '2012-12-31 22:30:00',
               '2012-12-31 23:00:00', '2012-12-31 23:30:00'],
              dtype='datetime64[ns]', length=17568, freq=None)

Efficient slicing of the Datetime Index

>>> with Resource(file) as res:
>>>     ti = res['time_index', 1]
>>>
>>> ti
2012-01-01 00:30:00

>>> with Resource(file) as res:
>>>     ti = res['time_index', :10]
>>>
>>> ti
DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:30:00',
               '2012-01-01 01:00:00', '2012-01-01 01:30:00',
               '2012-01-01 02:00:00', '2012-01-01 02:30:00',
               '2012-01-01 03:00:00', '2012-01-01 03:30:00',
               '2012-01-01 04:00:00', '2012-01-01 04:30:00'],
              dtype='datetime64[ns]', freq=None)

>>> with Resource(file) as res:
>>>     ti = res['time_index', [1, 2, 4, 8, 9]
>>>
>>> ti
DatetimeIndex(['2012-01-01 00:30:00', '2012-01-01 01:00:00',
               '2012-01-01 02:00:00', '2012-01-01 04:00:00',
               '2012-01-01 04:30:00'],
              dtype='datetime64[ns]', freq=None)

Extracting resource’s site metadata

>>> with Resource(file) as res:
>>>     meta = res.meta
>>>
>>> meta
        latitude  longitude   elevation  timezone    country ...
0      41.29     -71.86    0.000000        -5           None ...
1      41.29     -71.82    0.000000        -5           None ...
2      41.25     -71.82    0.000000        -5           None ...
3      41.33     -71.82   15.263158        -5  United States ...
4      41.37     -71.82   25.360000        -5  United States ...
..       ...        ...         ...       ...            ... ...
95     41.25     -71.66    0.000000        -5           None ...
96     41.89     -71.66  153.720000        -5  United States ...
97     41.45     -71.66   35.440000        -5  United States ...
98     41.61     -71.66  140.200000        -5  United States ...
99     41.41     -71.66   35.160000        -5  United States ...
[100 rows x 10 columns]

Efficient slicing of the metadata

>>> with Resource(file) as res:
>>>     meta = res['meta', 1]
>>>
>>> meta
   latitude  longitude  elevation  timezone country state county urban ...
1     41.29     -71.82        0.0        -5    None  None   None  None ...

>>> with Resource(file) as res:
>>>     meta = res['meta', :5]
>>>
>>> meta
   latitude  longitude  elevation  timezone        country ...
0     41.29     -71.86   0.000000        -5           None ...
1     41.29     -71.82   0.000000        -5           None ...
2     41.25     -71.82   0.000000        -5           None ...
3     41.33     -71.82  15.263158        -5  United States ...
4     41.37     -71.82  25.360000        -5  United States ...

>>> with Resource(file) as res:
>>>     tz = res['meta', :, 'timezone']
>>>
>>> tz
0    -5
1    -5
2    -5
3    -5
4    -5
     ..
95   -5
96   -5
97   -5
98   -5
99   -5
Name: timezone, Length: 100, dtype: int64

>>> with Resource(file) as res:
>>>     lat_lon = res['meta', :, ['latitude', 'longitude']]
>>>
>>> lat_lon
    latitude  longitude
0      41.29     -71.86
1      41.29     -71.82
2      41.25     -71.82
3      41.33     -71.82
4      41.37     -71.82
..       ...        ...
95     41.25     -71.66
96     41.89     -71.66
97     41.45     -71.66
98     41.61     -71.66
99     41.41     -71.66
[100 rows x 2 columns]

Extracting resource variables (datasets)

>>> with Resource(file) as res:
>>>     wspd = res['wind_speed']
>>>
>>> wspd
[[12. 12. 12. ... 12. 12. 12.]
 [12. 12. 12. ... 12. 12. 12.]
 [12. 12. 12. ... 12. 12. 12.]
 ...
 [14. 14. 14. ... 14. 14. 14.]
 [15. 15. 15. ... 15. 15. 15.]
 [15. 15. 15. ... 15. 15. 15.]]

Efficient slicing of variables

>>> with Resource(file) as res:
>>>     wspd = res['wind_speed', :2]
>>>
>>> wspd
[[12. 12. 12. 12. 12. 12. 53. 53. 53. 53. 53. 12. 53.  1.  1. 12. 12. 12.
   1.  1. 12. 53. 53. 53. 12. 12. 12. 12. 12.  1. 12. 12.  1. 12. 12. 53.
  12. 53.  1. 12.  1. 53. 53. 12. 12. 12. 12.  1.  1.  1. 12. 12.  1.  1.
  12. 12. 53. 53. 53. 12. 12. 53. 53. 12. 12. 12. 12. 12. 12.  1. 53.  1.
  53. 12. 12. 53. 53.  1.  1.  1. 53. 12.  1.  1. 53. 53. 53. 12. 12. 12.
  12. 12. 12. 12.  1. 12.  1. 12. 12. 12.]
 [12. 12. 12. 12. 12. 12. 53. 53. 53. 53. 53. 12. 53.  1.  1. 12. 12. 12.
   1.  1. 12. 53. 53. 53. 12. 12. 12. 12. 12.  1. 12. 12.  1. 12. 12. 53.
  12. 53.  1. 12.  1. 53. 53. 12. 12. 12. 12.  1.  1.  1. 12. 12.  1.  1.
  12. 12. 53. 53. 53. 12. 12. 53. 53. 12. 12. 12. 12. 12. 12.  1. 53.  1.
  53. 12. 12. 53. 53.  1.  1.  1. 53. 12.  1.  1. 53. 53. 53. 12. 12. 12.
  12. 12. 12. 12.  1. 12.  1. 12. 12. 12.]]

>>> with Resource(file) as res:
>>>     wspd = res['wind_speed', :, [2, 3]]
>>>
>>> wspd
[[12. 12.]
 [12. 12.]
 [12. 12.]
 ...
 [14. 14.]
 [15. 15.]
 [15. 15.]]

Parameters:

h5_file (str) – Path to .h5 resource file
unscale (bool, optional) – Boolean flag to automatically unscale variables on extraction, by default True
str_decode (bool, optional) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read, by default True
group (str, optional) – Group within .h5 resource file to open, by default None
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None

Methods

`close`()	Close h5 instance
`df_str_decode`(df)	Decode a dataframe with byte string columns into ordinary str cols.
`get_SAM_df`(site)	Placeholder for get_SAM_df method that it resource specific
`get_attrs`([dset])	Get h5 attributes either from file or dataset
`get_dset_properties`(dset)	Get dataset properties (shape, dtype, chunks)
`get_meta_arr`(rec_name[, rows])	Get a meta array by name (faster than DataFrame extraction).
`get_scale_factor`(dset)	Get dataset scale factor
`get_units`(dset)	Get dataset units
`open_dataset`(ds_name)	Open resource dataset
`preload_SAM`(h5_file, sites, tech[, unscale, ...])	Pre-load project_points for SAM

Attributes

`ADD_ATTR`
`SCALE_ATTR`
`UNIT_ATTR`
`adders`	Dictionary of all dataset add offset factors
`attrs`	Dictionary of all dataset attributes
`chunks`	Dictionary of all dataset chunk sizes
`coordinates`	(lat, lon) pairs
`data_version`	Get the version attribute of the data.
`datasets`	Datasets available
`dsets`	Datasets available
`dtypes`	Dictionary of all dataset dtypes
`global_attrs`	Global (file) attributes
`groups`	Groups available
`h5`	Open h5py File instance.
`lat_lon`	Extract (latitude, longitude) pairs
`meta`	Resource meta data DataFrame
`res_dsets`	Available resource datasets
`resource_datasets`	Available resource datasets
`scale_factors`	Dictionary of all dataset scale factors
`shape`	Resource shape (timesteps, sites) shape = (len(time_index), len(meta))
`shapes`	Dictionary of all dataset shapes
`time_index`	Resource DatetimeIndex
`units`	Dictionary of all dataset units

property adders

Dictionary of all dataset add offset factors

Returns:: adders (dict)

property attrs

Dictionary of all dataset attributes

Returns:: attrs (dict)

property chunks

Dictionary of all dataset chunk sizes

Returns:: chunks (dict)

close(): Close h5 instance

property coordinates

(lat, lon) pairs

Returns:: lat_lon (ndarray)
Type:: Coordinates

property data_version

Get the version attribute of the data. None if not available.

Returns:: version (str | None)

property datasets

Datasets available

Returns:: list

static df_str_decode(df)

Decode a dataframe with byte string columns into ordinary str cols.

Parameters:: df (pd.DataFrame) – Dataframe with some columns being byte strings.
Returns:: df (pd.DataFrame) – DataFrame with str columns instead of byte str columns.

property dsets

Datasets available

Returns:: list

property dtypes

Dictionary of all dataset dtypes

Returns:: dtypes (dict)

get_SAM_df(site)

Placeholder for get_SAM_df method that it resource specific

Parameters:: site (int) – Site to extract SAM DataFrame for

get_attrs(dset=None)

Get h5 attributes either from file or dataset

Parameters:: dset (str) – Dataset to get attributes for, if None get file (global) attributes
Returns:: attrs (dict) – Dataset or file attributes

get_dset_properties(dset)

Get dataset properties (shape, dtype, chunks)

Parameters:

dset (str) – Dataset to get scale factor for

Returns:

shape (tuple) – Dataset array shape
dtype (str) – Dataset array dtype
chunks (tuple) – Dataset chunk size

get_meta_arr(rec_name, rows=slice(None, None, None))

Get a meta array by name (faster than DataFrame extraction).

Parameters:

rec_name (str) – Named record from the meta data to retrieve.
rows (slice) – Rows of the record to extract.

Returns:

meta_arr (np.ndarray) – Extracted array from the meta data record name.

get_scale_factor(dset)

Get dataset scale factor

Parameters:: dset (str) – Dataset to get scale factor for
Returns:: float – Dataset scale factor, used to unscale int values to floats

get_units(dset)

Get dataset units

Parameters:: dset (str) – Dataset to get units for
Returns:: str – Dataset units, None if not defined

property global_attrs

Global (file) attributes

Returns:: global_attrs (dict)

property groups

Groups available

Returns:: groups (list) – List of groups

property h5

Open h5py File instance. If _group is not None return open Group

Returns:: h5 (h5py.File | h5py.Group)

property lat_lon

Extract (latitude, longitude) pairs

Returns:: lat_lon (ndarray)

property meta

Resource meta data DataFrame

Returns:: meta (pandas.DataFrame)

open_dataset(ds_name)

Open resource dataset

Parameters:: ds_name (str) – Dataset name to open
Returns:: ds (ResourceDataset) – Resource for open resource dataset

classmethod preload_SAM(h5_file, sites, tech, unscale=True, str_decode=True, group=None, hsds=False, hsds_kwargs=None, time_index_step=None, means=False)

Pre-load project_points for SAM

Parameters:

h5_file (str) – h5_file to extract resource from
sites (list) – List of sites to be provided to SAM (sites is synonymous with gids aka spatial indices)
tech (str) – Technology to be run by SAM
unscale (bool) – Boolean flag to automatically unscale variables on extraction
str_decode (bool) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read.
group (str) – Group within .h5 resource file to open
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
hsds_kwargs (dict, optional) – Dictionary of optional kwargs for h5pyd, e.g., bucket, username, password, by default None
time_index_step (int, optional) – Step size for time_index, used to reduce temporal resolution, by default None
means (bool, optional) – Boolean flag to compute mean resource when res_array is set, by default False

Returns:

SAM_res (SAMResource) – Instance of SAMResource pre-loaded with Solar resource for sites in project_points

property res_dsets

Available resource datasets

Returns:: list

property resource_datasets

Available resource datasets

Returns:: list

property scale_factors

Dictionary of all dataset scale factors

Returns:: scale_factors (dict)

property shape

Resource shape (timesteps, sites) shape = (len(time_index), len(meta))

Returns:: shape (tuple)

property shapes

Dictionary of all dataset shapes

Returns:: shapes (dict)

property time_index

Resource DatetimeIndex

Returns:: time_index (pandas.DatetimeIndex)

property units

Dictionary of all dataset units

Returns:: units (dict)