rex.temporal_stats.temporal_stats.TemporalStats

class TemporalStats(res_h5, statistics='mean', res_cls=<class 'rex.resource.Resource'>, hsds=False)[source]

Bases: object

Temporal Statistics from Resource Data

Parameters:

res_h5 (str) – Path to resource h5 file(s)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False

Methods

`all`(res_h5, dataset[, sites, statistics, ...])	Compute annual, monthly, monthly-diurnal, and diurnal stats
`all_stats`(dataset[, sites, max_workers, ...])	Compute annual, monthly, monthly-diurnal, and diurnal stats
`compute_statistics`(dataset[, sites, ...])	Compute statistics
`diurnal`(res_h5, dataset[, sites, ...])	Compute diurnal stats
`diurnal_stats`(dataset[, sites, max_workers, ...])	Compute diurnal stats
`full_stats`(dataset[, sites, max_workers, ...])	Compute stats for entire temporal extent of file
`monthly`(res_h5, dataset[, sites, ...])	Compute monthly stats
`monthly_diurnal`(res_h5, dataset[, sites, ...])	Compute monthly-diurnal stats
`monthly_diurnal_stats`(dataset[, sites, ...])	Compute monthly-diurnal stats
`monthly_stats`(dataset[, sites, max_workers, ...])	Compute monthly stats
`run`(res_h5, dataset[, sites, statistics, ...])	Compute temporal stats, by default full temporal extent stats
`save_stats`(res_stats, out_path)	Save statistics to disk

Attributes

`STATS`
`lat_lon`	Resource (lat, lon) coordinates
`meta`	Resource meta-data table
`res_cls`	Resource class to use to access res_h5
`res_h5`	Path to resource h5 file(s)
`statistics`	Dictionary of statistic functions/kwargs to run
`time_index`	Resource Datetimes

property res_h5

Path to resource h5 file(s)

Returns:: str

property statistics

Dictionary of statistic functions/kwargs to run

Returns:: dict

property res_cls

Resource class to use to access res_h5

Returns:: Class

property time_index

Resource Datetimes

Returns:: pandas.DatetimeIndex

property meta

Resource meta-data table

Returns:: pandas.DataFrame

property lat_lon

Resource (lat, lon) coordinates

Returns:: pandas.DataFrame

compute_statistics(dataset, sites=None, diurnal=False, month=False, combinations=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute statistics

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
diurnal (bool, optional) – Extract diurnal stats, by default False
month (bool, optional) – Extract monthly stats, by default False
combinations (bool, optional) – Extract all combinations of temporal stats, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

res_stats (pandas.DataFrame) – DataFrame of desired statistics at desired time intervals

full_stats(dataset, sites=None, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute stats for entire temporal extent of file

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

full_stats (pandas.DataFrame) – DataFrame of statistics for the entire temporal extent of file

monthly_stats(dataset, sites=None, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute monthly stats

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

monthly_stats (pandas.DataFrame) – DataFrame of monthly statistics

diurnal_stats(dataset, sites=None, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute diurnal stats

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

diurnal_stats (pandas.DataFrame) – DataFrame of diurnal statistics

monthly_diurnal_stats(dataset, sites=None, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute monthly-diurnal stats

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

monthly_diurnal_stats (pandas.DataFrame) – DataFrame of monthly-diurnal statistics

all_stats(dataset, sites=None, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False)[source]

Compute annual, monthly, monthly-diurnal, and diurnal stats

Parameters:

dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).

Returns:

all_diurnal_stats (pandas.DataFrame) – DataFrame of temporal statistics

save_stats(res_stats, out_path)[source]

Save statistics to disk

Parameters:

res_stats (pandas.DataFrame) – Table of statistics to save
out_path (str) – Directory, .csv, or .json path to save statistics too

classmethod run(res_h5, dataset, sites=None, statistics='mean', diurnal=False, month=False, combinations=False, res_cls=<class 'rex.resource.Resource'>, hsds=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False, out_path=None)[source]

Compute temporal stats, by default full temporal extent stats

Parameters:

res_h5 (str) – Path to resource h5 file(s)
dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
diurnal (bool, optional) – Extract diurnal stats, by default False
month (bool, optional) – Extract monthly stats, by default False
combinations (bool, optional) – Extract all combinations of temporal stats, by default False
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).
out_path (str, optional) – Directory, .csv, or .json path to save statistics too, by default None

Returns:

out_stats (pandas.DataFrame) – DataFrame of resource statistics

classmethod monthly(res_h5, dataset, sites=None, statistics='mean', res_cls=<class 'rex.resource.Resource'>, hsds=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False, out_path=None)[source]

Compute monthly stats

Parameters:

res_h5 (str) – Path to resource h5 file(s)
dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).
out_path (str, optional) – Directory, .csv, or .json path to save statistics too, by default None

Returns:

monthly_stats (pandas.DataFrame) – DataFrame of monthly statistics

classmethod diurnal(res_h5, dataset, sites=None, statistics='mean', res_cls=<class 'rex.resource.Resource'>, hsds=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False, out_path=None)[source]

Compute diurnal stats

Parameters:

res_h5 (str) – Path to resource h5 file(s)
dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).
out_path (str, optional) – Directory, .csv, or .json path to save statistics too, by default None

Returns:

diurnal_stats (pandas.DataFrame) – DataFrame of diurnal statistics

classmethod monthly_diurnal(res_h5, dataset, sites=None, statistics='mean', res_cls=<class 'rex.resource.Resource'>, hsds=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False, out_path=None)[source]

Compute monthly-diurnal stats

Parameters:

res_h5 (str) – Path to resource h5 file(s)
dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).
out_path (str, optional) – Directory, .csv, or .json path to save statistics too, by default None

Returns:

monthly_diurnal_stats (pandas.DataFrame) – DataFrame of monthly-diurnal statistics

classmethod all(res_h5, dataset, sites=None, statistics='mean', res_cls=<class 'rex.resource.Resource'>, hsds=False, max_workers=None, chunks_per_worker=5, lat_lon_only=True, mask_zeros=False, out_path=None)[source]

Compute annual, monthly, monthly-diurnal, and diurnal stats

Parameters:

res_h5 (str) – Path to resource h5 file(s)
dataset (str) – Dataset to extract stats for
sites (list | slice, optional) – Subset of sites to extract, by default None or all sites (sites is synonymous with gids aka spatial indices)
statistics (str | tuple | dict, optional) – Statistics to extract, either a key or tuple of keys in cls.STATS, or a dictionary of the form {‘stat_name’: {‘func’: , ‘kwargs: {*}}}, by default ‘mean’
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
res_cls (Class, optional) – Resource class to use to access res_h5, by default Resource
hsds (bool, optional) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS, by default False
max_workers (None | int, optional) – Number of workers to use, if 1 run in serial, if None use all available cores, by default None
chunks_per_worker (int, optional) – Number of chunks to extract on each worker, by default 5
lat_lon_only (bool, optional) – Only append lat, lon coordinates to stats, by default True
mask_zeros (bool) – Flag to only calculate stats when all data is > 0 (useful for global horizontal irradiance).
out_path (str, optional) – Directory, .csv, or .json path to save statistics too, by default None

Returns:

all_stats (pandas.DataFrame) – DataFrame of temporal statistics