nsrdb.aggregation.aggregation.Aggregation
- class Aggregation(var, data_fpath, nn, w, final_ti)[source]
Bases:
object
Framework for performing spatiotemporal aggregation.
- Parameters:
var (str) – Variable (dataset) name being aggregated.
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
Methods
cloud_property
(var, data_fpath, nn, w, ...)Run cloud property aggregation, returning the mean cloud property only for timesteps that match the most common (mode) cloud type.
cloud_property_avg
(cprop_source, ...)Run cloud property aggregation based on output cloud type.
cloud_type
(var, data_fpath, nn, w, final_ti)Run cloud type aggregation, returning the most common cloud type.
cloud_type_mode
(data, w)Get the mode of a 2D cloud type array using a rolling time window.
dhi
(var, i, fout)Calculate the aggregated DHI from an aggregated output file.
fill_flag
(var, data_fpath, nn, w, final_ti)Run fill flag aggregation, returning the percentage of timesteps that were filled.
format_out_arr
(arr)Format the output array (round and flatten).
mean
(var, data_fpath, nn, w, final_ti)Run agg using a spatial average and temporal moving window average.
point
(var, data_fpath, nn, w, final_ti)Run agg by selecting just the closest site and timestep.
reduce_timeseries
(arr)Reduce a high res timeseries to a coarse timeseries.
spatial_avg
(data)Average the source data across the spatial extent.
spatial_sum
(data)Sum the source data across the spatial extent.
time_avg
(inp)Calculate the rolling time average for an input array or df.
time_sum
(inp)Calculate the rolling sum for an input array or df.
Attributes
Get the timeseries data for the specified var and sites.
Get the time index of the source data.
- property source_time_index
Get the time index of the source data.
- Returns:
time_index (pd.Datetimeindex) – Datetimeindex of the source dataset.
- property data
Get the timeseries data for the specified var and sites.
- Returns:
_data (np.ndarray) – Unscaled float data array with shape (ti, nn) where ti is the native time index length and nn is the number of neighbors in the self.nn attr.
- static spatial_avg(data)[source]
Average the source data across the spatial extent.
- Returns:
data (np.ndarray) – Unscaled float data array with shape (ti, ) where ti is the native time index length the data was averaged accross all nn neighbors.
- static spatial_sum(data)[source]
Sum the source data across the spatial extent.
- Returns:
data (np.ndarray) – Unscaled float data array with shape (ti, ) where ti is the native time index length the data was summed accross all nn neighbors.
- time_avg(inp)[source]
Calculate the rolling time average for an input array or df.
- Parameters:
inp (np.ndarray | pd.DataFrame) – Input array/df with data to average.
- Returns:
out (np.ndarray | pd.DataFrame) – Array or dataframe with same size as input and each value is a moving average.
- time_sum(inp)[source]
Calculate the rolling sum for an input array or df.
- Parameters:
inp (np.ndarray | pd.DataFrame) – Input array/df with data to sum.
- Returns:
out (np.ndarray | pd.DataFrame) – Array or dataframe with same size as input and each value is a moving sum.
- static cloud_type_mode(data, w)[source]
Get the mode of a 2D cloud type array using a rolling time window.
- Parameters:
data (np.ndarray) – 2D array of integer cloud types.
w (int) – Temporal window over which to take the mode.
- Returns:
data (np.ndarray) – Mode of cloud type.
- reduce_timeseries(arr)[source]
Reduce a high res timeseries to a coarse timeseries.
- Parameters:
arr (np.ndarray) – 2D numpy array
- Returns:
arr (np.ndarray) – Shortened 2D numpy array with length equal to the final ti.
- classmethod point(var, data_fpath, nn, w, final_ti)[source]
Run agg by selecting just the closest site and timestep.
- Parameters:
var (str) – Variable (dataset) name being aggregated.
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
- Returns:
data (np.ndarray) – (n, ) array unscaled and rounded data from the nn with time series matching final_ti.
- classmethod dhi(var, i, fout)[source]
Calculate the aggregated DHI from an aggregated output file.
- Parameters:
var (str) – Variable name, either “dhi” or “clearsky_dhi”.
i (int) – Site index in fout.
fout (str) – Filepath to the output file containing aggregated GHI, DNI, and SZA to calculate aggregated DHI.
- Returns:
dhi (np.ndarray) – DHI calcualted from vars in fout.
- classmethod fill_flag(var, data_fpath, nn, w, final_ti)[source]
Run fill flag aggregation, returning the percentage of timesteps that were filled.
- Parameters:
var (str) – Variable (dataset) name being aggregated (fill_flag).
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
- Returns:
data (np.ndarray) – (n, ) array unscaled and rounded data from the nn with time series matching final_ti.
- classmethod cloud_type(var, data_fpath, nn, w, final_ti)[source]
Run cloud type aggregation, returning the most common cloud type.
- Parameters:
var (str) – Variable (dataset) name being aggregated (cloud_type).
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
- Returns:
data (np.ndarray) – (n, ) array unscaled and rounded data from the nn with time series matching final_ti.
- static cloud_property_avg(cprop_source, ctype_source, ctype_out_full, w)[source]
Run cloud property aggregation based on output cloud type.
- Parameters:
cprop_source (np.ndarray) – Source (full resolution) cloud property data.
ctype_source (np.ndarray) – Source (full resolution) cloud type data.
ctype_out_full (np.ndarray) – Output (reduced resolution) cloud type data, interpolated to the same length as the source resolution.
w (int) – Window size.
- Returns:
cprop_out (np.ndarray) – Average cloud property data in the window surrounding each timestep masked by cloud type output == cloud type source. Shape is same as ctype_out_full.
- classmethod cloud_property(var, data_fpath, nn, w, final_ti, gid, fout)[source]
Run cloud property aggregation, returning the mean cloud property only for timesteps that match the most common (mode) cloud type.
- Parameters:
var (str) – Variable (dataset) name being aggregated (cloud_type).
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
gid (int) – Site index in fout.
fout (str) – Filepath to the output file containing aggregated cloud type.
- Returns:
data (np.ndarray) – Average cloud property data in the window surrounding each timestep masked by cloud type output == cloud type source. Shape is same as ctype_out_full. Array is (n, ) and is unscaled and rounded data from the nn with time series matching final_ti.
- classmethod mean(var, data_fpath, nn, w, final_ti)[source]
Run agg using a spatial average and temporal moving window average.
- Parameters:
var (str) – Variable (dataset) name being aggregated.
data_fpath (str) – Filepath to h5 file containing source var data.
nn (np.ndarray) – 1D array of site (column) indices in data_fpath to aggregate.
w (int) – Window size for temporal aggregation.
final_ti (pd.DateTimeIndex) – Final datetime index (used to ensure the aggregated profile has correct length).
- Returns:
data (np.ndarray) – (n, ) array unscaled and rounded data from the nn with time series matching final_ti.