sup3r.preprocessing.data_handlers.nc_cc.DataHandlerNCforCC#
- class DataHandlerNCforCC(file_paths, *, features='all', nsrdb_source_fp=None, nsrdb_agg=1, nsrdb_smoothing=0, res_kwargs=None, chunks='auto', target=None, shape=None, time_slice=slice(None, None, None), threshold=None, time_roll=0, time_shift=None, hr_spatial_coarsen=1, nan_method_kwargs=None, interp_kwargs=None, cache_kwargs=None, **kwargs)[source]#
Bases:
FactoryDataHandler
Extended NETCDF data handler. This implements a rasterizer hook to add “clearsky_ghi” to the rasterized data if “clearsky_ghi” is requested.
- Parameters:
file_paths (str | list | pathlib.Path) – file_paths input to LoaderClass
features (list | str) – Features to load and / or derive. If ‘all’ then all available raw features will be loaded. Specify explicit feature names for derivations.
nsrdb_source_fp (str | None) – Optional NSRDB source h5 file to retrieve clearsky_ghi from to calculate CC clearsky_ratio along with rsds (ghi) from the CC netcdf file.
nsrdb_agg (int) – Optional number of NSRDB source pixels to aggregate clearsky_ghi from to a single climate change netcdf pixel. This can be used if the CC.nc data is at a much coarser resolution than the source nsrdb data.
nsrdb_smoothing (float) – Optional gaussian filter smoothing factor to smooth out clearsky_ghi from high-resolution nsrdb source data. This is typically done because spatially aggregated nsrdb data is still usually rougher than CC irradiance data.
kwargs (dict) – Dictionary of additional keyword args for
Rasterizer
, used specifically for rasterizing flattened datares_kwargs (dict) – Additional keyword arguments passed through to the
BaseLoader
. BaseLoader is usually xr.open_mfdataset for NETCDF files and MultiFileResourceX for H5 files.chunks (dict | str) – Dictionary of chunk sizes to pass through to
dask.array.from_array()
orxr.Dataset().chunk()
. Will be converted to a tuple when used infrom_array()
. These are the methods for H5 and NETCDF data, respectively. This argument can be “auto” or None in addition to a dictionary. None will not do any chunking and load data into memory asnp.array
target (tuple) – (lat, lon) lower left corner of raster. Either need target+shape or raster_file.
shape (tuple) – (rows, cols) grid size. Either need target+shape or raster_file.
time_slice (slice | list) – Slice specifying extent and step of temporal extraction. e.g. slice(start, stop, step). If equal to slice(None, None, 1) the full time dimension is selected. Can be also be a list
[start, stop, step]
threshold (float) – Nearest neighbor euclidean distance threshold. If the coordinates are more than this value away from the target lat/lon, an error is raised.
time_roll (int) – Number of steps to roll along the time axis. Passed to xr.Dataset.roll()
time_shift (int | None) – Number of minutes to shift time axis. This can be used, for example, to shift the time index for daily data so that the time stamp for a given day starts at the zeroth minute instead of at noon, as is the case for most GCM data.
hr_spatial_coarsen (int) – Spatial coarsening factor. Passed to xr.Dataset.coarsen()
nan_method_kwargs (str | dict | None) – Keyword arguments for nan handling. If ‘mask’, time steps with nans will be dropped. Otherwise this should be a dict of kwargs which will be passed to
sup3r.preprocessing.accessor.Sup3rX.interpolate_na()
.interp_kwargs (dict | None) – Dictionary of kwargs for level interpolation. Can include “method” and “run_level_check” keys. Method specifies how to perform height interpolation. e.g. Deriving u_20m from u_10m and u_100m. Options are “linear” and “log”. See
sup3r.preprocessing.derivers.Deriver.do_level_interpolation()
cache_kwargs (dict | None) – Dictionary with kwargs for caching wrangled data. This should at minimum include a cache_pattern key, value. This pattern must have a {feature} format key and either a h5 or nc file extension, based on desired output type. See class:Cacher for description of more arguments.
Methods
check_registry
(feature)Get compute method from the registry if available.
derive
(feature)Routine to derive requested features.
do_level_interpolation
(feature[, interp_kwargs])Interpolate over height or pressure to derive the given feature.
Get clearsky ghi from an exogenous NSRDB source h5 file at the target CC meta data and time index.
get_inputs
(feature)Get inputs for the given feature and inputs for those inputs.
get_multi_level_data
(feature)Get data stored in multi-level arrays, like u stored on pressure levels.
get_single_level_data
(feature)When doing level interpolation we should include the single level data available.
get_time_slice
(ti_nsrdb)Get nsrdb data time slice consistent with self.time_index.
has_interp_variables
(feature)Check if the given feature can be interpolated from values at nearby heights or from pressure level data.
map_new_name
(feature, pattern)If the search for a derivation method first finds an alternative name for the feature we want to derive, by matching a wildcard pattern, we need to replace the wildcard with the specific height or pressure we want and continue the search for a derivation method with this new name.
no_overlap
(feature)Check if any of the nested inputs for 'feature' contain 'feature'
post_init_log
([args_dict])Log additional arguments after initialization.
Run checks on the files provided for extracting clearsky_ghi.
run_wrap_checks
(cs_ghi)Run check on rasterized data from clearsky_ghi source.
wrap
(data)Return a
Sup3rDataset
object or tuple of such.Attributes
- run_input_checks()[source]#
Run checks on the files provided for extracting clearsky_ghi. Make sure the loaded data is daily data and the step size is one day.
- get_clearsky_ghi()[source]#
Get clearsky ghi from an exogenous NSRDB source h5 file at the target CC meta data and time index.
TODO: Replace some of this with call to Regridder? Perform daily means with self.loader.coarsen?
- Returns:
cs_ghi (Union[np.ndarray, da.core.Array]) – Clearsky ghi (W/m2) from the nsrdb_source_fp h5 source file. Data shape is (lat, lon, time) where time is daily average values.
- check_registry(feature) ndarray | Array | str | None #
Get compute method from the registry if available. Will check for pattern feature match in feature registry. e.g. if u_100m matches a feature registry entry of u_(.*)m
- property data#
Return underlying data.
- Returns:
See also
- derive(feature) ndarray | Array #
Routine to derive requested features. Employs a little recursion to locate differently named features with a name map in the feature registry. i.e. if FEATURE_REGISTRY contains a key, value pair like “windspeed”: “wind_speed” then requesting “windspeed” will ultimately return a compute method (or fetch from raw data) for “wind_speed
Note
Features are all saved as lower case names and __contains__ checks will use feature.lower()
- do_level_interpolation(feature, interp_kwargs=None) DataArray #
Interpolate over height or pressure to derive the given feature.
- get_inputs(feature)#
Get inputs for the given feature and inputs for those inputs.
- get_multi_level_data(feature)#
Get data stored in multi-level arrays, like u stored on pressure levels.
- get_single_level_data(feature)#
When doing level interpolation we should include the single level data available. e.g. If we have u_100m already and want to interpolate u_40m from multi-level data U we should add u_100m at height 100m before doing interpolation, since 100 could be a closer level to 40m than those available in U.
- has_interp_variables(feature)#
Check if the given feature can be interpolated from values at nearby heights or from pressure level data. e.g. If
u_10m
andu_50m
exist thenu_30m
can be interpolated from these. If a pressure level arrayu
is available this can also be used, in conjunction with height data.
- map_new_name(feature, pattern)#
If the search for a derivation method first finds an alternative name for the feature we want to derive, by matching a wildcard pattern, we need to replace the wildcard with the specific height or pressure we want and continue the search for a derivation method with this new name.
- no_overlap(feature)#
Check if any of the nested inputs for ‘feature’ contain ‘feature’
- post_init_log(args_dict=None)#
Log additional arguments after initialization.
- property shape#
Get shape of underlying data.
- wrap(data)#
Return a
Sup3rDataset
object or tuple of such. This is a tuple when the .data attribute belongs to aCollection
object likeBatchHandler
. Otherwise this isSup3rDataset
object, which is either a wrapped 2-tuple or 1-tuple (e.g.len(data) == 2
orlen(data) == 1)
. This is a 2-tuple when.data
belongs to a dual container object likeDualSampler
and a 1-tuple otherwise.