sup3r.preprocessing.data_handlers.base.DataHandlerH5WindCC

sup3r.preprocessing.data_handlers.base.DataHandlerH5WindCC#

class DataHandlerH5WindCC(file_paths, *, features='all', load_features='all', res_kwargs=None, chunks='auto', target=None, shape=None, time_slice=slice(None, None, None), threshold=None, time_roll=0, time_shift=None, hr_spatial_coarsen=1, nan_method_kwargs=None, feature_aliases=None, BaseLoader=None, FeatureRegistry=None, interp_kwargs=None, cache_kwargs=None, **kwargs)[source]#

Bases: DailyDataHandler

Extended DailyDataHandler specifically for handling H5 data for WindCC applications

Parameters:

file_paths (str | list | pathlib.Path) – file_paths input to LoaderClass
features (list | str) – Features to derive. If ‘all’ then all available raw features will just be loaded. Specify explicit feature names for derivations.
load_features (list | str) – Features to load and make available for derivations. If ‘all’ then all available raw features will be loaded and made available for derivations. This can be used to restrict features used for derivations. For example, to derive ‘temperature_100m’ from only temperature isobars, from data that includes single level values as well (like temperature_2m), don’t include ‘temperature_2m’ in the load_features list.
res_kwargs (dict) – Additional keyword arguments passed through to the BaseLoader. BaseLoader is usually xr.open_mfdataset for NETCDF files and MultiFileResourceX for H5 files.
chunks (dict | str) – Dictionary of chunk sizes to pass through to dask.array.from_array() or xr.Dataset().chunk(). Will be converted to a tuple when used in from_array(). These are the methods for H5 and NETCDF data, respectively. This argument can be “auto” or None in addition to a dictionary. None will not do any chunking and load data into memory as np.array
target (tuple) – (lat, lon) lower left corner of raster. Either need target+shape or raster_file.
shape (tuple) – (rows, cols) grid size. Either need target+shape or raster_file.
time_slice (slice | list) – Slice specifying extent and step of temporal extraction. e.g. slice(start, stop, step). If equal to slice(None, None, 1) the full time dimension is selected. Can be also be a list [start, stop, step]
threshold (float) – Nearest neighbor euclidean distance threshold. If the coordinates are more than this value away from the target lat/lon, an error is raised.
time_roll (int) – Number of steps to roll along the time axis. Passed to xr.Dataset.roll()
time_shift (int | None) – Number of minutes to shift time axis. This can be used, for example, to shift the time index for daily data so that the time stamp for a given day starts at the zeroth minute instead of at noon, as is the case for most GCM data.
hr_spatial_coarsen (int) – Spatial coarsening factor. Passed to xr.Dataset.coarsen()
nan_method_kwargs (str | dict | None) – Keyword arguments for nan handling. If ‘mask’, time steps with nans will be dropped. Otherwise this should be a dict of kwargs which will be passed to sup3r.preprocessing.accessor.Sup3rX.interpolate_na(). e.g. {‘method’: ‘linear’, ‘dim’: ‘time’}
feature_aliases (dict) – Optional dictionary of feature aliases to use when loading data. This is useful for renaming features to expected sup3r names. For example, {‘sp’: ‘pressure_0m’, ‘u10’: u_10m’}.
BaseLoader (Callable) – Base level file loader wrapped by Loader. This is usually xr.open_mfdataset for NETCDF files and MultiFileResourceX for H5 files.
FeatureRegistry (dict) – Dictionary of DerivedFeature objects used for derivations
interp_kwargs (dict | None) – Dictionary of kwargs for level interpolation. Can include “method” and “run_level_check” keys. Method specifies how to perform height interpolation. e.g. Deriving u_20m from u_10m and u_100m. Options are “linear” and “log”. See sup3r.preprocessing.derivers.Deriver.do_level_interpolation()
cache_kwargs (dict | None) – Dictionary with kwargs for caching wrangled data. This should at minimum include a cache_pattern key, value. This pattern must have a {feature} format key and either a h5 or nc file extension, based on desired output type. See class:Cacher for description of more arguments.
kwargs (dict) – Dictionary of additional keyword args for Rasterizer, used specifically for rasterizing flattened data

Methods

`check_registry`(feature)	Get compute method from the registry if available.
`collect_input_attrs`(feature[, inputs])	Collect attributes from the input features for the given feature.
`derive`(feature)	Routine to derive requested features.
`do_level_interpolation`(feature[, interp_kwargs])	Interpolate over height or pressure to derive the given feature.
`get_inputs`(feature)	Get inputs for the given feature and inputs for those inputs.
`get_multi_level_data`(feature)	Get data stored in multi-level arrays, like u stored on pressure levels.
`get_single_level_data`(feature)	When doing level interpolation we should include the single level data available.
`has_interp_variables`(feature)	Check if the given feature can be interpolated from values at nearby heights or from pressure level data.
`map_new_name`(feature, pattern)	If the search for a derivation method first finds an alternative name for the feature we want to derive, by matching a wildcard pattern, we need to replace the wildcard with the specific height or pressure we want and continue the search for a derivation method with this new name.
`no_overlap`(feature)	Check if any of the nested inputs for 'feature' contain 'feature'
`post_init_log`([args_dict])	Log additional arguments after initialization.
`wrap`(data)	Return a `Sup3rDataset` object or tuple of such.

Attributes

`FEATURE_REGISTRY`
`data`	Return underlying data.
`shape`	Get shape of underlying data.

check_registry(feature) → ndarray | Array | str | None#: Get compute method from the registry if available. Will check for pattern feature match in feature registry. e.g. if u_100m matches a feature registry entry of u_(.*)m

collect_input_attrs(feature, inputs=None)#: Collect attributes from the input features for the given feature.

property data#

Return underlying data.

Returns:: Sup3rDataset

sup3r.preprocessing.data_handlers.base.DataHandlerH5WindCC

Contents

sup3r.preprocessing.data_handlers.base.DataHandlerH5WindCC#