sup3r.preprocessing.loaders.Loader#

class Loader(file_paths, *, features='all', res_kwargs=None, chunks='auto', feature_aliases=None, BaseLoader=None)[source]#

Bases: BaseLoader

Base loader. “Loads” files so that a .data attribute provides access to the data in the files as a dask array with shape (lats, lons, time, features). This object provides a __getitem__ method that can be used by Sampler objects to build batches or by Rasterizer objects to derive / extract specific features / regions / time_periods.

Parameters:
  • file_paths (str | pathlib.Path | list) – Location(s) of files to load

  • features (list | str) – Features to return in loaded dataset. If ‘all’ then all available features will be returned.

  • res_kwargs (dict) – Additional keyword arguments passed through to the BaseLoader. BaseLoader is usually xr.open_mfdataset for NETCDF files and MultiFileResourceX for H5 files.

  • chunks (dict | str | None) – Dictionary of chunk sizes to pass through to dask.array.from_array() or xr.Dataset().chunk(). Will be converted to a tuple when used in from_array(). These are the methods for H5 and NETCDF data, respectively. This argument can be “auto” in additional to a dictionary. If this is None then the data will not be chunked and instead loaded directly into memory.

  • feature_aliases (dict) – Optional dictionary of feature aliases to use when loading data. This is useful for renaming features to expected sup3r names. For example, {‘sp’: ‘pressure_0m’, ‘u10’: u_10m’}.

  • BaseLoader (Callable) – Optional base loader update. The default for H5 files is MultiFileResourceX and for NETCDF is xarray.open_mfdataset

Methods

BASE_LOADER(**kwargs)

Wrapper for xr.open_mfdataset with default opening options.

post_init_log([args_dict])

Log additional arguments after initialization.

wrap(data)

Return a Sup3rDataset object or tuple of such.

Attributes

TypeSpecificClasses

data

Return underlying data.

file_paths

Get file paths for input data

shape

Get shape of underlying data.

BASE_LOADER(**kwargs)#

Wrapper for xr.open_mfdataset with default opening options.

property data#

Return underlying data.

Returns:

Sup3rDataset

See also

wrap()

property file_paths#

Get file paths for input data

post_init_log(args_dict=None)#

Log additional arguments after initialization.

property shape#

Get shape of underlying data.

wrap(data)#

Return a Sup3rDataset object or tuple of such. This is a tuple when the .data attribute belongs to a Collection object like BatchHandler. Otherwise this is Sup3rDataset object, which is either a wrapped 3-tuple, 2-tuple, or 1-tuple (e.g. len(data) == 3, len(data) == 2 or len(data) == 1). This is a 3-tuple when .data belongs to a container object like DualSamplerWithObs, a 2-tuple when .data belongs to a dual container object like DualSampler, and a 1-tuple otherwise.