sup3r.preprocessing.rasterizers.dual.DualRasterizer#

class DualRasterizer(data: Sup3rDataset | Tuple[Dataset, Dataset], regrid_workers=1, regrid_lr=True, s_enhance=1, t_enhance=1, lr_cache_kwargs=None, hr_cache_kwargs=None)[source]#

Bases: Container

Object containing xr.Dataset instances for low and high-res data. (Usually ERA5 and WTK, respectively). This essentially just regrids the low-res data to the coarsened high-res grid. This is useful for caching prepping data which then can go directly to a DualSampler DualBatchQueue.

Note

When first extracting the low_res data make sure to extract a region that completely overlaps the high_res region. It is easiest to load the full low_res domain and let DualRasterizer select the appropriate region through regridding.

Initialize data container lr and hr Data instances. Typically lr = ERA5 data and hr = WTK data.

Parameters:
  • data (Sup3rDataset | Tuple[xr.Dataset, xr.Dataset]) – A tuple of xr.Dataset instances. The first must be low-res and the second must be high-res data

  • regrid_workers (int | None) – Number of workers to use for regridding routine.

  • regrid_lr (bool) – Flag to regrid the low-res data to the high-res grid. This will take care of any minor inconsistencies in different projections. Disable this if the grids are known to be the same.

  • s_enhance (int) – Spatial enhancement factor

  • t_enhance (int) – Temporal enhancement factor

  • lr_cache_kwargs (dict) – Cache kwargs for the call to lr_data.cache_data(cache_kwargs). Must include ‘cache_pattern’ key if not None, and can also include dictionary of chunk tuples with feature keys

  • hr_cache_kwargs (dict) – Cache kwargs for the call to hr_data.cache_data(cache_kwargs). Must include ‘cache_pattern’ key if not None, and can also include dictionary of chunk tuples with feature keys

Methods

check_regridded_lr_data()

Check for NaNs after regridding and do NN fill if needed.

get_regridder()

Get regridder object

post_init_log([args_dict])

Log additional arguments after initialization.

update_hr_data()

Set the high resolution data attribute and check if hr_data.shape is divisible by s_enhance.

update_lr_data()

Regrid low_res data for all requested noncached features.

wrap(data)

Return a Sup3rDataset object or tuple of such.

Attributes

data

Return underlying data.

shape

Get shape of underlying data.

update_hr_data()[source]#

Set the high resolution data attribute and check if hr_data.shape is divisible by s_enhance. If not, take the largest shape that can be.

get_regridder()[source]#

Get regridder object

update_lr_data()[source]#

Regrid low_res data for all requested noncached features. Load cached features if available and overwrite=False

check_regridded_lr_data()[source]#

Check for NaNs after regridding and do NN fill if needed.

property data#

Return underlying data.

Returns:

Sup3rDataset

See also

wrap()

post_init_log(args_dict=None)#

Log additional arguments after initialization.

property shape#

Get shape of underlying data.

wrap(data)#

Return a Sup3rDataset object or tuple of such. This is a tuple when the .data attribute belongs to a Collection object like BatchHandler. Otherwise this is Sup3rDataset object, which is either a wrapped 2-tuple or 1-tuple (e.g. len(data) == 2 or len(data) == 1). This is a 2-tuple when .data belongs to a dual container object like DualSampler and a 1-tuple otherwise.