sup3r.preprocessing.rasterizers.exo.BaseExoRasterizer#

class BaseExoRasterizer(file_paths: str | None = None, source_file: str | None = None, feature: str | None = None, s_enhance: int = 1, t_enhance: int = 1, input_handler_name: str | None = None, input_handler_kwargs: dict | None = None, cache_dir: str = './exo_cache/', chunks: str | dict | None = 'auto', distance_upper_bound: int | None = None, max_workers: int = 1, verbose: bool = False)[source]#

Bases: ABC

Class to extract high-res (4km+) data rasters for new spatially-enhanced datasets (e.g. GCM files after spatial enhancement) using nearest neighbor mapping and aggregation from NREL datasets (e.g. WTK or NSRDB)

Parameters:
  • file_paths (str | list) – A single source h5 file to extract raster data from or a list of netcdf files with identical grid. The string can be a unix-style file path which will be passed through glob.glob. This is typically low-res WRF output or GCM netcdf data files that is source low-resolution data intended to be sup3r resolved.

  • source_file (str) – Filepath to source data file to get hi-res exogenous data from which will be mapped to the enhanced grid of the file_paths input. Pixels from this source_file will be mapped to their nearest low-res pixel in the file_paths input. Accordingly, source_file should be a significantly higher resolution than file_paths. Warnings will be raised if the low-resolution pixels in file_paths do not have unique nearest pixels from source_file. File format can be .h5 for ExoRasterizerH5 or .nc for ExoRasterizerNC

  • feature (str) – Name of exogenous feature to rasterize.

  • s_enhance (int) – Factor by which the Sup3rGan model will enhance the spatial dimensions of low resolution data from file_paths input. For example, if getting topography data, file_paths has 100km data, and s_enhance is 4, this class will output a topography raster corresponding to the file_paths grid enhanced 4x to ~25km

  • t_enhance (int) – Factor by which the Sup3rGan model will enhance the temporal dimension of low resolution data from file_paths input. For example, if getting “sza” data, file_paths has hourly data, and t_enhance is 4, this class will output an “sza” raster corresponding to file_paths, temporally enhanced 4x to 15 min

  • input_handler_name (str) – data handler class to use for input data. Provide a string name to match a Rasterizer. If None the correct handler will be guessed based on file type and time series properties.

  • input_handler_kwargs (dict | None) – Any kwargs for initializing the input_handler_name class.

  • cache_dir (str | ‘./exo_cache’) – Directory to use for caching rasterized data.

  • chunks (str | dict) – Dictionary of dimension chunk sizes for returned exo data. e.g. {‘time’: 100, ‘south_north’: 100, ‘west_east’: 100}. This can also just be “auto”. This is passed to .chunk() before returning exo data through .data attribute

  • distance_upper_bound (float | None) – Maximum distance to map high-resolution data from source_file to the low-resolution file_paths input. None (default) will calculate this based on the median distance between points in source_file

  • max_workers (int) – Number of workers used for writing data to cache files. Gets passed to Cacher.write_netcdf.

  • verbose (bool) – Whether to log output as each chunk is written to cache file.

Methods

get_data()

Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance).

get_distance_upper_bound()

Maximum distance (float) to map high-resolution data from source_file to the low-resolution file_paths input.

Attributes

cache_dir

cache_file

Get cache file name

chunks

coords

Get coords dictionary for initializing xr.Dataset.

data

Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance).

distance_upper_bound

feature

file_paths

hr_lat_lon

Lat lon grid for data in format (spatial_1, spatial_2, 2) Lat/Lon array with same ordering in last dimension.

hr_shape

Get the high-resolution spatiotemporal shape

hr_time_index

Get the full time index for aggregated source data

input_handler_kwargs

input_handler_name

lr_shape

Get the low-resolution spatiotemporal shape

max_workers

nn

Get the nearest neighbor indices.

s_enhance

source_data

Get the 1D array of source data from the source_file_h5

source_file

source_handler

Get the Loader object that handles the exogenous data file.

source_lat_lon

Get the 2D array (n, 2) of lat, lon data from the source_file_h5

t_enhance

tree

Get the KDTree built on the target lat lon data from the file_paths input with s_enhance

verbose

abstract property source_data#

Get the 1D array of source data from the source_file_h5

property source_handler#

Get the Loader object that handles the exogenous data file.

property cache_file#

Get cache file name

Returns:

cache_fp (str) – Name of cache file. This is a netcdf file which will be saved with Cacher and loaded with Loader

property coords#

Get coords dictionary for initializing xr.Dataset.

property source_lat_lon#

Get the 2D array (n, 2) of lat, lon data from the source_file_h5

property lr_shape#

Get the low-resolution spatiotemporal shape

property hr_shape#

Get the high-resolution spatiotemporal shape

property hr_lat_lon#

Lat lon grid for data in format (spatial_1, spatial_2, 2) Lat/Lon array with same ordering in last dimension. This corresponds to the enhanced meta data from the file_paths input * s_enhance.

Returns:

ndarray

property hr_time_index#

Get the full time index for aggregated source data

get_distance_upper_bound()[source]#

Maximum distance (float) to map high-resolution data from source_file to the low-resolution file_paths input.

property tree#

Get the KDTree built on the target lat lon data from the file_paths input with s_enhance

property nn#

Get the nearest neighbor indices. This uses a single neighbor by default

property data#

Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance). The shape is (lats, lons, temporal, 1)

get_data()[source]#

Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance). The shape is (lats, lons, 1)