sup3r.preprocessing.rasterizers.exo.BaseExoRasterizer#
- class BaseExoRasterizer(file_paths: str | None = None, source_file: str | None = None, feature: str | None = None, s_enhance: int = 1, t_enhance: int = 1, input_handler_name: str | None = None, input_handler_kwargs: dict | None = None, cache_dir: str = './exo_cache/', chunks: str | dict | None = 'auto', distance_upper_bound: int | None = None, max_workers: int = 1, verbose: bool = False)[source]#
Bases:
ABC
Class to extract high-res (4km+) data rasters for new spatially-enhanced datasets (e.g. GCM files after spatial enhancement) using nearest neighbor mapping and aggregation from NREL datasets (e.g. WTK or NSRDB)
- Parameters:
file_paths (str | list) – A single source h5 file to extract raster data from or a list of netcdf files with identical grid. The string can be a unix-style file path which will be passed through glob.glob. This is typically low-res WRF output or GCM netcdf data files that is source low-resolution data intended to be sup3r resolved.
source_file (str) – Filepath to source data file to get hi-res exogenous data from which will be mapped to the enhanced grid of the file_paths input. Pixels from this source_file will be mapped to their nearest low-res pixel in the file_paths input. Accordingly, source_file should be a significantly higher resolution than file_paths. Warnings will be raised if the low-resolution pixels in file_paths do not have unique nearest pixels from source_file. File format can be .h5 for ExoRasterizerH5 or .nc for ExoRasterizerNC
feature (str) – Name of exogenous feature to rasterize.
s_enhance (int) – Factor by which the Sup3rGan model will enhance the spatial dimensions of low resolution data from file_paths input. For example, if getting topography data, file_paths has 100km data, and s_enhance is 4, this class will output a topography raster corresponding to the file_paths grid enhanced 4x to ~25km
t_enhance (int) – Factor by which the Sup3rGan model will enhance the temporal dimension of low resolution data from file_paths input. For example, if getting “sza” data, file_paths has hourly data, and t_enhance is 4, this class will output an “sza” raster corresponding to
file_paths
, temporally enhanced 4x to 15 mininput_handler_name (str) – data handler class to use for input data. Provide a string name to match a
Rasterizer
. If None the correct handler will be guessed based on file type and time series properties.input_handler_kwargs (dict | None) – Any kwargs for initializing the
input_handler_name
class.cache_dir (str | ‘./exo_cache’) – Directory to use for caching rasterized data.
chunks (str | dict) – Dictionary of dimension chunk sizes for returned exo data. e.g. {‘time’: 100, ‘south_north’: 100, ‘west_east’: 100}. This can also just be “auto”. This is passed to
.chunk()
before returning exo data through.data
attributedistance_upper_bound (float | None) – Maximum distance to map high-resolution data from source_file to the low-resolution file_paths input. None (default) will calculate this based on the median distance between points in source_file
max_workers (int) – Number of workers used for writing data to cache files. Gets passed to
Cacher.write_netcdf.
verbose (bool) – Whether to log output as each chunk is written to cache file.
Methods
get_data
()Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance).
Maximum distance (float) to map high-resolution data from source_file to the low-resolution file_paths input.
Attributes
cache_dir
Get cache file name
chunks
Get coords dictionary for initializing xr.Dataset.
Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance).
distance_upper_bound
feature
file_paths
Lat lon grid for data in format (spatial_1, spatial_2, 2) Lat/Lon array with same ordering in last dimension.
Get the high-resolution spatiotemporal shape
Get the full time index for aggregated source data
input_handler_kwargs
input_handler_name
Get the low-resolution spatiotemporal shape
max_workers
Get the nearest neighbor indices.
s_enhance
Get the 1D array of source data from the source_file_h5
source_file
Get the Loader object that handles the exogenous data file.
Get the 2D array (n, 2) of lat, lon data from the source_file_h5
t_enhance
Get the KDTree built on the target lat lon data from the file_paths input with s_enhance
verbose
- abstract property source_data#
Get the 1D array of source data from the source_file_h5
- property source_handler#
Get the Loader object that handles the exogenous data file.
- property cache_file#
Get cache file name
- Returns:
cache_fp (str) – Name of cache file. This is a netcdf file which will be saved with
Cacher
and loaded withLoader
- property coords#
Get coords dictionary for initializing xr.Dataset.
- property source_lat_lon#
Get the 2D array (n, 2) of lat, lon data from the source_file_h5
- property lr_shape#
Get the low-resolution spatiotemporal shape
- property hr_shape#
Get the high-resolution spatiotemporal shape
- property hr_lat_lon#
Lat lon grid for data in format (spatial_1, spatial_2, 2) Lat/Lon array with same ordering in last dimension. This corresponds to the enhanced meta data from the file_paths input * s_enhance.
- Returns:
ndarray
- property hr_time_index#
Get the full time index for aggregated source data
- get_distance_upper_bound()[source]#
Maximum distance (float) to map high-resolution data from source_file to the low-resolution file_paths input.
- property tree#
Get the KDTree built on the target lat lon data from the file_paths input with s_enhance
- property nn#
Get the nearest neighbor indices. This uses a single neighbor by default
- property data#
Get a raster of source values corresponding to the high-resolution grid (the file_paths input grid * s_enhance * t_enhance). The shape is (lats, lons, temporal, 1)