sup3r.qa.qa.Sup3rQa#

class Sup3rQa(source_file_paths, out_file_path, s_enhance, t_enhance, temporal_coarsening_method, features=None, source_features=None, output_names=None, input_handler_name=None, input_handler_kwargs=None, qa_fp=None, bias_correct_method=None, bias_correct_kwargs=None, save_sources=True)[source]#

Bases: object

Class for doing QA on sup3r forward pass outputs.

Note

This only works if the sup3r forward pass output can be reshaped into a 2D raster dataset (e.g. no sparsifying of the meta data).

Parameters:
  • source_file_paths (list | str) – A list of low-resolution source files to extract raster data from. Each file must have the same number of timesteps. Can also pass a string with a unix-style file path which will be passed through glob.glob

  • out_file_path (str) – A single sup3r-resolved output file (either .nc or .h5) with high-resolution data corresponding to the source_file_paths * s_enhance * t_enhance

  • s_enhance (int) – Factor by which the Sup3rGan model will enhance the spatial dimensions of low resolution data

  • t_enhance (int) – Factor by which the Sup3rGan model will enhance temporal dimension of low resolution data

  • temporal_coarsening_method (str | list) – [subsample, average, total, min, max] Subsample will take every t_enhance-th time step, average will average over t_enhance time steps, total will sum over t_enhance time steps. This can also be a list of method names corresponding to the list of features.

  • features (str | list | None) – Explicit list of features to validate. Can be a single feature str, list of string feature names, or None for all features found in the out_file_path.

  • source_features (str | list | None) – Optional feature names to retrieve from the source dataset if the source feature names are not the same as the sup3r output feature names. These will be used to derive the features to be validated. e.g. If model output is temperature_2m, and these were derived from temperature_min_2m (and max), then source features should be temperature_min_2m and temperature_max_2m while the model output temperature_2m is aggregated using min/max in the temporal_coarsening_method. Another example is features=”ghi”, source_features=”rsds”, where this is a simple alternative name lookup.

  • output_names (str | list) – Optional output file dataset names corresponding to the features list input

  • input_handler_name (str | None) – data handler class to use for input data. Provide a string name to match a class in sup3r.preprocessing.data_handlers. If None the correct handler will be guessed based on file type.

  • input_handler_kwargs (dict) – Keyword arguments for input_handler. See Rasterizer class for argument details.

  • qa_fp (str | None) – Optional filepath to output QA file when you call Sup3rQa.run() (only .h5 is supported)

  • bias_correct_method (str | None) – Optional bias correction function name that can be imported from the sup3r.bias.bias_transforms module. This will transform the source data according to some predefined bias correction transformation along with the bias_correct_kwargs. As the first argument, this method must receive a generic numpy array of data to be bias corrected

  • bias_correct_kwargs (dict | None) – Optional namespace of kwargs to provide to bias_correct_method. If this is provided, it must be a dictionary where each key is a feature name and each value is a dictionary of kwargs to correct that feature. You can bias correct only certain input features by only including those feature names in this dict.

  • save_sources (bool) – Flag to save re-coarsened synthetic data and true low-res data to qa_fp in addition to the error dataset

Methods

bias_correct_input_handler(input_handler)

Apply bias correction to all source features which have bias correction data and return Deriver instance to use for derivations of features to match output features.

close()

Close any open file handlers

coarsen_data(idf, feature, data)

Re-coarsen a high-resolution synthetic output dataset

export(qa_fp, data, dset_name[, dset_suffix])

Export error dictionary to h5 file.

get_dset_out(name)

Get an output dataset from the forward pass output file.

get_node_cmd(config)

Get a CLI call to initialize Sup3rQa and execute the Sup3rQa.run() method based on an input config

run()

Go through all datasets and get the error for the re-coarsened synthetic minus the true low-res source data.

Attributes

features

Get a list of feature names from the output file, excluding meta and time index datasets

output_names

Get a list of output dataset names corresponding to the features list

output_type

Get output data type

source_features

Get a list of source dataset names corresponding to the input source data

close()[source]#

Close any open file handlers

property features#

Get a list of feature names from the output file, excluding meta and time index datasets

Returns:

list

property output_names#

Get a list of output dataset names corresponding to the features list

property source_features#

Get a list of source dataset names corresponding to the input source data

property output_type#

Get output data type

Returns:

output_type – e.g. ‘nc’ or ‘h5’

bias_correct_input_handler(input_handler)[source]#

Apply bias correction to all source features which have bias correction data and return Deriver instance to use for derivations of features to match output features.

(1) Check if we need to derive any features included in the bias_correct_kwargs. (2) Derive these features using the input_handler.derive method, and update the stored data. (3) Apply bias correction to all the features in the bias_correct_kwargs (4) Derive the features required for validation from the bias corrected data and update the stored data (5) Return the updated input_handler, now a Deriver object.

get_dset_out(name)[source]#

Get an output dataset from the forward pass output file.

TODO: Make this dim order agnostic. If we didnt have the h5 condition we could just do transpose(‘south_north’, ‘west_east’, ‘time’)

Parameters:

name (str) – Name of the output dataset to retrieve. Must be found in the features property and the forward pass output file.

Returns:

out (np.ndarray) – A copy of the high-resolution output data as a numpy array of shape (spatial_1, spatial_2, temporal)

coarsen_data(idf, feature, data)[source]#

Re-coarsen a high-resolution synthetic output dataset

Parameters:
  • idf (int) – Feature index

  • feature (str) – Feature name

  • data (Union[np.ndarray, da.core.Array]) – A copy of the high-resolution output data as a numpy array of shape (spatial_1, spatial_2, temporal)

Returns:

data (Union[np.ndarray, da.core.Array]) – A spatiotemporally coarsened copy of the input dataset, still with shape (spatial_1, spatial_2, temporal)

classmethod get_node_cmd(config)[source]#

Get a CLI call to initialize Sup3rQa and execute the Sup3rQa.run() method based on an input config

Parameters:

config (dict) – sup3r QA config with all necessary args and kwargs to initialize Sup3rQa and execute Sup3rQa.run()

export(qa_fp, data, dset_name, dset_suffix='')[source]#

Export error dictionary to h5 file.

Parameters:
  • qa_fp (str | None) – Optional filepath to output QA file (only .h5 is supported)

  • data (Union[np.ndarray, da.core.Array]) – An array with shape (space1, space2, time) that represents the re-coarsened synthetic data minus the source true low-res data, or another dataset of the same shape to be written to disk

  • dset_name (str) – Base dataset name to save data to

  • dset_suffix (str) – Optional suffix to append to dset_name with an underscore before saving.

run()[source]#

Go through all datasets and get the error for the re-coarsened synthetic minus the true low-res source data.

Returns:

errors (dict) – Dictionary of errors, where keys are the feature names, and each value is an array with shape (space1, space2, time) that represents the re-coarsened synthetic data minus the source true low-res data