sup3r.bias.qdm.QuantileDeltaMappingCorrection

sup3r.bias.qdm.QuantileDeltaMappingCorrection#

class QuantileDeltaMappingCorrection(base_fps, bias_fps, bias_fut_fps, base_dset, bias_feature, distance_upper_bound=None, target=None, shape=None, base_handler='Resource', bias_handler='DataHandlerNCforCC', base_handler_kwargs=None, bias_handler_kwargs=None, bias_fut_handler_kwargs=None, decimals=None, match_zero_rate=False, n_quantiles=101, dist='empirical', relative=True, sampling='linear', log_base=10, n_time_steps=24, window_size=120, pre_load=True)[source]#

Bases: AbstractBiasCorrection, FillAndSmoothMixin, DataRetrievalBase

Estimate probability distributions required by Quantile Delta Mapping

The main purpose of this class is to estimate the probability distributions required by Quantile Delta Mapping (QDM) ([Cannon2015]) technique. Therefore, the name ‘Correction’ can be misleading since it is not the correction per se, but that was used to keep consistency within this module.

The QDM technique corrects bias and trend by comparing the data distributions of three datasets: a historical reference, a biased reference, and a biased target to correct (in Cannon et. al. (2015) called: historical observed, historical modeled, and future modeled respectively). Those three probability distributions provided here can be, for instance, used by local_qdm_bc() to actually correct a dataset.

Parameters:

base_fps (list | str) – One or more baseline .h5 filepaths representing non-biased data to use to correct the biased dataset (observed historical in Cannon et. al. (2015)). This is typically several years of WTK or NSRDB files.
bias_fps (list | str) – One or more biased .nc or .h5 filepaths representing the biased data to be compared with the baseline data (modeled historical in Cannon et. al. (2015)). This is typically several years of GCM .nc files.
bias_fut_fps (list | str) – Consistent data to bias_fps but for a different time period (modeled future in Cannon et. al. (2015)). This is the dataset that would be corrected, while bias_fsp is used to provide a transformation map with the baseline data.
base_dset (str) – A single dataset from the base_fps to retrieve. In the case of wind components, this can be U_100m or V_100m which will retrieve windspeed and winddirection and derive the U/V component.
bias_feature (str) – This is the biased feature from bias_fps to retrieve. This should be a single feature name corresponding to base_dset
distance_upper_bound (float) – Upper bound on the nearest neighbor distance in decimal degrees. This should be the approximate resolution of the low-resolution bias data. None (default) will calculate this based on the median distance between points in bias_fps
target (tuple) – (lat, lon) lower left corner of raster to retrieve from bias_fps. If None then the lower left corner of the full domain will be used.
shape (tuple) – (rows, cols) grid size to retrieve from bias_fps. If None then the full domain shape will be used.
base_handler (str) – Name of rex resource handler or sup3r.preprocessing.data_handlers class to be retrieved from the rex/sup3r library. If a sup3r.preprocessing.data_handlers class is used, all data will be loaded in this class’ initialization and the subsequent bias calculation will be done in serial
bias_handler (str) – Name of the bias data handler class to be retrieved from the sup3r.preprocessing.data_handlers library.
base_handler_kwargs (dict | None) – Optional kwargs to send to the initialization of the base_handler class
bias_handler_kwargs (dict | None) – Optional kwargs to send to the initialization of the bias_handler class with the bias_fps
bias_fut_handler_kwargs (dict | None) – Optional kwargs to send to the initialization of the bias_handler class with the bias_fut_fps
decimals (int | None) – Option to round bias and base data to this number of decimals, this gets passed to np.around(). If decimals is negative, it specifies the number of positions to the left of the decimal point.
match_zero_rate (bool) – Option to fix the frequency of zero values in the biased data. The lowest percentile of values in the biased data will be set to zero to match the percentile of zeros in the base data. If SkillAssessment is being run and this is True, the distributions will not be mean-centered. This helps resolve the issue where global climate models produce too many days with small precipitation totals e.g., the “drizzle problem” [Polade2014].
dist (str, default=”empirical”,) – Define the type of distribution, which can be “empirical” or any parametric distribution defined in “scipy”.
n_quantiles (int, default=101) – Defines the number of quantiles (between 0 and 1) for an empirical distribution.
sampling (str, default=”linear”,) – Defines how the quantiles are sampled. For instance, ‘linear’ will result in a linearly spaced quantiles. Other options are: ‘log’ and ‘invlog’.
log_base (int or float, default=10) – Log base value if sampling is “log” or “invlog”.
n_time_steps (int) – Number of times to calculate QDM parameters equally distributed along a year. For instance, n_time_steps=1 results in a single set of parameters while n_time_steps=12 is approximately every month.
window_size (int) – Total time window period in days to be considered for each time QDM is calculated. For instance, window_size=30 with n_time_steps=12 would result in approximately monthly estimates.
pre_load (bool) – Flag to preload all data needed for bias correction. This is currently recommended to improve performance with the new sup3r data handler access patterns

See also

sup3r.bias.bias_transforms.local_qdm_bc: Bias correction using QDM.
sup3r.preprocessing.data_handlers.DataHandler: Bias correction using QDM directly from a derived handler.
rex.utilities.bc_utils.QuantileDeltaMapping: Quantile Delta Mapping method and support functions. Since rex.utilities.bc_utils is used here, the arguments dist, n_quantiles, sampling, and log_base must be consitent with that package/module.

Notes

One way of using this class is by saving the distributions definitions obtained here with the method write_outputs() and then use that file with local_qdm_bc() or through a derived DataHandler. ATTENTION, be careful handling that file of parameters. There is no checking process and one could missuse the correction estimated for the wrong dataset.

References

[Cannon2015]

Cannon, A. J., Sobie, S. R., & Murdock, T. Q. (2015). Bias correction of GCM precipitation by quantile mapping: how well do methods preserve changes in quantiles and extremes?. Journal of Climate, 28(17), 6938-6959.

Methods

`compare_dists`(base_data, bias_data[, adder, ...])	Compare two distributions using the two-sample Kolmogorov-Smirnov.
`fill_and_smooth`(out[, fill_extend, ...])	For a given set of parameters, fill and extend missing positions
`get_base_data`(base_fps, base_dset, base_gid, ...)	Get data from the baseline data source, possibly for many high-res base gids corresponding to a single coarse low-res bias gid.
`get_base_gid`(bias_gid)	Get one or more base gid(s) corresponding to a bias gid.
`get_bias_data`(bias_gid[, bias_dh])	Get data from the biased data source for a single gid
`get_bias_gid`(coord)	Get the bias gid from a coordinate.
`get_data_pair`(coord[, daily_reduction])	Get base and bias data observations based on a single bias gid.
`get_node_cmd`(config)	Get a CLI call to call cls.run() on a single node based on an input config.
`get_qdm_params`(bias_data, bias_fut_data, ...)	Get quantiles' cut point for given datasets
`pre_load`()	Preload all data needed for bias correction.
`run`([fp_out, max_workers, daily_reduction, ...])	Estimate the statistical distributions for each location
`window_mask`(doy, d0, window_size)	An index of elements within a given time window
`write_outputs`(fp_out[, out])	Write outputs to an .h5 file.

Attributes

`distance_upper_bound`	Maximum distance (float) to map high-resolution data from exo_source to the low-resolution file_paths input.
`meta`	Get a meta data dictionary on how these bias factors were calculated

pre_load()[source]#: Preload all data needed for bias correction. This is currently recommended to improve performance with the new sup3r data handler access patterns

static get_qdm_params(bias_data, bias_fut_data, base_data, bias_feature, base_dset, sampling, n_samples, log_base)[source]#

Get quantiles’ cut point for given datasets

Estimate the quantiles’ cut points for each of the three given datasets. Lacking a good analytical approximation, such as one of the parametric distributions, those quantiles can be used to approximate the statistical distribution of those datasets.

Parameters:

bias_data (np.ndarray) – 1D array of biased data observations.
bias_fut_data (np.ndarray) – 1D array of biased data observations.
base_data (np.ndarray) – 1D array of base data observations.
bias_feature (str) – This is the biased feature from bias_fps to retrieve. This should be a single feature name corresponding to base_dset.
base_dset (str) – A single dataset from the base_fps to retrieve. In the case of wind components, this can be U_100m or V_100m which will retrieve windspeed and winddirection and derive the U/V component.
sampling (str) – Defines how the quantiles are sampled. For instance, ‘linear’ will result in a linearly spaced quantiles. Other options are: ‘log’ and ‘invlog’.
n_samples (int) – Number of points to sample between 0 and 1, i.e. number of quantiles.
log_base (int | float) – Log base value.

Returns:

out (dict) – Dictionary of the quantiles’ cut points. Note that to make sense of those cut point values, one need to know the given arguments such as log_base. For instance, the sequence [-1, 0, 2] are, if sampling was linear, the minimum, median, and maximum values respectively. The expected keys are “bias_{bias_feature}_params”, “bias_fut_{bias_feature}_params”, and “base_{base_dset}_params”.

sup3r.bias.qdm.QuantileDeltaMappingCorrection

Contents

sup3r.bias.qdm.QuantileDeltaMappingCorrection#