sup3r.preprocessing.batch_handling.BatchHandlerSpatialDC

class BatchHandlerSpatialDC(*args, **kwargs)[source]

Bases: BatchHandler

Data-centric batch handler

Parameters:
  • *args (list) – Same positional args as BatchHandler

  • **kwargs (dict) – Same keyword args as BatchHandler

Methods

cache_stats()

Saved stdevs and means to cache files if files are not None

check_cached_stats()

Get standard deviations and means for all data features from cache files if available.

get_handler_index()

Get random handler index based on handler weights

get_rand_handler()

Get random handler based on handler weights

get_stats()

Get standard deviations and means for all data features

load_handler_data()

Load data handler data in parallel or serial

normalize([means, stds])

Compute means and stds for each feature across all datasets and normalize each data handler dataset.

update_training_sample_record()

Keep track of number of observations from each temporal bin

Attributes

feature_mem

Get memory used by each feature in data handlers

features

Get the ordered list of feature names held in this object's data handlers

handler_weights

Get weights used to sample from different data handlers based on relative sizes

hr_exo_features

Get a list of high-resolution features that are only used for training e.g., mid-network high-res topo injection.

hr_features_ind

Get the high-resolution feature channel indices that should be included for training.

hr_out_features

Get a list of low-resolution features that are intended to be output by the GAN.

load_workers

Get max workers for loading data handler based on memory usage

lr_features

Get a list of low-resolution features.

norm_workers

Get max workers used for calculating and normalization across features

shape

Shape of full dataset across all handlers

stats_workers

Get max workers for calculating stats based on memory usage

VAL_CLASS

alias of ValidationDataSpatialDC

BATCH_CLASS

alias of Batch

DATA_HANDLER_CLASS

alias of DataHandlerDCforH5

cache_stats()

Saved stdevs and means to cache files if files are not None

check_cached_stats()

Get standard deviations and means for all data features from cache files if available.

Returns:

  • means (dict | none) – Dictionary of means for all features with keys: feature names and values: mean values. if None, this will be calculated. if norm is true these will be used for data normalization

  • stds (dict | none) – dictionary of standard deviation values for all features with keys: feature names and values: standard deviations. if None, this will be calculated. if norm is true these will be used for data normalization

property feature_mem

Get memory used by each feature in data handlers

property features

Get the ordered list of feature names held in this object’s data handlers

get_handler_index()

Get random handler index based on handler weights

get_rand_handler()

Get random handler based on handler weights

get_stats()

Get standard deviations and means for all data features

property handler_weights

Get weights used to sample from different data handlers based on relative sizes

property hr_exo_features

Get a list of high-resolution features that are only used for training e.g., mid-network high-res topo injection.

property hr_features_ind

Get the high-resolution feature channel indices that should be included for training. Any high-resolution features that are only included in the data handler to be coarsened for the low-res input are removed

property hr_out_features

Get a list of low-resolution features that are intended to be output by the GAN.

load_handler_data()

Load data handler data in parallel or serial

property load_workers

Get max workers for loading data handler based on memory usage

property lr_features

Get a list of low-resolution features. All low-resolution features are used for training.

property norm_workers

Get max workers used for calculating and normalization across features

normalize(means=None, stds=None)

Compute means and stds for each feature across all datasets and normalize each data handler dataset. Checks if input means and stds are different from stored means and stds and renormalizes if they are

Parameters:
  • means (dict | none) – Dictionary of means for all features with keys: feature names and values: mean values. if None, this will be calculated. if norm is true these will be used for data normalization

  • stds (dict | none) – dictionary of standard deviation values for all features with keys: feature names and values: standard deviations. if None, this will be calculated. if norm is true these will be used for data normalization

  • features (list | None) – Optional list of features used to index data array during normalization. If this is None self.features will be used.

property shape

Shape of full dataset across all handlers

Returns:

shape (tuple) – (spatial_1, spatial_2, temporal, features) With spatiotemporal extent equal to the sum across all data handler dimensions

property stats_workers

Get max workers for calculating stats based on memory usage

update_training_sample_record()[source]

Keep track of number of observations from each temporal bin