nsrdb.data_model.maiac_aod.MaiacVar

class MaiacVar(name, var_meta, date, **kwargs)[source]

Bases: AncillaryVarHandler

Framework for MAIAC AOD source data extraction.

Parameters:
  • name (str) – NSRDB var name.

  • var_meta (str | pd.DataFrame | None) – CSV file or dataframe containing meta data for all NSRDB variables. Defaults to the NSRDB var meta csv in git repo.

  • date (datetime.date) – Single day to extract data for.

Methods

pre_flight()

Perform pre-flight checks - source file check.

scale_data(array)

Perform a safe data scaling operation on a source data array.

unscale_data(array)

Perform a safe data unscaling operation on a source data array.

Attributes

DEFAULT_DIR

NN_METHOD

attrs

Return a dictionary of dataset attributes for HDF5 dataset attrs.

cache_file

Get the nearest neighbor result cache csv file for this var.

chunks

Get the variable's intended storage chunk shape.

data_source

Get the data source.

date

Get the date for this handler

description

Long variable description.

doy

Get the day of year for daily MAIAC AOD data.

doy_index

Get the day of year index which is one less than doy (zero indexed)

dset_name

Get the source dataset name for the NSRDB variable.

dtype

Get the data type attribute.

elevation_correct

Get the elevation correction preference.

file

Get the file paths for the target NSRDB variable name based on the glob self.pattern.

file_set

Get the source file set name for the NSRDB variable.

files

Get multiple MAIAC AOD filepaths based on source_dir and year.

final_dtype

Get the variable's intended storage datatype.

grid

Return the MAIAC AOD source coordinates

mask

Get a boolean mask to locate the current variable in the meta data.

name

Get the NSRDB variable name.

next_date

Get the date after the date for this handler.

next_file

Get the file path for the date for the target NSRDB variable name based on the glob self.next_pattern.

next_file_exists

Check if file for next date exists

next_pattern

Get the next date source file pattern which is sent to glob().

pattern

Get the source file pattern which is sent to glob().

physical_max

Get the variable's physical maximum value.

physical_min

Get the variable's physical minimum value.

scale_factor

Get the variable's intended storage scale factor.

source_data

Get a flat (1, n) array of data for a single day of MAIAC AOD.

source_dir

Get the source directory containing the variable data files.

spatial_method

Get the spatial interpolation method.

temporal_method

Get the temporal interpolation method.

time_index

Get the aod native time index.

units

Get the units attribute.

var_meta

Return the meta data for NSRDB variables.

property doy_index

Get the day of year index which is one less than doy (zero indexed)

Returns:

index (int) – Zero-indexed doy (0-364 or 0-365)

property doy

Get the day of year for daily MAIAC AOD data.

Returns:

doy (int) – Day of year integer (1-365 or 1-366).

property time_index

Get the aod native time index.

Returns:

ti (pd.DatetimeIndex) – Pandas datetime index for the current day at the MAIAC resolution (1-day).

property pattern

Get the source file pattern which is sent to glob().

Returns:

str

property file

Get the file paths for the target NSRDB variable name based on the glob self.pattern.

Returns:

list

property files

Get multiple MAIAC AOD filepaths based on source_dir and year.

Returns:

list

pre_flight()[source]

Perform pre-flight checks - source file check.

Returns:

missing (str) – Look for the source file and return the string if not found. If nothing is missing, return an empty string.

property source_data

Get a flat (1, n) array of data for a single day of MAIAC AOD.

Returns:

data (np.ndarray) – 2D numpy array (1, n) of MAIAC data for the specified var for a given day.

property grid

Return the MAIAC AOD source coordinates

Returns:

self._grid (pd.DataFrame) – MAIAC source coordinates (latitude, longitude) without elevation

property attrs

Return a dictionary of dataset attributes for HDF5 dataset attrs.

Returns:

attrs (dict) – Namespace of attributes to define the dataset.

property cache_file

Get the nearest neighbor result cache csv file for this var.

Returns:

_cache_file (False | str) – False for no caching, or a string filename (no path).

property chunks

Get the variable’s intended storage chunk shape.

Returns:

chunks (tuple) – Data storage chunk shape (row_chunk, col_chunk).

property data_source

Get the data source.

Returns:

data_source (str) – Data source.

property date

Get the date for this handler

Returns:

datetime.date

property description

Long variable description.

Returns:

description (str) – Description of the variable to provide more info than the sometimes opaque dset names.

property dset_name

Get the source dataset name for the NSRDB variable. This is typically the netcdf or h5 source dataset name for the variable such as T2M or TOTANGSTR (for MERRA temp and alpha)

Returns:

str

property dtype

Get the data type attribute.

Returns:

dtype (str) – Intended NSRDB disk data type.

property elevation_correct

Get the elevation correction preference.

Returns:

elevation_correct (bool) – Whether or not to use elevation correction for the current var.

property file_set

Get the source file set name for the NSRDB variable. This is typically used for MERRA source filesets such as tavg1_2d_aer_Nx or tavg1_2d_slv_Nx (for MERRA)

Returns:

str

property final_dtype

Get the variable’s intended storage datatype.

Returns:

dtype (str) – Data type for the current variable.

property mask

Get a boolean mask to locate the current variable in the meta data.

property name

Get the NSRDB variable name.

property next_date

Get the date after the date for this handler. This is used to get the data for the next date for temporal interpolation

Returns:

datetime.date

property next_file

Get the file path for the date for the target NSRDB variable name based on the glob self.next_pattern. The file is used to get the data for the next date for temporal interpolation

Returns:

str

property next_file_exists

Check if file for next date exists

property next_pattern

Get the next date source file pattern which is sent to glob().

Returns:

str | None

property physical_max

Get the variable’s physical maximum value.

Returns:

physical_max (float) – Physical maximum value for the variable. Variable range can be truncated at this value. Must be consistent with the final dtype and scale factor.

property physical_min

Get the variable’s physical minimum value.

Returns:

physical_min (float) – Physical minimum value for the variable. Variable range can be truncated at this value. Must be consistent with the final dtype and scale factor.

scale_data(array)

Perform a safe data scaling operation on a source data array.

Steps:
  1. Enforce physical range limits

  2. Apply scale factor (mulitply)

  3. Round if integer

  4. Enforce dtype bit range limits

  5. Perform dtype conversion

  6. Return manipulated array

Parameters:

array (np.ndarray) – Source data array with full precision (likely float32).

Returns:

array (np.ndarray) – Source data array with final datatype.

property scale_factor

Get the variable’s intended storage scale factor.

Returns:

scale_factor (float) – Scale factor for the current variable. Data is multiplied by this scale factor before being stored.

property source_dir

Get the source directory containing the variable data files.

Returns:

source_dir (str) – Directory containing source data files (with possible sub folders).

property spatial_method

Get the spatial interpolation method.

Returns:

spatial_method (str) – NN or IDW

property temporal_method

Get the temporal interpolation method.

Returns:

temporal_method (str) – linear or nearest

property units

Get the units attribute.

Returns:

units (str) – NSRDB variable units.

unscale_data(array)

Perform a safe data unscaling operation on a source data array.

Parameters:

array (np.ndarray) – Scaled source data array with integer precision.

Returns:

array (np.ndarray) – Unscaled source data array with float32 precision.

property var_meta

Return the meta data for NSRDB variables.

Returns:

_var_meta (pd.DataFrame) – Meta data for NSRDB variables.