nsrdb.data_model.clouds.CloudVarSingleNC

class CloudVarSingleNC(fpath, pre_proc_flag=True, index=None, dsets=('cloud_type', 'cld_opd_dcomp', 'cld_reff_dcomp', 'cld_press_acha'), parallax_correct=True, solar_shading=True, remap_pc=True)[source]

Bases: CloudVarSingle

Framework for .nc single-file/single-timestep cloud data extraction.

Parameters:

fpath (str) – Full filepath for the cloud data at a single timestep.
pre_proc_flag (bool) – Flag to pre-process and sparsify data.
index (np.ndarray) – Nearest neighbor results array to extract a subset of the data.
dsets (tuple | list) – Source datasets to extract.
parallax_correct (bool) – Flag to adjust cloud coordinates so clouds are overhead their coordinates and not at the apparent location from the sensor.
solar_shading (bool) – Flag to adjust cloud coordinates so clouds are assigned to the coordiantes they shade.
remap_pc (bool) – Flag to remap the parallax-corrected and solar-shading-corrected data back onto the original semi-regular GOES coordinates

Methods

`clean_attrs`()	Try to clean unnecessary object attributes to reduce memory usage
`correct_coordinates`(fpath, grid, sparse_mask)	Adjust grid lat/lon values based on solar position
`get_dset`(dset)	Get a single dataset from the source cloud data file.
`pre_process`(dset, data[, fill_value, ...])	Pre-process cloud data by filling missing values and unscaling.
`remap_pc_coords`()	Remap the parallax/shading corrected coordinates back onto the original "raw" coordinate system and set internal variables to do the same for the cloud data when processed through get_dset() and self.source_data
`remap_pc_data`(data)	Perform remapping of parallax/shading corrected data onto the raw/original cloud coordinate system including overlaying cloud shadow data over clear data.

Attributes

`GRID_LABELS`
`dsets`	Get a list of the available datasets in the cloud file.
`fpath`	Get the full file path for this cloud data timestep.
`grid`	Return the cloud data grid for the current timestep.
`source_data`	Get multiple-variable data dictionary from the cloud data file.
`tree`	Get the KDTree for the cloud data coordinates eg.

property dsets: Get a list of the available datasets in the cloud file.

static correct_coordinates(fpath, grid, sparse_mask, parallax_correct=True, solar_shading=True)[source]

Adjust grid lat/lon values based on solar position

Parameters:

fpath (str) – Filepath to cloud nc file containing required datasets for solpo coodinate adjustment.
grid (dict) – Dictionary with latitude and longitude keys and corresponding numpy array values.
sparse_mask (np.ndarray) – Boolean array to mask the native dataset shapes from fpath.
parallax_correct (bool) – Flag to adjust cloud coordinates so clouds are overhead their coordinates and not at the apparent location from the sensor.
solar_shading (bool) – Flag to adjust cloud coordinates so clouds are assigned to the coordiantes they shade.

Returns:

grid (dict) – Dictionary with latitude and longitude keys and corresponding numpy array values. Coordinates are adjusted for solar position so that clouds are linked to the coordinate that they are shading.

static pre_process(dset, data, fill_value=None, sparse_mask=None, index=None)[source]

Pre-process cloud data by filling missing values and unscaling.

Pre-processing steps (different for .nc vs .h5):

sparsify
flatten (ravel)
convert to float32 (unless dset == cloud_type)
convert filled values to NaN (unless dset == cloud_type)
extract only data at index

Parameters:

dset (str) – Dataset name.
data (np.ndarray) – Raw data extracted from the dataset in the cloud data source file. For the .nc files, this data is already unscaled.
fill_value (NoneType | int | float) – Value that was assigned if the data was missing. These entries in data will be converted to NaN if possible.
sparse_mask (NoneType | pd.Series) – Optional boolean mask to apply to the data to sparsify. For the .nc files, this is taken from the masked coordinate arrays.
index (np.ndarray) – Nearest neighbor results array to extract a subset of the data.

Returns:

data (np.ndarray) – Pre-processed data.

get_dset(dset)[source]

Get a single dataset from the source cloud data file.

Parameters:: dset (str) – Variable dataset name to retrieve from the cloud file.
Returns:: dset (np.ndarray) – 1D array of flattened data that should match the self.grid meta data.

clean_attrs(): Try to clean unnecessary object attributes to reduce memory usage

property fpath: Get the full file path for this cloud data timestep.

property grid

Return the cloud data grid for the current timestep.

Returns:: self._grid (pd.DataFrame | None) – GOES source coordinates (labels: [‘latitude’, ‘longitude’]). None if bad dataset

remap_pc_coords(): Remap the parallax/shading corrected coordinates back onto the original “raw” coordinate system and set internal variables to do the same for the cloud data when processed through get_dset() and self.source_data

remap_pc_data(data)

Perform remapping of parallax/shading corrected data onto the raw/original cloud coordinate system including overlaying cloud shadow data over clear data.

Parameters:: data (np.ndarray) – 1D array of flattened data based on the original coordinate system ordering from the cloud file, possibly with sparsification due to pre processing of nan data/coordinates.
Returns:: data (np.ndarray) – 1D array of flattened data that corresponds to the original coordinate system with no parallax/shading corrections but has been re-arranged such that it reflects these coordinate adjustments.

property source_data

Get multiple-variable data dictionary from the cloud data file.

Returns:: data (dict) – Dictionary of multiple cloud datasets. Keys are the cloud dataset names. Values are 1D (flattened/raveled) arrays of data.

property tree

Get the KDTree for the cloud data coordinates eg. cKDTree(self.grid[[‘latitude’, ‘longitude’]])

Returns:: cKDTree