nsrdb.utilities.sky_class.SkyClass
- class SkyClass(fp_surf, fp_nsrdb, nsrdb_gid, clearsky_ratio=0.9, clear_time_frac=0.8, cloudy_time_frac=0.2, window_minutes=61, min_irradiance=0, sza_lim=89)[source]
Bases:
object
Utility class for retrieving SURFRAD validation data alongside NSRDB data, determining the sky class by comparison to predicted clearsky irradiance, and providing data in a ready-to-validate dataframe.
- Parameters:
fp_surf (str) – Filepath to surfrad h5 file.
fp_nsrdb (str) – Filepath to NSRDB file. can be a MultiFileResource path with: /dir/prefix*suffix.h5
nsrdb_gid (int) – GID (meta data index) for the site of interest in the fp_nsrdb file that matches the fp_surf file.
clearsky_ratio (float) – Clearsky ratio (ground measurement / clearsky irradiance) above which a timestep is considered clear
clear_time_frac (float) – Fraction of clear timesteps in an averaging window above which the whole window is considered clear. Between clear_time_frac and cloudy_time_frac is considered broken clouds.
cloudy_time_frac (float) – Fraction of cloudy timesteps in an averaging window below which the whole window is considered cloudy. Between clear_time_frac and cloudy_time_frac is considered broken clouds.
window_minutes (int) – Minutes that the moving average of the sky classification will be over. This will be calculated while considering the source time resolution of the SURFRAD measurements.
min_irradiance (float | int) – Minimum irradiance value, timesteps with either ground measured or NSRDB irradiance less than this value will be classified as missing.
sza_lim (int | float) – Maximum solar zenith angle, timesteps with sza > sza_lim will be classified as missing
Methods
Add NSRDB and SURFRAD ghi and dni data to a DataFrame.
Calculate the sky class (clear, cloudy, broken, missing) from the comparison df.
Get a timeseries dataframe comparing the ground-measured GHI vs.
Get a dataframe of NSRDB variables required to run REST2.
run
(fp_surf, fp_nsrdb, nsrdb_gid[, ...])run_rest
(rest_inputs)Run REST2 using a dataframe of input data and return clearsky GHI.
Attributes
ALIASES
REST_VARS
Get the initialized Resource or MultiFileResource handler
Get the datetimeindex from the nsrdb h5 file
Get the surfrad ghi data with negative values as NaN
Get the datetimeindex from the surfrad h5 file
Get the initialized Surfrad handler
- property surfrad
Get the initialized Surfrad handler
- property surf_time_index
Get the datetimeindex from the surfrad h5 file
- property nsrdb
Get the initialized Resource or MultiFileResource handler
- property nsrdb_time_index
Get the datetimeindex from the nsrdb h5 file
- property surf_ghi
Get the surfrad ghi data with negative values as NaN
- get_rest_inputs()[source]
Get a dataframe of NSRDB variables required to run REST2.
- Returns:
rest_inputs (pd.DataFrame) – Timeseries data with time_index from the surfrad data (might be missing time steps) and data columns for each input variable required by REST and some extras (e.g. air_temperature).
- run_rest(rest_inputs)[source]
Run REST2 using a dataframe of input data and return clearsky GHI.
- Parameters:
rest_inputs (pd.DataFrame) – Timeseries data with time_index from the surfrad data (might be missing time steps) and data columns for each input variable required by REST and some extras (e.g. air_temperature).
- Returns:
ghi (np.ndarray) – 2D (time, 1) array of clearsky GHI values calculated by REST2.
- get_comparison_df()[source]
Get a timeseries dataframe comparing the ground-measured GHI vs. the clearsky (REST2) GHI at the ground-measurement time_index.
- Returns:
df (pd.DataFrame) – Timeseries data with time_index from the surfrad data (might be missing time steps) and data columns for ghi_rest (clearsky), ghi_ground (surfrad), and “clear”, where clear is a boolean (1 for clear) with float dtype so it can have NaN values where ground measurements are missing.
- calculate_sky_class(df)[source]
Calculate the sky class (clear, cloudy, broken, missing) from the comparison df.
- Parameters:
df (pd.DataFrame) – Timeseries data with time_index from the surfrad data (might be missing time steps) and data columns for ghi_rest (clearsky), ghi_ground (surfrad), and “clear”, where clear is a boolean (1 for clear) with float dtype so it can have NaN values where ground measurements are missing.
- Returns:
df (pd.DataFrame) – Same as input but with new column “sky_class” with values (clear, cloudy, broken, missing) calculated from the clear_time_frac and cloudy_time_frac inputs over a time window determined by the window_minutes inputs. Note that sky_class == missing means that it is night or there is missing ground measurement data and validation should not be performed with those timesteps.
- classmethod run(fp_surf, fp_nsrdb, nsrdb_gid, clearsky_ratio=0.9, clear_time_frac=0.8, cloudy_time_frac=0.2, window_minutes=61, min_irradiance=0, sza_lim=89)[source]
- Parameters:
fp_surf (str) – Filepath to surfrad h5 file.
fp_nsrdb (str) – Filepath to NSRDB file. can be a MultiFileResource path with: /dir/prefix*suffix.h5
nsrdb_gid (int) – GID (meta data index) for the site of interest in the fp_nsrdb file that matches the fp_surf file.
clearsky_ratio (float) – Clearsky ratio (ground measurement / clearsky irradiance) above which a timestep is considered clear
clear_time_frac (float) – Fraction of clear timesteps in an averaging window above which the whole window is considered clear. Between clear_time_frac and cloudy_time_frac is considered broken clouds.
cloudy_time_frac (float) – Fraction of cloudy timesteps in an averaging window below which the whole window is considered cloudy. Between clear_time_frac and cloudy_time_frac is considered broken clouds.
window_minutes (int) – Minutes that the moving average of the sky classification will be over. This will be calculated while considering the source time resolution of the SURFRAD measurements.
min_irradiance (float | int) – Minimum irradiance value, timesteps with either ground measured or NSRDB irradiance less than this value will be classified as missing.
sza_lim (int | float) – Maximum solar zenith angle, timesteps with sza > sza_lim will be classified as missing
- Returns:
df (pd.DataFrame) – Timeseries of validation data from fp_nsrdb and fp_surf including sky classification strings (clear, cloudy, broken, missing) with same datetimeindex as the nsrdb file. Note that sky_class == missing means that it is night or there is missing ground measurement data and validation should not be performed with those timesteps.