sup3r.solar.solar.Solar#

class Solar(sup3r_fps, nsrdb_fp, t_slice=slice(None, None, None), tz=-6, agg_factor=1, nn_threshold=0.5, cloud_threshold=0.99)[source]#

Bases: object

Custom sup3r solar module. This primarily converts GAN output clearsky ratio to GHI, DNI, and DHI using NSRDB data and utility modules like DISC

Parameters:
  • sup3r_fps (str | list) – Full .h5 filepath(s) to one or more sup3r GAN output .h5 chunk files containing clearsky_ratio, time_index, and meta. These files must have the same meta data but can be sequential and ordered temporal chunks. The data in this file has been rolled by tz to a local time (assumed that the GAN was trained on local time solar data) and will be converted back into UTC, so it’s wise to include some padding in the sup3r_fps file list.

  • nsrdb_fp (str) – Filepath to NSRDB .h5 file containing clearsky_ghi, clearsky_dni, clearsky_dhi data.

  • t_slice (slice) – Slicing argument to slice the temporal axis of the sup3r_fps source data after doing the tz roll to UTC but before returning the irradiance variables. This can be used to effectively pad the solar irradiance calculation in UTC time. For example, if sup3r_fps is 3 files each with 24 hours of data, t_slice can be slice(24, 48) to only output the middle day of irradiance data, but padded by the other two days for the UTC output.

  • tz (int) – The timezone offset for the data in sup3r_fps. It is assumed that the GAN is trained on data in local time and therefore the output in sup3r_fps should be treated as local time. For example, -6 is CST which is default for CONUS training data.

  • agg_factor (int) – Spatial aggregation factor for nsrdb-to-GAN-meta e.g. the number of NSRDB spatial pixels to average for a single sup3r GAN output site.

  • nn_threshold (float) – The KDTree nearest neighbor threshold that determines how far the sup3r GAN output data has to be from the NSRDB source data to get irradiance=0. Note that is value is in decimal degrees which is a very approximate way to determine real distance.

  • cloud_threshold (float) – Clearsky ratio threshold below which the data is considered cloudy and DNI is calculated using DISC.

Methods

close()

Close all internal file handlers

get_node_cmd(config)

Get a CLI call to run Solar.run_temporal_chunk() on a single node based on an input config.

get_nsrdb_data(dset)

Get an NSRDB dataset with spatial index corresponding to the sup3r GAN output data, averaged across the agg_factor.

get_sup3r_fps(fp_pattern[, ignore])

Get a list of file chunks to run in parallel based on a file pattern

preflight()

Run preflight checks on source data to make sure everything will work together.

run_temporal_chunks(fp_pattern, nsrdb_fp[, ...])

Run the solar module on all spatial chunks for each temporal chunk corresponding to the fp_pattern and the given list of temporal_ids.

write(fp_out[, features])

Write irradiance datasets (ghi, dni, dhi) to output h5 file.

Attributes

clearsky_ratio

Get the clearsky ghi ratio data from the GAN output, rolled from tz to UTC.

cloud_mask

Get the cloud mask (True if cloudy) based on the GAN output clearsky ratio in UTC.

dhi

Get the dhi (W/m2) which is calculated based on the simple relationship between GHI, DNI and solar zenith angle.

dist

Get the nearest neighbor distances from the sup3r GAN data sites to the NSRDB nearest neighbors.

dni

Get the dni (W/m2) which is clearsky dni (from NSRDB) when the GAN output is clear (clearsky_ratio is > cloud_threshold), and calculated from the DISC model when cloudy

ghi

Get the ghi (W/m2) based on the GAN output clearsky ratio + clearsky_ghi in UTC.

idnn

Get the nearest neighbor meta data indices from the NSRDB data that correspond to the sup3r GAN data

nsrdb_tslice

Get the time slice of the NSRDB data corresponding to the sup3r GAN output.

out_of_bounds

Get a boolean mask for the sup3r data that is out of bounds (too far from the NSRDB data).

solar_zenith_angle

Get the solar zenith angle (degrees)

time_index

Time index for the sup3r GAN output data but sliced by t_slice

preflight()[source]#

Run preflight checks on source data to make sure everything will work together.

close()[source]#

Close all internal file handlers

property idnn#

Get the nearest neighbor meta data indices from the NSRDB data that correspond to the sup3r GAN data

Returns:

idnn (Union[np.ndarray, da.core.Array]) – 2D array of length (n_sup3r_sites, agg_factor) where the values are meta data indices from the NSRDB.

property dist#

Get the nearest neighbor distances from the sup3r GAN data sites to the NSRDB nearest neighbors.

Returns:

dist (Union[np.ndarray, da.core.Array]) – 2D array of length (n_sup3r_sites, agg_factor) where the values are decimal degree distances from the sup3r sites to the nsrdb nearest neighbors.

property time_index#

Time index for the sup3r GAN output data but sliced by t_slice

Returns:

pd.DatetimeIndex

property out_of_bounds#

Get a boolean mask for the sup3r data that is out of bounds (too far from the NSRDB data).

Returns:

out_of_bounds (Union[np.ndarray, da.core.Array]) – 1D boolean array with length == number of sup3r GAN sites. True if the site is too far from the NSRDB.

property nsrdb_tslice#

Get the time slice of the NSRDB data corresponding to the sup3r GAN output.

property clearsky_ratio#

Get the clearsky ghi ratio data from the GAN output, rolled from tz to UTC.

Returns:

clearsky_ratio (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

property solar_zenith_angle#

Get the solar zenith angle (degrees)

Returns:

solar_zenith_angle (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

property ghi#

Get the ghi (W/m2) based on the GAN output clearsky ratio + clearsky_ghi in UTC.

Returns:

ghi (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

property dni#

Get the dni (W/m2) which is clearsky dni (from NSRDB) when the GAN output is clear (clearsky_ratio is > cloud_threshold), and calculated from the DISC model when cloudy

Returns:

dni (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

property dhi#

Get the dhi (W/m2) which is calculated based on the simple relationship between GHI, DNI and solar zenith angle.

Returns:

dhi (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

property cloud_mask#

Get the cloud mask (True if cloudy) based on the GAN output clearsky ratio in UTC.

Returns:

cloud_mask (Union[np.ndarray, da.core.Array]) – 2D array with shape (time, sites) in UTC.

get_nsrdb_data(dset)[source]#

Get an NSRDB dataset with spatial index corresponding to the sup3r GAN output data, averaged across the agg_factor.

Parameters:

dset (str) – Name of dataset to retrieve from NSRDB source file

Returns:

out (Union[np.ndarray, da.core.Array]) – Dataset of shape (time, sites) where time and sites correspond to the same shape as the sup3r GAN output data and if agg_factor > 1 the sites is an average across multiple NSRDB sites.

static get_sup3r_fps(fp_pattern, ignore=None)[source]#

Get a list of file chunks to run in parallel based on a file pattern

Note

It’s assumed that all source files have the pattern sup3r_file_TTTTTT_SSSSSS.h5 where TTTTTT is the zero-padded temporal chunk index and SSSSSS is the zero-padded spatial chunk index.

Parameters:
  • fp_pattern (str | list) – Unix-style file*pattern that matches a set of spatiotemporally chunked sup3r forward pass output files.

  • ignore (str | None) – Ignore all files that have this string in their filenames.

Returns:

  • fp_sets (list) – List of file sets where each file set is 3 temporally sequential files over the same spatial chunk. Each file set overlaps with its neighbor such that fp_sets[0][-1] == fp_sets[1][0] (this is so Solar can do temporal padding when rolling from GAN local time output to UTC).

  • t_slices (list) – List of t_slice arguments corresponding to fp_sets to pass to Solar class initialization that will slice and reduce the overlapping time axis when Solar outputs irradiance data.

  • temporal_ids (list) – List of temporal id strings TTTTTT corresponding to the fp_sets

  • spatial_ids (list) – List of spatial id strings SSSSSS corresponding to the fp_sets

  • target_fps (list) – List of actual target files corresponding to fp_sets, so for example the file set fp_sets[10] sliced by t_slices[10] is designed to process target_fps[10]

classmethod get_node_cmd(config)[source]#

Get a CLI call to run Solar.run_temporal_chunk() on a single node based on an input config.

Parameters:

config (dict) – sup3r solar config with all necessary args and kwargs to run Solar.run_temporal_chunk() on a single node.

write(fp_out, features=('ghi', 'dni', 'dhi'))[source]#

Write irradiance datasets (ghi, dni, dhi) to output h5 file.

Parameters:
  • fp_out (str) – Filepath to an output h5 file to write irradiance variables to. Parent directory will be created if it does not exist.

  • features (list | tuple) – List of features to write to disk. These have to be attributes of the Solar class (ghi, dni, dhi).

classmethod run_temporal_chunks(fp_pattern, nsrdb_fp, fp_out_suffix='irradiance', tz=-6, agg_factor=1, nn_threshold=0.5, cloud_threshold=0.99, features=('ghi', 'dni', 'dhi'), temporal_ids=None)[source]#

Run the solar module on all spatial chunks for each temporal chunk corresponding to the fp_pattern and the given list of temporal_ids. This typically gets run from the CLI.

Parameters:
  • fp_pattern (str) – Unix-style file*pattern that matches a set of spatiotemporally chunked sup3r forward pass output files.

  • nsrdb_fp (str) – Filepath to NSRDB .h5 file containing clearsky_ghi, clearsky_dni, clearsky_dhi data.

  • fp_out_suffix (str) – Suffix to add to the input sup3r source files when writing the processed solar irradiance data to new data files.

  • tz (int) – The timezone offset for the data in sup3r_fps. It is assumed that the GAN is trained on data in local time and therefore the output in sup3r_fps should be treated as local time. For example, -6 is CST which is default for CONUS training data.

  • agg_factor (int) – Spatial aggregation factor for nsrdb-to-GAN-meta e.g. the number of NSRDB spatial pixels to average for a single sup3r GAN output site.

  • nn_threshold (float) – The KDTree nearest neighbor threshold that determines how far the sup3r GAN output data has to be from the NSRDB source data to get irradiance=0. Note that is value is in decimal degrees which is a very approximate way to determine real distance.

  • cloud_threshold (float) – Clearsky ratio threshold below which the data is considered cloudy and DNI is calculated using DISC.

  • features (list | tuple) – List of features to write to disk. These have to be attributes of the Solar class (ghi, dni, dhi).

  • temporal_ids (list | None) – Lise of zero-padded temporal ids from the file chunks that match fp_pattern. This input typically gets set from the CLI. If None, this will run all temporal indices.