reVX.rpm.rpm_manager.RPMClusterManager

class RPMClusterManager(cf_fpath, rpm_meta, rpm_region_col=None, max_workers=None)[source]

Bases: object

RPM Cluster Manager:

Extracts gids for all RPM regions

Runs RPMClusters in parallel for all regions

Save results to disk

Parameters:

cf_fpath (str) – Path to reV .h5 file containing desired capacity factor profiles
rpm_meta (pandas.DataFrame | str) –

DataFrame or path to .csv or .json containing the RPM meta data:
- Categorical regions of interest with column label “region”
- # of clusters per region with column label “clusters”
- A column that maps the RPM regions to the cf_fpath meta data: “res_gid” (priorized) or “gen_gid”. This can be omitted if the rpm_region_col kwarg input is found in the cf_fpath meta
rpm_region_col (str | Nonetype) – If not None, the meta-data field to map RPM regions to
max_workers (int, optional) – Number of parallel workers. 1 will run serial, None will use all available., by default None

Methods

`run_clusters`(cf_fpath, rpm_meta, out_dir[, ...])	RPM Cluster Manager:
`run_clusters_and_profiles`(cf_fpath, ...[, ...])	RPM Cluster Manager:

classmethod run_clusters(cf_fpath, rpm_meta, out_dir, job_tag=None, rpm_region_col=None, max_workers=True, **cluster_kwargs)[source]

RPM Cluster Manager:

Extracts gen_gids for all RPM regions

Runs RPMClusters in parallel for all regions

Save results to disk

Parameters:

cf_fpath (str) – Path to reV .h5 file containing desired capacity factor profiles
rpm_meta (pandas.DataFrame | str) –

DataFrame or path to .csv or .json containing the RPM meta data:
- Categorical regions of interest with column label “region”
- # of clusters per region with column label “clusters”
- A column that maps the RPM regions to the cf_fpath meta data: “res_gid” (priorized) or “gen_gid”. This can be omitted if the rpm_region_col kwarg input is found in the cf_fpath meta
out_dir (str) – Directory to dump output files.
job_tag (str | None) – Optional name tag to add to the output files. Format is “rpm_cluster_output_{tag}.csv”.
rpm_region_col (str | Nonetype) – If not None, the meta-data field to map RPM regions to
max_workers (int, optional) – Number of parallel workers. 1 will run serial, None will use all available., by default None
output_kwargs (dict | None) – Kwargs for the RPM outputs manager.
**cluster_kwargs (dict) – RPMClusters kwargs

classmethod run_clusters_and_profiles(cf_fpath, rpm_meta, excl_fpath, excl_dict, techmap_dset, out_dir, job_tag=None, rpm_region_col=None, max_workers=True, pre_extract_inclusions=False, output_kwargs=None, **cluster_kwargs)[source]

RPM Cluster Manager:

Extracts gen_gids for all RPM regions

Runs RPMClusters in parallel for all regions

Save results to disk

Parameters:

cf_fpath (str) – Path to reV .h5 file containing desired capacity factor profiles
rpm_meta (pandas.DataFrame | str) –

DataFrame or path to .csv or .json containing the RPM meta data:
- Categorical regions of interest with column label “region”
- # of clusters per region with column label “clusters”
- A column that maps the RPM regions to the cf_fpath meta data: “res_gid” (priorized) or “gen_gid”. This can be omitted if the rpm_region_col kwarg input is found in the cf_fpath meta
excl_fpath (str | None) – Filepath to exclusions data (must match the techmap grid). None will not apply exclusions.
excl_dict (dict | None) – Dictionary of exclusion LayerMask arugments {layer: {kwarg: value}}
techmap_dset (str) – Dataset name in the exclusions file containing the exclusions-to-resource mapping data.
out_dir (str) – Directory to dump output files.
job_tag (str | None) – Optional name tag to add to the output files. Format is “rpm_cluster_output_{tag}.csv”.
rpm_region_col (str | Nonetype) – If not None, the meta-data field to map RPM regions to
max_workers (int, optional) – Number of parallel workers. 1 will run serial, None will use all available., by default None
pre_extract_inclusions (bool) – Flag to pre-extract the inclusion mask using excl_fpath and excl_dict. This is advantageous if the excl_dict is highly complex and if you’re processing a lot of points. Default is False.
output_kwargs (dict | None) – Kwargs for the RPM outputs manager.
**cluster_kwargs (dict) – RPMClusters kwargs