reVX.rpm.rpm_clusters.RPMClusters

class RPMClusters(cf_fpath, gen_gids, n_clusters, region=None)[source]

Bases: object

Base class for RPM clusters

Examples

>>> from reV import Resource
>>>
>>> fname = '$TESTDATADIR/reV_gen/gen_pv_2012.h5'
>>> with Resource(fname) as res:
>>>     gen_gids = f.meta.index.values
>>>
>>> clusters = RPMClusters(fname, gen_gids, n_clusters=6)
>>> clusters._cluster(**kwargs)
>>> clusters.meta
        gen_gid   latitude  longitude  cluster_id   geometry
0         0  41.290001 -71.860001           0  POINT (-71.86000 41.29000)
1         1  41.290001 -71.820000           0  POINT (-71.82000 41.29000)
2         2  41.250000 -71.820000           4  POINT (-71.82000 41.25000)
3         3  41.330002 -71.820000           0  POINT (-71.82000 41.33000)
4         4  41.369999 -71.820000           0  POINT (-71.82000 41.37000)
..      ...        ...        ...         ...                         ...
95       95  41.250000 -71.660004           4  POINT (-71.66000 41.25000)
96       96  41.889999 -71.660004           5  POINT (-71.66000 41.89000)
97       97  41.450001 -71.660004           3  POINT (-71.66000 41.45000)
98       98  41.610001 -71.660004           1  POINT (-71.66000 41.61000)
99       99  41.410000 -71.660004           3  POINT (-71.66000 41.41000)

Generate Shape File of Cluster

>>> RPMClusters.generate_shapefile(clusters.meta, fpath='./test.shp')
Parameters:
  • cf_fpath (str) – Path to reV .h5 files containing desired capacity factor profiles

  • gen_gids (list | ndarray) – List or vector of gen_gids to cluster on

  • n_clusters (int) – Number of clusters to identify

  • region (str | None) – Optional region identifier that you are clustering on for better debugging.

Methods

cluster(cf_h5_path, region_gen_gids, n_clusters)

Entry point for RPMCluster to get clusters for a given region defined as a list | array of gen_gids

generate_shapefile(meta, fpath[, beautify, ...])

Generate cluster polygons and save to shapefile

Attributes

cluster_coefficients

returns:

cluster_coeffs (ndarray) -- Representative coefficients for each cluster

cluster_coordinates

returns:

cluster_coords (ndarray) -- lon, lat coordinates of the centroid of each cluster

cluster_ids

returns:

cluster_ids (ndarray) -- Cluster cluster_id for each gen_gid

coefficients

returns:

_coefficients (ndarray) -- Array of wavelet coefficients for each gen_gid

coordinates

returns:

coords (ndarray) -- lon, lat coordinates for each gen_gid

meta

returns:

_meta (pandas.DataFrame) -- DataFrame of meta data:

n_clusters

returns:

_n_clusters (int) -- Number of clusters

property coefficients
Returns:

_coefficients (ndarray) – Array of wavelet coefficients for each gen_gid

property meta
Returns:

_meta (pandas.DataFrame) – DataFrame of meta data: - gen_gid - latitude - longitude - cluster_id - rank

property n_clusters
Returns:

_n_clusters (int) – Number of clusters

property cluster_coefficients
Returns:

cluster_coeffs (ndarray) – Representative coefficients for each cluster

property cluster_ids
Returns:

cluster_ids (ndarray) – Cluster cluster_id for each gen_gid

property cluster_coordinates
Returns:

cluster_coords (ndarray) – lon, lat coordinates of the centroid of each cluster

property coordinates
Returns:

coords (ndarray) – lon, lat coordinates for each gen_gid

classmethod generate_shapefile(meta, fpath, beautify=True, source_crs='EPSG:4326', target_crs=None)[source]

Generate cluster polygons and save to shapefile

classmethod cluster(cf_h5_path, region_gen_gids, n_clusters, method='kmeans', method_kwargs=None, dist_rank_filter=True, dist_rmse_kwargs=None, contiguous_filter=True, contiguous_kwargs=None, region=None)[source]

Entry point for RPMCluster to get clusters for a given region defined as a list | array of gen_gids

Parameters:
  • cf_h5_path (str) – Path to reV .h5 files containing desired capacity factor profiles

  • region_gen_gids (list | ndarray) – List or vector of gen_gids to cluster on

  • n_clusters (int) – Number of clusters to identify

  • method (str) – Method to use to cluster coefficients

  • method_kwargs (dict) – Kwargs for running _cluster_coefficients

  • dist_rank_filter (bool) – Re-cluster data by minimizing the sum of the: - distance between each point and each cluster centroid

  • dist_rmse_kwargs (dict) – Kwargs for running _dist_rank_optimization

  • contiguous_filter (bool) – Re-classify clusters by making contigous cluster polygons

  • contiguous_kwargs (dict) – Kwargs for _contiguous_filter

  • region (str | None) – Optional region identifier that you are clustering on for better debugging.

Returns:

out (pandas.DataFrame) – Cluster results: (gen_gid, lon, lat, cluster_id, rank)

Examples

>>> from reV import Resource
>>>
>>> fname = '$TESTDATADIR/reV_ge/gen_pv_2012.h5'
>>> with Resource(fname) as res:
>>>     gen_gids = f.meta.index.values
>>>
>>> RPMClusters.cluster(fname, gen_gids, n_clusters=6)
        gen_gid   latitude  longitude  cluster_id   geometry
0         0  41.290001 -71.860001       0  POINT (-71.86000 41.29000)
1         1  41.290001 -71.820000       0  POINT (-71.82000 41.29000)
2         2  41.250000 -71.820000       4  POINT (-71.82000 41.25000)
3         3  41.330002 -71.820000       0  POINT (-71.82000 41.33000)
4         4  41.369999 -71.820000       0  POINT (-71.82000 41.37000)
..      ...        ...        ...     ...                         ...
95       95  41.250000 -71.660004       4  POINT (-71.66000 41.25000)
96       96  41.889999 -71.660004       5  POINT (-71.66000 41.89000)
97       97  41.450001 -71.660004       3  POINT (-71.66000 41.45000)
98       98  41.610001 -71.660004       1  POINT (-71.66000 41.61000)
99       99  41.410000 -71.660004       3  POINT (-71.66000 41.41000)