reVX.rpm.rpm_clusters.RPMClusters
- class RPMClusters(cf_fpath, gen_gids, n_clusters, region=None)[source]
Bases:
object
Base class for RPM clusters
Examples
>>> from reV import Resource >>> >>> fname = '$TESTDATADIR/reV_gen/gen_pv_2012.h5' >>> with Resource(fname) as res: >>> gen_gids = f.meta.index.values >>> >>> clusters = RPMClusters(fname, gen_gids, n_clusters=6) >>> clusters._cluster(**kwargs) >>> clusters.meta gen_gid latitude longitude cluster_id geometry 0 0 41.290001 -71.860001 0 POINT (-71.86000 41.29000) 1 1 41.290001 -71.820000 0 POINT (-71.82000 41.29000) 2 2 41.250000 -71.820000 4 POINT (-71.82000 41.25000) 3 3 41.330002 -71.820000 0 POINT (-71.82000 41.33000) 4 4 41.369999 -71.820000 0 POINT (-71.82000 41.37000) .. ... ... ... ... ... 95 95 41.250000 -71.660004 4 POINT (-71.66000 41.25000) 96 96 41.889999 -71.660004 5 POINT (-71.66000 41.89000) 97 97 41.450001 -71.660004 3 POINT (-71.66000 41.45000) 98 98 41.610001 -71.660004 1 POINT (-71.66000 41.61000) 99 99 41.410000 -71.660004 3 POINT (-71.66000 41.41000)
Generate Shape File of Cluster
>>> RPMClusters.generate_shapefile(clusters.meta, fpath='./test.shp')
- Parameters:
cf_fpath (str) – Path to reV .h5 files containing desired capacity factor profiles
gen_gids (list | ndarray) – List or vector of gen_gids to cluster on
n_clusters (int) – Number of clusters to identify
region (str | None) – Optional region identifier that you are clustering on for better debugging.
Methods
cluster
(cf_h5_path, region_gen_gids, n_clusters)Entry point for RPMCluster to get clusters for a given region defined as a list | array of gen_gids
generate_shapefile
(meta, fpath[, beautify, ...])Generate cluster polygons and save to shapefile
Attributes
- returns:
cluster_coeffs (ndarray) -- Representative coefficients for each cluster
- returns:
cluster_coords (ndarray) -- lon, lat coordinates of the centroid of each cluster
- returns:
cluster_ids (ndarray) -- Cluster cluster_id for each gen_gid
- returns:
_coefficients (ndarray) -- Array of wavelet coefficients for each gen_gid
- returns:
coords (ndarray) -- lon, lat coordinates for each gen_gid
- returns:
_meta (pandas.DataFrame) -- DataFrame of meta data:
- returns:
_n_clusters (int) -- Number of clusters
- property coefficients
- Returns:
_coefficients (ndarray) – Array of wavelet coefficients for each gen_gid
- property meta
- Returns:
_meta (pandas.DataFrame) – DataFrame of meta data: - gen_gid - latitude - longitude - cluster_id - rank
- property n_clusters
- Returns:
_n_clusters (int) – Number of clusters
- property cluster_coefficients
- Returns:
cluster_coeffs (ndarray) – Representative coefficients for each cluster
- property cluster_ids
- Returns:
cluster_ids (ndarray) – Cluster cluster_id for each gen_gid
- property cluster_coordinates
- Returns:
cluster_coords (ndarray) – lon, lat coordinates of the centroid of each cluster
- property coordinates
- Returns:
coords (ndarray) – lon, lat coordinates for each gen_gid
- classmethod generate_shapefile(meta, fpath, beautify=True, source_crs='EPSG:4326', target_crs=None)[source]
Generate cluster polygons and save to shapefile
- classmethod cluster(cf_h5_path, region_gen_gids, n_clusters, method='kmeans', method_kwargs=None, dist_rank_filter=True, dist_rmse_kwargs=None, contiguous_filter=True, contiguous_kwargs=None, region=None)[source]
Entry point for RPMCluster to get clusters for a given region defined as a list | array of gen_gids
- Parameters:
cf_h5_path (str) – Path to reV .h5 files containing desired capacity factor profiles
region_gen_gids (list | ndarray) – List or vector of gen_gids to cluster on
n_clusters (int) – Number of clusters to identify
method (str) – Method to use to cluster coefficients
method_kwargs (dict) – Kwargs for running _cluster_coefficients
dist_rank_filter (bool) – Re-cluster data by minimizing the sum of the: - distance between each point and each cluster centroid
dist_rmse_kwargs (dict) – Kwargs for running _dist_rank_optimization
contiguous_filter (bool) – Re-classify clusters by making contigous cluster polygons
contiguous_kwargs (dict) – Kwargs for _contiguous_filter
region (str | None) – Optional region identifier that you are clustering on for better debugging.
- Returns:
out (pandas.DataFrame) – Cluster results: (gen_gid, lon, lat, cluster_id, rank)
Examples
>>> from reV import Resource >>> >>> fname = '$TESTDATADIR/reV_ge/gen_pv_2012.h5' >>> with Resource(fname) as res: >>> gen_gids = f.meta.index.values >>> >>> RPMClusters.cluster(fname, gen_gids, n_clusters=6) gen_gid latitude longitude cluster_id geometry 0 0 41.290001 -71.860001 0 POINT (-71.86000 41.29000) 1 1 41.290001 -71.820000 0 POINT (-71.82000 41.29000) 2 2 41.250000 -71.820000 4 POINT (-71.82000 41.25000) 3 3 41.330002 -71.820000 0 POINT (-71.82000 41.33000) 4 4 41.369999 -71.820000 0 POINT (-71.82000 41.37000) .. ... ... ... ... ... 95 95 41.250000 -71.660004 4 POINT (-71.66000 41.25000) 96 96 41.889999 -71.660004 5 POINT (-71.66000 41.89000) 97 97 41.450001 -71.660004 3 POINT (-71.66000 41.45000) 98 98 41.610001 -71.660004 1 POINT (-71.66000 41.61000) 99 99 41.410000 -71.660004 3 POINT (-71.66000 41.41000)