gaps.project_points.ProjectPoints#

class ProjectPoints(points, **kwargs)[source]#

Bases: object

Class to manage site and SAM input configuration requests.

Initialize ProjectPoints.

Parameters:
  • points (int | slice | list | tuple | str | pd.DataFrame | dict) – Slice specifying project points, string pointing to a project points csv, or a DataFrame containing the effective csv contents. Can also be a single integer site value.

  • **kwargs – Keyword-argument pairs to add to project points DataFrame. The key should be the column name, and the value should be the value to add under the column name. Values must either be a scalar or match the length of the DataFrame resulting from the points input.

Methods

from_range(split_range, points, **kwargs)

Create a ProjectPoints instance from a range indices.

get_sites_from_key(key, value)

Get a site list for which the key equals the value.

index(gid)

Index location (iloc not loc) for a resource gid.

join_df(df2[, key])

Join df2 to the _df attribute using _df's gid as the join key.

split([sites_per_split])

Split the project points into sub-groups by number of sites.

Attributes

df

Project points DataFrame of site info.

gids

Gids (resource file index values) of sites.

sites_as_slice

Sites in slice format or list if non-sequential.

property df#

Project points DataFrame of site info.

Type:

pd.DataFrame

property gids#

Gids (resource file index values) of sites.

Type:

list

property sites_as_slice#

Sites in slice format or list if non-sequential.

Type:

list | slice

index(gid)[source]#

Index location (iloc not loc) for a resource gid.

Parameters:

gid (int) – Resource GID found in the project points gid column.

Returns:

ind (int) – Row index of gid in the project points DataFrame.

join_df(df2, key='gid')[source]#

Join df2 to the _df attribute using _df’s gid as the join key.

This can be used to add site-specific data to the project_points, taking advantage of the ProjectPoints iterator/split functions such that only the relevant site data is passed to the analysis functions.

Parameters:
  • df2 (pd.DataFrame) – DataFrame to be joined to the df attribute (this instance of project points DataFrame). This likely contains site-specific inputs that are to be passed to parallel workers.

  • key (str) – Primary key of df2 to be joined to the df attribute (this instance of the project points DataFrame). Primary key of the self._df attribute is fixed as the gid column.

get_sites_from_key(key, value)[source]#

Get a site list for which the key equals the value.

Parameters:
  • key (str) – Name of key (column) in project points DataFrame.

  • value (int | float | str | obj) – Value to look for under the key column.

Returns:

sites (lis of ints) – List of sites (GID values) associated with the requested key and value. If the key or value does not exist, an empty list is returned.

split(sites_per_split=100)[source]#

Split the project points into sub-groups by number of sites.

Parameters:

sites_per_split (int, optional) – Number of points in each sub-group. By default, 100.

Yields:

ProjectPoints – A new ProjectPoints instance with up to sites_per_split number of points.

classmethod from_range(split_range, points, **kwargs)[source]#

Create a ProjectPoints instance from a range indices.

Parameters:
  • split_range (2-tuple) – Tuple containing the start and end index (iloc, not loc). Last index is not included.

  • points (int | slice | list | tuple | str | pd.DataFrame | dict) – Slice specifying project points, string pointing to a project points csv, or a DataFrame containing the effective csv contents. Can also be a single integer site value.

  • **kwargs – Keyword-argument pairs to add to project points DataFrame. The key should be the column name, and the value should be the value to add under the column name. Values must either be a scalar or match the length of the DataFrame resulting from the points input.

Returns:

ProjectPoints – A new ProjectPoints instance with a range sampled from the points input according to split_range.