flasc.utilities.energy_ratio_utilities#

Utility functions for calculating energy ratios.

Functions

add_bin_weights

Add weights to DataFrame bins.

add_power_ref

Add the pow_ref column to a dataframe, given which columns to average over.

add_power_test

Add the pow_test column to a dataframe, given which columns to average over.

add_reflected_rows

Add reflected rows to a dataframe.

add_wd

Add the wd column to a dataframe, given which columns to average over.

add_wd_bin

Add the wd_bin column to a dataframe.

add_ws

Add the ws column to a dataframe, given which columns to average over.

add_ws_bin

Add the ws_bin column to a dataframe.

bin_and_group_dataframe

Bin and aggregate a DataFrame based on wind direction and wind speed parameters.

bin_column

Bins the values in the specified column of a Polars DataFrame according to the given edges.

check_compute_energy_ratio_inputs

Check the inputs to compute_energy_ratio.

cut

Bins the values in the specified column according to the given edges.

filter_all_nulls

Filter dataframe for ALL nulls.

filter_any_nulls

Filter dataframe for ANY nulls.

flasc.utilities.energy_ratio_utilities.cut(col_name, edges)[source]#

Bins the values in the specified column according to the given edges.

Return type:

Expr

Parameters:
  • col_name (str) -- The name of the column to bin.

  • edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value.

Returns: expression: An expression object that can be used to bin the column.

flasc.utilities.energy_ratio_utilities.bin_column(df_, col_name, bin_col_name, edges)[source]#

Bins the values in the specified column of a Polars DataFrame according to the given edges.

Return type:

DataFrame

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • col_name (str) -- The name of the column to bin.

  • bin_col_name (str) -- The name to give the new column containing the bin labels.

  • edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value.

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional column containing the bin labels.

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_ws(df_, ws_cols, remove_all_nulls=False)[source]#

Add the ws column to a dataframe, given which columns to average over.

Return type:

DataFrame

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • ws_cols (list(str)) -- The name of the columns to average across.

  • remove_all_nulls (bool) -- (bool): Remove all null values in ws_cols (rather than any)

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional ws column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_ws_bin(df_, ws_cols, ws_step=1.0, ws_min=-0.5, ws_max=50.0, edges=None, remove_all_nulls=False)[source]#

Add the ws_bin column to a dataframe.

Given which columns to average over and the step sizes to use

Return type:

DataFrame

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • ws_cols (list(str)) -- The name of the columns to average across.

  • ws_step (float) -- Step size for binning

  • ws_min (float) -- Minimum wind speed

  • ws_max (float) -- Maximum wind speed

  • edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value. Defaults to None, in which case the edges are generated using ws_step, ws_min, and ws_max.

  • remove_all_nulls (bool) -- (bool): Remove all null values in ws_cols (rather than any)

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional ws_bin column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_wd(df_, wd_cols, remove_all_nulls=False)[source]#

Add the wd column to a dataframe, given which columns to average over.

Return type:

DataFrame

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • wd_cols (list(str)) -- The name of the columns to average across.

  • remove_all_nulls (bool) -- (bool): Remove all null values in wd_cols (rather than any)

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional wd column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_wd_bin(df_, wd_cols, wd_step=2.0, wd_min=0.0, wd_max=360.0, edges=None, remove_all_nulls=False)[source]#

Add the wd_bin column to a dataframe.

Given which columns to average over and the step sizes to use

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • wd_cols (list(str)) -- The name of the columns to average across.

  • wd_step (float) -- Step size for binning

  • wd_min (float) -- Minimum wind direction

  • wd_max (float) -- Maximum wind direction

  • edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value. Defaults to None, in which case the edges are generated using ws_step, ws_min, and ws_max.

  • remove_all_nulls (bool) -- (bool): Remove all null values in wd_cols (rather than any)

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional ws_bin column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_power_test(df_, test_cols)[source]#

Add the pow_test column to a dataframe, given which columns to average over.

Return type:

DataFrame

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • test_cols (list(str)) -- The name of the columns to average across.

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional pow_test column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_power_ref(df_, ref_cols)[source]#

Add the pow_ref column to a dataframe, given which columns to average over.

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.

  • ref_cols (list(str)) -- The name of the columns to average across.

  • df_ (DataFrame)

Returns:

A new Polars DataFrame with an additional pow_ref column

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.add_reflected_rows(df_, edges, overlap_distance)[source]#

Add reflected rows to a dataframe.

Adds rows to a dataframe with where the wind direction is reflected around the nearest edge if within overlap_distance

Given a wind direction DataFrame df_, this function adds reflected rows to the DataFrame such that each wind direction in the original DataFrame has a corresponding reflected wind direction. The reflected wind direction is calculated by subtracting the wind direction from the nearest edge in edges and then subtracting that difference again from the original wind direction. The resulting wind direction is then wrapped around to the range [0, 360) degrees. The function returns a new DataFrame with the original rows and the added reflected rows.

This function enables overlapping bins in the energy ratio functions

Parameters:
  • df -- polars.DataFrame The DataFrame to add reflected rows to.

  • edges (ndarray | list) -- numpy.ndarray An array of wind direction edges to use for reflection. (Should be same as used in energy ratio)

  • overlap_distance (float) -- float The maximum distance between a wind direction and an edge for the wind direction to be considered overlapping.

  • df_ (DataFrame)

Returns:

polars.DataFrame

A new DataFrame with the original rows and the added reflected rows.

flasc.utilities.energy_ratio_utilities.filter_all_nulls(df_, ref_cols, test_cols, ws_cols, wd_cols)[source]#

Filter dataframe for ALL nulls.

Filter data by requiring ALL values of ref, test, ws, and wd to be valid numbers.

Parameters:
  • df (pl.DataFrame) -- Polars dataframe possibly containing Null values

  • ref_cols (list[str]) -- A list of columns to use as the reference turbines

  • test_cols (list[str]) -- A list of columns to use as the test turbines

  • wd_cols (list[str]) -- A list of columns to derive the wind directions from

  • ws_cols (list[str]) -- A list of columns to derive the wind speeds from

  • df_ (DataFrame)

Returns:

A dataframe containing the energy ratio between the two sets of turbines.

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.filter_any_nulls(df_, ref_cols, test_cols, ws_cols, wd_cols)[source]#

Filter dataframe for ANY nulls.

Filter data by requiring ANY of ref, ANY of test, ANY of ws, and ANY of wd to be a valid number.

Parameters:
  • df (pl.DataFrame) -- Polars dataframe possibly containing Null values

  • ref_cols (list[str]) -- A list of columns to use as the reference turbines

  • test_cols (list[str]) -- A list of columns to use as the test turbines

  • wd_cols (list[str]) -- A list of columns to derive the wind directions from

  • ws_cols (list[str]) -- A list of columns to derive the wind speeds from

  • df_ (DataFrame)

Returns:

A dataframe containing the energy ratio between the two sets of turbines.

Return type:

pl.DataFrame

flasc.utilities.energy_ratio_utilities.check_compute_energy_ratio_inputs(df_, ref_turbines, test_turbines, wd_turbines, ws_turbines, use_predefined_ref, use_predefined_wd, use_predefined_ws, wd_step, wd_min, wd_max, ws_step, ws_min, ws_max, bin_cols_in, weight_by, df_freq, wd_bin_overlap_radius, uplift_pairs, uplift_names, uplift_absolute, N, percentiles, remove_all_nulls)[source]#

Check the inputs to compute_energy_ratio.

Check inputs to compute_energy_ratio. Inputs reflect inputs to compute_energy_ratio, with exception of df_, which is passed directly instead of er_in.

All the inputs of compute_energy_ratio are checked for validity. This function does not check every input, although they are all accepted.

Parameters:
  • df (pl.DataFrame) -- The Polars DataFrame

  • ref_turbines (list) -- A list of the reference turbine columns

  • test_turbines (list) -- A list of the test turbine columns

  • wd_turbines (list) -- A list of the wind direction columns

  • ws_turbines (list) -- A list of the wind speed columns

  • use_predefined_ref (bool) -- Whether to use predefined reference turbines

  • use_predefined_wd (bool) -- Whether to use predefined wind direction turbines

  • use_predefined_ws (bool) -- Whether to use predefined wind speed turbines

  • wd_step (float) -- Step size for binning wind direction

  • wd_min (float) -- Minimum wind direction

  • wd_max (float) -- Maximum wind direction

  • ws_step (float) -- Step size for binning wind speed

  • ws_min (float) -- Minimum wind speed

  • ws_max (float) -- Maximum wind speed

  • bin_cols_in (list) -- A list of columns to bin

  • weight_by (str) -- A string indicating how to weight the bins

  • df_freq (pl.DataFrame) -- A DataFrame containing frequency data

  • wd_bin_overlap_radius (float) -- The radius for overlapping wind direction bins

  • uplift_pairs (list) -- A list of uplift pairs

  • uplift_names (list) -- A list of uplift names

  • uplift_absolute (bool) -- Whether to use absolute uplift

  • N (int) -- Number of bootstrapping iterations

  • percentiles (list) -- A list of percentiles to calculate from bootstrap

  • remove_all_nulls (bool) -- Whether to remove all nulls

flasc.utilities.energy_ratio_utilities.bin_and_group_dataframe(df_, ref_cols, test_cols, wd_cols, ws_cols, wd_step=2.0, wd_min=0.0, wd_max=360.0, ws_step=1.0, ws_min=0.0, ws_max=50.0, wd_bin_overlap_radius=0.0, remove_all_nulls=False, bin_cols_without_df_name=None, num_df=0)[source]#

Bin and aggregate a DataFrame based on wind direction and wind speed parameters.

This function takes a Polars DataFrame (df_) and performs binning and aggregation operations based on wind direction (wd) and wind speed (ws). It allows for optional handling of reflected rows and grouping by specific columns. The resulting DataFrame contains aggregated statistics for reference and test power columns within specified bins.

Parameters:
  • df (DataFrame) -- The input Polars DataFrame to be processed.

  • ref_cols (List[str]) -- List of columns containing reference power data.

  • test_cols (List[str]) -- List of columns containing test power data.

  • wd_cols (List[str]) -- List of columns containing wind direction data.

  • ws_cols (List[str]) -- List of columns containing wind speed data.

  • wd_step (float, optional) -- Step size for wind direction binning. Defaults to 2.0.

  • wd_min (float, optional) -- Minimum wind direction value. Defaults to 0.0.

  • wd_max (float, optional) -- Maximum wind direction value. Defaults to 360.0.

  • ws_step (float, optional) -- Step size for wind speed binning. Defaults to 1.0.

  • ws_min (float, optional) -- Minimum wind speed value. Defaults to 0.0.

  • ws_max (float, optional) -- Maximum wind speed value. Defaults to 50.0.

  • wd_bin_overlap_radius (float, optional) -- Radius for overlapping wind direction bins. Defaults to 0.0.

  • remove_all_nulls (bool, optional) -- If True, remove rows unless all valid instead of any. Defaults to False.

  • bin_cols_without_df_name (List[str], optional) -- List of columns used for grouping without 'df_name'.

  • num_df (int, optional) -- Number of dataframes required for each bin combination.

  • df_ (DataFrame)

Returns:

The resulting Polars DataFrame with aggregated statistics.

Return type:

DataFrame

flasc.utilities.energy_ratio_utilities.add_bin_weights(df_, df_freq_pl=None, bin_cols_without_df_name=None, weight_by='min')[source]#

Add weights to DataFrame bins.

Add weights to DataFrame bins based on either frequency counts or the provided frequency table df_freq_pl.

This function assigns weights to DataFrame bins. If 'df_freq_pl' is provided, these weights are used directly. If 'df_freq_pl' is not provided, the function calculates the weights from the input DataFrame 'df_'. Weights can be determined as either the minimum ('min') or the sum ('sum') of counts.

Parameters:
  • df (DataFrame) -- The input Polars DataFrame containing bins and frequency counts.

  • df_freq_pl (DataFrame, optional) -- A Polars DataFrame containing frequency counts for bins. If not provided, the function will calculate these counts from 'df_'.

  • bin_cols_without_df_name (List, optional) -- List of columns used for grouping bins without 'df_name'.

  • weight_by (str, optional) -- Weight calculation method, either 'min' (minimum count) or 'sum' (sum of counts). Defaults to 'min'.

  • df_ (DataFrame)

Returns:

A tuple containing the modified DataFrame 'df_'

with added weights and the DataFrame

Return type:

Tuple[pl.DataFrame, pl.DataFrame]

'df_freq_pl' with the calculated frequency counts.

Raises:
  • RuntimeError -- If none of the ws/wd bins in data appear in df_freq.

  • UserWarning -- If some bins in data are not in df_freq and will receive a weight of 0.

Parameters:
  • df_ (DataFrame)

  • df_freq_pl (DataFrame | None)

  • bin_cols_without_df_name (List | None)

  • weight_by (str)