flasc.utilities.energy_ratio_utilities#
Utility functions for calculating energy ratios.
Functions
Add weights to DataFrame bins. |
|
Add the pow_ref column to a dataframe, given which columns to average over. |
|
Add the pow_test column to a dataframe, given which columns to average over. |
|
Add reflected rows to a dataframe. |
|
Add the wd column to a dataframe, given which columns to average over. |
|
Add the wd_bin column to a dataframe. |
|
Add the ws column to a dataframe, given which columns to average over. |
|
Add the ws_bin column to a dataframe. |
|
Bin and aggregate a DataFrame based on wind direction and wind speed parameters. |
|
Bins the values in the specified column of a Polars DataFrame according to the given edges. |
|
Check the inputs to compute_energy_ratio. |
|
Bins the values in the specified column according to the given edges. |
|
Filter dataframe for ALL nulls. |
|
Filter dataframe for ANY nulls. |
- flasc.utilities.energy_ratio_utilities.cut(col_name: str, edges: ndarray | list) Expr [source]#
Bins the values in the specified column according to the given edges.
- Parameters:
col_name (str) -- The name of the column to bin.
edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value.
- Return type:
Expr
Returns: expression: An expression object that can be used to bin the column.
- flasc.utilities.energy_ratio_utilities.bin_column(df_: DataFrame, col_name: str, bin_col_name: str, edges: ndarray | list) DataFrame [source]#
Bins the values in the specified column of a Polars DataFrame according to the given edges.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
col_name (str) -- The name of the column to bin.
bin_col_name (str) -- The name to give the new column containing the bin labels.
edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value.
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional column containing the bin labels.
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_ws(df_: DataFrame, ws_cols: List[str], remove_all_nulls: bool = False) DataFrame [source]#
Add the ws column to a dataframe, given which columns to average over.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
ws_cols (list(str)) -- The name of the columns to average across.
remove_all_nulls (bool) -- (bool): Remove all null values in ws_cols (rather than any)
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional ws column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_ws_bin(df_: DataFrame, ws_cols: List[str], ws_step: float = 1.0, ws_min: float = -0.5, ws_max: float = 50.0, edges: ndarray | list | None = None, remove_all_nulls: bool = False) DataFrame [source]#
Add the ws_bin column to a dataframe.
Given which columns to average over and the step sizes to use
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
ws_cols (list(str)) -- The name of the columns to average across.
ws_step (float) -- Step size for binning
ws_min (float) -- Minimum wind speed
ws_max (float) -- Maximum wind speed
edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value. Defaults to None, in which case the edges are generated using ws_step, ws_min, and ws_max.
remove_all_nulls (bool) -- (bool): Remove all null values in ws_cols (rather than any)
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional ws_bin column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_wd(df_: DataFrame, wd_cols: List[str], remove_all_nulls: bool = False) DataFrame [source]#
Add the wd column to a dataframe, given which columns to average over.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
wd_cols (list(str)) -- The name of the columns to average across.
remove_all_nulls (bool) -- (bool): Remove all null values in wd_cols (rather than any)
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional wd column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_wd_bin(df_: DataFrame, wd_cols: List[str], wd_step: float = 2.0, wd_min: float = 0.0, wd_max: float = 360.0, edges: ndarray | list | None = None, remove_all_nulls: bool = False)[source]#
Add the wd_bin column to a dataframe.
Given which columns to average over and the step sizes to use
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
wd_cols (list(str)) -- The name of the columns to average across.
wd_step (float) -- Step size for binning
wd_min (float) -- Minimum wind direction
wd_max (float) -- Maximum wind direction
edges (array-like) -- The edges of the bins. Values will be placed into the bin whose left edge is the largest edge less than or equal to the value, and whose right edge is the smallest edge greater than the value. Defaults to None, in which case the edges are generated using ws_step, ws_min, and ws_max.
remove_all_nulls (bool) -- (bool): Remove all null values in wd_cols (rather than any)
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional ws_bin column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_power_test(df_: DataFrame, test_cols: List[str]) DataFrame [source]#
Add the pow_test column to a dataframe, given which columns to average over.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
test_cols (list(str)) -- The name of the columns to average across.
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional pow_test column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_power_ref(df_: DataFrame, ref_cols: List[str])[source]#
Add the pow_ref column to a dataframe, given which columns to average over.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame containing the column to bin.
ref_cols (list(str)) -- The name of the columns to average across.
df_ (DataFrame)
- Returns:
A new Polars DataFrame with an additional pow_ref column
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.add_reflected_rows(df_: DataFrame, edges: ndarray | list, overlap_distance: float)[source]#
Add reflected rows to a dataframe.
Adds rows to a dataframe with where the wind direction is reflected around the nearest edge if within overlap_distance
Given a wind direction DataFrame df_, this function adds reflected rows to the DataFrame such that each wind direction in the original DataFrame has a corresponding reflected wind direction. The reflected wind direction is calculated by subtracting the wind direction from the nearest edge in edges and then subtracting that difference again from the original wind direction. The resulting wind direction is then wrapped around to the range [0, 360) degrees. The function returns a new DataFrame with the original rows and the added reflected rows.
This function enables overlapping bins in the energy ratio functions
- Parameters:
df -- polars.DataFrame The DataFrame to add reflected rows to.
edges (ndarray | list) -- numpy.ndarray An array of wind direction edges to use for reflection. (Should be same as used in energy ratio)
overlap_distance (float) -- float The maximum distance between a wind direction and an edge for the wind direction to be considered overlapping.
df_ (DataFrame)
- Returns:
- polars.DataFrame
A new DataFrame with the original rows and the added reflected rows.
- flasc.utilities.energy_ratio_utilities.filter_all_nulls(df_: DataFrame, ref_cols: List[str], test_cols: List[str], ws_cols: List[str], wd_cols: List[str])[source]#
Filter dataframe for ALL nulls.
Filter data by requiring ALL values of ref, test, ws, and wd to be valid numbers.
- Parameters:
df (pl.DataFrame) -- Polars dataframe possibly containing Null values
ref_cols (list[str]) -- A list of columns to use as the reference turbines
test_cols (list[str]) -- A list of columns to use as the test turbines
wd_cols (list[str]) -- A list of columns to derive the wind directions from
ws_cols (list[str]) -- A list of columns to derive the wind speeds from
df_ (DataFrame)
- Returns:
A dataframe containing the energy ratio between the two sets of turbines.
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.filter_any_nulls(df_: DataFrame, ref_cols: List[str], test_cols: List[str], ws_cols: List[str], wd_cols: List[str])[source]#
Filter dataframe for ANY nulls.
Filter data by requiring ANY of ref, ANY of test, ANY of ws, and ANY of wd to be a valid number.
- Parameters:
df (pl.DataFrame) -- Polars dataframe possibly containing Null values
ref_cols (list[str]) -- A list of columns to use as the reference turbines
test_cols (list[str]) -- A list of columns to use as the test turbines
wd_cols (list[str]) -- A list of columns to derive the wind directions from
ws_cols (list[str]) -- A list of columns to derive the wind speeds from
df_ (DataFrame)
- Returns:
A dataframe containing the energy ratio between the two sets of turbines.
- Return type:
pl.DataFrame
- flasc.utilities.energy_ratio_utilities.check_compute_analysis_inputs(df_, ref_turbines, test_turbines, wd_turbines, ws_turbines, use_predefined_ref, use_predefined_wd, use_predefined_ws, wd_step, wd_min, wd_max, ws_step, ws_min, ws_max, bin_cols_in, weight_by, df_freq, wd_bin_overlap_radius, uplift_pairs, uplift_names, uplift_absolute, N, percentiles, remove_all_nulls)[source]#
Check the inputs to compute_energy_ratio.
Check inputs to compute_energy_ratio. Inputs reflect inputs to compute_energy_ratio, with exception of df_, which is passed directly instead of a_in.
All the inputs of compute_energy_ratio are checked for validity. This function does not check every input, although they are all accepted.
- Parameters:
df (pl.DataFrame) -- The Polars DataFrame
ref_turbines (list) -- A list of the reference turbine columns
test_turbines (list) -- A list of the test turbine columns
wd_turbines (list) -- A list of the wind direction columns
ws_turbines (list) -- A list of the wind speed columns
use_predefined_ref (bool) -- Whether to use predefined reference turbines
use_predefined_wd (bool) -- Whether to use predefined wind direction turbines
use_predefined_ws (bool) -- Whether to use predefined wind speed turbines
wd_step (float) -- Step size for binning wind direction
wd_min (float) -- Minimum wind direction
wd_max (float) -- Maximum wind direction
ws_step (float) -- Step size for binning wind speed
ws_min (float) -- Minimum wind speed
ws_max (float) -- Maximum wind speed
bin_cols_in (list) -- A list of columns to bin
weight_by (str) -- A string indicating how to weight the bins
df_freq (pl.DataFrame) -- A DataFrame containing frequency data
wd_bin_overlap_radius (float) -- The radius for overlapping wind direction bins
uplift_pairs (list) -- A list of uplift pairs
uplift_names (list) -- A list of uplift names
uplift_absolute (bool) -- Whether to use absolute uplift
N (int) -- Number of bootstrapping iterations
percentiles (list) -- A list of percentiles to calculate from bootstrap
remove_all_nulls (bool) -- Whether to remove all nulls
- flasc.utilities.energy_ratio_utilities.bin_and_group_dataframe(df_: DataFrame, ref_cols: List, test_cols: List, wd_cols: List, ws_cols: List, wd_step: float = 2.0, wd_min: float = 0.0, wd_max: float = 360.0, ws_step: float = 1.0, ws_min: float = 0.0, ws_max: float = 50.0, wd_bin_overlap_radius: float = 0.0, remove_all_nulls: bool = False, bin_cols_without_df_name: List | None = None, num_df: int = 0)[source]#
Bin and aggregate a DataFrame based on wind direction and wind speed parameters.
This function takes a Polars DataFrame (df_) and performs binning and aggregation operations based on wind direction (wd) and wind speed (ws). It allows for optional handling of reflected rows and grouping by specific columns. The resulting DataFrame contains aggregated statistics for reference and test power columns within specified bins.
- Parameters:
df (DataFrame) -- The input Polars DataFrame to be processed.
ref_cols (List[str]) -- List of columns containing reference power data.
test_cols (List[str]) -- List of columns containing test power data.
wd_cols (List[str]) -- List of columns containing wind direction data.
ws_cols (List[str]) -- List of columns containing wind speed data.
wd_step (float, optional) -- Step size for wind direction binning. Defaults to 2.0.
wd_min (float, optional) -- Minimum wind direction value. Defaults to 0.0.
wd_max (float, optional) -- Maximum wind direction value. Defaults to 360.0.
ws_step (float, optional) -- Step size for wind speed binning. Defaults to 1.0.
ws_min (float, optional) -- Minimum wind speed value. Defaults to 0.0.
ws_max (float, optional) -- Maximum wind speed value. Defaults to 50.0.
wd_bin_overlap_radius (float, optional) -- Radius for overlapping wind direction bins. Defaults to 0.0.
remove_all_nulls (bool, optional) -- If True, remove rows unless all valid instead of any. Defaults to False.
bin_cols_without_df_name (List[str], optional) -- List of columns used for grouping without 'df_name'.
num_df (int, optional) -- Number of dataframes required for each bin combination.
df_ (DataFrame)
- Returns:
The resulting Polars DataFrame with aggregated statistics.
- Return type:
DataFrame
- flasc.utilities.energy_ratio_utilities.add_bin_weights(df_: DataFrame, df_freq_pl: DataFrame | None = None, bin_cols_without_df_name: List | None = None, weight_by: str = 'min')[source]#
Add weights to DataFrame bins.
Add weights to DataFrame bins based on either frequency counts or the provided frequency table df_freq_pl.
This function assigns weights to DataFrame bins. If 'df_freq_pl' is provided, these weights are used directly. If 'df_freq_pl' is not provided, the function calculates the weights from the input DataFrame 'df_'. Weights can be determined as either the minimum ('min') or the sum ('sum') of counts.
- Parameters:
df (DataFrame) -- The input Polars DataFrame containing bins and frequency counts.
df_freq_pl (DataFrame, optional) -- A Polars DataFrame containing frequency counts for bins. If not provided, the function will calculate these counts from 'df_'.
bin_cols_without_df_name (List, optional) -- List of columns used for grouping bins without 'df_name'.
weight_by (str, optional) -- Weight calculation method, either 'min' (minimum count) or 'sum' (sum of counts). Defaults to 'min'.
df_ (DataFrame)
- Returns:
- A tuple containing the modified DataFrame 'df_'
with added weights and the DataFrame
- Return type:
Tuple[pl.DataFrame, pl.DataFrame]
'df_freq_pl' with the calculated frequency counts.
- Raises:
RuntimeError -- If none of the ws/wd bins in data appear in df_freq.
UserWarning -- If some bins in data are not in df_freq and will receive a weight of 0.
- Parameters:
df_ (DataFrame)
df_freq_pl (DataFrame | None)
bin_cols_without_df_name (List | None)
weight_by (str)