Calculations

SLiDE.impute_meanFunction
impute_mean(df::DataFrame, col::Symbol)

This function fills missing values in df with the average over the index given in col using the standard mean. For a parameter $z$ imputed over an index $x$, the average $\bar{z}$ would be calculated:

\[\bar{z} = \dfrac{\sum_x z}{N}\]

If a weight $w$ is given, the weighted average would be calculated:

\[\bar{z} = \dfrac{\sum_x z \cdot w}{\sum_x w}\]

This process of filling missing values is called "mean imputation".

Arguments

  • df::DataFrame with missing values
  • col::Symbol over which to average.

Keyword Arguments

  • weight::DataFrame=DataFrame() to use when weighting.
  • condition::DataFrame=DataFrame() on which indices to keep in the output df_avg. If no condition is given, all NaN values in the input df will be replaced.

Returns

  • df_avg::DataFrame of mean.
  • df::DataFrame of unchanged values
source
SLiDE.indexjoinFunction
indexjoin(df::DataFrame...; kwargs)
indexjoin(df::Array{DataFrame,1}; kwargs)

This function joins input DataFrames on their index columns (ones that are not filled with AbstractFloat or Bool DataTypes)

Argument

  • df::DataFrame... to join.
source
SLiDE.combine_overFunction
combine_over(df::DataFrame, col::Array{Symbol,1}; operation::Function = sum)
combine_over(df::DataFrame, col::Symbol; operation::Function = sum)

This function applies combine to the input DataFrame df over the input column(s) col.

Arguments

  • df::DataFrame: DataFrame on which to operate.
  • col::Symbol or col::Array{Symbol,1}: column(s) over which to operate.

Keywords

  • operation::Function = sum: Operation to perform over the DataFrame columns. By default, the function will return a summation. Other standard summary functions include: sum, prod, minimum, maximum, mean, var, std, first, last and length.

Returns

  • df::DataFrame WITHOUT the specified column(s) argument. The resulting DataFrame will be 'shorter' than the input DataFrame.
source
SLiDE.transform_overFunction
transform_over(df::DataFrame, col::Array{Symbol,1}; operation::Function = sum)
transform_over(df::DataFrame, col::Symbol; operation::Function = sum)

This function applies transform to the input DataFrame df over the input column(s) col.

Arguments

  • df::DataFrame: DataFrame on which to operate.
  • col::Symbol or col::Array{Symbol,1}: column(s) over which to operate.

Keywords

  • operation::Function = sum: Operation to perform over the DataFrame columns. By default, the function will return a summation. Other standard summary functions include: sum, prod, minimum, maximum, mean, var, std, first, last and length.

Returns

  • df::DataFrame WITH the specified column(s) argument. The resulting DataFrame will be the same length as the input DataFrame.
source