Calculations
SLiDE.impute_mean
— Functionimpute_mean(df::DataFrame, col::Symbol)
This function fills missing values in df
with the average over the index given in col
using the standard mean. For a parameter $z$ imputed over an index $x$, the average $\bar{z}$ would be calculated:
\[\bar{z} = \dfrac{\sum_x z}{N}\]
If a weight $w$ is given, the weighted average would be calculated:
\[\bar{z} = \dfrac{\sum_x z \cdot w}{\sum_x w}\]
This process of filling missing values is called "mean imputation".
Arguments
df::DataFrame
with missing valuescol::Symbol
over which to average.
Keyword Arguments
weight::DataFrame=DataFrame()
to use when weighting.condition::DataFrame=DataFrame()
on which indices to keep in the outputdf_avg
. If no condition is given, allNaN
values in the inputdf
will be replaced.
Returns
df_avg::DataFrame
of mean.df::DataFrame
of unchanged values
SLiDE.indexjoin
— Functionindexjoin(df::DataFrame...; kwargs)
indexjoin(df::Array{DataFrame,1}; kwargs)
This function joins input DataFrames on their index columns (ones that are not filled with AbstractFloat
or Bool
DataTypes)
Argument
df::DataFrame...
to join.
SLiDE.combine_over
— Functioncombine_over(df::DataFrame, col::Array{Symbol,1}; operation::Function = sum)
combine_over(df::DataFrame, col::Symbol; operation::Function = sum)
This function applies combine
to the input DataFrame df
over the input column(s) col
.
Arguments
df::DataFrame
: DataFrame on which to operate.col::Symbol
orcol::Array{Symbol,1}
: column(s) over which to operate.
Keywords
operation::Function = sum
: Operation to perform over the DataFrame columns. By default, the function will return a summation. Other standard summary functions include:sum
,prod
,minimum
,maximum
,mean
,var
,std
,first
,last
andlength
.
Returns
df::DataFrame
WITHOUT the specified column(s) argument. The resulting DataFrame will be 'shorter' than the input DataFrame.
SLiDE.transform_over
— Functiontransform_over(df::DataFrame, col::Array{Symbol,1}; operation::Function = sum)
transform_over(df::DataFrame, col::Symbol; operation::Function = sum)
This function applies transform
to the input DataFrame df
over the input column(s) col
.
Arguments
df::DataFrame
: DataFrame on which to operate.col::Symbol
orcol::Array{Symbol,1}
: column(s) over which to operate.
Keywords
operation::Function = sum
: Operation to perform over the DataFrame columns. By default, the function will return a summation. Other standard summary functions include:sum
,prod
,minimum
,maximum
,mean
,var
,std
,first
,last
andlength
.
Returns
df::DataFrame
WITH the specified column(s) argument. The resulting DataFrame will be the same length as the input DataFrame.
SLiDE.operate_over
— Function