phygnn.model_interfaces.random_forest_model.RandomForestModel
- class RandomForestModel(model, feature_names=None, label_name=None, norm_params=None, normalize=True, one_hot_categories=None)[source]
Bases:
ModelBase
scikit learn Random Forest Regression model interface
- Parameters:
model (sklearn.ensemble.RandomForestRegressor) – Sklearn Random Forest Model
feature_names (list) – Ordered list of feature names.
label_name (str) – label (output) variable name.
norm_params (dict, optional) – Dictionary mapping feature and label names (keys) to normalization parameters (mean, stdev), by default None
normalize (bool | tuple, optional) – Boolean flag(s) as to whether features and labels should be normalized. Possible values: - True means normalize both - False means don’t normalize either - Tuple of flags (normalize_feature, normalize_label) by default True
one_hot_categories (dict, optional) – Features to one-hot encode using given categories, if None do not run one-hot encoding, by default None
Methods
build_trained
(features, label[, normalize, ...])Build Random Forest Model with given kwargs and then train with given features, labels, and kwargs
compile_model
(**kwargs)Build sklearn random forest model
dict_json_convert
(inp)Recursively convert numeric values in dict to work with json dump
get_mean
(name)Get feature | label mean
get_norm_params
(names)Get means and stdevs for given feature/label names
get_stdev
(name)Get feature | label stdev
load
(path)Load model from model path.
make_one_hot_feature_names
(feature_names, ...)Update feature_names after one-hot encoding
normalize
(data[, names])Normalize given data
parse_features
(features[, names])Parse features - preprocessing of feature data before training or prediction.
parse_labels
(label[, name])Parse labels and normalize if desired
predict
(features[, table, parse_kwargs, ...])Use model to predict label from given features
save_model
(path)Save Random Forest Model to path.
seed
([s])Set the random seed for reproducible results.
train_model
(features, label[, shuffle, ...])Train the model with the provided features and label
unnormalize
(data[, names])Un-normalize given data
unnormalize_prediction
(prediction)Unnormalize prediction if needed
Attributes
Number of features
Feature means, used for (un)normalization
List of the feature variable names.
Feature stdevs, used for (un)normalization
Input feature names
Number of labels
label means, used for (un)normalization
label variable names
label stdevs, used for (un)normalization
Mapping feature/label names to the mean values for (un)normalization
Trained model
Tensorflow model summary
Features and label (un)normalization parameters
Flag to normalize features
Flag to normalize labels
categories to use for one-hot encoding
One-hot encoded feature names
Input feature names to be one-hot encoded
Mapping feature/label names to the stdev values for (un)normalization
A record of important versions that this model was built with.
- static compile_model(**kwargs)[source]
Build sklearn random forest model
- Parameters:
kwargs (dict) – kwargs for sklearn.ensemble.RandomForestRegressor
- Returns:
sklearn.ensemble.RandomForestRegressor – sklearn random forest model
- unnormalize_prediction(prediction)[source]
Unnormalize prediction if needed
- Parameters:
prediction (ndarray) – Model prediction
- Returns:
prediction (ndarray) – Native prediction
- parse_labels(label, name=None)[source]
Parse labels and normalize if desired
- Parameters:
label (pandas.DataFrame | dict | ndarray) – Features to train on or predict from
name (list, optional) – List of label names, by default None
- Returns:
label (ndarray) – Parsed labels array, normalized if desired
- train_model(features, label, shuffle=True, parse_kwargs=None, fit_kwargs=None)[source]
Train the model with the provided features and label
- Parameters:
features (dict | pandas.DataFrame) – Input features to train on
label (dict | pandas.DataFrame) – label to train on
shuffle (bool) – Flag to randomly subset the validation data and batch selection from features and labels.
parse_kwargs (dict) – kwargs for cls.parse_features
fit_kwargs (dict) – kwargs for sklearn.ensemble.RandomForestRegressor.fit
- save_model(path)[source]
Save Random Forest Model to path.
- Parameters:
path (str) – Path to save model to
- classmethod build_trained(features, label, normalize=True, one_hot_categories=None, shuffle=True, save_path=None, compile_kwargs=None, parse_kwargs=None, fit_kwargs=None)[source]
Build Random Forest Model with given kwargs and then train with given features, labels, and kwargs
- Parameters:
features (pandas.DataFrame) – Model features
label (pandas.DataFrame) – label to train on
normalize (bool | tuple, optional) – Boolean flag(s) as to whether features and labels should be normalized. Possible values: - True means normalize both - False means don’t normalize either - Tuple of flags (normalize_feature, normalize_label) by default True
one_hot_categories (dict, optional) – Features to one-hot encode using given categories, if None do not run one-hot encoding, by default None
shuffle (bool) – Flag to randomly subset the validation data and batch selection from features and labels.
save_path (str) – Directory path to save model to. The RandomForest Model will be saved to the directory while the framework parameters will be saved in json.
compile_kwargs (dict) – kwargs for sklearn.ensemble.RandomForestRegressor
parse_kwargs (dict) – kwargs for cls.parse_features
fit_kwargs (dict) – kwargs for sklearn.ensemble.RandomForestRegressor.fit
- Returns:
model (RandomForestModel) – Initialized and trained RandomForestModel obj
- classmethod load(path)[source]
Load model from model path.
- Parameters:
path (str) – Directory path to RandomForestModel from pickle file.
- Returns:
model (RandomForestModel) – Loaded RandomForestModel from disk.
- static dict_json_convert(inp)
Recursively convert numeric values in dict to work with json dump
- Parameters:
inp (dict) – Dictionary to convert.
- Returns:
out (dict) – Copy of dict input with all nested numeric values converted to base python int or float and all arrays converted to lists.
- property feature_dims
Number of features
- Returns:
int
- property feature_means
Feature means, used for (un)normalization
- Returns:
list
- property feature_names
List of the feature variable names.
- Returns:
list
- property feature_stdevs
Feature stdevs, used for (un)normalization
- Returns:
list
- get_mean(name)
Get feature | label mean
- Parameters:
name (str) – feature | label name
- Returns:
mean (float) – Mean value used for normalization
- get_norm_params(names)
Get means and stdevs for given feature/label names
- Parameters:
names (list) – list of feature/label names to get normalization params for
- Returns:
means (list) – List of means to use for (un)normalization
stdevs (list) – List of stdevs to use for (un)normalization
- get_stdev(name)
Get feature | label stdev
- Parameters:
name (str) – feature | label name
- Returns:
stdev (float) – Stdev value used for normalization
- property input_feature_names
Input feature names
- Returns:
list
- property label_dims
Number of labels
- Returns:
int
- property label_means
label means, used for (un)normalization
- Returns:
list
- property label_names
label variable names
- Returns:
list
- property label_stdevs
label stdevs, used for (un)normalization
- Returns:
list
- static make_one_hot_feature_names(feature_names, one_hot_categories)
Update feature_names after one-hot encoding
- Parameters:
feature_names (list) – Input feature names
one_hot_categories (dict) – Features to one-hot encode using given categories
- Returns:
one_hot_feature_names (list) – Updated list of feature names with one_hot categories
- property means
Mapping feature/label names to the mean values for (un)normalization
- Returns:
dict
- property model
Trained model
- Returns:
tensorflow.keras.models
- property model_summary
Tensorflow model summary
- Returns:
str
- property normalization_parameters
Features and label (un)normalization parameters
- Returns:
dict
- normalize(data, names=None)
Normalize given data
- Parameters:
data (dict | pandas.DataFrame | ndarray) – Data to normalize
names (list, optional) – List of data item names, needed to normalized ndarrays, by default None
- Returns:
data (dict | pandas.DataFrame | ndarray) – Normalized data in same format as input
- property normalize_features
Flag to normalize features
- Returns:
bool
- property normalize_labels
Flag to normalize labels
- Returns:
bool
- property one_hot_categories
categories to use for one-hot encoding
- Returns:
dict
- property one_hot_feature_names
One-hot encoded feature names
- Returns:
list
- property one_hot_input_feature_names
Input feature names to be one-hot encoded
- Returns:
list
- parse_features(features, names=None, **kwargs)
Parse features - preprocessing of feature data before training or prediction. This will do one-hot encoding based on self.one_hot_categories, and feature normalization based on self.normalize_features
- Parameters:
features (pandas.DataFrame | dict | ndarray) – Features to train on or predict from
names (list, optional) – List of feature names, by default None
kwargs (dict, optional) – kwargs for PreProcess.one_hot
- Returns:
features (ndarray) – Parsed features array normalized and with str columns converted to one hot vectors if desired
- predict(features, table=True, parse_kwargs=None, predict_kwargs=None)
Use model to predict label from given features
- Parameters:
features (dict | pandas.DataFrame) – features to predict from
table (bool, optional) – Return pandas DataFrame
parse_kwargs (dict) – kwargs for cls.parse_features
predict_kwargs (dict) – kwargs for tensorflow.*.predict
- Returns:
prediction (ndarray | pandas.DataFrame) – label prediction
- static seed(s=0)
Set the random seed for reproducible results. :Parameters: s (int) – Random number generator seed
- property stdevs
Mapping feature/label names to the stdev values for (un)normalization
- Returns:
dict
- unnormalize(data, names=None)
Un-normalize given data
- Parameters:
data (dict | pandas.DataFrame | ndarray) – Data to un-normalize
names (list, optional) – List of data item names, needed to un-normalized ndarrays, by default None
- Returns:
data (dict | pandas.DataFrame | ndarray) – Native data in same format as input
- property version_record
A record of important versions that this model was built with.
- Returns:
dict