phygnn.model_interfaces.phygnn_model.PhygnnModel

class PhygnnModel(model, feature_names=None, label_names=None, norm_params=None, normalize=(True, False), one_hot_categories=None)[source]

Bases: ModelBase

Phygnn Model interface

Parameters:

model (PhysicsGuidedNeuralNetwork) – PhysicsGuidedNeuralNetwork Model instance
feature_names (list) – Ordered list of feature names.
label_names (list) – Ordered list of label (output) names.
norm_params (dict, optional) – Dictionary mapping feature and label names (keys) to normalization parameters (mean, stdev), by default None
normalize (bool | tuple, optional) – Boolean flag(s) as to whether features and labels should be normalized. Possible values: - True means normalize both - False means don’t normalize either - Tuple of flags (normalize_feature, normalize_label) by default True
one_hot_categories (dict, optional) – Features to one-hot encode using given categories, if None do not run one-hot encoding, by default None

Methods

`build`(p_fun, feature_names, label_names[, ...])	Build phygnn model from given features, layers and kwargs
`build_trained`(p_fun, features, labels, p[, ...])	Build phygnn model from given features, layers and kwargs and then train with given labels and kwargs
`dict_json_convert`(inp)	Recursively convert numeric values in dict to work with json dump
`get_mean`(name)	Get feature \| label mean
`get_norm_params`(names)	Get means and stdevs for given feature/label names
`get_stdev`(name)	Get feature \| label stdev
`load`(path)	Load model from model path.
`make_one_hot_feature_names`(feature_names, ...)	Update feature_names after one-hot encoding
`normalize`(data[, names])	Normalize given data
`parse_features`(features[, names])	Parse features - preprocessing of feature data before training or prediction.
`parse_labels`(labels[, names])	Parse labels and normalize if desired
`predict`(features[, table, parse_kwargs, ...])	Use model to predict label from given features
`save_model`(path)	Save phygnn model to path.
`seed`([s])	Set the random seed for reproducible results.
`set_loss_weights`(loss_weights)	Set new loss weights
`train_model`(features, labels, p[, n_batch, ...])	Train the model with the provided features and label
`unnormalize`(data[, names])	Un-normalize given data

Attributes

`bias_weights`	Get a list of the NN bias weights (tensors)
`feature_dims`	Number of features
`feature_means`	Feature means, used for (un)normalization
`feature_names`	List of the feature variable names.
`feature_stdevs`	Feature stdevs, used for (un)normalization
`history`	Model training history DataFrame (None if not yet trained)
`input_feature_names`	Input feature names
`kernel_weights`	Get a list of the NN kernel weights (tensors)
`label_dims`	Number of labels
`label_means`	label means, used for (un)normalization
`label_names`	label variable names
`label_stdevs`	label stdevs, used for (un)normalization
`layers`	Model layers
`means`	Mapping feature/label names to the mean values for (un)normalization
`model`	Trained model
`model_summary`	Tensorflow model summary
`normalization_parameters`	Features and label (un)normalization parameters
`normalize_features`	Flag to normalize features
`normalize_labels`	Flag to normalize labels
`one_hot_categories`	categories to use for one-hot encoding
`one_hot_feature_names`	One-hot encoded feature names
`one_hot_input_feature_names`	Input feature names to be one-hot encoded
`stdevs`	Mapping feature/label names to the stdev values for (un)normalization
`version_record`	A record of important versions that this model was built with.
`weights`	Get a list of layer weights for gradient calculations.

MODEL_CLASS: alias of PhysicsGuidedNeuralNetwork

property layers

Model layers

Returns:: list

property weights

Get a list of layer weights for gradient calculations.

Returns:: list

property kernel_weights

Get a list of the NN kernel weights (tensors)

(can be used for kernel regularization).

Does not include input layer or dropout layers. Does include the output layer.

Returns:: list

property bias_weights

Get a list of the NN bias weights (tensors)

(can be used for bias regularization).

Does not include input layer or dropout layers. Does include the output layer.

Returns:: list

property history

Model training history DataFrame (None if not yet trained)

Returns:: pandas.DataFrame | None

property version_record

A record of important versions that this model was built with.

Returns:: dict

train_model(features, labels, p, n_batch=16, batch_size=None, n_epoch=10, shuffle=True, validation_split=0.2, run_preflight=True, return_diagnostics=False, p_kwargs=None, parse_kwargs=None)[source]

Train the model with the provided features and label

Parameters:

features (np.ndarray | pd.DataFrame) – Feature data in a >=2D array or DataFrame. If this is a DataFrame, the index is ignored, the columns are used with self.feature_names, and the df is converted into a numpy array for batching and passing to the training algorithm. A 2D input should have the shape: (n_observations, n_features). A 3D input should have the shape: (n_observations, n_timesteps, n_features). 4D inputs have not been tested and should be used with caution.
labels (np.ndarray | pd.DataFrame) – Known output data in a 2D array or DataFrame. Same dimension rules as features.
p (np.ndarray | pd.DataFrame) – Supplemental feature data for the physics loss function in 2D array or DataFrame. Same dimension rules as features.
n_batch (int) – Number of times to update the NN weights per epoch (number of mini-batches). The training data will be split into this many mini-batches and the NN will train on each mini-batch, update weights, then move onto the next mini-batch.
batch_size (int | None) – Number of training samples per batch. This input is redundant to n_batch and will not be used if n_batch is not None.
n_epoch (int) – Number of times to iterate on the training data.
shuffle (bool) – Flag to randomly subset the validation data and batch selection from features, labels, and p.
validation_split (float) – Fraction of features and labels to use for validation.
p_kwargs (None | dict) – Optional kwargs for the physical loss function p_fun.
run_preflight (bool) – Flag to run preflight checks.
return_diagnostics (bool) – Flag to return training diagnostics dictionary.
parse_kwargs (dict) – kwargs for cls.parse_features
norm_labels (bool, optional) – Flag to normalize label, by default True

Returns:

diagnostics (dict, optional) – Namespace of training parameters that can be used for diagnostics.

save_model(path)[source]

Save phygnn model to path.

Parameters:: path (str) – Target model save path. Can be a target .json, .pkl, or a directory that will be created+populated with a pkl model file and json parameters file.

set_loss_weights(loss_weights)[source]

Set new loss weights

Parameters:: loss_weights (tuple) – Loss weights for the neural network y_true vs y_predicted and for the p_fun loss, respectively. For example, loss_weights=(0.0, 1.0) would simplify the phygnn loss function to just the p_fun output.

classmethod build(p_fun, feature_names, label_names, normalize=(True, False), one_hot_categories=None, loss_weights=(0.5, 0.5), hidden_layers=None, input_layer=None, output_layer=None, layers_obj=None, metric='mae', optimizer=None, learning_rate=0.01, history=None, kernel_reg_rate=0.0, kernel_reg_power=1, bias_reg_rate=0.0, bias_reg_power=1, name=None)[source]

Build phygnn model from given features, layers and kwargs

Parameters:

p_fun (function) – Physics function to guide the neural network loss function. This fun must take (phygnn, y_true, y_predicted, p, **p_kwargs) as arguments with datatypes (PhysicsGuidedNeuralNetwork, tf.Tensor, np.ndarray, np.ndarray). The function must return a tf.Tensor object with a single numeric loss value (output.ndim == 0).
feature_names (list) – Ordered list of feature names.
label_names (list) – Ordered list of label (output) names.
normalize (bool | tuple, optional) – Boolean flag(s) as to whether features and labels should be normalized. Possible values: - True means normalize both - False means don’t normalize either - Tuple of flags (normalize_feature, normalize_label) by default True
one_hot_categories (dict, optional) – Features to one-hot encode using given categories, if None do not run one-hot encoding, by default None
loss_weights (tuple, optional) – Loss weights for the neural network y_true vs y_predicted and for the p_fun loss, respectively. For example, loss_weights=(0.0, 1.0) would simplify the phygnn loss function to just the p_fun output.
hidden_layers (list, optional) – List of dictionaries of key word arguments for each hidden layer in the NN. Dense linear layers can be input with their activations or separately for more explicit control over the layer ordering. For example, this is a valid input for hidden_layers that will yield 8 hidden layers (10 layers including input+output):

[{‘units’: 64, ‘activation’: ‘relu’, ‘dropout’: 0.01},
{‘units’: 64}, {‘batch_normalization’: {‘axis’: -1}}, {‘activation’: ‘relu’}, {‘dropout’: 0.01}, {‘class’: ‘Flatten’}, ]
input_layer (None | bool | dict) – Input layer. specification. Can be a dictionary similar to hidden_layers specifying a dense / conv / lstm layer. Will default to a keras InputLayer with input shape = n_features. Can be False if the input layer will be included in the hidden_layers input.
output_layer (None | bool | list | dict) – Output layer specification. Can be a list/dict similar to hidden_layers input specifying a dense layer with activation. For example, for a classfication problem with a single output, output_layer should be [{‘units’: 1}, {‘activation’: ‘sigmoid’}]. This defaults to a single dense layer with no activation (best for regression problems). Can be False if the output layer will be included in the hidden_layers input.
layers_obj (None | phygnn.utilities.tf_layers.Layers) – Optional initialized Layers object to set as the model layers including pre-set weights. This option will override the hidden_layers, input_layer, and output_layer arguments.
metric (str, optional) – Loss metric option for the NN loss function (not the physical loss function). Must be a valid key in phygnn.loss_metrics.METRICS
optimizer (tensorflow.keras.optimizers | dict | None) – Instantiated tf.keras.optimizers object or a dict optimizer config from tf.keras.optimizers.get_config(). None defaults to Adam.
learning_rate (float, optional) – Optimizer learning rate. Not used if optimizer input arg is a pre-initialized object or if optimizer input arg is a config dict.
history (None | pd.DataFrame, optional) – Learning history if continuing a training session.
kernel_reg_rate (float, optional) – Kernel regularization rate. Increasing this value above zero will add a structural loss term to the loss function that disincentivizes large hidden layer weights and should reduce model complexity. Setting this to 0.0 will disable kernel regularization.
kernel_reg_power (int, optional) – Kernel regularization power. kernel_reg_power=1 is L1 regularization (lasso regression), and kernel_reg_power=2 is L2 regularization (ridge regression).
bias_reg_rate (float, optional) – Bias regularization rate. Increasing this value above zero will add a structural loss term to the loss function that disincentivizes large hidden layer biases and should reduce model complexity. Setting this to 0.0 will disable bias regularization.
bias_reg_power (int, optional) – Bias regularization power. bias_reg_power=1 is L1 regularization (lasso regression), and bias_reg_power=2 is L2 regularization (ridge regression).
name (None | str) – Optional model name for debugging.

Returns:

model (PhygnnModel) – Initialized PhygnnModel instance

classmethod build_trained(p_fun, features, labels, p, normalize=(True, False), one_hot_categories=None, loss_weights=(0.5, 0.5), hidden_layers=None, input_layer=None, output_layer=None, layers_obj=None, metric='mae', optimizer=None, learning_rate=0.01, history=None, kernel_reg_rate=0.0, kernel_reg_power=1, bias_reg_rate=0.0, bias_reg_power=1, n_batch=16, batch_size=None, n_epoch=10, shuffle=True, validation_split=0.2, run_preflight=True, return_diagnostics=False, p_kwargs=None, parse_kwargs=None, save_path=None, name=None)[source]

Build phygnn model from given features, layers and kwargs and then train with given labels and kwargs

Parameters:

p_fun (function) – Physics function to guide the neural network loss function. This fun must take (phygnn, y_true, y_predicted, p, **p_kwargs) as arguments with datatypes (PhysicsGuidedNeuralNetwork, tf.Tensor, np.ndarray, np.ndarray). The function must return a tf.Tensor object with a single numeric loss value (output.ndim == 0).
features (np.ndarray | pd.DataFrame) – Feature data in a >=2D array or DataFrame. If this is a DataFrame, the index is ignored, the columns are used with self.feature_names, and the df is converted into a numpy array for batching and passing to the training algorithm. A 2D input should have the shape: (n_observations, n_features). A 3D input should have the shape: (n_observations, n_timesteps, n_features). 4D inputs have not been tested and should be used with caution.
labels (np.ndarray | pd.DataFrame) – Known output data in a 2D array or DataFrame. Same dimension rules as features.
p (np.ndarray | pd.DataFrame) – Supplemental feature data for the physics loss function in 2D array or DataFrame. Same dimension rules as features.
normalize (bool | tuple, optional) – Boolean flag(s) as to whether features and labels should be normalized. Possible values: - True means normalize both - False means don’t normalize either - Tuple of flags (normalize_feature, normalize_label) by default True
one_hot_categories (dict, optional) – Features to one-hot encode using given categories, if None do not run one-hot encoding, by default None
loss_weights (tuple, optional) – Loss weights for the neural network y_true vs y_predicted and for the p_fun loss, respectively. For example, loss_weights=(0.0, 1.0) would simplify the phygnn loss function to just the p_fun output.
hidden_layers (list, optional) – List of dictionaries of key word arguments for each hidden layer in the NN. Dense linear layers can be input with their activations or separately for more explicit control over the layer ordering. For example, this is a valid input for hidden_layers that will yield 8 hidden layers (10 layers including input+output):

[{‘units’: 64, ‘activation’: ‘relu’, ‘dropout’: 0.01},
{‘units’: 64}, {‘batch_normalization’: {‘axis’: -1}}, {‘activation’: ‘relu’}, {‘dropout’: 0.01}, {‘class’: ‘Flatten’}, ]
input_layer (None | bool | dict) – Input layer. specification. Can be a dictionary similar to hidden_layers specifying a dense / conv / lstm layer. Will default to a keras InputLayer with input shape = n_features. Can be False if the input layer will be included in the hidden_layers input.
output_layer (None } bool | list | dict) – Output layer specification. Can be a list/dict similar to hidden_layers input specifying a dense layer with activation. For example, for a classfication problem with a single output, output_layer should be [{‘units’: 1}, {‘activation’: ‘sigmoid’}]. This defaults to a single dense layer with no activation (best for regression problems). Can be False if the output layer will be included in the hidden_layers input.
layers_obj (None | phygnn.utilities.tf_layers.Layers) – Optional initialized Layers object to set as the model layers including pre-set weights. This option will override the hidden_layers, input_layer, and output_layer arguments.
metric (str, optional) – Loss metric option for the NN loss function (not the physical loss function). Must be a valid key in phygnn.loss_metrics.METRICS
optimizer (tensorflow.keras.optimizers | dict | None) – Instantiated tf.keras.optimizers object or a dict optimizer config from tf.keras.optimizers.get_config(). None defaults to Adam.
learning_rate (float, optional) – Optimizer learning rate. Not used if optimizer input arg is a pre-initialized object or if optimizer input arg is a config dict.
history (None | pd.DataFrame, optional) – Learning history if continuing a training session.
kernel_reg_rate (float, optional) – Kernel regularization rate. Increasing this value above zero will add a structural loss term to the loss function that disincentivizes large hidden layer weights and should reduce model complexity. Setting this to 0.0 will disable kernel regularization.
kernel_reg_power (int, optional) – Kernel regularization power. kernel_reg_power=1 is L1 regularization (lasso regression), and kernel_reg_power=2 is L2 regularization (ridge regression).
bias_reg_rate (float, optional) – Bias regularization rate. Increasing this value above zero will add a structural loss term to the loss function that disincentivizes large hidden layer biases and should reduce model complexity. Setting this to 0.0 will disable bias regularization.
bias_reg_power (int, optional) – Bias regularization power. bias_reg_power=1 is L1 regularization (lasso regression), and bias_reg_power=2 is L2 regularization (ridge regression).
n_batch (int) – Number of times to update the NN weights per epoch (number of mini-batches). The training data will be split into this many mini-batches and the NN will train on each mini-batch, update weights, then move onto the next mini-batch.
batch_size (int | None) – Number of training samples per batch. This input is redundant to n_batch and will not be used if n_batch is not None.
n_epoch (int) – Number of times to iterate on the training data.
shuffle (bool) – Flag to randomly subset the validation data and batch selection from features and labels.
validation_split (float)
run_preflight (bool) – Flag to run preflight checks.
return_diagnostics (bool) – Flag to return training diagnostics dictionary. Fraction of features and labels to use for validation.
p_kwargs (None | dict) – Optional kwargs for the physical loss function p_fun.
parse_kwargs (dict) – kwargs for cls.parse_features
norm_labels (bool, optional) – Flag to normalize label, by default True
save_path (str, optional) – Directory path to save model to. The tensorflow model will be saved to the directory while the framework parameters will be saved in json, by default None
name (None | str) – Optional model name for debugging.

Returns:

model (TfModel) – Initialized and trained TfModel obj
diagnostics (dict, optional) – Namespace of training parameters that can be used for diagnostics.

classmethod load(path)[source]

Load model from model path.

Parameters:: path (str) – Directory path for PhygnnModel to load model from. There should be a saved model directory with json and pickle files for the PhygnnModel framework.
Returns:: model (PhygnnModel) – Loaded PhygnnModel from disk.

static dict_json_convert(inp)

Recursively convert numeric values in dict to work with json dump

Parameters:: inp (dict) – Dictionary to convert.
Returns:: out (dict) – Copy of dict input with all nested numeric values converted to base python int or float and all arrays converted to lists.

property feature_dims

Number of features

Returns:: int

property feature_means

Feature means, used for (un)normalization

Returns:: list

property feature_names

List of the feature variable names.

Returns:: list

property feature_stdevs

Feature stdevs, used for (un)normalization

Returns:: list

get_mean(name)

Get feature | label mean

Parameters:: name (str) – feature | label name
Returns:: mean (float) – Mean value used for normalization

get_norm_params(names)

Get means and stdevs for given feature/label names

Parameters:

names (list) – list of feature/label names to get normalization params for

Returns:

means (list) – List of means to use for (un)normalization
stdevs (list) – List of stdevs to use for (un)normalization

get_stdev(name)

Get feature | label stdev

Parameters:: name (str) – feature | label name
Returns:: stdev (float) – Stdev value used for normalization

property input_feature_names

Input feature names

Returns:: list

property label_dims

Number of labels

Returns:: int

property label_means

label means, used for (un)normalization

Returns:: list

property label_names

label variable names

Returns:: list

property label_stdevs

label stdevs, used for (un)normalization

Returns:: list

static make_one_hot_feature_names(feature_names, one_hot_categories)

Update feature_names after one-hot encoding

Parameters:

feature_names (list) – Input feature names
one_hot_categories (dict) – Features to one-hot encode using given categories

Returns:

one_hot_feature_names (list) – Updated list of feature names with one_hot categories

property means

Mapping feature/label names to the mean values for (un)normalization

Returns:: dict

property model

Trained model

Returns:: tensorflow.keras.models

property model_summary

Tensorflow model summary

Returns:: str

property normalization_parameters

Features and label (un)normalization parameters

Returns:: dict

normalize(data, names=None)

Normalize given data

Parameters:

data (dict | pandas.DataFrame | ndarray) – Data to normalize
names (list, optional) – List of data item names, needed to normalized ndarrays, by default None

Returns:

data (dict | pandas.DataFrame | ndarray) – Normalized data in same format as input

property normalize_features

Flag to normalize features

Returns:: bool

property normalize_labels

Flag to normalize labels

Returns:: bool

property one_hot_categories

categories to use for one-hot encoding

Returns:: dict

property one_hot_feature_names

One-hot encoded feature names

Returns:: list

property one_hot_input_feature_names

Input feature names to be one-hot encoded

Returns:: list

parse_features(features, names=None, **kwargs)

Parse features - preprocessing of feature data before training or prediction. This will do one-hot encoding based on self.one_hot_categories, and feature normalization based on self.normalize_features

Parameters:

features (pandas.DataFrame | dict | ndarray) – Features to train on or predict from
names (list, optional) – List of feature names, by default None
kwargs (dict, optional) – kwargs for PreProcess.one_hot

Returns:

features (ndarray) – Parsed features array normalized and with str columns converted to one hot vectors if desired

parse_labels(labels, names=None)

Parse labels and normalize if desired

Parameters:

labels (pandas.DataFrame | dict | ndarray) – Features to train on or predict from
names (list, optional) – List of label names, by default None

Returns:

labels (ndarray) – Parsed labels array, normalized if desired

predict(features, table=True, parse_kwargs=None, predict_kwargs=None)

Use model to predict label from given features

Parameters:

features (dict | pandas.DataFrame) – features to predict from
table (bool, optional) – Return pandas DataFrame
parse_kwargs (dict) – kwargs for cls.parse_features
predict_kwargs (dict) – kwargs for tensorflow.*.predict

Returns:

prediction (ndarray | pandas.DataFrame) – label prediction

static seed(s=0)

Set the random seed for reproducible results.

Parameters:: s (int) – Random number generator seed

property stdevs

Mapping feature/label names to the stdev values for (un)normalization

Returns:: dict

unnormalize(data, names=None)

Un-normalize given data

Parameters:

data (dict | pandas.DataFrame | ndarray) – Data to un-normalize
names (list, optional) – List of data item names, needed to un-normalized ndarrays, by default None

Returns:

data (dict | pandas.DataFrame | ndarray) – Native data in same format as input