phygnn.base.CustomNetwork
- class CustomNetwork(n_features=None, n_labels=None, hidden_layers=None, input_layer=False, output_layer=False, layers_obj=None, feature_names=None, output_names=None, name=None)[source]
Bases:
ABC
Custom infrastructure for feed forward neural networks.
Note that the phygnn model requires TensorFlow 2.x
- Parameters:
n_features (int, optional) – Number of input features. This should match the last dimension of the feature training data.
n_labels (int, optional) – Number of output labels. This should match the last dimension of the label training data.
hidden_layers (list, optional) – List of dictionaries of key word arguments for each hidden layer in the NN. Dense linear layers can be input with their activations or separately for more explicit control over the layer ordering. For example, this is a valid input for hidden_layers that will yield 8 hidden layers (10 layers including input+output):
- [{‘units’: 64, ‘activation’: ‘relu’, ‘dropout’: 0.01},
{‘units’: 64}, {‘batch_normalization’: {‘axis’: -1}}, {‘activation’: ‘relu’}, {‘dropout’: 0.01}, {‘class’: ‘Flatten’}, ]
input_layer (None | bool | dict) – Input layer. specification. Can be a dictionary similar to hidden_layers specifying a dense / conv / lstm layer. Defaults to False so the input layer will be included in the hidden_layers input.
output_layer (None | bool | list | dict) – Output layer specification. Can be a list/dict similar to hidden_layers input specifying a dense layer with activation. For example, for a classfication problem with a single output, output_layer should be [{‘units’: 1}, {‘activation’: ‘sigmoid’}]. Default is False so the output layer will be included in the hidden_layers input.
layers_obj (None | phygnn.utilities.tf_layers.Layers) – Optional initialized Layers object to set as the model layers including pre-set weights. This option will override the hidden_layers, input_layer, and output_layer arguments.
feature_names (list | tuple | None, optional) – Training feature names (strings). Mostly a convenience so that a loaded-from-disk model will have declared feature names, making it easier to feed in features for prediction. This will also get set if phygnn is trained on a DataFrame.
output_names (list | tuple | None, optional) – Prediction output names (strings). Mostly a convenience so that a loaded-from-disk model will have declared output names, making it easier to understand prediction output. This will also get set if phygnn is trained on a DataFrame.
name (None | str) – Optional model name for debugging.
Methods
get_val_split
(*args[, shuffle, validation_split])Get a validation split and remove from from the training data.
load
(fpath)Load a phygnn model that has been saved to a pickle file.
make_batches
(*args[, n_batch, batch_size, ...])Make lists of unique data batches by splitting x and y along the 1st data dimension.
predict
(x[, to_numpy, training, training_layers])Run a prediction on input features.
Run preflight checks and data conversions on feature data.
save
(fpath)Save phygnn model to pickle file.
seed
([s])Set the random seed for reproducable results.
Attributes
Get a list of the NN bias weights (tensors)
Get a list of the NN kernel weights (tensors)
Ordered list of TensorFlow keras layers that make up this model including input and output layers
phygnn layers handler object
Model parameters, used to save model to disc
A record of important versions that this model was built with.
Get a list of layer weights and bias terms for gradient calculations.
- property version_record
A record of important versions that this model was built with.
- Returns:
dict
- property layers
Ordered list of TensorFlow keras layers that make up this model including input and output layers
- Returns:
list
- property layers_obj
phygnn layers handler object
- Returns:
phygnn.utilities.tf_layers.Layers
- property weights
Get a list of layer weights and bias terms for gradient calculations.
- Returns:
list
- property kernel_weights
Get a list of the NN kernel weights (tensors)
(can be used for kernel regularization).
Does not include input layer or dropout layers. Does include the output layer.
- Returns:
list
- property bias_weights
Get a list of the NN bias weights (tensors)
(can be used for bias regularization).
Does not include input layer or dropout layers. Does include the output layer.
- Returns:
list
- property model_params
Model parameters, used to save model to disc
- Returns:
dict
- static seed(s=0)[source]
Set the random seed for reproducable results.
- Parameters:
s (int) – Random seed
- classmethod get_val_split(*args, shuffle=True, validation_split=0.2)[source]
Get a validation split and remove from from the training data. This applies the split along the 1st data dimension.
- Parameters:
args (np.ndarray) – This is one or more positional arguments that are numpy arrays to be split. They must have the same length.
shuffle (bool) – Flag to randomly subset the validation data from x and y. shuffle=False will take the first entries in x and y.
validation_split (float) – Fraction of x and y to put in the validation set.
- Returns:
out (list) – List with the same length as the number of positional input arguments. Each list entry is itself a list with two entries. For example, the first entry in the output is of the format: [the training split, and the validation split] and corresponds to the first positional input argument.
- static make_batches(*args, n_batch=16, batch_size=None, shuffle=True)[source]
Make lists of unique data batches by splitting x and y along the 1st data dimension.
- Parameters:
args (np.ndarray) – This is one or more positional arguments that are numpy arrays to be batched. They must have the same length.
n_batch (int | None) – Number of times to update the NN weights per epoch. The training data will be split into this many batches and the NN will train on each batch, update weights, then move onto the next batch.
batch_size (int | None) – Number of training samples per batch. This input is redundant to n_batch and will not be used if n_batch is not None.
shuffle (bool) – Flag to randomly subset the validation data from x and y.
- Returns:
batches (GeneratorType) – Generator of batches, each iteration of the generator has as many entries as are input in the positional arguments. Each entry in the iteration is an ND array with the same original dimensions as the input just with a subset batch of the 0 axis
- preflight_features(x)[source]
Run preflight checks and data conversions on feature data.
- Parameters:
x (np.ndarray | pd.DataFrame) – Feature data in a >=2D array or DataFrame. If this is a DataFrame, the index is ignored, the columns are used with self.feature_names, and the df is converted into a numpy array for batching and passing to the training algorithm. Generally speaking, the data should always have the number of observations in the first axis and the number of features/channels in the last axis. Spatial and temporal dimensions can be used in intermediate axes.
- Returns:
x (np.ndarray) – Feature data in a >=2D array
- predict(x, to_numpy=True, training=False, training_layers=(<class 'keras.src.layers.normalization.batch_normalization.BatchNormalization'>, <class 'keras.src.layers.regularization.dropout.Dropout'>, <class 'keras.src.layers.rnn.lstm.LSTM'>))[source]
Run a prediction on input features.
- Parameters:
x (np.ndarray | pd.DataFrame) – Feature data in a >=2D array or DataFrame. If this is a DataFrame, the index is ignored, the columns are used with self.feature_names, and the df is converted into a numpy array for batching and passing to the training algorithm. Generally speaking, the data should always have the number of observations in the first axis and the number of features/channels in the last axis. Spatial and temporal dimensions can be used in intermediate axes.
to_numpy (bool) – Flag to convert output from tensor to numpy array
training (bool) – Flag for predict() used in the training routine. This is used to freeze the BatchNormalization and Dropout layers.
training_layers (list | tuple) – List of tensorflow.keras.layers classes that training=bool should be passed to. By default this is (BatchNormalization, Dropout, LSTM)
- Returns:
y (tf.Tensor | np.ndarray) – Predicted output data.