PT-MELT package
PT-MELT is composed of a series of modules that form the basis for building a
variety of machine learning models. The structure of the packages is to encourage
modularity and reusability of the code. The main modules are:
Blocks Module: Each of the blocks is a self-contained PyTorch model in itself, but can be combined with other blocks to form more complex models. The blocks are designed to be easily combined with other blocks, and to be easily extended.
Layers Module: The layers are redefined PyTorch layers that are used in some of the blocks. The layers are designed to be plug-in replacements for some of the default PyTorch layers, but with additional functionality.
Losses Module: The losses are custom loss functions that should be used with their respective models. Certain loss functions are designed to be used with specific models, but others can be used with any model.
Models Module: The models are the main machine learning models that are built using the blocks and losses. The models serve a dual purpose of being a standalone model, and also as a template for building more complex models with the
PT-MELTblocks.NN Utils Module: The nn_utils module contains utility functions that are used in the main modules. These functions are used for retrieving activation functions and initilizer functions via their string names.
Following is a detailed description of the modules and subpackages in the PT-MELT.
Blocks Module
- class ptmelt.blocks.BayesianBlock(num_points, perturbation_type='multiplicative', seed: int | None = None, **kwargs: Any)[source]
Bases:
MELTBlockBayesian block for the MELT architecture using custom Bayesian layers.
- class ptmelt.blocks.DefaultOutput(input_features: int, output_features: int, activation: str | None = 'linear', initializer: str | None = 'glorot_uniform', do_bayesian: bool | None = False, seed: int | None = None, **kwargs: Any)[source]
Bases:
ModuleDefault output layer with a single dense layer and optional activation function.
- Parameters:
input_features (int) – Number of input features.
output_features (int) – Number of output features.
activation (str, optional) – Activation function. Defaults to “linear”.
initializer (str, optional) – Weight initializer. Defaults to “glorot_uniform”.
**kwargs – Additional keyword arguments.
- class ptmelt.blocks.DenseBlock(**kwargs: Any)[source]
Bases:
MELTBlockDense block for the MELT architecture. The dense block consists of dense layers with optional activation, dropout, and batch normalization layers.
- Parameters:
**kwargs – Additional keyword arguments.
- class ptmelt.blocks.MELTBlock(input_features: int, node_list: List[int], activation: str | None = 'relu', dropout: float | None = 0.0, batch_norm: bool | None = False, batch_norm_type: str | None = 'ema', use_batch_renorm: bool | None = False, initializer: str | None = 'glorot_uniform', seed: int | None = None, **kwargs: Any)[source]
Bases:
ModuleBase class for a MELT block. Provides the building blocks for the MELT architecture. Defines the common parameters for the MELT blocks with optional activation, dropout, batch normalization, and batch renormalization layers.
- Parameters:
input_features (int) – Number of input features.
node_list (List[int]) – List of number of nodes in each layer.
activation (str, optional) – Activation function. Defaults to “relu”.
dropout (float, optional) – Dropout rate. Defaults to 0.0.
batch_norm (bool, optional) – Whether to use batch normalization. Defaults to False.
batch_norm_type (str, optional) – Type of batch normalization. Defaults to “ema”.
use_batch_renorm (bool, optional) – Whether to use batch renormalization. Defaults to False.
initializer (str, optional) – Weight initializer. Defaults to “glorot_uniform”.
**kwargs – Additional keyword arguments.
- class ptmelt.blocks.MixtureDensityOutput(input_features: int, num_mixtures: int, num_outputs: int, activation: str | None = 'linear', initializer: str | None = 'glorot_uniform', seed: int | None = None, **kwargs: Any)[source]
Bases:
ModuleOutput layer for mixture density networks. The output layer consists of three dense layers for the mixture coefficients, mean, and log variance of the output distribution.
- Parameters:
input_features (int) – Number of input features.
num_mixtures (int) – Number of mixture components.
num_outputs (int) – Number of output dimensions.
activation (str, optional) – Activation function. Defaults to “linear”.
initializer (str, optional) – Weight initializer. Defaults to “glorot_uniform”.
**kwargs – Additional keyword arguments.
- class ptmelt.blocks.ResidualBlock(layers_per_block: int | None = 2, pre_activation: bool | None = False, post_add_activation: bool | None = False, **kwargs: Any)[source]
Bases:
MELTBlockResidual block for the MELT architecture. The residual block consists of residual connections between dense layers with optional activation, dropout, and batch normalization layers. Residual connections are added after every layers_per_block layers.
- Parameters:
layers_per_block (int, optional) – Number of layers per residual block. Defaults to 2.
pre_activation (bool, optional) – Whether to use pre-activation residual blocks. Defaults to False.
post_add_activation (bool, optional) – Whether to use post-addition activation. Defaults to False.
**kwargs – Additional keyword arguments.
Layers Module
- class ptmelt.layers.AttentionPool(hidden_size)[source]
Bases:
ModuleAttention Pooling Layer.
- Parameters:
hidden_size (int) – Size of the hidden state from the RNN.
- class ptmelt.layers.MELTBatchNorm(num_features: int, eps: float | None = 1e-05, momentum: float | None = 0.1, affine: bool | None = True, track_running_stats: bool | None = True, average_type: str | None = 'ema')[source]
Bases:
ModuleCustom Batch Normalization Layer for PT-MELT.
Supports implementation of different types of moving averages for the batch norm statistics.
- Parameters:
num_features (int) – Number of features in the input tensor.
eps (float, optional) – Small value to avoid division by zero. Defaults to 1e-5.
momentum (float, optional) – Momentum for moving average. Defaults to 0.1.
affine (bool, optional) – Apply affine transformation. Defaults to True.
track_running_stats (bool, optional) – Track running statistics. Defaults to True.
average_type (str, optional) – Type of moving average. Defaults to “ema”.
- class ptmelt.layers.MELTBatchRenorm(num_features: int, eps: float | None = 1e-05, momentum: float | None = 0.1, affine: bool | None = True, track_running_stats: bool | None = True, average_type: str | None = 'ema', rmax: float | None = 1.0, dmax: float | None = 0.0)[source]
Bases:
MELTBatchNormCustom Batch Renormalization Layer for PT-MELT.
Supports implementation of different types of moving averages for the batch norm statistics.
- Parameters:
num_features (int) – Number of features in the input tensor.
eps (float, optional) – Small value to avoid division by zero. Defaults to 1e-5.
momentum (float, optional) – Momentum for moving average. Defaults to 0.1.
affine (bool, optional) – Apply affine transformation. Defaults to True.
track_running_stats (bool, optional) – Track running statistics. Defaults to True.
average_type (str, optional) – Type of moving average. Defaults to “ema”.
rmax (float, optional) – Maximum value for r. Defaults to 1.0.
dmax (float, optional) – Maximum value for d. Defaults to 0.0.
Losses Module
- class ptmelt.losses.MixtureDensityLoss(num_mixtures, num_outputs, mse_weight=1.0, reduction='mean')[source]
Bases:
ModuleCustom loss function for Mixture Density Network (MDN).
- Parameters:
num_mixtures (int) – Number of mixture components.
num_outputs (int) – Number of output dimensions.
- forward(y_pred, y_true)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Models Module
- class ptmelt.models.ArtificialNeuralNetwork(**kwargs)[source]
Bases:
MELTModelArtificial Neural Network (ANN) model.
- Parameters:
**kwargs – Additional keyword arguments.
- class ptmelt.models.BayesianNeuralNetwork(num_points: int | None = 1, do_aleatoric: bool | None = False, do_bayesian_output: bool | None = True, aleatoric_scale_factor: float | None = 0.05, scale_epsilon: float | None = 0.001, bayesian_mask: List[bool] | None = None, **kwargs)[source]
Bases:
MELTModelBayesian Neural Network (BNN) model.
- Parameters:
num_points (int, optional) – Number of Monte Carlo samples. Defaults to 1.
do_aleatoric (bool, optional) – Flag to perform aleatoric output. Defaults to False.
do_bayesian_output (bool, optional) – Flag to perform Bayesian output. Defaults to True.
aleatoric_scale_factor (float, optional) – Scale factor for aleatoric uncertainty. Defaults to 5e-2.
scale_epsilon (float, optional) – Epsilon value for the scale of the aleatoric uncertainty. Defaults to 1e-3.
bayesian_mask (list, optional) – List of booleans to determine which layers are Bayesian and which are Dense. Defaults to None.
**kwargs – Additional keyword arguments.
- class ptmelt.models.MELTModel(num_features: int, num_outputs: int, width: int | None = 32, depth: int | None = 2, act_fun: str | None = 'relu', dropout: float | None = 0.0, input_dropout: float | None = 0.0, batch_norm: bool | None = False, batch_norm_type: str | None = 'ema', use_batch_renorm: bool | None = False, output_activation: str | None = None, initializer: str | None = 'glorot_uniform', l1_reg: float | None = 0.0, l2_reg: float | None = 0.0, num_mixtures: int | None = 0, node_list: list | None = None, seed: int | None = None, **kwargs)[source]
Bases:
ModulePT-MELT Base model.
- Parameters:
num_features (int) – The number of input features.
num_outputs (int) – The number of output units.
width (int, optional) – The width of the hidden layers. Defaults to 32.
depth (int, optional) – The number of hidden layers. Defaults to 2.
act_fun (str, optional) – The activation function to use. Defaults to ‘relu’.
dropout (float, optional) – The dropout rate. Defaults to 0.0.
input_dropout (float, optional) – The input dropout rate. Defaults to 0.0.
batch_norm (bool, optional) – Whether to use batch normalization. Defaults to False.
batch_norm_type (str, optional) – The type of batch normalization to use. Defaults to ‘ema’.
use_batch_renorm (bool, optional) – Whether to use batch renormalization. Defaults to False.
output_activation (str, optional) – The activation function for the output layer. Defaults to None.
initializer (str, optional) – The weight initializer to use. Defaults to ‘glorot_uniform’.
l1_reg (float, optional) – The L1 regularization strength. Defaults to 0.0.
l2_reg (float, optional) – The L2 regularization strength. Defaults to 0.0.
num_mixtures (int, optional) – The number of mixture components for MDN. Defaults to 0.
node_list (list, optional) – The list of nodes per layer to alternately define layers. Defaults to None.
**kwargs – Additional keyword arguments.
- fit(train_dl, val_dl, optimizer, criterion, num_epochs: int | None = 100, device: str | None = 'cpu', scheduler: _LRScheduler | None = None, stopping: bool | None = True, verbose=False)[source]
Perform the model training loop.
- Parameters:
train_dl (DataLoader) – The training data loader.
val_dl (DataLoader) – The validation data loader.
optimizer (Optimizer) – The optimizer to use.
criterion (Loss) – The loss function to use.
num_epochs (int) – The number of epochs to train the model.
device (str, optional) – The device to use for training. Defaults to ‘cpu’.
verbose (bool, optional) – Whether to print training statistics. Defaults to False.
- get_loss_fn(loss: str | None = 'mse', reduction: str | None = 'mean', mse_weight: float | None = None)[source]
Get the loss function for the model. Used in the training loop.
- Parameters:
loss (str, optional) – The loss function to use. Defaults to ‘mse’.
reduction (str, optional) – The reduction method for the loss. Defaults to ‘mean’.
- get_optimizer(optimizer_name: str, **kwargs)[source]
Get the optimizer for the model. Used in the training loop.
- Parameters:
optimizer_name (str) – The name of the optimizer to use.
- get_scheduler(scheduler_name: str, optimizer, **kwargs)[source]
Get the learning rate scheduler for the model. Used in the training loop.
- Parameters:
scheduler_name (str) – The name of the scheduler to use.
optimizer – The optimizer to attach the scheduler to.
- l1_regularization(lambda_l1: float)[source]
Compute the L1 regularization term for use in the loss function.
- Parameters:
lambda_l1 (float) – The L1 regularization strength.
- class ptmelt.models.RecurrentNeuralNetwork(rnn_type: str | None = 'lstm', return_sequences: bool | None = False, head_type: str = 'last', **kwargs)[source]
Bases:
MELTModelRecurrent Neural Network (RNN) model.
Bidirectional is not supported in this implementation as it is intended for forecasting tasks.
- Parameters:
rnn_type (str, optional) – The type of RNN to use (‘rnn’, ‘lstm’, ‘gru’).
return_sequences (bool, optional) – Whether to return the full sequence or just the last output.
**kwargs – Additional keyword arguments.
- create_output_layer()[source]
Override to use recurrent_out_dim as input_features instead of layer_width.
- forward(inputs: Tensor, lengths: Tensor | None = None)[source]
Perform the forward pass of the RNN. If lengths are provided, the input sequences will be packed and unpacked to handle variable-length sequences. Performs optional pooling based on head_type setting.
- Parameters:
inputs (torch.Tensor) – The input data.
lengths (torch.Tensor, optional) – The lengths of the sequences in the batch.
- class ptmelt.models.ResidualNeuralNetwork(layers_per_block: int | None = 2, pre_activation: bool | None = True, post_add_activation: bool | None = False, **kwargs)[source]
Bases:
MELTModelResidual Neural Network (ResNet) model.
- Parameters:
layers_per_block (int, optional) – The number of layers per residual block. Defaults to 2.
pre_activation (bool, optional) – Whether to use pre-activation. Defaults to True.
post_add_activation (bool, optional) – Whether to use activation after addition. Defaults to False.
**kwargs – Additional keyword arguments.
NN Utils Module
Subpackages
In addition to the main modules, there are subpackages that contain various utility functions that are used in the main modules. The subpackages are:
- PT-MELT UtilitiesContains utility functions that are used in
the main modules. These functions contain routines for data processing, model evaluation, visualization, and other general-purpose functions.