buildings_bench.models
Available models:
- Encoder-decoder time series transformer
- Persistence Ensemble (
AveragePersistence
) - Previous Day Persistence (
CopyLastDayPersistence
) - Previous Week Persistence (
CopyLastWeekPersistence
) - Linear regression
- DLinear
Main entry point for loading a BuildingsBench model is model_factory()
.
model_factory
buildings_bench.models.model_factory(model_name, model_args)
Instantiate and returns a model for the benchmark.
Returns the model itself, the loss function to use, and the predict function.
The predict function should return a tuple of two tensors: (point predictions, prediction distribution parameters) where the distribution parameters may be, e.g., logits, or mean and variance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
Name of the model. |
required |
model_args
|
Dict
|
The keyword arguments for the model. |
required |
Returns:
model (torch.nn.Module): the instantiated model
loss (Callable): loss function
predict (Callable): predict function
BaseModel
buildings_bench.models.base_model.BaseModel
Bases: Module
, PyTorchModelHubMixin
Base class for all models.
__init__(context_len, pred_len, continuous_loads)
Init method for BaseModel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
context_len
|
int
|
length of context window |
required |
pred_len
|
int
|
length of prediction window |
required |
continuous_loads
|
bool
|
whether to use continuous load values |
required |
forward(x)
abstractmethod
Forward pass.
Expected keys in x:
- 'load': torch.Tensor of shape (batch_size, seq_len, 1)
- 'building_type': torch.LongTensor of shape (batch_size, seq_len, 1)
- 'day_of_year': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'hour_of_day': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'day_of_week': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'latitude': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'longitude': torch.FloatTensor of shape (batch_size, seq_len, 1)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Dict
|
dictionary of input tensors |
required |
Returns: predictions, distribution parameters (Tuple[torch.Tensor, torch.Tensor]): outputs
load_from_checkpoint(checkpoint_path)
abstractmethod
Describes how to load the model from checkpoint_path.
loss(x, y)
abstractmethod
A function for computing the loss.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Tensor
|
preds of shape (batch_size, seq_len, 1) |
required |
y
|
Tensor
|
targets of shape (batch_size, seq_len, 1) |
required |
Returns: loss (torch.Tensor): scalar loss
predict(x)
abstractmethod
A function for making a forecast on x with the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Dict
|
dictionary of input tensors |
required |
Returns: predictions (torch.Tensor): of shape (batch_size, pred_len, 1) distribution_parameters (torch.Tensor): of shape (batch_size, pred_len, -1)
unfreeze_and_get_parameters_for_finetuning()
abstractmethod
For transfer learning.
- Set requires_grad=True for parameters being fine-tuned (if necessary)
- Return the parameters that should be fine-tuned.
LoadForecastingTransformer
buildings_bench.models.transformers.LoadForecastingTransformer
Bases: BaseModel
An encoder-decoder time series Transformer. Based on PyTorch nn.Transformer.
- Uses masking in the decoder to prevent the model from peeking into the future
- Uses N(0, 0.02) for weight initialization
- Trains with teacher forcing (i.e. the target is used as the input to the decoder)
- continuous_loads (True) just predict target values (False) categorical over quantized load values
__init__(context_len=168, pred_len=24, vocab_size=2274, num_encoder_layers=3, num_decoder_layers=3, d_model=256, nhead=8, dim_feedforward=256, dropout=0.0, activation='gelu', continuous_loads=False, continuous_head='mse', ignore_spatial=False, weather_inputs=None)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
context_len
|
int
|
length of the input sequence. |
168
|
pred_len
|
int
|
length of the output sequence. |
24
|
vocab_size
|
int
|
number of quantized load values in the entire vocabulary. |
2274
|
num_encoder_layers
|
int
|
number of encoder layers. |
3
|
num_decoder_layers
|
int
|
number of decoder layers. |
3
|
d_model
|
int
|
number of expected features in the encoder/decoder inputs. |
256
|
nhead
|
int
|
number of heads in the multi-head attention models. |
8
|
dim_feedforward
|
int
|
dimension of the feedforward network model. |
256
|
dropout
|
float
|
dropout value. |
0.0
|
activation
|
str
|
the activation function of encoder/decoder intermediate layer, relu or gelu. |
'gelu'
|
continuous_loads
|
bool
|
whether inputs are continuous/to train the model to predict continuous values. |
False
|
continuous_head
|
str
|
'mse' or 'gaussian_nll'. |
'mse'
|
ignore_spatial
|
bool
|
whether to ignore the spatial features. |
False
|
weather_inputs
|
List[str]
|
list of weather features to use. Default: None. |
None
|
forward(x)
Forward pass of the time series transformer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Dict
|
dictionary of input tensors. |
required |
Returns: logits (torch.Tensor): [batch_size, pred_len, vocab_size] if not continuous_loads, [batch_size, pred_len, 1] if continuous_loads and continuous_head == 'mse', [batch_size, pred_len, 2] if continuous_loads and continuous_head == 'gaussian_nll'.
generate_sample(x, temperature=1.0, greedy=False, num_samples=1)
Sample from the conditional distribution.
Use output of decoder at each prediction step as input to the next decoder step. Implements greedy decoding and random temperature-controlled sampling.
Top-k sampling and nucleus sampling are deprecated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Dict
|
dictionary of input tensors |
required |
temperature
|
float
|
temperature for sampling |
1.0
|
greedy
|
bool
|
whether to use greedy decoding |
False
|
num_samples
|
int
|
number of samples to generate |
1
|
Returns:
Name | Type | Description |
---|---|---|
predictions |
Tensor
|
of shape [batch_size, pred_len, 1] or shape [batch_size, num_samples, pred_len] if num_samples > 1. |
distribution_parameters |
Tensor
|
of shape [batch_size, pred_len, 1]. Not returned if sampling. |
TokenEmbedding
buildings_bench.models.transformers.TokenEmbedding
Bases: Module
Helper Module to convert tensor of input indices into corresponding tensor of token embeddings.
__init__(vocab_size, emb_size)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
vocab_size
|
int
|
number of quantized load values in the entire vocabulary. |
required |
emb_size
|
int
|
embedding size. |
required |
PositionalEncoding
buildings_bench.models.transformers.PositionalEncoding
Bases: Module
Helper Module that adds positional encoding to the token embedding to introduce a notion of order within a time-series.
__init__(emb_size, dropout, maxlen=500)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
emb_size
|
int
|
embedding size. |
required |
dropout
|
float
|
dropout rate. |
required |
maxlen
|
int
|
maximum possible length of the incoming time series. |
500
|
TimeSeriesSinusoidalPeriodicEmbedding
buildings_bench.models.transformers.TimeSeriesSinusoidalPeriodicEmbedding
Bases: Module
This module produces a sinusoidal periodic embedding for a sequence of values in [-1, +1].
__init__(embedding_dim)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
embedding_dim
|
int
|
embedding size. |
required |
forward(x)
x
is expected to be [batch_size, seqlen, 1].
ZeroEmbedding
buildings_bench.models.transformers.ZeroEmbedding
Bases: Module
Outputs zeros of the desired output dim.
__init__(embedding_dim)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
embedding_dim
|
int
|
embedding size. |
required |
forward(x)
x
is expected to be [batch_size, seqlen, 1].
Persistence Ensemble
buildings_bench.models.persistence.AveragePersistence
Previous Day Persistence
buildings_bench.models.persistence.CopyLastDayPersistence
Previous Week Persistence
buildings_bench.models.persistence.CopyLastWeekPersistence
Linear Regression
buildings_bench.models.linear_regression.LinearRegression
Bases: BaseModel
Linear regression model that does direct forecasting.
It has one weight W and one bias b. The output is computed as y = Wx + b, where W is a matrix of shape [pred_len, context_len].