Skip to content

buildings_bench.models

Available models:

  • Encoder-decoder time series transformer
  • Persistence Ensemble (AveragePersistence)
  • Previous Day Persistence (CopyLastDayPersistence)
  • Previous Week Persistence (CopyLastWeekPersistence)
  • Linear regression
  • DLinear

Main entry point for loading a BuildingsBench model is model_factory().


model_factory

buildings_bench.models.model_factory(model_name, model_args)

Instantiate and returns a model for the benchmark.

Returns the model itself, the loss function to use, and the predict function.

The predict function should return a tuple of two tensors: (point predictions, prediction distribution parameters) where the distribution parameters may be, e.g., logits, or mean and variance.

Parameters:

Name Type Description Default
model_name str

Name of the model.

required
model_args Dict

The keyword arguments for the model.

required

Returns: model (torch.nn.Module): the instantiated model
loss (Callable): loss function predict (Callable): predict function


BaseModel

buildings_bench.models.base_model.BaseModel

Bases: Module, PyTorchModelHubMixin

Base class for all models.

__init__(context_len, pred_len, continuous_loads)

Init method for BaseModel.

Parameters:

Name Type Description Default
context_len int

length of context window

required
pred_len int

length of prediction window

required
continuous_loads bool

whether to use continuous load values

required
forward(x) abstractmethod

Forward pass.

Expected keys in x:

- 'load': torch.Tensor of shape (batch_size, seq_len, 1)
- 'building_type': torch.LongTensor of shape (batch_size, seq_len, 1)
- 'day_of_year': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'hour_of_day': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'day_of_week': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'latitude': torch.FloatTensor of shape (batch_size, seq_len, 1)
- 'longitude': torch.FloatTensor of shape (batch_size, seq_len, 1)

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required

Returns: predictions, distribution parameters (Tuple[torch.Tensor, torch.Tensor]): outputs

load_from_checkpoint(checkpoint_path) abstractmethod

Describes how to load the model from checkpoint_path.

loss(x, y) abstractmethod

A function for computing the loss.

Parameters:

Name Type Description Default
x Tensor

preds of shape (batch_size, seq_len, 1)

required
y Tensor

targets of shape (batch_size, seq_len, 1)

required

Returns: loss (torch.Tensor): scalar loss

predict(x) abstractmethod

A function for making a forecast on x with the model.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required

Returns: predictions (torch.Tensor): of shape (batch_size, pred_len, 1) distribution_parameters (torch.Tensor): of shape (batch_size, pred_len, -1)

unfreeze_and_get_parameters_for_finetuning() abstractmethod

For transfer learning.

  • Set requires_grad=True for parameters being fine-tuned (if necessary)
  • Return the parameters that should be fine-tuned.

LoadForecastingTransformer

buildings_bench.models.transformers.LoadForecastingTransformer

Bases: BaseModel

An encoder-decoder time series Transformer. Based on PyTorch nn.Transformer.

  • Uses masking in the decoder to prevent the model from peeking into the future
  • Uses N(0, 0.02) for weight initialization
  • Trains with teacher forcing (i.e. the target is used as the input to the decoder)
  • continuous_loads (True) just predict target values (False) categorical over quantized load values
__init__(context_len=168, pred_len=24, vocab_size=2274, num_encoder_layers=3, num_decoder_layers=3, d_model=256, nhead=8, dim_feedforward=256, dropout=0.0, activation='gelu', continuous_loads=False, continuous_head='mse', ignore_spatial=False, weather_inputs=None)

Parameters:

Name Type Description Default
context_len int

length of the input sequence.

168
pred_len int

length of the output sequence.

24
vocab_size int

number of quantized load values in the entire vocabulary.

2274
num_encoder_layers int

number of encoder layers.

3
num_decoder_layers int

number of decoder layers.

3
d_model int

number of expected features in the encoder/decoder inputs.

256
nhead int

number of heads in the multi-head attention models.

8
dim_feedforward int

dimension of the feedforward network model.

256
dropout float

dropout value.

0.0
activation str

the activation function of encoder/decoder intermediate layer, relu or gelu.

'gelu'
continuous_loads bool

whether inputs are continuous/to train the model to predict continuous values.

False
continuous_head str

'mse' or 'gaussian_nll'.

'mse'
ignore_spatial bool

whether to ignore the spatial features.

False
weather_inputs List[str]

list of weather features to use. Default: None.

None
forward(x)

Forward pass of the time series transformer.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors.

required

Returns: logits (torch.Tensor): [batch_size, pred_len, vocab_size] if not continuous_loads, [batch_size, pred_len, 1] if continuous_loads and continuous_head == 'mse', [batch_size, pred_len, 2] if continuous_loads and continuous_head == 'gaussian_nll'.

generate_sample(x, temperature=1.0, greedy=False, num_samples=1)

Sample from the conditional distribution.

Use output of decoder at each prediction step as input to the next decoder step. Implements greedy decoding and random temperature-controlled sampling.

Top-k sampling and nucleus sampling are deprecated.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required
temperature float

temperature for sampling

1.0
greedy bool

whether to use greedy decoding

False
num_samples int

number of samples to generate

1

Returns:

Name Type Description
predictions Tensor

of shape [batch_size, pred_len, 1] or shape [batch_size, num_samples, pred_len] if num_samples > 1.

distribution_parameters Tensor

of shape [batch_size, pred_len, 1]. Not returned if sampling.

TokenEmbedding

buildings_bench.models.transformers.TokenEmbedding

Bases: Module

Helper Module to convert tensor of input indices into corresponding tensor of token embeddings.

__init__(vocab_size, emb_size)

Parameters:

Name Type Description Default
vocab_size int

number of quantized load values in the entire vocabulary.

required
emb_size int

embedding size.

required

PositionalEncoding

buildings_bench.models.transformers.PositionalEncoding

Bases: Module

Helper Module that adds positional encoding to the token embedding to introduce a notion of order within a time-series.

__init__(emb_size, dropout, maxlen=500)

Parameters:

Name Type Description Default
emb_size int

embedding size.

required
dropout float

dropout rate.

required
maxlen int

maximum possible length of the incoming time series.

500

TimeSeriesSinusoidalPeriodicEmbedding

buildings_bench.models.transformers.TimeSeriesSinusoidalPeriodicEmbedding

Bases: Module

This module produces a sinusoidal periodic embedding for a sequence of values in [-1, +1].

__init__(embedding_dim)

Parameters:

Name Type Description Default
embedding_dim int

embedding size.

required
forward(x)

x is expected to be [batch_size, seqlen, 1].

ZeroEmbedding

buildings_bench.models.transformers.ZeroEmbedding

Bases: Module

Outputs zeros of the desired output dim.

__init__(embedding_dim)

Parameters:

Name Type Description Default
embedding_dim int

embedding size.

required
forward(x)

x is expected to be [batch_size, seqlen, 1].


Persistence Ensemble

buildings_bench.models.persistence.AveragePersistence

Bases: BaseModel

Predict each hour as the average over each previous day.

Previous Day Persistence

buildings_bench.models.persistence.CopyLastDayPersistence

Bases: BaseModel

Predict each hour as the same hour from the previous day.

Previous Week Persistence

buildings_bench.models.persistence.CopyLastWeekPersistence

Bases: BaseModel

Predict each hour as the same hour from the previous week.


Linear Regression

buildings_bench.models.linear_regression.LinearRegression

Bases: BaseModel

Linear regression model that does direct forecasting.

It has one weight W and one bias b. The output is computed as y = Wx + b, where W is a matrix of shape [pred_len, context_len].


DLinear

buildings_bench.models.dlinear_regression.DLinearRegression

Bases: BaseModel

Decomposition-Linear