Skip to content

buildings_bench.models

Available models:

  • Encoder-decoder time series transformer
  • Persistence Ensemble (AveragePersistence)
  • Previous Day Persistence (CopyLastDayPersistence)
  • Previous Week Persistence (CopyLastWeekPersistence)
  • Linear regression
  • DLinear

Main entry point for loading a BuildingsBench model is model_factory().


model_factory

buildings_bench.models.model_factory(model_name: str, model_args: Dict) -> Tuple[torch.nn.Module, Callable, Callable]

Instantiate and returns a model for the benchmark.

Returns the model itself, the loss function to use, and the predict function.

The predict function should return a tuple of two tensors: (point predictions, prediction distribution parameters) where the distribution parameters may be, e.g., logits, or mean and variance.

Parameters:

Name Type Description Default
model_name str

Name of the model.

required
model_args Dict

The keyword arguments for the model.

required

Returns:

Name Type Description
model torch.nn.Module

the instantiated model

loss Callable

loss function

predict Callable

predict function


BaseModel

buildings_bench.models.base_model.BaseModel

Bases: nn.Module, PyTorchModelHubMixin

Base class for all models.

__init__(context_len, pred_len, continuous_loads)

Init method for BaseModel.

Parameters:

Name Type Description Default
context_len int

length of context window

required
pred_len int

length of prediction window

required
continuous_loads bool

whether to use continuous load values

required
forward(x: Dict) -> Tuple[torch.Tensor, torch.Tensor] abstractmethod

Forward pass.

Expected keys in x
  • 'load': torch.Tensor of shape (batch_size, seq_len, 1)
  • 'building_type': torch.LongTensor of shape (batch_size, seq_len, 1)
  • 'day_of_year': torch.FloatTensor of shape (batch_size, seq_len, 1)
  • 'hour_of_day': torch.FloatTensor of shape (batch_size, seq_len, 1)
  • 'day_of_week': torch.FloatTensor of shape (batch_size, seq_len, 1)
  • 'latitude': torch.FloatTensor of shape (batch_size, seq_len, 1)
  • 'longitude': torch.FloatTensor of shape (batch_size, seq_len, 1)

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required

Returns:

Type Description
Tuple[torch.Tensor, torch.Tensor]

predictions, distribution parameters (Tuple[torch.Tensor, torch.Tensor]): outputs

loss(x: torch.Tensor, y: torch.Tensor) -> torch.Tensor abstractmethod

A function for computing the loss.

Parameters:

Name Type Description Default
x torch.Tensor

preds of shape (batch_size, seq_len, 1)

required
y torch.Tensor

targets of shape (batch_size, seq_len, 1)

required

Returns:

Name Type Description
loss torch.Tensor

scalar loss

predict(x: Dict) -> Tuple[torch.Tensor, torch.Tensor] abstractmethod

A function for making a forecast on x with the model.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required

Returns:

Name Type Description
predictions torch.Tensor

of shape (batch_size, pred_len, 1)

distribution_parameters torch.Tensor

of shape (batch_size, pred_len, -1)

unfreeze_and_get_parameters_for_finetuning() abstractmethod

For transfer learning.

  • Set requires_grad=True for parameters being fine-tuned (if necessary)
  • Return the parameters that should be fine-tuned.
load_from_checkpoint(checkpoint_path: Union[str, Path]) abstractmethod

Describes how to load the model from checkpoint_path.


LoadForecastingTransformer

buildings_bench.models.transformers.LoadForecastingTransformer

Bases: BaseModel

An encoder-decoder time series Transformer. Based on PyTorch nn.Transformer.

  • Uses masking in the decoder to prevent the model from peeking into the future
  • Uses N(0, 0.02) for weight initialization
  • Trains with teacher forcing (i.e. the target is used as the input to the decoder)
  • continuous_loads (True) just predict target values (False) categorical over quantized load values
__init__(context_len: int = 168, pred_len: int = 24, vocab_size: int = 2274, num_encoder_layers: int = 3, num_decoder_layers: int = 3, d_model: int = 256, nhead: int = 8, dim_feedforward: int = 256, dropout: float = 0.0, activation: str = 'gelu', continuous_loads: bool = False, continuous_head: str = 'mse', ignore_spatial: bool = False, weather_inputs: List[str] = None)

Parameters:

Name Type Description Default
context_len int

length of the input sequence.

168
pred_len int

length of the output sequence.

24
vocab_size int

number of quantized load values in the entire vocabulary.

2274
num_encoder_layers int

number of encoder layers.

3
num_decoder_layers int

number of decoder layers.

3
d_model int

number of expected features in the encoder/decoder inputs.

256
nhead int

number of heads in the multi-head attention models.

8
dim_feedforward int

dimension of the feedforward network model.

256
dropout float

dropout value.

0.0
activation str

the activation function of encoder/decoder intermediate layer, relu or gelu.

'gelu'
continuous_loads bool

whether inputs are continuous/to train the model to predict continuous values.

False
continuous_head str

'mse' or 'gaussian_nll'.

'mse'
ignore_spatial bool

whether to ignore the spatial features.

False
weather_inputs List[str]

list of weather features to use. Default: None.

None
forward(x)

Forward pass of the time series transformer.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors.

required

Returns:

Name Type Description
logits torch.Tensor

[batch_size, pred_len, vocab_size] if not continuous_loads, [batch_size, pred_len, 1] if continuous_loads and continuous_head == 'mse', [batch_size, pred_len, 2] if continuous_loads and continuous_head == 'gaussian_nll'.

generate_sample(x, temperature = 1.0, greedy = False, num_samples = 1)

Sample from the conditional distribution.

Use output of decoder at each prediction step as input to the next decoder step. Implements greedy decoding and random temperature-controlled sampling.

Top-k sampling and nucleus sampling are deprecated.

Parameters:

Name Type Description Default
x Dict

dictionary of input tensors

required
temperature float

temperature for sampling

1.0
greedy bool

whether to use greedy decoding

False
num_samples int

number of samples to generate

1

Returns:

Name Type Description
predictions torch.Tensor

of shape [batch_size, pred_len, 1] or shape [batch_size, num_samples, pred_len] if num_samples > 1.

distribution_parameters torch.Tensor

of shape [batch_size, pred_len, 1]. Not returned if sampling.

TokenEmbedding

buildings_bench.models.transformers.TokenEmbedding

Bases: nn.Module

Helper Module to convert tensor of input indices into corresponding tensor of token embeddings.

__init__(vocab_size: int, emb_size: int)

Parameters:

Name Type Description Default
vocab_size int

number of quantized load values in the entire vocabulary.

required
emb_size int

embedding size.

required

PositionalEncoding

buildings_bench.models.transformers.PositionalEncoding

Bases: nn.Module

Helper Module that adds positional encoding to the token embedding to introduce a notion of order within a time-series.

__init__(emb_size: int, dropout: float, maxlen: int = 500)

Parameters:

Name Type Description Default
emb_size int

embedding size.

required
dropout float

dropout rate.

required
maxlen int

maximum possible length of the incoming time series.

500

TimeSeriesSinusoidalPeriodicEmbedding

buildings_bench.models.transformers.TimeSeriesSinusoidalPeriodicEmbedding

Bases: nn.Module

This module produces a sinusoidal periodic embedding for a sequence of values in [-1, +1].

__init__(embedding_dim: int) -> None

Parameters:

Name Type Description Default
embedding_dim int

embedding size.

required
forward(x: torch.Tensor) -> torch.Tensor

x is expected to be [batch_size, seqlen, 1].

ZeroEmbedding

buildings_bench.models.transformers.ZeroEmbedding

Bases: nn.Module

Outputs zeros of the desired output dim.

__init__(embedding_dim: int)

Parameters:

Name Type Description Default
embedding_dim int

embedding size.

required
forward(x: torch.Tensor) -> torch.Tensor

x is expected to be [batch_size, seqlen, 1].


Persistence Ensemble

buildings_bench.models.persistence.AveragePersistence

Bases: BaseModel

Predict each hour as the average over each previous day.

Previous Day Persistence

buildings_bench.models.persistence.CopyLastDayPersistence

Bases: BaseModel

Predict each hour as the same hour from the previous day.

Previous Week Persistence

buildings_bench.models.persistence.CopyLastWeekPersistence

Bases: BaseModel

Predict each hour as the same hour from the previous week.


Linear Regression

buildings_bench.models.linear_regression.LinearRegression

Bases: BaseModel

Linear regression model that does direct forecasting.

It has one weight W and one bias b. The output is computed as y = Wx + b, where W is a matrix of shape [pred_len, context_len].


DLinear

buildings_bench.models.dlinear_regression.DLinearRegression

Bases: BaseModel

Decomposition-Linear