buildings_bench.tokenizer

Tokenizer Quick Start

Instantiate a LoadQuantizer

from buildings_bench.tokenizer import LoadQuantizer

transform_path =  Path(os.environ.get('BUILDINGS_BENCH')) / 'metadata' / 'transforms'

load_transform = LoadQuantizer(
    with_merge=True,  # Default vocabulary has merged KMeans centroids
    num_centroids=2274, # Default vocabulary has 2,274 tokens
    device='cuda:0' if 'cuda' in args.device else 'cpu')

# Load the saved faiss KMeans state from disk
load_transform.load(transform_path)

Quantize a load time series

batch['load'] = load_transform.transform(batch['load'])

Dequantize transformer predictions

# predictions are a Tensor of shape [batch_size, pred_len, 1] of quantized values
# distribution_params is a Tensor of shape [batch_size, pred_len, num_centroids] of logits
predictions, distribution_params = model.predict(batch)

# Dequantize the predictions
predictions = load_transform.undo_transform(predictions)

Extract the categorical distribution

# First, apply softmax to the logits to normalize them into a categorical distribution
distribution_params = torch.softmax(distribution_params, dim=-1)

# The merged centroid values are the load values corresponding
# to each token. Note that the merged centroids are already sorted
# in increasing order.

# if using merge...
load_values = load_transform.merged_centroids
# else, load_values = load_transform.kmeans.centroids.squeeze()
# Now, distribution_params[i] is the probability 
# assigned to load_values[i].

LoadQuantizer

`buildings_bench.tokenizer.LoadQuantizer`

Quantize load timeseries with KMeans. Merge centroids that are within a threshold.

`init(seed: int = 1, num_centroids: int = 2274, with_merge: int = False, merge_threshold: int = 0.01, device: str = 'cpu')`

Parameters:

Name	Type	Description	Default
`seed`	`int`	random seed. Default: 1.	`1`
`num_centroids`	`int`	number of centroids: Default: 2274.	`2274`
`with_merge`	`bool`	whether to merge centroids that are within a threshold: Default: False.	`False`
`merge_threshold`	`float`	threshold for merging centroids. Default: 0.01 (kWh).	`0.01`
`device`	`str`	cpu or cuda. Default: cpu.	`'cpu'`

`train(sample: np.ndarray) -> None`

Fit KMeans to a subset of the data.

Optionally, merge centroids that are within a threshold.

Parameters:

Name	Type	Description	Default
`sample`	`np.ndarray`	shape [num_samples, 1]	required

`transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]`

Quantize a sample of load values into a sequence of indices.

Parameters:

Name	Type	Description	Default
`sample`	`Union[np.ndarray, torch.Tensor]`	of shape (n, 1) or (b,n,1). type is numpy if device is cpu or torch Tensor if device is cuda.	required

Returns:

Name	Type	Description
`sample`	`Union[np.ndarray, torch.Tensor]`	of shape (n, 1) or (b,n,1).

`undo_transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]`

Dequantize a sample of integer indices into a sequence of load values.

Parameters:

Name	Type	Description	Default
`sample`	`Union[np.ndarray, torch.Tensor]`	of shape (n, 1) or (b,n,1). type is numpy if device is cpu or torch Tensor if device is cuda.	required

Returns:

Name	Type	Description
`sample`	`Union[np.ndarray, torch.Tensor]`	of shape (n, 1) or (b,n,1).

buildings_bench.tokenizer

Tokenizer Quick Start

Instantiate a LoadQuantizer

Quantize a load time series

Dequantize transformer predictions

Extract the categorical distribution

LoadQuantizer

buildings_bench.tokenizer.LoadQuantizer

__init__(seed: int = 1, num_centroids: int = 2274, with_merge: int = False, merge_threshold: int = 0.01, device: str = 'cpu')

train(sample: np.ndarray) -> None

transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]

undo_transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]

`buildings_bench.tokenizer.LoadQuantizer`

`init(seed: int = 1, num_centroids: int = 2274, with_merge: int = False, merge_threshold: int = 0.01, device: str = 'cpu')`

`train(sample: np.ndarray) -> None`

`transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]`

`undo_transform(sample: Union[np.ndarray, torch.Tensor]) -> Union[np.ndarray, torch.Tensor]`