buildings_bench.transforms
BoxCoxTransform
buildings_bench.transforms.BoxCoxTransform
A class that computes and applies the Box-Cox transform to data.
__init__(max_datapoints=1000000)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_datapoints
|
int
|
If the number of datapoints is greater than this, subsample. |
1000000
|
load(saved_path)
Load the Box-Cox transform
save(output_path)
Save the Box-Cox transform
train(data)
Train the Box-Cox transform on the data with sklearn.preprocessing.PowerTransformer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
array
|
of shape (n, 1) or (b,n,1) |
required |
transform(sample)
Transform a sample via Box-Cox. Not ran on the GPU, so input/output are numpy arrays.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
ndarray
|
of shape (n, 1) or (b,n,1) |
required |
Returns:
Name | Type | Description |
---|---|---|
transformed_sample |
ndarray
|
of shape (n, 1) or (b,n,1) |
undo_transform(sample)
Undo the transformation of a sample via Box-Cox
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
Union[ndarray, LongTensor]
|
of shape (n, 1) or (b,n,1). numpy if device is cpu or torch Tensor if device is cuda. |
required |
Returns:
Name | Type | Description |
---|---|---|
unscaled_sample |
Union[ndarray, LongTensor]
|
of shape (n, 1) or (b,n,1). |
StandardScalerTransform
buildings_bench.transforms.StandardScalerTransform
A class that standardizes data by removing the mean and scaling to unit variance.
__init__(max_datapoints=1000000, device='cpu')
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_datapoints
|
int
|
If the number of datapoints is greater than this, subsample. |
1000000
|
device
|
str
|
'cpu' or 'cuda' |
'cpu'
|
load(saved_path)
Load the StandardScaler transform
save(output_path)
Save the StandardScaler transform
train(data)
Train the StandardScaler transform on the data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
array
|
of shape (n, 1) or (b,n,1) |
required |
transform(sample)
Transform a sample via StandardScaler
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
Union[ndarray, LongTensor]
|
shape (n, 1) or (b,n,1) |
required |
Returns: transformed_samples (torch.Tensor): shape (n, 1) or (b,n,1)
undo_transform(sample)
Undo the transformation of a sample via StandardScaler
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
ndarray
|
of shape (n, 1) or (b,n,1) or torch.Tensor of shape (n, 1) or (b,n,1) |
required |
Returns:
Name | Type | Description |
---|---|---|
unscaled_sample |
Tensor
|
of shape (n, 1) or (b,n,1) |
undo_transform_std(scaled_std)
Undo transform for standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scaled_std
|
Tensor
|
of shape (n, 1) or (b,n,1) |
required |
Returns:
Name | Type | Description |
---|---|---|
unscaled_std |
Tensor
|
of shape (n, 1) or (b,n,1) |
LatLonTransform
buildings_bench.transforms.LatLonTransform
Pre-processing lat,lon data with standard normalization by Buildings-900K training set.
transform(puma_id)
Look up a PUMA ID's normalized Lat/Lon centroid.
This is used in the Buildings-900K Dataset to look up a lat/lon for each building's PUMA.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
puma_id
|
str
|
PUMA ID |
required |
Returns:
Name | Type | Description |
---|---|---|
centroid |
ndarray
|
of shape (1,2) |
transform_latlon(latlon)
Transform a raw Lat/Lon sample into a normalized Lat/Lon sample
Parameters:
Name | Type | Description | Default |
---|---|---|---|
latlon
|
ndarray
|
of shape (2,). |
required |
Returns:
Name | Type | Description |
---|---|---|
transformed_latlon |
ndarray
|
of shape (2,). |
undo_transform(normalized_latlon)
Undo the transformation of a sample
Parameters:
Name | Type | Description | Default |
---|---|---|---|
normalized_latlon
|
ndarray
|
of shape (n, 2) or (b,n,2). |
required |
Returns:
Name | Type | Description |
---|---|---|
unnormalized_latlon |
ndarray
|
of shape (n, 2) or (b,n,2). |
TimestampTransform
buildings_bench.transforms.TimestampTransform
Extract timestamp features from a Pandas timestamp Series.
__init__(is_leap_year=False)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
is_leap_year
|
bool
|
Whether the year of the building data is a leap year or not. |
False
|
transform(timestamp_series)
Extract timestamp features from a Pandas timestamp Series.
- Day of week (0-6)
- Day of year (0-364)
- Hour of day (0-23)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timestamp_series
|
DataFrame
|
of shape (n,) or (b,n) |
required |
Returns:
Name | Type | Description |
---|---|---|
time_features |
ndarray
|
of shape (n,3) or (b,n,3) |
undo_transform(time_features)
Convert normalized time features back to original time features
Parameters:
Name | Type | Description | Default |
---|---|---|---|
time_features
|
ndarray
|
of shape (n, 3) or (b,n,3) |
required |
Returns:
Name | Type | Description |
---|---|---|
unnormalized_time_features |
ndarray
|
of shape (n, 3) or (b,n,3) |