Data

Given the complexity of the ComStock software workflow, and the big-data skill set and computing hardware required, the pathway for professionals and researchers to use ComStock successfully is to interact with the pre-created results, rather than running the ComStock modeling tool. This section provides information about accessing ComStock data, and a list of published datasets.

Data Access Platforms, Structure and Contents

Published Datasets

These datasets describe the timeseries energy consumption of the U.S. commercial building stock at the end-use level. For details on how it was created and validated, please see the project’s final report. See the Data Access Platforms, Structure and Contents section for more details about the data.

Each End Use Savings Shape dataset release is introduced with a webinar presentation. Access the webinar recordings and slides, below:

ComStock dataset releases are summarized in the following table with links for accessing the aggregate results.

  ComStock End Use Savings Shape 2023 Release 2 - 2018 Weather ComStock End Use Savings Shape 2023 Release 1 - 2018 Weather ComStock End Use Load Profiles - 2018 Weather ComStock End Use Load Profiles - Typical Weather
OEDI Name 2023/comstock_amy2018_release_2 2023/comstock_amy2018_release_1 2021/comstock_amy2018_release_1 2021/comstock_tmy3_release_1
Data Viewer Links
Annual and Timeseries Energy
by_state by_state by_state,
by_puma_northeast,
by_puma_midwest,
by_puma_south,
by_puma_west
by_state,
by_puma_northeast,
by_puma_midwest,
by_puma_south,
by_puma_west
Data Table with
Characteristics and
Annual Energy Use
metadata metadata metadata metadata
OpenEI Data Lake suppl_data_dict suppl_data_dict suppl_data_dict suppl_data_dict
Publication Date Sept-23 March-23 Oct-21 Oct-21
Release # 2023_2 2023_1 2021_1 2021_1
Building Stock
Represented
U.S. commercial sector circa 2018 U.S. commercial sector circa 2018 U.S. commercial sector circa 2018 U.S. commercial sector circa 2018
Upgrades Available 1. HP RTU, Electric Backup;
2. HP RTU, Original Heating Fuel Backup;
3. VRF with DOAS;
4. DOAS HP Minisplits;
5. HP Boiler, Electric Backup;
6. HP Boiler, Gas Backup;
8. Demand Control Ventilation;
9. Energy Recovery;
10. LED Lighting;
11. Wall Insulation;
12. Roof Insulation;
13. Secondary Windows;
14. Window Film;
15. New Windows;
16. Package 1, Wall & Roof Insulation + New Windows;
17. Package 2, LED Lighting + HP RTU or HP Boilers;
18. Package 3, Package 1 + Package 2
1. HP-RTU
2. DOAS HP Minisplits
3. Heat Pump Boiler
5. LED Lighting
6. Exterior Wall Insulation
7. Roof Insulation
8. Secondary Windows
9.Window Film
10. New Windows
None None
Weather Year amy2018 amy2018 amy2018 tmy3

Data Access Platforms, Structure and Contents

At the most fundamental level, the ComStock dataset is a collection of end-use load profiles of approximately 350,000 building energy models. The output of each building energy model is 1 year of energy consumption in 15-minute intervals, separated into end-use categories.

Accessing national ComStock building load profiles in the full dataset requires big-data skills that make the full dataset inaccessible for most users. To support many use cases, aggregate load profiles for the following geographic resolutions are published for ComStock releases:

  • 16 ASHRAE/International Energy Conservation Code climate zones
  • 5 U.S. Department of Energy Building America climate zones
  • 8 Electric System independent system operator and regional transmission organization regions
  • 2,400+ U.S. Census Public Use Microdata Areas
  • 3,000+ U.S. counties.

Data Access Platforms

The following table summarizes the various ways to access and use ComStock data.

The dataset has been formatted to be accessible in four main ways to meet the needs of many different users and use cases.

Metadata: Files of individual model characteristics together with annual results, commonly referred to as the “metadata” file

Load Profiles: Timeseries load profiles (individual building and pre-aggregated) in downloadable spreadsheets

Data Viewer: A web-based data viewer, customizable time scales and aggregations

Full Database: A detailed format that can be queried with big data tools

Aggregate ComStock datasets can be accessed via the Open Energy Initiative (OpenEI) Data Lake and the ComStock data viewer. There are two versions of the datasets published with each release: one with actual weather data (AMY), and another with typical weather data (TMY3). Note: The TMY3 15-minute energy data should not be used for larger geographies because weather events are not regionally aligned.

For information on how to query the full ComStock dataset, please refer to this documentation.

Please note, there are separate public datasets available for residential and commercial building stocks. 

Open Energy Initiative (OEDI) Data Lake

OpenEI is an energy information portal, and is developed and maintained by the National Renewable Energy Laboratory with funding and support from the U.S. Department of Energy and a network of International Partners & Sponsors. The OpenEI data lake contains comprehensive aggregate data for ComStock releases. This includes metadata and timeseries energy consumption results (baseline and upgrades, if applicable), individual building energy models, weather files, geographic information, and data dictionaries.

The ComStock release directory structure of the data lake is summarized in the table, below. For more detailed information about the contents of the ComStock OpenEI data lake, visit the README.

ComStock Data Viewer

The ComStock data viewer exists to quickly filter, slice, combine, visualize, and download the results in custom ways. This platform is available at comstock.nrel.gov. Multiple geographic views of the datasets on the data viewer have been created: by state, and by Census region by PUMA.

OEDI Directory Structure and Contents

Name Contents
building_energy_models Building energy models, in OpenStudio format, that were run to create the dataset.
geographic_information Information on various geographies used in the dataset provided for convenience. Includes map files showing the shapes of the geographies (states, PUMAs) used for partitioning and a lookup table mapping between census tracts and various other geographies.
metadata Building characteristics (age, area, HVAC system type, etc.) for each of the building energy models run to create the timeseries data and annual energy results. Descriptions of the characteristics are included in data_dictionary.tsv, enumeration_dictionary.tsv, and upgrade_dictionary.tsv.
timeseries_aggregates Aggregate end-use load profiles by building type and geography that can be opened and analyzed in Excel, python, or other common data analysis tools.
timeseries_aggregates_metadata Building characteristics for timeseries_aggregates building energy models. Follows the same format at metadata.
timeseries_individual_buildings The raw individual building timeseries data. This is a large number of individual files!
weather Key weather data used as an input to run the building energy models to create the dataset.
citation.txt Citation to use when referencing this work.
data_dictionary.tsv Describes the column names found in the metadata and timeseries data files.
enumeration_dictionary.tsv Expands the definitions of the enumerations used in the metadata files.
upgrade_dictionary.tsv Expands the definitions of the upgrades.

Dataset Naming Convention

ComStock releases on OpenEI data lake and the data viewer use the following naming convention.

         <dataset type>_<weather data>_<year of publication>_release_<release number>
 example:   comstock   _   amy2018    _         2021        _release_       1
  result:   comstock_amy2018_2021_release_1
  • dataset type
    • resstock = residential buildings stock
    • comstock = commercial building stock
  • weather data
    • amy2018 = actual meteorological year 2018 (2018 weather data from NOAA ISD, NSRDB, and MesoWest)
    • tmy3 = typical weather from 1991-2005 (see this publication for details)
  • year of publication
    • 2021 = dataset was published in 2021
    • 2022 = dataset was published in 2022
    • etc.
  • release
    • release_1 = first release of the dataset during the year of publication
    • release_2 = second release of the dataset during the year of publication
    • etc.

Field Naming Convention

The field naming convention is fairly simple. At the highest level there is – “in.” for inputs, “out.” for outputs, “calc.” for calculated fields, then a handful of columns that provide simulation information.

For the “out.” prefix there is a second level that includes – fuel type, emissions, model parameter and statistic fields, and site energy. The “in.” prefix does not have a second level.

The third level of “out.” is where you’ll find the end uses.

Finally, units are denoted by a “..” with the unit following.


ComStock, Copyright (c) 2019-2023, Alliance for Sustainable Energy, LLC, and other contributors. All rights reserved.