Frequently Asked Questions

Expand the sections below for answers to frequently asked questions. If you have additional questions, please email us at ComStock@nrel.gov.

ComStock Essentials

  • The profiles are simulated using the ResStock and ComStock modeling tools, which have been validated and informed by the best available data against an array of empirical datasets. ResStock and ComStock use the EnergyPlus simulation engine. The validation results and uncertainty for quantities of interest are presented in the End-Use Load Profiles final report.

    ResStock generally simulates 550,000 individual building energy models, and ComStock simulates 150,000 building energy models.

  • ComStock models 14 commercial building types. Compared to the Commercial Building Energy Consumption Survey (CBECS) estimation, ComStock datasets account for 64% of the energy use and 62% of the floor area of commercial buildings in the United States. The ComStock development team is actively working on adding more building types to the model. See the explanation titled "Building Types Not Included in ComStock" for more detail.

  • The ComStock and ResStock datasets represent, as closely as possible, the 2018 U.S. commercial and residential building stock characteristics. The energy consumption results depend on the weather data used in the simulations. When modeled with AMY2018 weather, the datasets represent energy use for the year 2018. When TMY3 weather is used, they represent typical or average energy consumption under typical climate conditions.

    Emissions and utility bills in the ComStock and ResStock datasets use input data from a several years, depending on the dataset release. See the ComStock reference documentation or ResStock reference documentation for more details.

  • Yes. The models underwent extensive calibration as part of the End Use Load Profiles (EULP) project where we compared model load profiles to AMI data from around the country, and updated baseline model schedules, power densities, among other things using various data sources. Reference the final report for more details. The EULP project concluded in 2021.

    For every baseline update and upgrade measures since EULP, ComStock compares energy consumption and EUI to available data sources, such as CBECS and EIA. These comparisons are available on the OEDI Data Lake for each dataset. You can find links to OEDI in the Published Datasets section of the Data page.

    For details about how to determine whether the models are appropriate for a specific analysis, reference the explanation titled "Considerations for ComStock Calibration, Validation, and Uncertainty."

  • ComStock publishes datasets on a regular basis, and we recommend using the latest release. See the Data page for a list of available datasets and access links.

    New datasets include any improvements made to the baseline model, as well as new upgrade measures and all measures from previous releases. Information about upgrade measures included in dataset releases can be found on the Upgrade Measures page. Baseline model improvements are captured in the release change log on our public GitHub repository. Note that we re-sample our input characteristic distributions for every release and as a result, the building IDs between releases will not match.

  • Weights in ComStock represent the number of real buildings in the U.S. building stock that a ComStock model represents. Weights are determined using national floor area by building type from CBECS. Use the weights by multiplying the energy consumption column by the weight for the model. Some results columns already have the weight applied. These have the word “weighted” in the name. See the explanation titled "Sampling and Weighting in ComStock" for more information.

  • The minimum sample count required for a given geography in ComStock is a function of the number of commercial buildings present in that area, as well as the quality of available input data for the ComStock model. To ensure statistical robustness in your analysis using ComStock, you may need additional building models depending on the specificity of your segmentation. A good rule of thumb is to include at least six models per segment (e.g., building type, sub-type, size, vintage, or operation hours). For example, if you’re analyzing small office buildings open more than 18 hours a day, make sure you have at least six such models.

    Also, cross-check ComStock’s building representation with external sources (like Google Maps or local datasets) to ensure the dataset reflects your target geography. For more detail, see the explanation titled "Sample Size Considerations"

    Queries in sparsely populated areas or with filters applied may have relatively few samples available. In these cases, samples from nearby locations can be grouped to increase the sample size. See the tutorial titled "Perform an analysis by blending ComStock and local data" for an example of incorporating local floor area estimates to improve representation of ComStock data at specific geographic resolutions.

    Users should estimate standard error for metrics of interest using the standard deviation divided by the square root of the number of samples (i.e., profiles or models). See Section 5.1.3 in the End-Use Load Profiles methodology report for a discussion on uncertainty calculations.

  • ComStock and ResStock can be cited according to the suggestions here for ComStock and here for ResStock.

Datasets and Data Access

  • There are several access platforms available to access ComStock and ResStock datasets. See the ComStock Data page and ResStock Data page for more detail about dataset access and links to the public datasets.

  • Descriptions of each of the building characteristics and the end-use categories can be found in the “data_dictionary.tsv” file. Descriptions of the values used in those filters can be found in the “enumeration_dictionary.tsv”. Both files can be downloaded from the OEDI Data Lake and are unique to each dataset release. Use the correct data dictionary for the relevant dataset. They can be opened with Excel or a text editor.

    Links to the OEDI Data Lake for each dataset release can be found on the ComStock Data page and ResStock Data page.

  • ComStock and ResStock data have multiple units. For annual results data downloaded from the Open Energy Data Initiative (OEDI) data lake, units can be found in the "data_dictionary.tsv" file. Some fields will also have the units in the column header at the end of the name (e.g., "out.electricity.total.jan.energy_consumption..kwh"). Timeseries energy consumption data on OEDI are provided in kWh. Natural gas, fuel oil, and propane are output in kwh--this is intentional though unconventional.

    The Data Viewer provides energy data in metric units, visible in the y-axis label. Depending on the scale of energy being shown, the metric prefix will automatically adjust (T for tera, G for giga, M for mega, etc.).

    For Tableau dashboards, use the relevant column headers or the graph axis to see the units.

  • The timestamps of all load profiles have been converted to Eastern Standard Time, to prevent issues when aggregating across time zones.

    The underlying modeling was conducted using local standard time for each location, with occupant schedules adjusted for daylight savings as applicable. All EnergyPlus timeseries outputs were converted from local standard time to Eastern Standard Time for publication in the web Data Viewer, Data Viewer exports, timeseries aggregates, and individual timeseries parquet files. In converting from local Standard Time to Eastern Standard Time, if necessary the last few hours of each dataset were moved to the beginning of the timeseries. For example, the first two hours of data from Colorado in Eastern Standard Time (Jan 1, midnight to 2 AM) were originally modeled as the last two hours of the year in Mountain Standard Time (Dec 31, 10 PM to midnight) using the corresponding weather.

  • The timestamp indicates the end of each 15-minute interval. So "12:15" represents the energy use between 12:00 and 12:15.

  • Yes. The aggregates represent the total relevant building stock with all relevant weights applied (e.g., all small office buildings in the state of Colorado), not just the sum of the model results.

  • ComStock includes commercial buildings in California. However, as of ComStock 2024 Release 2, California climate zones are not available as a characteristic in ComStock public datasets. This characteristic will be made available in an upcoming dataset release.

    There are a few known issues with California models in ComStock. Please see the "California Models Known Issues" explanation for more information.

  • ComStock and ResStock use the National Historical GIS (NHGIS) GISJOIN standard codes for county, census PUMA, and census tract, which are based on Federal Information Processing System (FIPS) codes. The datasets use the 2010 version of the GISJOIN codes--2020 are not available at this time. For more information about the geospatial fields available in the datasets, see this explanation for ComStock, and this explanation for ResStock.

    In most ComStock and ResStock datasets, county name is available in addition to the GISJOIN county code. For both tools, the column in the metadata_and_annual_results files on OEDI is called "in.county_name."

  • See the Upgrade Measures page for a complete list of available upgrade measures and packages in ComStock datasets, including a link to their documentation, and in which dataset release the measure was first included.

  • Weather data used for the modeling have been provided in .csv format for regression modeling, forecasting, or other analyses. The TMY3 weather files in EnergyPlus input format (EPW) can be downloaded from the NREL Data Catalog, with filenames that correspond to county IDs in the ResStock and ComStock metadata. EPW format weather files for 2018 or other actual meteorological years (AMY) have not been publicly released. These files can be purchased from private sector vendors. See here for a list of providers.

  • OpenStudio model input files (.osm) are available in the dataset on the OEDI data lake in the "building_energy_models" directory. Files are named by the building ID ("bldg_id"). The EnergyPlus model input files are not available.

  • Currently, there is no API. However, we have posted a tutorial example showing how to load the datasets into cloud services such as Amazon Web Services (AWS) so the data can be queried by analytic tools like Athena.

    Example notebooks and SQL queries are also available on the "Access ComStock datasets programmatically" page, and more will be added as we develop them. The queries and example notebooks are a good starting point for accessing ResStock programmatically, too.

  • Parquet files can be read using programming languages such as Python, using the pyarrow package. For other options, see here. There are a few third-party graphical tools for viewing parquet files, but we have not tested them and the third-party support is limited.

    See below for example Python code to convert parquet file to csv.

    
            import pandas as pd
            import os
            folder_path = 'C:/Users/username/Documents/EUSS/Results’
            file_name = '813-2'
            suffix = '.parquet'
            file = pd.read_parquet(os.path.join(folder_path, file_name+suffix))
            new_suffix = '.csv'
            file.to_csv(os.path.join(folder_path, file_name+new_suffix), index=False)
            

  • The building IDs and exact building characteristics between releases will not match because we re-sample our input characteristic distributions for every release. However, you can filter the building models using building characteristics to identify similar samples between releases. For instance, using building type, size, location, and wall construction type to identify similar models. The fields with the prefix “in.” show the available model inputs that you can use to do the comparison. You can see a complete list and description of available fields in the “data_dictionary.tsv” file on the OEDI Data Lake. Links to the datasets on OEDI are in the "Published Datasets" section of the ComStock Data page and ResStock data page.

Data Viewer

  • The Data Viewer is a web-based visualization platform that allows users to easily filter, aggregate, view, and download ComStock end-use energy data in a web browser.

    Links to Data Viewer visualizations for each dataset release are on the Data page.

    For Data Viewer trainings, visit the NREL’s Building Stock Analysis YouTube channel.

  • The "sum" aggregation is the total energy consumption for all buildings that meet the filter criteria across all the occurrences of the given time step within the selected month(s). For example, in a day timeseries range for a specific state for the month of July, the 7-7:15 AM hour time step shows the sum of all energy consumption statewide between 7-7:15 AM in July, from buildings that meet the filter criteria. The "sum" view has fewer uses than the "average" view. The "average" aggregation is the total energy consumption for all buildings that meet the filter criteria, averaged across all the occurrences of the given time step within the selected month(s).

    For example, in a day timeseries range for a specific state for the month of July, the 7-7:15 AM hour time step shows the average statewide energy consumption between 7-7:15 AM in July, from buildings that meet the filter criteria. The "average" aggregation provides a view of the average day of total energy consumption in the state. This is the more logical view for most use cases. Note that while each time step within a day or a year has the same number of occurrences within each dataset, each time step for a week does not - some days of the week occur more times than others in each year or month range (except for February).

  • The peak day is the day with the highest single-hour (peak) energy consumption within the selected months.

    The min peak day is the day with the lowest single-hour energy consumption within the selected months.

  • We query data in real time to produce the time series graphs you see on the webpage, and this can involve scanning terabytes (TB) of data. Running a baseline-only query for California, Texas, New York, or Illinois takes around a minute, while running a query for a state like Colorado or Massachusetts takes about 10-20 seconds. However, if the graphs have previously been generated we have the data cached and can typically load the data in a few seconds. That's why the load time varies.

  • The “Explore Timeseries” option is available once a specific geography (e.g. state or PUMA region) is selected.

  • Clicking on the end uses in the legend will highlight the end use in the visualization.

  • The viewer allows aggregations of up to six locations (states or PUMAs, depending on the dataset). When viewing a single location, choose the “+ More Locations” option, add up to five additional locations, and choose “Update Search”.

    Additionally, sums of more than six locations can be created manually by downloading sums of up to six locations and summing further on your local computer.

    TMY3 weather is not aligned between locations. This does not affect our recommendations for working with annual data. However, if your application requires timeseries data and therefore would benefit from aligned weather, we recommend either using an AMY dataset, or filtering by weather station and summing only within a single weather station’s PUMAs.

  • The "+ Filter" button enables users to filter the data by characteristics, such as vintage, floor area, and building type. This feature also enables aggregations of locations, including by PUMA and county.

  • The building characteristics are available on the Open Energy Data Initiative (OEDI) data lake. Visit the Data page for links to the OEDI pages for each dataset. In the "metadata_and_annual_results_aggregate" directory on OEDI, navigate to the national file: metadata_and_annual_results_aggregates > national > full > csv > baseline_agg.csv.gz. Download the file, unzip it and open in Microsoft Excel. Use the filters applied on the Data Viewer to filter the spreadsheet.

    Note that the national file is an “aggregate,” meaning that the data in the file is consolidated by merging duplicate building models within a geography (in this case state), so each building ID appears only once with a combined weight. Columns that cannot be meaningfully aggregated from the tract level—such as Cambium grid region and CEJST designation—are excluded from the resulting low-resolution, “aggregate” files. For more information about the updated OEDI file structure as a result of the new sampling method, please see the "New ComStock Sampling Method" explanation.

Analysis

  • The code required to run ComStock and ResStock is available on the ComStock and ResStock public GitHub repositories. Other related code repositories are provided on the "For Developers" page for ComStock and ResStock.

    While these resources are available, ComStock and ResStock are complex modeling tools and there is no documentation for running the model other than what exists in the codebase, and we are not able to support running the models at this time. We generally do not recommend running the model unless you have a deep understanding of the methodology and objectives. Please email us at ComStock@nrel.gov or ResStock@nrel.gov if you have suggestions for improvements or specific needs.

  • Our general guidance is to NOT combine measure results. There are interactions between most upgrade measures that affect the amount of savings and make results of multiple measures together misleading.

    For an explanation and examples on this topic, see the linked ComStock and ResStock resources.

    If you have questions about combining specific measures, please email us at ComStock@nrel.gov or ResStock@nrel.gov.

Modeling Methods, Assumptions and Documentation

  • ComStock reference documentation is available in the References section of the Resources page. We publish an updated version with every dataset release that includes changes to the ComStock model.

  • ComStock calculates the greenhouse gas emissions from the building stock and savings from measures using both historical and projected emissions data. Historical electricity emissions use the CO2-equivalent total output emission rate from EPA’s Emissions and Generation Resource Integrated Database (eGRID). Projected electricity emissions use data from NREL’s Cambium dataset. Projected emissions consider both the average emissions rate (AER) and the long-run marginal emission rate (LRMER).

    Natural gas, propane, and fuel oil emissions use the emission factors defined in ANSI/RESNET/ICCC 301-2022 Addendum B-2022 Standard for the Calculation and Labeling of the Energy Performance of Dwelling and Sleeping Units using an Energy Rating Index.

    Emissions input data are updated as they become available and do not always match the ComStock dataset release simulation year (typically 2018). Please see the ComStock reference documentation for more detail and specifics about which emissions input data years were used for a given dataset release.

  • As of the 2024 Release 1, ComStock includes utility cost data using current electricity rates from the Utility Rate Database (URDB), matched by utility ID, demand, and usage. Annual utility bills are reported as the min, max, mean, and median of all applicable rates for each model. Natural gas, propane, and fuel oil prices are based on volumetric pricing due to limited rate data, using EIA price and heat content data. See the ComStock reference documentation for details.

    ComStock does not calculate first costs (i.e., upgrade or measure costs). However, many ComStock output variables can be used to estimate first cost. See the explanation titled "Using ComStock to Analyze Cost" for more detail about cost assessments, including a discussion of output variables that can be used to estimate first costs.

  • ComStock does not currently model rooftop solar PV, though this capability is coming soon. We recommend using PVWatts or REopt to evaluate commercial solar PV opportunities.

  • No, ComStock does not currently model EV charging in the dataset. For modeling aggregate EV load profiles for a city or state, we suggest using EVI-Pro Lite. Measured charging profile data for individual homes can be found in the NEEA HEMS data and Pecan Street Dataport. Email us at ComStock@nrel.gov if you have suggestions for other EV charging data sources.

  • We have not published a service water heating measure due to current water draw profiles in our baseline models. The energy consumed by heat pump water heaters (HPWHs), especially, is sensitive to how quickly the water in the tank is consumed. More specifically, how to design and size a HPWH system greatly relies on realistic water draw profiles to correctly capture when the heat pump heating and, especially, backup heating elements are triggered.

    If you are aware of water draw profile data, please let email us at ComStock@nrel.gov! We are in search of 15-min to hourly water draw profiles for commercial buildings of various types and square footage.

  • We do not currently model data centers as a building type in ComStock. Some large office buildings include data center loads, but these do not capture the characteristics, performance, HVAC system, etc. of standalone data centers and we do not recommend extrapolating results from these models.

  • To date, ComStock public dataset releases include AMY2018 and TMY3 weather years, neither of which are leap years.

    If ComStock were to simulate a leap year, the workflow would be as follows. The default simulation setting is a one-year, 8,760-hour simulation, starting on January 1 and ending on December 31. If the calendar year of simulation is a leap year, the end of the simulation period will be input as December 30 instead of December 31 to ensure 8,760 hours of simulation results. In years with February 29, December 31 will not be included in the simulation.

    For more detail, please see the ComStock reference documentation.