Skip to content


On this page, we describe two ways to get started with training RL agents in OCHRE™ Gym.

1. Quickest: Using ochre_gym.load

We provide a helper function to quickly instantiate an OCHRE™ Gym environment for one of the provided buildings (e.g., basic-v0) via ochre_gym.load:

import ochre_gym

# OchreEnv object
env = ochre_gym.load(

The ochre_gym.load function will handle creating the OCHRE™ building simulator instance using the properties, schedule, and weather files located in ochre_gym/buildings/basic-v0. Keyword arguments passed to load can be used to override the defaults in the ochre_gym/buildings/defaults.toml config file. For example, override the default observation space (the full OCHRE™ control result with time stamp and energy price information added) by setting override_ochre_observations_with_keys to a list of desired keys:

import ochre_gym

env = ochre_gym.load(
    override_ochre_observations_with_keys = [
        'Energy Price ($)',
        'Temperature - Indoor (C)',
        'Total Electric Power (kW)'

2. OchreEnv

You can also directly instantiate an OchreEnv object. This may be desirable if you want to subclass OchreEnv or customize the environment's observation space by passing a user-defined OchreObservationSpaceBaseConfig to the observation_space_config keyword argument.

This requires passing in the OCHRE™ building simulator's dwelling_args, which is a dictionary with keys: start_time, time_res, duration, initialization_time, hpxml_file, schedule_input_file, and weather_file.

For example:

from ochre_gym import OchreEnv

env = OchreEnv('basic-v0',
                actions = {
                    'HVAC Cooling': ['Setpoint'],
                    'HVAC Heating': ['Setpoint']
                vectorize_actions = True,
                lookahead = '00:30',
                reward_args = {
                        'dr_type': 'RTP',
                disable_uncontrollable_loads = False,  
                vectorize_observations= True,
                use_all_ochre_observations= True,
                override_ochre_observations_with_keys = None, 
                observation_space_config = None,  # use default TimeAndEnergyPriceObservationSpaceConfig
                logger = logger)

The basic OCHRE™ Gym RL loop

for step in range(1000):

    # Sample an action from the action space
    action = env.action_space.sample()

    # Step the environment with the sampled action
    obs, rew, terminated, truncated, info = env.step(action)

    # Check if the episode is done       
    if terminated:
        print("Episode finished after {} timesteps".format(step+1))

The rewards are negative with the max reward being 0 (no energy cost and no discomfort penalty).


Diagnostic info generated by OCHRE™ Gym is optionally logged to the console and/or to a file (adjustable via keyword arguments in ochre_gym.load). The default log file is ~./ochre_gym.log, but this can be changed by setting the log_output_filepath keyword argument to the full path to the log file when loading the environment.