Basics
On this page, we describe two ways to get started with training RL agents in OCHRE™ Gym.
1. Quickest: Using ochre_gym.load
We provide a helper function to quickly instantiate an OCHRE™ Gym environment for one of the provided buildings (e.g., basic-v0
) via ochre_gym.load
:
The ochre_gym.load
function will handle creating the OCHRE™ building simulator instance using the properties, schedule, and weather files located in ochre_gym/buildings/basic-v0
.
Keyword arguments passed to load
can be used to override the defaults in the ochre_gym/buildings/defaults.toml
config file. For example, override the default observation space (the full OCHRE™ control result with time stamp and energy price information added) by setting override_ochre_observations_with_keys
to a list of desired keys:
import ochre_gym
env = ochre_gym.load(
env_name="basic-v0",
override_ochre_observations_with_keys = [
'Energy Price ($)',
'Temperature - Indoor (C)',
'Total Electric Power (kW)'
]
)
2. OchreEnv
You can also directly instantiate an OchreEnv
object. This may be desirable if you want to subclass OchreEnv
or customize the environment's observation space by passing a user-defined OchreObservationSpaceBaseConfig
to the observation_space_config
keyword argument.
This requires passing in the OCHRE™ building simulator's dwelling_args
, which is a dictionary with keys: start_time
, time_res
, duration
, initialization_time
, hpxml_file
, schedule_input_file
, and weather_file
.
For example:
from ochre_gym import OchreEnv
env = OchreEnv('basic-v0',
dwelling_args,
actions = {
'HVAC Cooling': ['Setpoint'],
'HVAC Heating': ['Setpoint']
},
vectorize_actions = True,
lookahead = '00:30',
reward_args = {
'dr_type': 'RTP',
},
disable_uncontrollable_loads = False,
vectorize_observations= True,
use_all_ochre_observations= True,
override_ochre_observations_with_keys = None,
observation_space_config = None, # use default TimeAndEnergyPriceObservationSpaceConfig
logger = logger)
The basic OCHRE™ Gym RL loop
for step in range(1000):
# Sample an action from the action space
action = env.action_space.sample()
# Step the environment with the sampled action
obs, rew, terminated, truncated, info = env.step(action)
# Check if the episode is done
if terminated:
print("Episode finished after {} timesteps".format(step+1))
break
The rewards are negative with the max reward being 0 (no energy cost and no discomfort penalty).
Logs
Diagnostic info generated by OCHRE™ Gym is optionally logged to the console and/or to a file (adjustable via keyword arguments in ochre_gym.load
). The default log file is ~./ochre_gym.log
, but this can be changed by setting the log_output_filepath
keyword argument to the full path to the log file when loading the environment.