graphenv.examples.hallway.hallway_model.HallwayQModel

class HallwayQModel(*args, hidden_dim=1, **kwargs)[source]

Bases: graphenv.examples.hallway.hallway_model.BaseHallwayModel, ray.rllib.algorithms.dqn.distributional_q_tf_model.DistributionalQTFModel

Initialize variables of this model.

Extra model kwargs:
q_hiddens (List[int]): List of layer-sizes after(!) the

Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].

dueling: Whether to build the advantage(A)/value(V) heads

for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.

num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy nets. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).

Note that the core layers for forward() are not defined here, this only defines the layers for the Q head. Those layers for forward() should be defined in subclasses of DistributionalQModel.

Methods

context

Returns a contextmanager for the current TF graph.

custom_loss

Override to customize the loss function used to optimize this model.

forward

Tensorflow/Keras style forward method.

forward_vertex

Forward function computing the evaluation of vertex observations.

from_batch

get_initial_state

Get the initial recurrent state values for the model.

get_q_value_distributions

Returns distributional values for Q(s, a) given a state embedding.

get_state_value

Returns the state value prediction for the given state embedding.

import_from_h5

Imports weights from an h5 file.

is_time_major

If True, data for calling this ModelV2 must be in time-major format.

last_output

Returns the last output returned from calling the model.

metrics

Override to return custom metrics from your model.

register_variables

Register the given list of variables with this model.

trainable_variables

Returns the list of trainable variables for this model.

update_ops

Return the list of update ops for this model.

value_function

returns

A tensor of current state values.

variables

Returns the list (or a dict) of variables for this model.

Parameters

hidden_dim (int) –