graphenv.examples.hallway.hallway_model.HallwayQModel

class HallwayQModel(*args, hidden_dim=1, **kwargs)[source]

Bases: graphenv.examples.hallway.hallway_model.BaseHallwayModel, ray.rllib.algorithms.dqn.distributional_q_tf_model.DistributionalQTFModel

Initialize variables of this model.

Extra model kwargs:

q_hiddens (List[int]): List of layer-sizes after(!) the: Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].
dueling: Whether to build the advantage(A)/value(V) heads: for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.

num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy nets. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).

Note that the core layers for forward() are not defined here, this only defines the layers for the Q head. Those layers for forward() should be defined in subclasses of DistributionalQModel.

Methods

`context`	Returns a contextmanager for the current TF graph.
`custom_loss`	Override to customize the loss function used to optimize this model.
`forward`	Tensorflow/Keras style forward method.
`forward_vertex`	Forward function computing the evaluation of vertex observations.
`from_batch`
`get_initial_state`	Get the initial recurrent state values for the model.
`get_q_value_distributions`	Returns distributional values for Q(s, a) given a state embedding.
`get_state_value`	Returns the state value prediction for the given state embedding.
`import_from_h5`	Imports weights from an h5 file.
`is_time_major`	If True, data for calling this ModelV2 must be in time-major format.
`last_output`	Returns the last output returned from calling the model.
`metrics`	Override to return custom metrics from your model.
`register_variables`	Register the given list of variables with this model.
`trainable_variables`	Returns the list of trainable variables for this model.
`update_ops`	Return the list of update ops for this model.
`value_function`	returns A tensor of current state values.
`variables`	Returns the list (or a dict) of variables for this model.

Parameters: hidden_dim (int) –