graphenv.examples.hallway.hallway_model.HallwayQModel
- class HallwayQModel(*args, hidden_dim=1, **kwargs)[source]
Bases:
graphenv.examples.hallway.hallway_model.BaseHallwayModel
,ray.rllib.algorithms.dqn.distributional_q_tf_model.DistributionalQTFModel
Initialize variables of this model.
- Extra model kwargs:
- q_hiddens (List[int]): List of layer-sizes after(!) the
Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].
- dueling: Whether to build the advantage(A)/value(V) heads
for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.
num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy nets. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).
Note that the core layers for forward() are not defined here, this only defines the layers for the Q head. Those layers for forward() should be defined in subclasses of DistributionalQModel.
Methods
context
Returns a contextmanager for the current TF graph.
custom_loss
Override to customize the loss function used to optimize this model.
forward
Tensorflow/Keras style forward method.
forward_vertex
Forward function computing the evaluation of vertex observations.
from_batch
get_initial_state
Get the initial recurrent state values for the model.
get_q_value_distributions
Returns distributional values for Q(s, a) given a state embedding.
get_state_value
Returns the state value prediction for the given state embedding.
import_from_h5
Imports weights from an h5 file.
is_time_major
If True, data for calling this ModelV2 must be in time-major format.
last_output
Returns the last output returned from calling the model.
metrics
Override to return custom metrics from your model.
register_variables
Register the given list of variables with this model.
trainable_variables
Returns the list of trainable variables for this model.
update_ops
Return the list of update ops for this model.
value_function
- returns
A tensor of current state values.
variables
Returns the list (or a dict) of variables for this model.
- Parameters
hidden_dim (int) –