graphenv.examples.hallway.hallway_model_torch.TorchHallwayQModel

class TorchHallwayQModel(*args, hidden_dim=1, **kwargs)[source]

Bases: graphenv.examples.hallway.hallway_model_torch.TorchHallwayModel, ray.rllib.algorithms.dqn.dqn_torch_model.DQNTorchModel

Initialize variables of this model.

Extra model kwargs:

q_hiddens (Sequence[int]): List of layer-sizes after(!) the: Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].
dueling: Whether to build the advantage(A)/value(V) heads: for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.
dueling_activation: The activation to use for all dueling: layers (A- and V-branch). One of “relu”, “tanh”, “linear”.

num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy layers. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).

Methods

`add_module`	Adds a child module to the current module.
`apply`	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`	Returns an iterator over module buffers.
`children`	Returns an iterator over immediate children modules.
`context`	Returns a contextmanager for the current forward pass.
`cpu`	Moves all model parameters and buffers to the CPU.
`cuda`	Moves all model parameters and buffers to the GPU.
`custom_loss`	Override to customize the loss function used to optimize this model.
`double`	Casts all floating point parameters and buffers to `double` datatype.
`eval`	Sets the module in evaluation mode.
`extra_repr`	Set the extra representation of the module
`float`	Casts all floating point parameters and buffers to `float` datatype.
`forward`	Tensorflow/Keras style forward method.
`forward_vertex`	Forward function returning a value and weight tensor for the vertices observed via input_dict (a dict of tensors for each vertex property)
`from_batch`
`get_buffer`	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`	Returns any extra state to include in the module's state_dict.
`get_initial_state`	Get the initial recurrent state values for the model.
`get_parameter`	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_q_value_distributions`	Returns distributional values for Q(s, a) given a state embedding.
`get_state_value`	Returns the state value prediction for the given state embedding.
`get_submodule`	Returns the submodule given by `target` if it exists, otherwise throws an error.
`half`	Casts all floating point parameters and buffers to `half` datatype.
`import_from_h5`	Imports weights from an h5 file.
`ipu`	Moves all model parameters and buffers to the IPU.
`is_time_major`	If True, data for calling this ModelV2 must be in time-major format.
`last_output`	Returns the last output returned from calling the model.
`load_state_dict`	Copies parameters and buffers from `state_dict` into this module and its descendants.
`metrics`	Override to return custom metrics from your model.
`modules`	Returns an iterator over all modules in the network.
`named_buffers`	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`	Returns an iterator over module parameters.
`register_backward_hook`	Registers a backward hook on the module.
`register_buffer`	Adds a buffer to the module.
`register_forward_hook`	Registers a forward hook on the module.
`register_forward_pre_hook`	Registers a forward pre-hook on the module.
`register_full_backward_hook`	Registers a backward hook on the module.
`register_full_backward_pre_hook`	Registers a backward pre-hook on the module.
`register_load_state_dict_post_hook`	Registers a post hook to be run after module's `load_state_dict` is called.
`register_module`	Alias for `add_module()`.
`register_parameter`	Adds a parameter to the module.
`register_state_dict_pre_hook`	These hooks will be called with arguments: `self`, `prefix`, and `keep_vars` before calling `state_dict` on `self`.
`requires_grad_`	Change if autograd should record operations on parameters in this module.
`set_extra_state`	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`share_memory`	See `torch.Tensor.share_memory_()`
`state_dict`	Returns a dictionary containing references to the whole state of the module.
`to`	Moves and/or casts the parameters and buffers.
`to_empty`	Moves the parameters and buffers to the specified device without copying storage.
`train`	Sets the module in training mode.
`trainable_variables`	Returns the list of trainable variables for this model.
`type`	Casts all parameters and buffers to `dst_type`.
`value_function`	returns A tensor of current state values.
`variables`	Returns the list (or a dict) of variables for this model.
`xpu`	Moves all model parameters and buffers to the XPU.
`zero_grad`	Sets gradients of all model parameters to zero.

Attributes

`T_destination`	alias of TypeVar('T_destination', bound=`Dict`[`str`, `Any`])
`call_super_init`
`dump_patches`

Parameters: hidden_dim (int) –