graphenv.examples.hallway.hallway_model_torch.TorchHallwayQModel
- class TorchHallwayQModel(*args, hidden_dim=1, **kwargs)[source]
Bases:
graphenv.examples.hallway.hallway_model_torch.TorchHallwayModel
,ray.rllib.algorithms.dqn.dqn_torch_model.DQNTorchModel
Initialize variables of this model.
- Extra model kwargs:
- q_hiddens (Sequence[int]): List of layer-sizes after(!) the
Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].
- dueling: Whether to build the advantage(A)/value(V) heads
for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.
- dueling_activation: The activation to use for all dueling
layers (A- and V-branch). One of “relu”, “tanh”, “linear”.
num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy layers. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).
Methods
add_module
Adds a child module to the current module.
apply
Applies
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
Returns an iterator over module buffers.
children
Returns an iterator over immediate children modules.
context
Returns a contextmanager for the current forward pass.
cpu
Moves all model parameters and buffers to the CPU.
cuda
Moves all model parameters and buffers to the GPU.
custom_loss
Override to customize the loss function used to optimize this model.
double
Casts all floating point parameters and buffers to
double
datatype.eval
Sets the module in evaluation mode.
extra_repr
Set the extra representation of the module
float
Casts all floating point parameters and buffers to
float
datatype.forward
Tensorflow/Keras style forward method.
forward_vertex
Forward function returning a value and weight tensor for the vertices observed via input_dict (a dict of tensors for each vertex property)
from_batch
get_buffer
Returns the buffer given by
target
if it exists, otherwise throws an error.get_extra_state
Returns any extra state to include in the module's state_dict.
get_initial_state
Get the initial recurrent state values for the model.
get_parameter
Returns the parameter given by
target
if it exists, otherwise throws an error.get_q_value_distributions
Returns distributional values for Q(s, a) given a state embedding.
get_state_value
Returns the state value prediction for the given state embedding.
get_submodule
Returns the submodule given by
target
if it exists, otherwise throws an error.half
Casts all floating point parameters and buffers to
half
datatype.import_from_h5
Imports weights from an h5 file.
ipu
Moves all model parameters and buffers to the IPU.
is_time_major
If True, data for calling this ModelV2 must be in time-major format.
last_output
Returns the last output returned from calling the model.
load_state_dict
Copies parameters and buffers from
state_dict
into this module and its descendants.metrics
Override to return custom metrics from your model.
modules
Returns an iterator over all modules in the network.
named_buffers
Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
Returns an iterator over module parameters.
register_backward_hook
Registers a backward hook on the module.
register_buffer
Adds a buffer to the module.
register_forward_hook
Registers a forward hook on the module.
register_forward_pre_hook
Registers a forward pre-hook on the module.
register_full_backward_hook
Registers a backward hook on the module.
register_full_backward_pre_hook
Registers a backward pre-hook on the module.
register_load_state_dict_post_hook
Registers a post hook to be run after module's
load_state_dict
is called.register_module
Alias for
add_module()
.register_parameter
Adds a parameter to the module.
register_state_dict_pre_hook
These hooks will be called with arguments:
self
,prefix
, andkeep_vars
before callingstate_dict
onself
.requires_grad_
Change if autograd should record operations on parameters in this module.
set_extra_state
This function is called from
load_state_dict()
to handle any extra state found within the state_dict.share_memory
See
torch.Tensor.share_memory_()
state_dict
Returns a dictionary containing references to the whole state of the module.
to
Moves and/or casts the parameters and buffers.
to_empty
Moves the parameters and buffers to the specified device without copying storage.
train
Sets the module in training mode.
trainable_variables
Returns the list of trainable variables for this model.
type
Casts all parameters and buffers to
dst_type
.value_function
- returns
A tensor of current state values.
variables
Returns the list (or a dict) of variables for this model.
xpu
Moves all model parameters and buffers to the XPU.
zero_grad
Sets gradients of all model parameters to zero.
Attributes
T_destination
alias of TypeVar('T_destination', bound=
Dict
[str
,Any
])call_super_init
dump_patches
- Parameters
hidden_dim (int) –