graphenv.examples.tsp.tsp_model.TSPQModel

class TSPQModel(*args, num_nodes, hidden_dim=32, embed_dim=32, **kwargs)[source]

Bases: graphenv.examples.tsp.tsp_model.BaseTSPModel, ray.rllib.algorithms.dqn.distributional_q_tf_model.DistributionalQTFModel

Initialize variables of this model.

Extra model kwargs:
q_hiddens (List[int]): List of layer-sizes after(!) the

Advantages(A)/Value(V)-split. Hence, each of the A- and V- branches will have this structure of Dense layers. To define the NN before this A/V-split, use - as always - config[“model”][“fcnet_hiddens”].

dueling: Whether to build the advantage(A)/value(V) heads

for DDQN. If True, Q-values are calculated as: Q = (A - mean[A]) + V. If False, raw NN output is interpreted as Q-values.

num_atoms: If >1, enables distributional DQN. use_noisy: Use noisy nets. v_min: Min value support for distributional DQN. v_max: Max value support for distributional DQN. sigma0 (float): Initial value of noisy layers. add_layer_norm: Enable layer norm (for param noise).

Note that the core layers for forward() are not defined here, this only defines the layers for the Q head. Those layers for forward() should be defined in subclasses of DistributionalQModel.

Methods

context

Returns a contextmanager for the current TF graph.

custom_loss

Override to customize the loss function used to optimize this model.

forward

Tensorflow/Keras style forward method.

forward_vertex

Forward function returning a value and weight tensor for the vertices observed via input_dict (a dict of tensors for each vertex property)

from_batch

get_initial_state

Get the initial recurrent state values for the model.

get_q_value_distributions

Returns distributional values for Q(s, a) given a state embedding.

get_state_value

Returns the state value prediction for the given state embedding.

import_from_h5

Imports weights from an h5 file.

is_time_major

If True, data for calling this ModelV2 must be in time-major format.

last_output

Returns the last output returned from calling the model.

metrics

Override to return custom metrics from your model.

register_variables

Register the given list of variables with this model.

trainable_variables

Returns the list of trainable variables for this model.

update_ops

Return the list of update ops for this model.

value_function

returns

A tensor of current state values.

variables

Returns the list (or a dict) of variables for this model.

Parameters
  • num_nodes (int) –

  • hidden_dim (int) –

  • embed_dim (int) –