Deep Q Network Learning

Classes and Functions

class dqn.Agent(state_size, action_size, seed)

Bases: object

Interacts with and learns form environment.

act(state, eps=0)

Returns action for given state as per current policy Params =======

state (array_like): current state eps (float): epsilon, for epsilon-greedy action selection

learn(experiences, gamma)

Update value parameters using given batch of experience tuples. Params =======

experiences (Tuple[torch.Variable]): tuple of (s, a, r, s’, done) tuples gamma (float): discount factor

soft_update(local_model, target_model, tau)

Soft update model parameters. θ_target = τ*θ_local + (1 - τ)*θ_target Params =======

local model (PyTorch model): weights will be copied from target model (PyTorch model): weights will be copied to tau (float): interpolation parameter

step(state, action, reward, next_step, done)
class dqn.QNetwork(state_size, action_size, seed, fc1_unit=64, fc2_unit=64)

Bases: torch.nn.modules.module.Module

Actor (Policy) Model.

forward(x)

Build a network that maps state -> action values.

training: bool
class dqn.ReplayBuffer(action_size, buffer_size, batch_size, seed)

Bases: object

Fixed -size buffe to store experience tuples.

add(state, action, reward, next_state, done)

Add a new experience to memory.

sample()

Randomly sample a batch of experiences from memory