# Deep value-based RL and DQN

#### Motivation

In the introduction, we saw value-based RL algorithms (and specifically Q learning) in the tabular setting where we keep a separate Q value for each $s,a$ pair. If we want to scale to large state spaces we will need to be able to generalize across an infinite state space using a function approximator, like a neural network. This week we will see how Q-learning can be modified to support function approximation and read the influential paper from Deepmind introducing the deep Q network (DQN) algorithm.

#### Topics

• Q-learning with function approximation

• Experience replay

• $\varepsilon$-greedy exploration

#### Questions

• What are the potential problems with Q-learning when we introduce function approximation?

• Why might experience replay improve the performance of DQN?

• Is the DQN algorithm more similar to Q-learning or value iteration? Why?

• Download and run the Pytorch DQN tutorial linked in the optional reading list to get an intuition for how the algorithm works.