Intro to Reinforcement Learning

30 Jan 2020

A brief introduction the reinforcement learning problem, including Markov decision processes (MDPs), policy search, and value estimation. Covered in the intro lecture. Lecture notes

Topics

Sequential decision making under uncertainty
Markov Decision Processes (MDP)
Challenges: credit assignment, exploration
Policy search:
- Ways to represent policies
- Ways to search for policies
- Policy gradients
Value estimation:
- Value function and Q function
- Bellman equation and Bellman optimality equation
- Dynamic programming and value iteration
- TD learning and Q-learning
- Policy iteration

Useful resources

The Sutton and Barto textbook
Github repo from Denny Britz with notebooks covering examples from the book
Courses from Emma Brunskill (for more theory), Sergey Levine (for more deep RL/control) David Silver (for somewhere in between)
Lilian Weng’s peek into reinforcement learning blog post
Gridworld visualization from Andrej Karpathy