Introduction: Reinforcement Learning
Some RL success stories
Welcome to Reinforcement Learning
- An example: OpenAI Gym
- Using a different policy
- Your turn:
Markov Decision Processes
- Markov chains
- Markov Reward Process
- Markov Decision Processes
- Solving MDPs: Value and Policy Iteration.
Chapter 3: Monte Carlo Methods
- Monte Carlo Learning
- Off-policy MC control
Chapter 4: Temporal Difference Learning
- Code sample: SARSA
Chapter 5: Function approximation.
- Gradient descent
- Feature vectors
- Function backups
- Code sample: SARSA with linear function approximation
Chapter 6: Experience Replay.
- Improvements since the original DQN
- #
- Code sample: Q-Learning with experience replay (Linear Approximator)
- Code sample: Q-Learning with experience replay (Neural Network Approximator)
Chapter 7: Policy Gradients & Policy Optimisation
- Derivative Free Methods
- Code sample: NES for FrozenLake
- Code sample: Cross entropy method for CartPole