Distributed Intelligence

Reinforcement Learning Lecture - WS 2024/25

This lecture is part of the Machine Learning Masters program at the University of Tübingen. The course is run by the Autonomous Learning Group.

Dates:

Course description:

The course will provide you with theoretical and practical knowledge of reinforcement learning, a field of machine learning concerned with decision-making and interaction with dynamical systems, such as robots. We start with a brief overview of supervised learning and spend the most time on reinforcement learning. The exercises will help you get hands-on with methods and deepen your understanding.

Qualification Goals:

Students gain an understanding of reinforcement learning formulations, problems, and algorithms on a theoretical and practical level. After this course, students should be able to implement and apply deep reinforcement learning algorithms to new problems. 

Course materials:

Both slides and exercises are available on ILIAS.

Lectures

  1. Lecture 1 Introduction to the course, Reinforcement Learning (RL) History and RL setup; Background reading: Sutton and Barto Reinforcement learning for the next few lectures (for this lecture, parts of Chapter 3)
  2. Lecture 2 MDPs; Background reading: Sutton and Barto Reinforcement learning Chapter 4
  3. Lecture 3 Model-free Prediction; Background reading: Sutton and Barto Reinforcement learning First part of Chapters 5, 6, 7, 12
  4. Lecture 4 Model-free Control; Background reading: Sutton and Barto Reinforcement learning Chapters 5.2, 5.3, 5.5, 6.4, 6.5, 12.7
  5. Lecture 5  Bandits and Exploration; Background reading: Bandit Algorithms, Lattimore and Szepesvari 2020, Chapters 4,6,7.
    https://tor-lattimore.com/downloads/book/book.pdf
  6. Lecture 6: Value Function Approximation; Background reading: Sutton and Barto Reinforcement learning Chapters 9.1-9.8, 10.1, 10.2, 11.1-11.3. Supplementary: DQN paper 1, paper 2, NFQ paper
  7. Lecture 7: Policy Gradient; Background reading: Sutton and Barto Reinforcement learning Chapters 13
  8. Lecture 8: Policy Gradient and Actor-Critic; Background reading: Natural Actor Critic Paper, TRPO Paper, PPO Paper
  9. Lecture 9: Q-learning style Actor-Critic; Background reading: DPG Paper, DDPG Paper, TD3 Paper, SAC paper
  10. Lecture 10: Exploration and Tricks to improve Deep RL (with recent work from my group); Background reading:  ICM Paper, RND Paper, Pink-Noise PaperHER paper, CrossQ paper, Stop Regressing paper
  11. Lecture 11: Model-based Methods: Dyna-Q, MBPO; Background reading: Sutton and Barto Reinforcement learning Chapters 8 and the MBPO Paper
  12. Lecture 12: Model-based Methods II: Online-Planning: (with recent work from my group); CEM, PETS, iCEM; CEE-US; Background reading: PETS paper, iCEM paper (video)CEE-US paper (videos)
  13. Lecture 13: Alpha Go and Alpha Zero, Dreamer. Background reading: AlphaGo paper (also in Ilias, because behind the paywall), AlphaZero Paper, and Dreamer Paper
  14. Lecture 14: Offline RL. Background reading: CQL paper, CRR paper, Benchmarking paper

Further Readings

  • Sutton & Barto, Reinforcement Learning: An Introduction
  • Bertsekas, Dynamic Programming and Optimal Control, Vol. 1
  • Bishop, Pattern Recognition and Machine Learning

Privacy settings

Our website uses cookies. Some of them are mandatory, while others allow us to improve your user experience on our website. The settings you have made can be edited at any time.

or

Essential

in2code

Videos

in2code
YouTube
Google