Deep Reinforcement Learning – Summer 2024/25

In recent years, reinforcement learning has been combined with deep neural networks, giving rise to game agents with super-human performance (for example for Go, chess, StarCraft II, capable of being trained solely by self-play), datacenter cooling algorithms being 50% more efficient than trained human operators, or faster code for sorting or matrix multiplication. The goal of the course is to introduce reinforcement learning employing deep neural networks, focusing both on the theory and on practical implementations.

Python programming skills and basic PyTorch/TensorFlow skills are required (the latter can be obtained on the Deep Learning course). No previous knowledge of reinforcement learning is necessary.

About

SIS code: NPFL139
Semester: summer
E-credits: 8
Examination: 3/4 C+Ex
Guarantor: Milan Straka

Timespace Coordinates

These coordinates are still preliminary.

lecture: the lecture is held on Wednesday 9:00 in S5; first lecture is on Feb 19
practicals: the practicals take place on Thursday 14:00 in S5; first practicals are on Feb 20
consultations: entirely optional consultations take place on Wednesday 14:00 in S9; first consultations are on Feb 26

All lectures and practicals will be recorded and available on this website.

Lectures

1. Introduction to Reinforcement Learning Slides PDF Slides

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

The lecture content, including references to study materials.

The main study material is the Reinforcement Learning: An Introduction; second edition by Richard S. Sutton and Andrew G. Barto (reffered to as RLB). It is available online and also as a hardcopy.

References to study materials cover all theory required at the exam, and sometimes even more – the references in italics cover topics not required for the exam.

1. Introduction to Reinforcement Learning

Feb 19 Slides PDF Slides

Introduction to Reinforcement Learning

Requirements

To pass the practicals, you need to obtain at least 80 points, excluding the bonus points. Note that all surplus points (both bonus and non-bonus) will be transfered to the exam. In total, assignments for at least 120 points (not including the bonus points) will be available, and if you solve all the assignments (any non-zero amount of points counts as solved), you automatically pass the exam with grade 1.

Environment

The tasks are evaluated automatically using the ReCodEx Code Examiner.

The evaluation is performed using Python 3.11, Gymnasium 1.0.0 and PyTorch 2.6.0. You should install the exact version of these packages yourselves.

Teamwork

Solving assignments in teams (of size at most 3) is encouraged, but everyone has to participate (it is forbidden not to work on an assignment and then submit a solution created by other team members). All members of the team must submit in ReCodEx individually, but can have exactly the same sources/models/results. Each such solution must explicitly list all members of the team to allow plagiarism detection using this template.

No Cheating

Cheating is strictly prohibited and any student found cheating will be punished. The punishment can involve failing the whole course, or, in grave cases, being expelled from the faculty. While discussing assignments with any classmate is fine, each team must complete the assignments themselves, without using code they did not write (unless explicitly allowed). Of course, inside a team you are allowed to share code and submit identical solutions. Note that all students involved in cheating will be punished, so if you share your source code with a friend, both you and your friend will be punished. That also means that you should never publish your solutions.

Submitting to ReCodEx

When submitting a competition solution to ReCodEx, you should submit a trained agent and a Python source capable of running it.

Furthermore, please also include the Python source and hyperparameters you used to train the submitted model. But be careful that there still must be exactly one Python source with a line starting with def main(.

Do not forget about the maximum allowed model size and time and memory limits.

Competition Evaluation

Before the deadline, ReCodEx prints the exact performance of your agent, but only if it is worse than the baseline.

If you surpass the baseline, the assignment is marked as solved in ReCodEx and you immediately get regular points for the assignment. However, ReCodEx does not print the reached performance.
After the competition deadline, the latest submission of every user surpassing the required baseline participates in a competition. Additional bonus points are then awarded according to the ordering of the performance of the participating submissions.
After the competition results announcement, ReCodEx starts to show the exact performance for all the already submitted solutions and also for the solutions submitted later.

What Is Allowed

Unless stated otherwise, you can use any algorithm to solve the competition task at hand, but the implementation must be created by you and you must understand it fully. You can of course take inspiration from any paper or existing implementation, but please reference it in that case.
PyTorch, TensorFlow, and JAX are available in ReCodEx (but there are no GPUs).

Install

Installing to central user packages repository

You can install all required packages to central user packages repository using python3 -m pip install --user --no-cache-dir --extra-index-url=https://download.pytorch.org/whl/cu118 npfl139.

On Linux and Windows, the above command installs CUDA 11.8 PyTorch build, but you can change cu118 to:
- cpu to get CPU-only (smaller) version,
- cu124 to get CUDA 12.4 build,
- rocm6.2 to get AMD ROCm 6.2 build (Linux only).
On macOS, the --extra-index-url has no effect and the Metal support is installed in any case.
Installing to a virtual environment

Python supports virtual environments, which are directories containing independent sets of installed packages. You can create a virtual environment by running python3 -m venv VENV_DIR followed by VENV_DIR/bin/pip install --no-cache-dir --extra-index-url=https://download.pytorch.org/whl/cu118 npfl139. (or VENV_DIR/Scripts/pip on Windows).

Again, apart from the CUDA 11.8 build, you can change cu118 on Linux and Windows to:
- cpu to get CPU-only (smaller) version,
- cu124 to get CUDA 12.4 build,
- rocm6.2 to get AMD ROCm 6.2 build (Linux only).
Windows installation
- On Windows, it can happen that python3 is not in PATH, while py command is – in that case you can use py -m venv VENV_DIR, which uses the newest Python available, or for example py -3.11 -m venv VENV_DIR, which uses Python version 3.11.
- If you encounter a problem creating the logs in the args.logdir directory, a possible cause is that the path is longer than 260 characters, which is the default maximum length of a complete path on Windows. However, you can increase this limit on Windows 10, version 1607 or later, by following the instructions.
GPU support on Linux and Windows

PyTorch supports NVIDIA GPU or AMD GPU out of the box, you just need to select appropriate --extra-index-url when installing the packages.

If you encounter problems loading CUDA or cuDNN libraries, make sure your LD_LIBRARY_PATH does not contain paths to older CUDA/cuDNN libraries.

MetaCentrum

How to apply for MetaCentrum account?

After reading the Terms and conditions, you can apply for an account here.

After your account is created, please make sure that the directories containing your solutions are always private.
How to activate Python 3.10 on MetaCentrum?

On Metacentrum, currently the newest available Python is 3.10, which you need to activate in every session by running the following command:
```
module add python/python-3.10.4-intel-19.0.4-sc7snnf
```
How to install the required virtual environment on MetaCentrum?

To create a virtual environment, you first need to decide where it will reside. Either you can find a permanent storage, where you have large-enough quota, or you can use scratch storage for a submitted job.

TL;DR:
- Run an interactive CPU job, asking for 16GB scratch space:
```
qsub -l select=1:ncpus=1:mem=8gb:scratch_local=16gb -I
```
- In the job, use the allocated scratch space as the temporary directory:
```
export TMPDIR=$SCRATCHDIR
```
- You should clear the scratch space before you exit using the clean_scratch command. You can instruct the shell to call it automatically by running:
```
trap 'clean_scratch' TERM EXIT
```
- Finally, create the virtual environment and install PyTorch in it:
```
module add python/python-3.10.4-intel-19.0.4-sc7snnf
python3 -m venv CHOSEN_VENV_DIR
CHOSEN_VENV_DIR/bin/pip install --no-cache-dir --upgrade pip setuptools
CHOSEN_VENV_DIR/bin/pip install --no-cache-dir --extra-index-url=https://download.pytorch.org/whl/cu118 npfl139
```
How to run a GPU computation on MetaCentrum?

First, read the official MetaCentrum documentation: Basic terms, Run simple job, GPU computing, GPU clusters.

TL;DR: To run an interactive GPU job with 1 CPU, 1 GPU, 8GB RAM, and 16GB scatch space, run:
```
qsub -q gpu -l select=1:ncpus=1:ngpus=1:mem=8gb:scratch_local=16gb -I
```
To run a script in a non-interactive way, replace the -I option with the script to be executed.

If you want to run a CPU-only computation, remove the -q gpu and ngpus=1: from the above commands.

AIC

How to install required packages on AIC?

The Python 3.11.7 is available /opt/python/3.11.7/bin/python3, so you should start by creating a virtual environment using
```
/opt/python/3.11.7/bin/python3 -m venv VENV_DIR
```
and then install the required packages in it using
```
VENV_DIR/bin/pip install --no-cache-dir --extra-index-url=https://download.pytorch.org/whl/cu118 npfl139
```
How to run a GPU computation on AIC?

First, read the official AIC documentation: Submitting CPU Jobs, Submitting GPU Jobs.

TL;DR: To run an interactive GPU job with 1 CPU, 1 GPU, and 16GB RAM, run:
```
srun -p gpu -c1 -G1 --mem=16G --pty bash
```
To run a shell script requiring a GPU in a non-interactive way, use
```
sbatch -p gpu -c1 -G1 --mem=16G SCRIPT_PATH
```
If you want to run a CPU-only computation, remove the -p gpu and -G1 from the above commands.

Git

Is it possible to keep the solutions in a Git repository?

Definitely. Keeping the solutions in a branch of your repository, where you merge them with the course repository, is probably a good idea. However, please keep the cloned repository with your solutions private.
On GitHub, do not create a public fork with your solutions

If you keep your solutions in a GitHub repository, please do not create a clone of the repository by using the Fork button – this way, the cloned repository would be public.

Of course, if you just want to create a pull request, GitHub requires a public fork and that is fine – just do not store your solutions in it.
How to clone the course repository?

To clone the course repository, run
```
git clone https://github.com/ufal/npfl139
```
This creates the repository in the npfl139 subdirectory; if you want a different name, add it as a last parameter.

To update the repository, run git pull inside the repository directory.
How to keep the course repository as a branch in your repository?

If you want to store the course repository just in a local branch of your existing repository, you can run the following command while in it:
```
git remote add upstream https://github.com/ufal/npfl139
git fetch upstream
git checkout -t upstream/master
```
This creates a branch master; if you want a different name, add -b BRANCH_NAME to the last command.

In both cases, you can update your checkout by running git pull while in it.
How to merge the course repository with your modifications?

If you want to store your solutions in a branch merged with the course repository, you should start by
```
git remote add upstream https://github.com/ufal/npfl139
git pull upstream master
```
which creates a branch master; if you want a different name, change the last argument to master:BRANCH_NAME.

You can then commit to this branch and push it to your repository.

To merge the current course repository with your branch, run
```
git merge upstream master
```
while in your branch. Of course, it might be necessary to resolve conflicts if both you and I modified the same place in the templates.

ReCodEx

What files can be submitted to ReCodEx?

You can submit multiple files of any type to ReCodEx. There is a limit of 20 files per submission, with a total size of 20MB.
What file does ReCodEx execute and what arguments does it use?

Exactly one file with py suffix must contain a line starting with def main(. Such a file is imported by ReCodEx and the main method is executed (during the import, __name__ == "__recodex__").

The file must also export an argument parser called parser. ReCodEx uses its arguments and default values, but it overwrites some of the arguments depending on the test being executed – the template should always indicate which arguments are set by ReCodEx and which are left intact.
What are the time and memory limits?

The memory limit during evaluation is 1.5GB. The time limit varies, but it should be at least 10 seconds and at least twice the running time of my solution.
Do agents need to be trained directly in ReCodEx?

No, you can pre-train your agent locally (unless specified otherwise in the task description).

Requirements

To pass the exam, you need to obtain at least 60, 75, or 90 points out of 100-point exam to receive a grade 3, 2, or 1, respectively. The exam consists of 100-point-worth questions from the list below (the questions are randomly generated, but in such a way that there is at least one question from every but the last lecture). In addition, you can get surplus points from the practicals and at most 10 points for community work (i.e., fixing slides or reporting issues) – but only the points you already have at the time of the exam count. You can take the exam without passing the practicals first.

Exam Questions

Related Courses

Deep Learning

Course introducing deep neural networks, from the basics to the latest advances, focusing both on theory as well as on practical aspects.

Machine Learning for Greenhorns

Introductory course to machine learning, focusing both on theoretical foundations as well as on practical applications in Python.

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Deep Reinforcement Learning – Summer 2024/25

About

Timespace Coordinates

Lectures

License

1. Introduction to Reinforcement Learning

Requirements

Environment

Teamwork

No Cheating

Submitting to ReCodEx

Competition Evaluation

What Is Allowed

Install

MetaCentrum

AIC

Git

ReCodEx

Requirements

Exam Questions

Related Courses

Deep Learning

Machine Learning for Greenhorns

Archive

Summer 2023/24

Winter 2022/23

Winter 2021/22

Winter 2020/21

Winter 2019/20

Winter 2018/19