This is the new course for the '24/25 Fall semester. You can find slides from last year on the archived old page.
This course presents advanced problems and current state-of-the-art in the field of dialogue systems, chatbots, and voice assistants. After a brief introduction into the topic, the course will focus mainly on the application of machine learning to the task – especially deep learning/neural networks – in the individual components of the traditional dialogue system architecture as well as in end-to-end approaches: chatbots represented by a single neural network, including large language models (LLMs).
This course is a follow-up to the course NPFL123 Dialogue Systems, but can be taken independently – important basics will be repeated. All required deep learning concepts will be explained, but only briefly, so some machine learning background is recommended.
The course is taught in English, but we're happy to explain in Czech, too.
Lectures and labs take place in the room S4 (Malá Strana, 3rd floor).
The labs will be likely shorter than 45 mins, as they mainly consist of explaining homework assignments and discussing related questions.
In addition, we will stream both lectures and lab instruction over Zoom and make the recordings available on Youtube (under a private link, avalable on request). We'll do our best to provide a useful experience, just note that the quality may not be ideal.
If you can't access Zoom, email us or text us on Slack.
There's also a Slack workspace you can use to discuss assignments and get news about the course. Please contact us by email if you want to join and haven't got an invite yet.
To pass this course, you will need to take an exam and do lab homeworks, which will involve working with neural dialogue systems. See more details here. Note that the assignments will be the most challenging part of the course, and will take some time to complete.
PDFs with lecture slides will appear here shortly before each lecture (more details on each lecture are on a separate tab). You can also check out last year's lecture slides.
1. Introduction Slides Questions
2. Data & Evaluation Slides Dataset Exploration Questions
3. Neural Nets Basics Slides Questions
4. Training Neural Nets Slides MultiWOZ 2.2 Loader Questions
5. Natural Language Understanding Slides Questions
6. Dialogue Management (1) Slides Prompting Llama for responses on MultiWOZ Questions
7. Dialogue Management (2) Slides Questions
8. Language Generation Slides LoRA Finetuning Questions
9. End-to-end Models Slides State tracking & database Questions
10. Open-domain Dialogue Slides Questions
11. Multimodal systems Slides Evaluation Bonus 1: Full MultiWOZ Bonus 2: Report Questions
12. Linguistics & Ethics Slides Questions
A list of recommended literature is on a separate tab.
10 October Slides Dataset Exploration Questions
24 October Slides MultiWOZ 2.2 Loader Questions
7 November Slides Prompting Llama for responses on MultiWOZ Questions
21 November Slides LoRA Finetuning Questions
5 December Slides State tracking & database Questions
19 December Slides Evaluation Bonus 1: Full MultiWOZ Bonus 2: Report Questions
There will be 6 homework assignments + 2(?) bonuses, each for a maximum of 10 points. Please see details on grading and deadlines on a separate tab.
Assignments should be submitted via Git – see instructions on a separate tab.
All deadlines are 23:59:59 CET/CEST.
Note: If you don't have a faculty Gitlab account yet, please create one as soon as possible (see the instructions). Don't wait until the deadline! It takes 5 minutes, and if you don't do it, you won't have any way of submitting.
3. Prompting Llama for responses on MultiWOZ
Presented: 10 October, Deadline: 29 October
Your task is to select one dialogue dataset, download and explore it.
Here you can use the dataset description/paper that came out with the data. The papers are linked from the dataset webpages or from here. If you can't find a paper, ask us and we'll try to help. If you can't find some of the information in the paper, mention it in your report.
Here you should use your own programming skills. If your dataset has a train/dev/test split, use the training set. If there's no clear separation between a user and a system (e.g. human-human chitchat data, or NLU-only data), provide just the overall numbers.
hw1/README.md
.hw1/analysis.py
or hw1/analysis.ipynb
.See the submission instructions here (clone your Gitlab repo and add a new merge request).
train_none_original
data)data
subdirectory)Dataset surveys (broader, but shallower than what we're aiming at):
Presented: 24 October, Deadline: 26 November
NOTE: The code & assignment have been updated significantly. The deadline was extended by 2 weeks. Before you start any implementation, make sure you update from upstream!
Your task is to create a component that will load the task-oriented MultiWOZ 2.2 dataset and process the data so it is prepared for model training. The component will consist of two Python classes -- one to hold the data, and one to prepare the training batches.
In later assignments, you will build an LLM based model (a prompted one similar to this one and a finetuned one similar to SOLOIST) using the data provided by this loader. Note that this means that the next assignments depend on this one.
We prepared some code templates for you to guide your implementation. You should not need to modify the code already present in the templates. If you need to, you can do so, but please comment on your code changes in the MR.
The bits that are waiting for your implementation are highlighted with # TODO:
in diallama/mw_loader.py
,
diallama/generate.py
and hw2/test.py
.
Note that to use the provided code, you'll need to install the dependencies provided in the requirements.txt
. They can be installed easily via pip install -r requirements.txt
.
MultiWOZ 2.2 is a task-oriented conversational dataset labeled with dialogue acts. It contains around 10k conversations between the user and a Cambridge town info centre (system). The dialogues are about certain topics: restaurants, hotels, trains, taxi, tourist attractions, hospital, and police. You can find more details in the dataset repository.
You can write your own dataset loader from the original format (see the dataset) but we recommend using the Huggingface Datasets library version.
This is how the data looks like if you load it using Huggingface Datasets: Each entry in the dataset represents one dialog. The information we are interested in is contained in the field turns
, which is a dictionary with the following important keys:
speaker
: Role associated with the speaker. It's either 0 (user) or 1 (system).utterance
: String representation of the dialogue utterances.dialogue_acts
: Structured parse of the system utterances into dialog acts (only in system utterances). It contains slot names and corresponding span_info
(location of the slot in the utterance, which will come in handy later).frames
: Present only in user utterances. Structured representation of the user's belief state.Each of these keys is mapped to a list with labels for the corresponding turns, i.e. turns['speaker'][0]
contains information for the speaker of the first turn and turns['speaker'][-1]
of the last one.
The dataset contains the train, validation and test splits. Please respect them!
Note that MultiWOZ also contains a database (and you need database queries for your system to work correctly), but we'll address that later.
You need to implement the following properties for the Dataset
class inside diallama/mw_loader.py
:
{
'context': list[str], # list of utterances preceeding the current system utterance/response
'utterance': str, # the string with the current system utterance/response
'delex_utterance': str, # the string with the current response which is delexicalized, i.e. slot values are
# replaced by corresponding slot names in the text.
}
n
turns will yield n // 2
examples, each with progressively longer context (starting from a context of length 1, up to n-1
turns of context).k
last utterances, where k
is a parameter of the class.dialogue_acts
and its fields span_end
, span_start
for localizing the parts suitable for delexicalization. Replace those parts with the corresponding slot names from act_slot_name
enclosed into brackets, e.g., [name]
or [pricerange]
.
Machine learning models usually work with numbers and matrices. That is why we also need to convert strings in our batches to integer IDs.
Since this will be the same for both prompting and finetuning, we moved the corresponding code to the class that will deal with generation
in either case, i.e. GenerationWrapper
in diallama/generate.py
This means that you'll also need to implement a collate function (collate_fn
) inside GenerationWrapper
that has the following properties:
torch.utils.data.DataLoader
(lists of examples).output
) of the following structure:output = {
'response': list[list[int]], # tokenized utterances (list of subword ids from the current dialogue turn)
# for all batch examples
'delex_response': list[list[int]], # tokenized and delexicalized utterances (list of subword ids
# from the current dialogue turn) for all batch examples
}
where {k : output[k][i] for k in output}
should correspond to i-th example of the original input batch.The test script is in hw2/test.py
in your repo.
Here, we simply test out the code -- load the dataset and print out a few things. Your task here is to load the model and tokenizer.
We want to use the meta-llama/Llama-3.2-1B-Instruct
model, which is just about the right combination of capable enough and small enough to fit onto small GPUs.
Load the model and its tokenizer using Huggingface's auto methods,
i.e., AutoModelForCausalLM.from_pretrained
and AutoTokenizer.from_pretrained
.
Note that meta-llama/Llama-3.2-1B-Instruct
is a gated model. This means you need to set up a Huggingface account and agree to Llama3.2's terms and conditions
on the model page. You can then get an access token under your Huggingface settings.
Just click on Create a new token, then tick Read access to contents of all public gated repos you can access, give it some name (anything) and click Create token.
Make sure you save the token. Whenever you create a model or tokenizer, make sure you pass token=<your access token>
in your code.
If you don't feel comfortable agreeing to Llama terms and conditions, feel free to use Qwen/Qwen2.5-0.5B-Instruct
instead,
or ask Ondrej and he'll lend you his access token.
diallama/mw_loader.py
.diallama/generate.py
.hw2/test.py
.hw2/test.py
run on your data (test set is used by default), as hw2/output.txt
. Have a look at what the script is doing, that'll help you with your implementation.Presented: 7 November, Deadline: 12 December
Before you start any implementation, make sure you update from upstream! The code is completely reworked for this assignment.
In this assignment, you'll need to prompt your model (i.e. Llama-3.2-1B-Instruct in most cases) and ask it to provide replies for various queries relating to hotels. We'll ignore state tracking and database for now, that will come later on. For now, it suffices that the model will give you some reasonable answer, it doesn't necessarily have to be true :-).
To do that, you'll need to implement a bunch of things:
generate()
functionMost of your implementation will be in the GenerationWrapper
class,
which deals with all things related to decoding.
Finetuning the model will come later.
First of all, you need to finish the GenerationWrapper.collate_fn
method you started in HW2, to correctly format the whole input prompt for the model. LLMs, especially the instruction tuned ones, use a very specific prompt formatting. You should use your tokenizer's apply_chat_template
function for this. It'll handle all the formatting for you. Note that GenerationWrapper
has the tokenizer passed in upon creation (see its __init__
method).
The LLM input prompt (for an instruction-tuned LLM which you chat with, like you would with ChatGPT) has multiple parts:
So the whole input prompt will include the system prompt + response prompt relating to the current sentence the system wants to deal with, in the chat template format.
Let your collate_fn
method return the whole tokenized font. It's a good idea to print it out so you see how it looks like (you might need to do it for debugging anyway).
Since the system prompt and the response prompt are parameters passed to the GenerationWrapper
class upon creation, you don't need to care about their exact values here (but you will do that in a second).
Note: You may also want to store the attention mask for the prompt, and pass it to the decoding (see below). It's not strictly necessary now (you'll get some annoying warnings but it'll work), but we'll use this in HW4.
For now, you also don't need to store the attention mask, it'll just be a vector of 1
's with the same shape as the vector for the prompt.
The second thing you need to implement is the output generation inside the GenerationWrapper.generate_response
method. Here, you need to:
GenerationWrapper.collate_fn
method you just finished.self.model.generate()
and use the GenerationConfig
parameter passed into GenerationWrapper.generate_response
.Now it's time to try out what you coded, using hw3/test.py
, where we try out multiple prompts & multiple generation settings. This script needs the actual prompts and generation settings values, so you'll need to fill it in.
SYSTEM_PROMPT
).RESPONSE_PROMPTS
). You can try out lexicalized vs. delexicalized responses, or different level of detail for the instructions (specifying the response format, what the model can and cannot do, etc.).GenerationConfig
options into GENERATION_CONFIGS
. You can have a look at this tutorial, to find out what settings there are and how they differ. In any case, make sure to use max_new_tokens
and set it below 100, so your code doesn't take forever (you don't need longer replies anyway). Note that the way the code is set up, only the first prompt will be used with all settings, the following prompts will only use the first setting.Finally, store the outputs of your script as hw3/output.txt
. Look how the individual settings differ, and write up a short paragraph in Markdown or plain text in hw3/report.md
(5-10 sentences).
We'll look at more rigorous evaluation on the actual dataset later, but this is how you would play with prompting with LLMs anyway.
Note: Running this on your computer's CPU will most likely be annoyingly slow, so you'll want to use a GPU. You can use Google Colab, which provides GPUs for free for a limited time spans, typically enough to run this. You can also get an account on our in-house AIC student computing cluster (Ondrej will get your accounts created and distribute passwords soon). Before you work on AIC, make sure you read the instructions! You can prepare and debug your code even without a GPU, then only run the actual generation once you have access to a GPU.
diallama/generate.py
.hw3/test.py
.hw3/output.txt
.hw3/report.md
.Presented: 21 November, Deadline: 17 January (extended)
NOTE: I had a subtle bug in my own code template for HW4 (see fix here). If you submitted and your submission doesn't work because of this, you won't be penalized. If you fixed the bug, you'll get a bonus point. I've extended the deadline for HW4 because of this.
This assignment puts together HW2 and HW3 and depends on them. Your task is to finetune your LLM using LoRA (see Lecture 4), on the MultiWOZ data we loaded, using your favourite prompts. We'll still ignore the database for now.
Before you start your implementation, make sure you update from upstream. There are newly added instructions in the code.
You'll need to essentially finish the data collation, so you can feed full examples to the model:
You will work with the diallama/generate.py
and modify the GenerationWrapper.collate_fn()
method in the following way:
<|eot_id|>
and <|end_of_text|>
tokens after the response (see explainers here)0
for padding and 1
for any valid tokens1
for context/utterance tokens only (i.e. 0
for the prompt and 0
for padding)For the model training, we have prepared the script hw4/train.py
that uses the class Trainer
from trainer.py
.
Your task will be to fill the TODO
s in both places to implement the training loop and validation step.
You will also need to create an optimizer and scheduler.
In hw4/train.py
, load your model in the same way as you did for HW3. However, you'll also need to initialize LoRA (see code foir comments), an optimizer and scheduler.
There are links to useful LoRA settings in the TODO notes in the code.
A good optimizer & scheduler choice might be the ones preset by Huggingface (AdamW, Linear schedule with warmup).
In diallama/trainer.py
, implement a training step. Your objective is to minimize cross-entropy / negative log-likelihood (NLL) of the training data (responses only) with respect to your model. Among a couple other things, you need to use the model's forward()
method (by simply calling model()
as usual in PyTorch/HF) and feed in the proper parameters.
Feed the whole concatenated
tensors into the model as input_ids
, including the context.
Only train the model to generate the response, not the context, by setting the model's target labels
properly. Make use of the response_mask
to produce the correct labels
input.
Don't forget to use the attention_mask
, so you avoid performing attention over padding.
Note: You need to fix your random seeds so your results are repeatable, and you can tell if you actually changed something (must be done separately for Python and Numpy and PyTorch/Tensorflow!). This is actually already done for you in hw4/train.py
code, just be aware it's there and it needs to be there.
Note: You may see a lot of use of HuggingFace default Trainer
class. We're not doing that, and we're building our own training loop, for two reasons: (1) we need the “feed context + only train to generate responses” function, which is kinda easier to do low-level, (2) we want you to see what the high-level libraries are doing.
Note: The note on GPU training from HW3 is even more important here.
You want to check performance on the development data once in a while. Have a look into Trainer.eval()
in diallama/trainer.py
and complete the code there.
Besides the usual loss, we want you to report the following measures on the test set:
argmax
on the predicted raw logits and compare the result with the ground-truth token ids)Now it's time to run the training. There are just a few considerations.
In hw4/train.py
, feel free to experiment with hyperparameters (e.g., optimizer/scheduler settings, number of training epochs, learning rate...).
Use the largest batch size you can (the largest where your GPU doesn't run out of memory). It might actually be very small (1-4).
Monitor the training and validation loss and use it to determine the hyperparameters.
First start debugging with very small data, just a few batches (test if the model learns something by checking outputs on the training data).
Redirect the training script output to a file for your final run (e.g. using > hw4/training.log 2>&1
to make sure you get both standard and error output).
If you run it as a batch job on the cluster, outputs are redirected to a file by default, so you can just take that one.
diallama/generate.py
)diallama/trainer.py
, hw4/train.py
)
hw4/training.log
) containing the outputs of your training script.hw4/outputs.txt
) containing the outputs of your trained model applied using hw4/test.py
.Presented: 5th December, Deadline: 24 January (extended due to HW4 extension)
This time, your model will be enhanced with a belief tracking component and database access using 2-stage decoding, so that it doesn't make stuff up but rather tries to produce actually correct answers. This is essentially an instance of retrieval-augmented generation (RAG).
We'll work with the prompted-only version of your model by default. We want to call the LLM twice: first, to get some kind of representation of the dialogue state (user preferences). Second, to generate the response as we did previously, but now using database results which are based on querying the database with the dialogue state.
Finetuning the model will most likely improve the results quite a bit, but we go with the prompted model for simplicity. Finetuning is optional and can get you bonus points, see below.
The file hw5/test.py
looks very much like the script from HW3, it's actually very similar -- we want to run the model on a couple of examples.
Same as there, you'll load the model (base non-finetuned by default) and fill in your prompts.
You don't need demonstrate the use of multiple options this time (just one that works at least somehow).
Building on your HW3 prompts, there are two things you need to do:
Note: The representation of the dialogue state is up to you. It can be represented as plain text talking about the state (The user wants the north area, 3 stars etc.), JSON or other structure (comma-separated values etc.) or some kind of SQL/Python code, whatever you prefer and whatever you find to work. As long as the representation makes some sense, we won't judge :-).
Here, you will implement the 2-stage decoding.
Extend your implementation of the GenerationWrapper
in the following way:
MultiWOZDatabase
object (diallama/database.py
) as the object's property called self.database
in the constructor. You'll need it to query the database.dst_prompt
to the GenerationWrapper
class constructor and save its value as the object's property. You'll use it to feed model input for dialogue state tracking (DST).stage
to collate_fn
, with three possible values: dst
, response
, mixed
.
stage=='dst'
:
self.dst_prompt
to produce the prompt (you'll need to use context
as you did before; see prompt update above)dialogue_state
value in batch
, which is a slot-value dict (e.g. {'area': 'north', 'stars': '3'}
), and use your string representation of this dict (see above) as the response
(and delex_response
, these should be identical)concatenated
, attention_mask
and response_mask
the same way you did beforebatch
is empty or non-existent)stage=='response'
:
db_results
value in batch
-- we recommend to just use a count, but you can also include a full hotel entry (or entries)dialogue_state
in batch
(it still may be empty, e.g. at the beginning of the dialogue)self.response_prompt
as you did before, but now it should also include a spot for the dialogue state and DB results, in addition to the context (see prompt update above)mixed
mode, pick one of the above options at random for each example in the batch.generate_response
method, so it can do the two-stage generation:
collate_fn
in the dst
mode, then call your LLM with the prepared input to get the “response”, i.e., the current dialogue state.self.database.query()
with the domain
set to hotel
and the constraints using a slot-value dict as shown above.collate_fn
, this time in response
mode, and pass it to the LLM.self.database.query
, i.e. the matching database entries)Run hw5/test.py
with your new prompts and your finished implementation of GenerationWrapper
and store the results as hw5/output.txt
.
Modify hw4/train.py
and diallama/trainer.py
.
Finetune your model so that you pick a random prompt for each training example -- for 1/2 examples, you'll do state tracking, for the rest, you'll do response generation.
You'll finetune the model to do both at the same time. This is where the mixed
setting for collate_fn
comes in handy.
To finetune your model, you'll need to extend your data loader in diallama/mw_loader.py
-- to get the dialogue states from the annotation and the corresponding database results into the dialogue_state
and db_results
entries.
Measure the same metrics as you did with HW4 and attach your training log.
diallama/generate.py
.hw5/test.py
.hw5/output.txt
.Presented: 19th December, Deadline: 31 January
In this assignment, you will work with the model(s) built in HW5 and perform some more detailed evaluation, using basic standard metrics.
Specifically, we want you to report:
To be able to compute the metrics, you will need to generate predictions from your model and save them in a machine-readable format, e.g. json
. Use a subset of the test set (all hotel-related dialogues) for generating the predictions.
For the computation of the scores itself, you are free to use any implementation you like. However, the easiest way is to use this evaluation script.
It can be easily installed via pip
and allows to measure all the required metrics (and some more).
The script is included in the requirements file in the repository, so if you used the recommended installation, you already have it installed.
For usage instructions, see its GitHub page.
If you implemented finetuning in HW5, you can measure the scores for both the prompted and the finetuned version of the model, and get 3 bonus points.
hw6/test.py
, add code that produces outputs for all turns in all hotel-related dialogues from the test set, saves them to a file, and measures scores
hw6/outputs.json
containing your generated test set belief states + responses (on the given subset)hw6/scores.txt
containing the metrics scores described above (BLEU, success, distinct tokens, conditional bigram entropy)Presented: 19th December, Deadline: 15 September
The first bonus assignment is to expand your system to support all MultiWOZ 2.2 domains, not just hotels.
This means that you'll have three LLM prompt types -- in addition to the DST prompt and response prompt, you'll need to add a domain detection prompt. Moreover, since your DST and response prompts are domain-dependent, you'll need multiple prompts of these types, one per each domain.
The operation of your system then should be the following:
Make any adjustments in the code you need to achieve this.
You can get additional 5 points if you apply finetuning, same as for HW5.
Name your branch hw7
for this submission. Include the following:
diallama/mw_loader.py
and diallama/generate.py
(and if you use finetuning, also diallama/trainer.py
)hw7/test.py
, which can be based on HW5 but must include at least two example dialogue for each of the MultiWOZ main domains: restaurants, trains, attractions, hotels, taxihw7/output.txt
Presented: 19th December, Deadline: 15th September
The basic idea of the second bonus assignment is that you write a ca. 3-page report (1500 words), detailing your model and the experiments, so it all looks like an academic paper. The purpose of this is to give you some writing training, which might come in handy for your master's thesis or other projects.
Have a look at Ondrej's tips for writing reports here before you start writing!
The prescribed format for your report is LaTeX, with the ACL Rolling Review templates. You can get the templates directly on Overleaf or download them for offline use.
Name your branch hw8
for this submission. Include this:
hw8/report.pdf
)hw8/*.*
)hw8/error_analysis/*.*
-- best as either plain text or JSON)All homework assignments will be submitted using a Git repository on MFF GitLab.
We provide an easy recipe to set up your repository below:
git remote show origin
You should see these two lines:
* remote origin
Fetch URL: git@gitlab.mff.cuni.cz:teaching/NPFL099/2024/your_username.git
Push URL: git@gitlab.mff.cuni.cz:teaching/NPFL099/2024/your_username.git
upstream
:git remote add upstream https://gitlab.mff.cuni.cz/teaching/NPFL099/base.git
git checkout master
git checkout -b hwX
Solve the assignment :)
Add new files (if applicable) and commit your changes:
git add hwX/solution.py
git commit -am "commit message"
git push origin hwX
Create a Merge request in the web interface. Make sure you create the merge request into the master branch in your own forked repository (not into the upstream).
Merge requests -> New merge request
You'll probably need to update from the upstream base repository every once in a while (most probably before you start implementing each assignment). We'll let you know when we make changes to the base repo.
To upgrade from upstream, do the following:
git checkout master
git fetch upstream master
git merge upstream/master master
You can run some basic sanity checks for homework assignments -- they are included in your repository
(make sure to upgrade from upstream first).
Note that the tests require stuff from requirements.txt
to be installed in your Python environment.
The tests assume checking in the current directory, they assume you have the correct branches set up.
For instance, to check hw1
, run:
./run_tests.py hw1
By default, this will just check your local files. If you want to check whether you have
your branches set up correctly, use the --check-git
parameter.
Note that this will run git checkout hw1
and git pull
, so be sure to save any
local changes beforehand!
Always update from upstream before running tests, we're adding checks for new assignments as we go. Some may only be available at the last minute, we're sorry for that!
This is just a short primer for the AIC wiki – better read that one, too. But definitely read at least this text before you start working with AIC.
Use the command
ssh LOGIN@aic.ufal.mff.cuni.cz
where LOGIN is your SIS username.
When you log on to AIC, you're at the cluster head node. Do not compute here – this just for launching computation jobs, copying files and such. All of your computation jobs will run on one of the CPU/GPU nodes. (You can run the terminal multiplexing program on the head node.)
There are two ways to compute on the cluster:
You should use a batch script for running longer computations. The interactive shell is useful for debugging.
Use the sbatch
command to submit your jobs (i.e. shell scripts) into a queue. For running a python command, simply create a shell script that has one line – your command with all the parameters
you need.
You can either specify the parameters in the script or on the command line.
Here are two equivalent ways of specifying a GPU job with 2 CPU cores, 1 GPU and 16G system RAM (all GPUs have 11G memory):
job_script.sh
:#!/bin/bash
#SBATCH -J hello_world # name of job
#SBATCH -p gpu # name of partition or queue (if not specified default partition is used)
#SBATCH --cpus-per-task=2 # number of cores/threads per task (default 1)
#SBATCH --gpus=1 # number of GPUs to request (default 0)
#SBATCH --mem=16G # request 16 gigabytes memory (per node, default depends on node)
# here start the actual commands
sleep 5
echo "Hello I am running on cluster!"
sbatch job_script.sh
job_script.sh
:#!/bin/bash
sleep 5
echo "Hello I am running on cluster!"
sbatch -J hello_world -p gpu -c2 -G1 --mem 16G job_script.sh
Have a look at the AIC wiki or man sbatch
for all the command-line parameters.
(Note: long / short flags can be used interchangeably for both approaches.)
You can get an interactive console using srun
.
The following command will run bash
with the same resources as in the previous example:
srun -J hello_world -p gpu -c2 -G1 --mem=16G --pty bash
exit
the console after use – you're blocking the GPU and whatever you reserve as long as the console is open!sinfo
to list the available queues.squeue --me
or squeue -u LOGIN
(where LOGIN is your username) to check your jobs.squeue
to see every job currently running on the cluster.scancel JOB_ID
to cancel a job.sftp://LOGIN@aic.ufal.mff.cuni.cz
The exam will have 10 questions from the pool below. Each question counts for 10 points. We reserve the right to make slight alterations or use variants of the same questions. Note that all of them are covered by the lectures, and they cover most of the lecture content. In general, none of them requires you to memorize formulas, but you should know the main ideas and principles. See the Grading tab for details on grading.
To pass this course, you will need to:
The final grade for the course will be a combination of your exam score and your homework assignment score, weighted 3:1 (i.e. the exam accounts for 75% of the grade, the assignments for 25%).
Grading:
In any case, you need >50% of points from the test and 40+ points (i.e. 66%) from the homeworks to pass. If you get less than the minimum from either, even if you get more than 60% overall, you will not pass.
You should be able to pass the course just by following the lectures, but here are some hints on further reading. There's nothing ideal on the topic as this is a very active research area, but some of these should give you a broader overview.
Recommended, though slightly outdated:
Recommended, but might be a bit too brief:
Further reading: