1. Introductory notes and discussion on large language models
Feb 22
Slides
Instructor: Jindřich Helcl
Covered topics: aims of the course, passing requirements. We discussed
what are (large) language models, what are they for, what are their benefits
and downsides. We concluded with a rough analysis of ChatGPT performance in
different languages.
Mar 7
Lecture notes
Slides
Instructor: Jindřich Libovický
After the class, you should be able to:
- Explain the building blocks of the Transformer architecture to a non-technical person
- Describe the Transformer architecture using equations, especially the self-attention block
- Implement the Transformer architecture (in PyTorch or another framework with automated differentiation)
Class outline:
Additional materials:
3. LLM Training
Mar 14
Slides
Recording
Instructor: Ondřej Dušek
After the class, you should be able to:
- Give a high-level description of how neural networks are trained
- Read and understand a neural training library documentation
- Explain the differences between various training techniques used in LLMs today
Class outline:
- Rest of the discussion on Transformers, see above
- General introduction into neural network & transformer model training, pretrained models, RLHF, DPO
Additional materials:
4. LLM Inference
Mar 21
Slides
Code
Recording
Instructor: Zdeněk Kasner
After the class, you should be able to:
- Give a high-level description of how a transformer predicts a probability distribution for the next token in the sequence
- Select the appropriate decoding algorithm for your use-case and understand its parameters
- Write a Python code snippet for generating text with an open language model using the
transformers
library
Class outline:
- Discussion, LLM zoo
- 3D visualization of transformer inference
- Decoding algorithms - exact inference (MAP), greedy search, beam search, top-k, top-p, Mirostat, locally typical sampling
- Hands-on demonstration of text generation with the
transformers
library
- Bonus: non-autoregressive decoding, reverse-engineering decoding algorithms
Additional materials:
5. Generating Weather Reports
Mar 28
Assignment
Assignment #1
After the class, you should be able to:
- Write a basic Python code querying a LLM through an OpenAI-like API.
- Set up a suitable prompt and parameters to get the expected output.
- Describe what are the opportunities and limits of recent open LLMs.
Class outline:
- Introduction
- Working on the assignment
Additional materials:
6. Data and Evaluation
Apr 4
Lecture notes
Instructor: Jindřich Helcl
After the class, you should be able to:
- Look for a dataset for a specified NLP task and find one (given the task is reasonably common)
- Roughly assess the usefulness of the dataset based on its statistics
- Pick an evaluation method that suits the task
- Have a sense of what a "reasonable" score in that task might look like
Class outline:
- Data for language modeling
- NLP tasks and data (introduction + team work)
- Evaluation (introduction + team work)
Additional materials:
7. Evaluation, Working with the Models
Apr 11
MCQA Evaluation
Speech Translation
LLMs for Machine Translation
Chain-of-thought Prompting; RAG
Generation; Evaluation; Web navigation
Experience with LLMs
Recording
Class outline:
- Remarks on LLM evaluation on multiple-choice question answering task
- Speech translation challenges
- Using LLMs for machine translation
- Chain-of-thought prompting, retrieval-augmented generation
- Generation, evaluation and Web navigation using LLMs
- Experience with using LLMs within the EDU-AI project, Task-oriented Dialogue
8. LLM Efficiency
Apr 18
Assignment review
Efficiency
Recording
Instructor: Tomasz Limisiewicz
After the class, you should be able to:
- Identify technical bottlenecks constraining inference and training with LLMs
- Know methods enabling the usage LLMs under computational restrictions:
- parameter efficient fine-tuning,
- quantization,
- picking the right model scale for your data.
Class outline:
- Assignment 1 review
- Time and space requirements of LLMs
- Low-rank adaptation
- Quantization
- Scaling
9. Multilinguality
Apr 25
Slides
Recording
Assignment
Instructor: Tomasz Limisiewicz
Assignment #2
After the class, you should be able to:
- Name benefits of multilingual language models and cross-lingual transfer.
- Pick the multilingual model suitable for a specific language based on training data, similar languages covered and tokenizer properties.
Class outline:
- Guided discussions: why do we train multilingual LMs? How to train multilingual LMs?
- Availability of data throughout languages, resourcefulness levels.
- Variability of languages: typology and writing systems
- Multilingual tokenization
- Application of LLMs for machine translation
10. LLMs for Speech-to-Text
May 2
Slides
Recording
Instructors: Peter Polák, Dominik Macháček
After the class, you should know:
- Motivation for speech in LLMs
- The basic and example speech-to-text methods
- Real-time methods
Class outline:
- Speech NLP tasks (ASR, translation, emotion recognition, …)
- Speech in NNs (sound representation, MFCC, raw audio) and in LLMs (Wav2vec,
HuBERT, Whisper)
- Simultaneous methods: re-translation vs. incremental
- Streaming policies wait-k and LocalAgreement
- Whisper-Streaming and ELITR demo
11. Reading Research Papers
May 9
Slides
Recording
Instructors: Jindřich Libovický and Jindřich Helcl
After the class, you should be able to:
- Find meta data on research articles and based on that judge the paper quality
- Identify strengths and weaknesses of research papers
Class outline:
- Summary of assignment no. 2
- Presentation of the assigned reading in groups
- Basmov, Victoria, Yoav Goldberg, and Reut Tsarfaty. "Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds: Blind spots and blinds."arXiv preprint arXiv:2305.14785(2023).
- Schick, Timo, et al. "Toolformer: Language models can teach themselves to use tools." Advances in Neural Information Processing Systems36 (2024).
- Simone Balloccu, Patrícia Schmidtová, Mateusz Lango, and Ondřej Dušek. 2024. "Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs". Proceedings of the 18th Conference of the EACL 2024.
- Jiang, Albert Q., et al. "Mistral 7B." arXiv preprint arXiv:2310.06825(2023).
- Yoon, Dongkeun, et al. "LangBridge: Multilingual Reasoning Without Multilingual Supervision." arXiv preprint arXiv:2401.10695 (2024).
- Discussion of strengths and weaknesses of the papers
- Indicators of paper quality
12. Understanding and Meaning in LLMs
May 16
Slides
Recording
Instructor: Tomáš Musil
After the class, you should be able to:
- Understand that meaning, understanding and language are not singleton concepts
- Describe the relevant thought experiments: Chinese room, Blockhead, Octopus
- Describe the experiments that show LLMs learning at least some extent of meaning from form only
Class outline:
- Meaning, understanding, language
- Why it may be impossible for LLMs to learn meaning and what suggests it migt be possible
- Ethical questions surrounding the training and use of LLMs
Additional materials:
Active participation
There will be two or three tasks during the semester; we will work on them mainly
during classes but they might turn into a (small) homework.
Reading assignments
You will be asked at least once to read a paper before the class.
Final written test
You need to take part in a final written test that will not be graded.