Large Language Models

There are no elephants in this picture

Goals of the course:

  1. Explain how the models work
  2. Teach basic usage of the models
  3. Help students critically assess what you read about them
  4. Encourage thinking about the broader context of using the models

Syllabus from SIS:

  • Basics of neural networks for language modeling
  • Language model typology
  • Data acquisition and curation, downstream tasks
  • Training (self-supervised learning, reinforcement learning with human feedback)
  • Finetuning & Inference
  • Multilinguality and cross-lingual transfer
  • Large Language Model Applications (e.g., conversational systems, robotics, code generation)
  • Multimodality (CLIP, diffusion models)
  • Societal impacts
  • Interpretability

About

SIS code: NPFL140
Semester: summer
E-credits: 3
Examination: 0/2 C
Guarantors: Jindřich Helcl, Jindřich Libovický

Timespace Coordinates

The course is held on Thrusdays at 15:40 in S3. The first lecture took place on 22 February.

Lectures

1. Introductory notes and discussion on large language models Slides

2. The Transformer Architecture Lecture notes Slides

3. LLM Training Slides Recording

4. LLM Inference Slides Code Recording

5. Generating Weather Reports Assignment

6. Data and Evaluation Lecture notes

7. Evaluation, Working with the Models MCQA Evaluation Speech Translation LLMs for Machine Translation Chain-of-thought Prompting; RAG Generation; Evaluation; Web navigation Experience with LLMs Recording

8. LLM Efficiency Assignment review Efficiency Recording

9. Multilinguality Slides Recording Assignment

10. LLMs for Speech-to-Text Slides Recording

11. Reading Research Papers Slides Recording

12. Understanding and Meaning in LLMs Slides Recording

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

1. Introductory notes and discussion on large language models

 Feb 22 Slides

Instructor: Jindřich Helcl

Covered topics: aims of the course, passing requirements. We discussed what are (large) language models, what are they for, what are their benefits and downsides. We concluded with a rough analysis of ChatGPT performance in different languages.

2. The Transformer Architecture

 Mar 7 Lecture notes Slides

Instructor: Jindřich Libovický

After the class, you should be able to:

  • Explain the building blocks of the Transformer architecture to a non-technical person
  • Describe the Transformer architecture using equations, especially the self-attention block
  • Implement the Transformer architecture (in PyTorch or another framework with automated differentiation)

Class outline:

Additional materials:

3. LLM Training

 Mar 14 Slides Recording

Instructor: Ondřej Dušek

After the class, you should be able to:

  • Give a high-level description of how neural networks are trained
  • Read and understand a neural training library documentation
  • Explain the differences between various training techniques used in LLMs today

Class outline:

  • Rest of the discussion on Transformers, see above
  • General introduction into neural network & transformer model training, pretrained models, RLHF, DPO

Additional materials:

4. LLM Inference

 Mar 21 Slides Code Recording

Instructor: Zdeněk Kasner

After the class, you should be able to:

  • Give a high-level description of how a transformer predicts a probability distribution for the next token in the sequence
  • Select the appropriate decoding algorithm for your use-case and understand its parameters
  • Write a Python code snippet for generating text with an open language model using the transformers library

Class outline:

  • Discussion, LLM zoo
  • 3D visualization of transformer inference
  • Decoding algorithms - exact inference (MAP), greedy search, beam search, top-k, top-p, Mirostat, locally typical sampling
  • Hands-on demonstration of text generation with the transformers library
  • Bonus: non-autoregressive decoding, reverse-engineering decoding algorithms

Additional materials:

5. Generating Weather Reports

 Mar 28 Assignment

Assignment #1

After the class, you should be able to:

  • Write a basic Python code querying a LLM through an OpenAI-like API.
  • Set up a suitable prompt and parameters to get the expected output.
  • Describe what are the opportunities and limits of recent open LLMs.

Class outline:

  • Introduction
  • Working on the assignment

Additional materials:

6. Data and Evaluation

 Apr 4 Lecture notes

Instructor: Jindřich Helcl

After the class, you should be able to:

  • Look for a dataset for a specified NLP task and find one (given the task is reasonably common)
  • Roughly assess the usefulness of the dataset based on its statistics
  • Pick an evaluation method that suits the task
  • Have a sense of what a "reasonable" score in that task might look like

Class outline:

  • Data for language modeling
  • NLP tasks and data (introduction + team work)
  • Evaluation (introduction + team work)

Additional materials:

7. Evaluation, Working with the Models

 Apr 11 MCQA Evaluation Speech Translation LLMs for Machine Translation Chain-of-thought Prompting; RAG Generation; Evaluation; Web navigation Experience with LLMs Recording

Class outline:

  • Remarks on LLM evaluation on multiple-choice question answering task
  • Speech translation challenges
  • Using LLMs for machine translation
  • Chain-of-thought prompting, retrieval-augmented generation
  • Generation, evaluation and Web navigation using LLMs
  • Experience with using LLMs within the EDU-AI project, Task-oriented Dialogue

8. LLM Efficiency

 Apr 18 Assignment review Efficiency Recording

Instructor: Tomasz Limisiewicz

After the class, you should be able to:

  • Identify technical bottlenecks constraining inference and training with LLMs
  • Know methods enabling the usage LLMs under computational restrictions:
    • parameter efficient fine-tuning,
    • quantization,
    • picking the right model scale for your data.

Class outline:

  • Assignment 1 review
  • Time and space requirements of LLMs
  • Low-rank adaptation
  • Quantization
  • Scaling

9. Multilinguality

 Apr 25 Slides Recording Assignment

Instructor: Tomasz Limisiewicz

Assignment #2

After the class, you should be able to:

  • Name benefits of multilingual language models and cross-lingual transfer.
  • Pick the multilingual model suitable for a specific language based on training data, similar languages covered and tokenizer properties.

Class outline:

  • Guided discussions: why do we train multilingual LMs? How to train multilingual LMs?
  • Availability of data throughout languages, resourcefulness levels.
  • Variability of languages: typology and writing systems
  • Multilingual tokenization
  • Application of LLMs for machine translation

10. LLMs for Speech-to-Text

 May 2 Slides Recording

Instructors: Peter Polák, Dominik Macháček

After the class, you should know:

  • Motivation for speech in LLMs
  • The basic and example speech-to-text methods
  • Real-time methods

Class outline:

  • Speech NLP tasks (ASR, translation, emotion recognition, …)
  • Speech in NNs (sound representation, MFCC, raw audio) and in LLMs (Wav2vec, HuBERT, Whisper)
  • Simultaneous methods: re-translation vs. incremental
  • Streaming policies wait-k and LocalAgreement
  • Whisper-Streaming and ELITR demo

11. Reading Research Papers

 May 9 Slides Recording

Instructors: Jindřich Libovický and Jindřich Helcl

After the class, you should be able to:

  • Find meta data on research articles and based on that judge the paper quality
  • Identify strengths and weaknesses of research papers

Class outline:

12. Understanding and Meaning in LLMs

 May 16 Slides Recording

Instructor: Tomáš Musil

After the class, you should be able to:

  • Understand that meaning, understanding and language are not singleton concepts
  • Describe the relevant thought experiments: Chinese room, Blockhead, Octopus
  • Describe the experiments that show LLMs learning at least some extent of meaning from form only

Class outline:

  • Meaning, understanding, language
  • Why it may be impossible for LLMs to learn meaning and what suggests it migt be possible
  • Ethical questions surrounding the training and use of LLMs

Additional materials:

Active participation

There will be two or three tasks during the semester; we will work on them mainly during classes but they might turn into a (small) homework.

Reading assignments

You will be asked at least once to read a paper before the class.

Final written test

You need to take part in a final written test that will not be graded.