Large Language Models

There are no elephants in this picture

Goals of the course:

  1. Explain how the models work
  2. Teach basic usage of the models
  3. Help students critically assess what you read about them
  4. Encourage thinking about the broader context of using the models

Syllabus from SIS:

  • Basics of neural networks for language modeling
  • Language model typology
  • Data acquisition and curation, downstream tasks
  • Training (self-supervised learning, reinforcement learning with human feedback)
  • Finetuning & Inference
  • Multilinguality and cross-lingual transfer
  • Large Language Model Applications (e.g., conversational systems, robotics, code generation)
  • Multimodality (CLIP, diffusion models)
  • Societal impacts
  • Interpretability

The course is part of the inter-university programme prg.ai Minor.

About

SIS code: NPFL140
Semester: summer
E-credits: 3
Examination: 0/2 C
Guarantors: Jindřich Helcl, Jindřich Libovický

Timespace Coordinates

The course is held on Mondays at 12:20 in S9.

Lectures

1. Introductory notes and discussion on large language models Slides

2. The Transformer architecture Slides Notes Recording

3. Data and Evaluation, Project Proposals Project Proposals Notes Team Coordination Team Registration

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

1. Introductory notes and discussion on large language models

 Feb 17 Slides

Instructor: Jindřich Helcl

Covered topics: aims of the course, passing requirements. We discussed what are (large) language models, what are they for, what are their benefits and downsides. We briefly talk about the Transformer architecture. We concluded with a rough analysis of ChatGPT performance in different languages.

2. The Transformer architecture

 Feb 24 Slides Notes Recording

Instructor: Jindřich Libovický

Learning objectives. After the lecture you should be able to...

  • Explain the building blocks of the Transformer architecture to a non-technical person;

  • Describe the Transformer architecture using equations, especially the self-attention block;

  • Implement the Transformer architecture (in PyTorch or another framework that does automated differentiation).

Additional materials.

3. Data and Evaluation, Project Proposals

 Mar 3 Project Proposals Notes Team Coordination Team Registration

Instructor: Jindřich Helcl

Learning objectives. After the class you should be able to...

  • Know about different NLP tasks that we can solve with LLMs,

  • look for suitable datasets applicable to the task (both for fine-tuning and evaluation),

  • describe (and be aware of) the different characteristics of the relevant data (size, structure, origin, etc.),

  • design experiments in terms of selecting the right baseline model, data and metric (and have some idea about what the result should be).

Team coordination document We suggest that teams who are looking for members as well as individuals looking for a team use the following shared document for coordination. You can look for a team and advertise your team to others.

Team registration form. The registration form for teams is available here

Individuals registration For those not successful in forming or finding a team, we will open a individual registration form later and we will try to assign you to some existing team.

Project work

You will work on a team project during the semester.

Reading assignments

You will be asked at least once to read a paper before the class.

Final written test

You need to take part in a final written test that will not be graded.