Deep Learning Seminar, Summer 2018/19
In recent years, deep neural networks have been used to solve complex machine-learning problems and have achieved significant state-of-the-art results in many areas. The whole field of deep learning has been developing rapidly, with new methods and techniques emerging steadily.
The goal of the seminar is to follow the newest advancements in the deep learning field. The course takes form of a reading group – each lecture a paper is presented by one of the students. The paper is announced in advance, hence all participants can read it beforehand and can take part in the discussion of the paper.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
About
SIS code: NPFL117
Semester: winter + summer
E-credits: 3
Examination: 0/2 C
Guarantor: Milan Straka
Timespace Coordinates
The Deep Learning Seminar takes place on Tuesday at 10:40 in S8. We will first meet on Tuesday Mar 05.
Requirements
To pass the course, you need to present a research paper and sufficiently attend the presentations.
License
Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
To add your name to a paper the table below, edit the source code on GitHub and send a PR.
Date | Who | Topic | Paper(s) |
---|---|---|---|
05 Mar 2019 | Milan Straka | Optimization | Noam Shazeer, Mitchell Stern: Adafactor: Adaptive Learning Rates with Sublinear Memory Cost Ilya Loshchilov, Frank Hutter: Decoupled Weight Decay Regularization Sashank J. Reddi, Satyen Kale, Sanjiv Kumar: On the Convergence of Adam and Beyond Liangchen Luo, Yuanhao Xiong, Yan Liu, Xu Sun: Adaptive Gradient Methods with Dynamic Bound of Learning Rate |
12 Mar 2019 | No DL Seminar | ||
19 Mar 2019 | Milan Straka | Optimization | James Martens, Roger Grosse: Optimizing Neural Networks with Kronecker-factored Approximate Curvature Roger Grosse, James Martens: A Kronecker-factored approximate Fisher matrix for convolution layers Jimmy Ba, Roger Grosse, James Martens: Distributed Second-Order Optimization using Kronecker-Factored Approximations James Martens, Jimmy Ba: Kronecker-Factored Curvature Approximations for Recurrent Neural Networks Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent: Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis |
26 Mar 2019 | Milan Straka | AutoML | Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean: Efficient Neural Architecture Search via Parameter Sharing Hanxiao Liu, Karen Simonyan, Yiming Yang: DARTS: Differentiable Architecture Search Han Cai, Ligeng Zhu, Song Han: ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware David R. So, Chen Liang, Quoc V. Le: The Evolved Transformer |
02 Apr 2019 | Martin Víta | NLP | Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes: Supervised Learning of Universal Sentence Representations from Natural Language Inference Data Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme: Hypothesis Only Baselines in Natural Language Inference Amit Gajbhiye, Sardar Jaf, Noura Al Moubayed, A. Stephen McGough, Steven Bradley: An Exploration of Dropout with RNNs for Natural Language Inference |
09 Apr 2019 | Tomas Soucek | GANs | Zhiming Zhou, Yuxuan Song, Lantao Yu, Hongwei Wang, Jiadong Liang, Weinan Zhang, Zhihua Zhang, Yong Yu: Understanding the Effectiveness of Lipschitz-Continuity in Generative Adversarial Nets Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida: Spectral Normalization for Generative Adversarial Networks |
16 Apr 2019 | Jakub Arnold | Glow | Laurent Dinh, David Krueger, Yoshua Bengio: NICE: Non-linear Independent Components Estimation Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio: Density estimation using Real NVP Diederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1x1 Convolutions |
23 Apr 2019 | Tomáš Gavenčiak | Value learning | Joel Lehman et al.: The Surprising Creativity of Digital Evolution P. Abbeel, A. Y. Ng: Apprenticeship learning via inverse reinforcement learning Paul Christiano et al.: Deep reinforcement learning from human preferences Possibly other IRLs (MaxEnt, Bayesian), notes on Cooperative IRL, Corrupt Reward MDP, Inv. Game Theory. |
30 Apr 2019 | Štěpán Hojdar | Computer vision | Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár: Focal Loss for Dense Object Detection Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan: FOTS: Fast Oriented Text Spotting with a Unified Network |
07 May 2019 | Petra Doubravová | RL as planning | Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson: Learning Latent Dynamics for Planning from Pixels https://planetrl.github.io/ |
” | Felipe Vianna | RL credit assignment | Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter: RUDDER: Return Decomposition for Delayed Rewards |
14 May 2019 | No DL Seminar | Rector's Day | |
21 May 2019 | Surya Prakash | AutoRL | Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis: Learning Navigation Behaviors End-to-End with AutoRL Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee: Long-Range Indoor Navigation with PRM-RL |
AI Safety and Inverse Reinforcement Learning Materials
Even if the talk by Tomáš Gavenčiak was cancelled, you can at least study on the following materials he kindly sent us:
Intro and motivation:
- General intro video (5m) from Stuart Russel (author of AIMA)
- A nice video example of reward hacking in RL (OpenAI blog)
- A paper with more value mis-specification examples (PDF with some pictures).
Why is it hard, why are the hard parts important:
- Really nice talk (90m) from Yukdowsky on AI alignment problem, with concrete math and simple models. One part (here in the video) deals with counter-/examples to even simple problem specification (e.g. let's give the AI a off-switch and make it not want to interfere with it).
Inverse reinforcement learning and some variants:
-
Nice short video introduction with basic SGD-based algorithm (3m).
-
Complete lecture from CVPR18, both overview and maths. Has parts on Maximum Entropy IRL (here) and also GAIL and other advanced techniques later.
Example of success with simpler model:
- Learning from human preferences (OpenAI blog with videos): teaching a "Hopper" figure to do a backflip, learning the value function only by showing humans 2 short videos to compare (<1000 times).