Deep Learning Seminar, Winter 2019/20
In recent years, deep neural networks have been used to solve complex machine-learning problems and have achieved significant state-of-the-art results in many areas. The whole field of deep learning has been developing rapidly, with new methods and techniques emerging steadily.
The goal of the seminar is to follow the newest advancements in the deep learning field. The course takes form of a reading group – each lecture a paper is presented by one of the students. The paper is announced in advance, hence all participants can read it beforehand and can take part in the discussion of the paper.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
About
SIS code: NPFL117
Semester: winter + summer
E-credits: 3
Examination: 0/2 C
Guarantor: Milan Straka
Timespace Coordinates
The Deep Learning Seminar takes place on Monday at 12:20 in S10. We will first meet on Monday Oct 07.
Requirements
To pass the course, you need to present a research paper and sufficiently attend the presentations.
License
Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
To add your name to a paper the table below, edit the source code on GitHub and send a PR.
Date | Who | Topic | Paper(s) |
---|---|---|---|
07 Oct 2019 | Milan Straka | Transformer | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention Is All You Need Peter Shaw, Jakob Uszkoreit, Ashish Vaswani: Self-Attention with Relative Position Representations Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck: Music Transformer Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context |
14 Oct 2019 | Ondřej Měkota | Transformer | Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. XLNet: Generalized Autoregressive Pretraining for Language Understanding |
21 Oct 2019 | Tomas Soucek | 3D Pointclouds | Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Christopher Choy, JunYoung Gwak, Silvio Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks Christopher Choy, Jaesik Park, Vladlen Koltun: Fully Convolutional Geometric Features |
28 Oct 2019 | No DL seminar | Czech Independence Day | |
04 Nov 2019 | Zdeněk Kasner | Neural LMs | Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language Models are Unsupervised Multitask Learners (OpenAI blog post) Subramanian, Sandeep, Raymond Li, Jonathan Pilault, and Christopher Pal. On Extractive and Abstractive Neural Document Summarization with Transformer Language Models |
11 Nov 2019 - 45 min | Viktor Vašátko | Adversarial Examples | Xiaoyong Yuan, Pan He, Qile Zhu, Xiaolin Li: Adversarial Examples: Attacks and Defenses for Deep Learning |
11 Nov 2019 - 45 min | Jan Vainer | Normalizing flows, Real NVP, Glow | Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio: Density estimation using Real NVP Diederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1x1 Convolutions |
18 Nov 2019 | Memduh Gokirmak Abhishek Agrawal |
NN Interpretation | Hewitt, John, and Christopher D. Manning. A Structural Probe for Finding Syntax in Word Representations Ning Mei, Usman Sheikh, Roberto Santana and David Soto How the brain encodes meaning: Comparing word embedding and computer vision models to predict fMRI data during visual word recognition |
25 Nov 2019 | Erdi Düzel | Image Segmentation | Xiaomei Zhao, Yihong Wu, Guidong Song, Zhenye Li, Yazhuo Zhang, Yong Fan A deep learning model integrating FCNNs and CRFs for brain tumor segmentation |
02 Dec 2019 | Milan Straka | Optimization | Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, Cho-Jui Hsieh: Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Michael R. Zhang, James Lucas, Geoffrey Hinton, Jimmy Ba: Lookahead Optimizer: k steps forward, 1 step back Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han: On the Variance of the Adaptive Learning Rate and Beyond |
09 Dec 2019 | Václav Volhejn | Overfitting and generalization | Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals. Understanding deep learning requires rethinking generalization Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal. Reconciling modern machine learning practice and the bias-variance trade-off Hartmut Maennel, Olivier Bousquet, Sylvain Gelly. Gradient Descent Quantizes ReLU Network Features |
16 Dec 2019 | David Samuel | Neural ODEs | Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud. Neural Ordinary Differential Equations Emilien Dupont, Arnaud Doucet, Yee Whye Teh. Augmented Neural ODEs Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud. Latent ODEs for Irregularly-Sampled Time Series |
23 Dec 2019 | No DL seminar | Christmas Holiday | |
30 Dec 2019 | No DL seminar | Christmas Holiday | |
06 Jan 2020 | David Kubeša | Entity Linking | Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann. End-to-End Neural Entity Linking Possibly also Samuel Broscheit. Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking |
You can choose any paper you find interesting, but if you would like some inspiration, you can look at the following list. The papers are grouped, each group is expected to be presented on one seminar.
Natural Language Processing
-
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin: Advances in Pre-Training Distributed Word RepresentationsMatthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer: Deep contextualized word representationsJacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingZhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. XLNet: Generalized Autoregressive Pretraining for Language Understanding
Generative Modeling
-
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio: Generative Adversarial Networks
- Martin Arjovsky, Soumith Chintala, Léon Bottou: Wasserstein GAN
-
- Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen: Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Andrew Brock, Jeff Donahue, Karen Simonyan: Large Scale GAN Training for High Fidelity Natural Image Synthesis
- Tero Karras, Samuli Laine, Timo Aila: A Style-Based Generator Architecture for Generative Adversarial Networks
-
Laurent Dinh, David Krueger, Yoshua Bengio: NICE: Non-linear Independent Components EstimationLaurent Dinh, Jascha Sohl-Dickstein, Samy Bengio: Density estimation using Real NVPDiederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1x1 Convolutions
Neural Architecture Search (AutoML)
-
- Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le: Learning Transferable Architectures for Scalable Image Recognition
- Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le: Regularized Evolution for Image Classifier Architecture Search
- Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy: Progressive Neural Architecture Search
-
- Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean: Efficient Neural Architecture Search via Parameter Sharing
- Hanxiao Liu, Karen Simonyan, Yiming Yang: DARTS: Differentiable Architecture Search
Networks with External Memory
-
Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap: One-shot Learning with Memory-Augmented Neural Networks
-
Mark Collier, Joeran Beel: Memory-Augmented Neural Networks for Machine Translation
Optimization
-
Michael R. Zhang, James Lucas, Geoffrey Hinton, Jimmy Ba: Lookahead Optimizer: k steps forward, 1 step backLiyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han: On the Variance of the Adaptive Learning Rate and Beyond
Adversarial Examples
-
Xiaoyong Yuan, Pan He, Qile Zhu, Xiaolin Li: Adversarial Examples: Attacks and Defenses for Deep Learning- Jiliang Zhang, Chen Li: Adversarial Examples: Opportunities and Challenges