Topics course Mathematics of Deep Learning, NYU, Spring 2020. CSCI-GA 3033.
Thursdays from 5.10pm-7pm in CIWW 201 on Zoom: https://nyu.zoom.us/j/528868904
(optional) Parallel Curriculum sessions: Fridays 11am-12:15pm in CIWW 102 on Zoom: https://nyu.zoom.us/j/724449246
Campuswire (forum): sign up here and use code 5548
Lecture Instructor: Joan Bruna (bruna@cims.nyu.edu)
Tutors (Parallel Curriculum): Will Whitney (wwhitney@cs.nyu.edu) and David Brandfonbrener (david.brandfonbrener@nyu.edu)
This graduate-level course explores some of the open mathematical challenges arising in the study of Deep Learning. In particular, we will focus on three main topics: (i) signal processing principles underpinning convolutional neural networks and their generalisations to non-euclidean geometries (ii) non-convex optimization challenges arising in high-dimensional (or overparametrised) learning problems, and (iii) geometry of Markov Decision Processes and its role in Reinforcement Learning guarantees.
Besides the lectures, we will also run an optional parallel curriculum during the recitation section on two topics in reinforcement learning (RL). This parallel curriculum builds from the basics of RL to reach seminal modern papers in the field. We will have two segments. The first will lead to Unifying Count-Based Exploration and Intrinsic Motivation, covering bandits, deep Q-learning, and exploration along the way. The second will lead to Guided Policy Search and cover continuous control, trajectory optimization, imitation learning, and constrained optimization.
Background on high-dimensional probability, statistics, and/or harmonic analysis. Familiarity with general Machine Learning topics and basic notions of optimization.
The course will involve weekly required readings before class and a final paper, which will be an in-depth survey of a topic related to the syllabus. A detailed abstract of the final paper will be due in the middle of the term. The abstract and final paper should be submitted on NYU Classes.
The course will be graded in three components: paper abstract, final paper, and participation.
The paper will be written in groups which should be roughly evenly divided around 12 topics (groups of 3-4). You can sign up with your group on this spreadsheet. There should be no more than one paper per topic. Here is an example of a high-quality paper from a previous year.
The proposal should be two pages. The first page should include a description of the topic area and an outline of the major directions you will cover. The second page should consist of a list of at least 10 papers which you will include in your review. It is due by 10 PM on 3/5 by submission to NYU classes.
The final survey paper should be around 10-15 pages long. The format is fairly loose, but we recommend either using LaTeX in the JMLR style or creating a web page in the Distill format. Distill is particularly nice if you would like to embed interactive visualizations. It is due by 10 PM on 4/30 by submission to NYU classes.
Good papers will form a coherent story and make connections to unify the work on a certain topic; poor papers will list the work on a topic without a unifying view or central thesis. Covering a smaller area beautifully will be more successful than a superficial view of a larger topic. We encourage you to explain and clarify the work in your subject area, but you do not need to do original theory or experiments (unless you find them pedagogically convenient). There is no need to explain background techniques which are not the subject of your paper, but make sure to precisely define notation and cite appropriately. Please refer back to the example paper for guidance on style and the level of depth that we’re looking for!
Week | Lecture Date | Topic | Section Date | Parallel topic |
---|---|---|---|---|
1 | 1/30 | Parallel takeover: Introduction to Reinforcement Learning (Lecture notes) | 1/31 | No session |
2 | 2/6 | Lecture 1: The Curse of Dimensionality | 2/7 | Bandits and the Upper confidence bound algorithm |
3 | 2/13 | Lecture 2: Symmetries and Geometric Stability | 2/14 | Deep value-based RL and DQN |
4 | 2/20 | Lecture 3: The Scattering Transform Topic due. |
2/21 | UCB in tabular RL |
5 | 2/27 | Lecture 4: From Euclidean to Non-Euclidean Stability | 2/28 | Room change: 60FA room 110 Deep RL with principled exploration |
6 | 3/5 | Lecture 5: Convex Optimization Abstract due. |
3/6 | New directions in exploration for deep RL |
7 | 3/12 | Lecture 6: Discrete vs Continuous Time, Mirror Descent. | 3/11 | Control and linear trajectory optimization |
8 | 3/19 | Spring Break | 3/20 | Spring Break |
9 | 3/26 | Lecture 7: Stochastic Optimization | 3/27 | Iterative linear quadratic regulation |
10 | 4/2 | Lecture 8: Topics in Non-Convex Optimization | 4/3 | Imitation learning |
11 | 4/9 | Lecture 9: Approximation in high-dimensional spaces. | 4/10 | Constrained optimization and ADMM |
12 | 4/16 | Lecture 10: Reproducing Kernel Hilbert Spaces. Measure Spaces. | 4/17 | Guided Policy Search |
13 | 4/23 | Lecture 11: Overparametrised Neural Networks. Lazy and Active regimes | 4/24 | No session |
14 | 4/30 | Lecture 12: Mean-Field Limit of overparamerised neural networks Paper due. |
5/1 | No session |
15 | 5/7 | Lecture 13: Depth Separation. Open Problems | 5/8 | No session |
The following are some recommended topics for the final paper. If your group doesn’t want to do one of these topics, you can propose something else to us instead.