IFT 6085: Theoretical principles for deep learning (new class)


Research in deep learning produces state-of-the-art results on a number of machine learning tasks. Most of those advances are driven by intuition and massive exploration through trial and error. As a result, theory is currently lagging behind practice. The ML community does not fully understand why the best methods work.

A symptom of this lack of understanding is that deep learning methods largely lack guarantees and interpretability, two necessary properties for mission-critical applications. More importantly, a solid theoretical foundation can aid the design of a new generation of efficient methods—sans the need for blind trial-and-error-based exploration.

In this class we will go over a number of recent publications that attempt to shed light onto these questions. Before discussing the new results in each paper we will first introduce the necessary fundamental tools from optimization, statistics, information theory and statistical mechanics. The purpose of this class is to get students engaged with new research in the area. To that end, the majority of credit will be given for a class project report and presentation on a relevant topic.

Prerequisites: This is meant to be an advanced graduate class for students who want to engage in theory-driven deep learning research. We will introduce the theoretical tools necessary, but start with the assumption that students are comfortable with basic probability and linear algebra.


Lecturer: Ioannis Mitliagkas, Office: 3359, André-Aisenstadt

Class info

Winter 2018 semester:

Note the earlier starting time on Thursday. This will allow students to also attend the Advanced RL class at McGill.


Homeworks and midterm: a small percentage of the grade will be based on homeworks and possibly a midterm exam testing knowledge of some basic tools we introduced.

Project: report handed in and presentation in April (TBA).

Tentative topics–to be updated as we go along