Advanced Optimization (2024 Fall)

Course Information

This course aims to provide an overview of modern optimization theory designed for machine learning, particularly focusing on the gradient-based first-order optimization methods (with offline and online deployments), which serve as the foundational optimization tools for modern large-scale learning tasks.

Announcement

[New! 2025.02.10] The lecture note will be updated in this website: https://www.pengzhao-ml.com/course/AOptLectureNote

Homework

Homework 2

Homework 1

Course agenda

WeekDateTopic
Slides
Lecture Notes/ Readings
109.19Introduction; Mathematical BackgroundLecture 1on matrix norm
209.26Convex Optimization Basics; Function PropertiesLecture 2Chapter 3.2 of Boyd and Vandenberghe’s book (on operations that preserve convexity)
Chapter 3 of Amir Beck’s book (on subgradients)
Chapter 5 of Amir Beck’s book (on smoothness and strong convexity)
 10.03no lecture (due to National Holiday)  
310.10GD Methods I: GD method, Lipschitz optimizationLecture 3Chapter 8.2 of Amir Beck's book (on GD methods for convex and strongly convex functions)
410.17GD Methods II: GD method, smooth optimization, Polyak's momentum, Nesterov's AGD, composite optimizationLecture 4Chapter 3.2 of Bubeck’s book (on GD methods for smooth functions)
Chapter 14 & 15 of Ashok Cutkosky's lecture note (on momentum and acceleration)
Chapter 10 of Amir Beck’s book (on composite optimization and proximal gradient)
510.31Online Convex Optimization: online gradient descent, online Newton step, exp-concave functions; online-to-batch conversion, SGDLecture 5Chapter 3 of Hazan's book (on OGD for convex and strongly convex functions)
Chapter 4 of Hazan's book (on ONS for exp-concave functions)
611.01Online Mirror Descent: expert problem, Hedge, mirror descent, Bregman divergence, FTRL, dual averaging, mirror mapLecture 6Lecture Note 2 of Luo's course (on PEA problem)
Chapter 6 of Orabona's note (on OMD)
Chapter 7 of Orabona's note (on FTRL)
Chapter 4 of Bubeck's book (on MD and Dual Averaging)
711.07Adaptive Online Convex Optimization: small-loss bound, self-confident tuningLecture 7Lecture Note 4 of Luos course (on small-loss PEA)
Chapter 4.2 of Orabonas note (on small-loss OCO)
811.21Optimistic Online Mirror Descent: predictable sequence, small-loss bound, gradient-variance bound, gradient-variation boundLecture 8Chapter 7.12 of Orabonas note (on variance/variation bound of OCO)
911.28Optimism for Fast Rates: Online Repeated Games, Fast convergence, Accelerated Methods, UnixGradLecture 9Lecture 4 of Luos course (on online games)
Chapter 12 of Orabonas note (on games and saddle point)
1012.05Adversarial Bandits: MAB, IW loss estimator, Bandit Convex Optimization, Gradient Estimator, Self-concordant BarrierLecture 10Lecture 6 of Luos course (on adversarial MAB)
Lecture 9 of Luos course (on self-concordant barrier for adversarial bandits)
1112.19Guest Lecture by Prof. Haipeng Luo (USC)
Adversarial Bandits: Theory and Algorithms 
1212.20Stochastic Bandits: MAB, ETE, UCB, linear bandits, self-normalized concentration, generalized linear banditsLecture 12Lecture Note 14 of Luo’s course (on stochastic MAB)
Lecture Note 15 of Luo’s course (on stochastic linear bandits)
Chapter 6 & 7 of Lattimore and Szepesvári’s book (on ETC and UCB)
1312.26Advanced Topic: non-stationary online learning, universal online learning, online ensemble, base algorithm, meta algorithmLecture 13see the references in the slides

Prerequisites

Familiar with calculus, probability, and linear algebra. Basic knowledge in convex optimization and machine learning.

Reading

Unfortunately, we don't have a specific textbook for this course. In addition to the course slides and lecture notes (will write if time permits), the following books are very good materials for extra readings.

Some related courses:



Last modified: 2025-02-10 by Peng Zhao