index

Advanced Optimization (2024 Fall)

Course Information

This course aims to provide an overview of modern optimization theory designed for machine learning, particularly focusing on the gradient-based first-order optimization methods (with offline and online deployments), which serve as the foundational optimization tools for modern large-scale learning tasks.

Instructor: Peng Zhao ([email protected])
Location: Room 105, Yifu Building B, Xianlin Campus (逸B-105)
Office hours: appointment by email

Announcement

[New! 2025.02.10] The lecture note will be updated in this website: https://www.pengzhao-ml.com/course/AOptLectureNote

[2024.12.22] The thirteenth (and the last) lecture will take place on 12.26 (14:00-16:45) onsite.
[2024.12.19] We will add an additional lecture on 12.20. The twelveth lecture will take place on 12.20 (14:00-16:45) onsite (Note the location change: at 逸B-304!).
[2024.12.17] The eleventh lecture will take place on 12.19 (14:00-16:30) onsite, which will be given by Prof. Haipeng Luo (USC) (who is a top researcher on bandits and online learning and has won best paper awards on top ML conferences including ICML/NeurIPS/COLT).
[2024.12.05] The tenth lecture will take place on 12.05 (14:00-16:45) onsite.
[2024.11.27] The ninth lecture will take place on 11.28 (14:00-16:45) onsite.
The eighth lecture will take place on 11.21 (14:00-16:45) onsite.
[2024.11.05] The seventh lecture will take place on 11.07 (14:00-16:45) onsite.
[2024.11.01] The sixth lecture will take place on 11.01 (14:00-16:45) onsite (Note the location change: today at 逸B-101!).
The fifth lecture will take place on 10.31 (14:00-16:45) onsite.
Due to a trip out of Nanjing, the lecture originally scheduled on 10.24 will be deferred to 11.01. Details will be announced in QQ.
[2024.10.17] The fourth lecture will take place on 10.17 (14:00-16:45) onsite.
[2024.10.09] The third lecture will take place on 10.10 (14:00-16:45) onsite.
The second lecture will take place on 09.26 (14:00-16:45) onsite.
[2024.09.03] Since this course is designed for both graduate fresh students and undergraduate returned students, we will start from the third week, i.e., the first lecture will take place on 09.19 (14:00-16:45).
[2024.09.03] We now have a course website!

Homework

hw2 ddl is fast approaching; please submit it on time!
hw2 has been released!
[2024.11.26] hw1 ddl is fast approaching; please submit it on time!
We have two homework assignments, each with a four-week completion period; hw1 has been released!

Homework 2

Homework2 PDF file: AOpt24-HW-2.pdf
Homework2 tex file download: homework2_file.zip [update history, last update@]

(It is recommended to clear your browser cache before re-downloading the file.)

Instruction: submission rules.html
ID List of who already submitted: list.txt
~~DDL: 12.25 (Wednesday) 23:59 Beijing time~~

Homework 1

Homework1 PDF file: AOpt24-HW-1.pdf
Homework1 tex file download: homework1_file.zip [update history, last update@]

(It is recommended to clear your browser cache before re-downloading the file.)

Instruction: submission rules.html
ID List of who already submitted: list.txt
~~DDL: 11.27 (Wednesday) 23:59 Beijing time~~

Course agenda

Week	Date	Topic	Slides	Lecture Notes/ Readings
1	09.19	Introduction; Mathematical Background	Lecture 1	on matrix norm
2	09.26	Convex Optimization Basics; Function Properties	Lecture 2	Chapter 3.2 of Boyd and Vandenberghe’s book (on operations that preserve convexity) Chapter 3 of Amir Beck’s book (on subgradients) Chapter 5 of Amir Beck’s book (on smoothness and strong convexity)
	10.03	no lecture (due to National Holiday)
3	10.10	GD Methods I: GD method, Lipschitz optimization	Lecture 3	Chapter 8.2 of Amir Beck's book (on GD methods for convex and strongly convex functions)
4	10.17	GD Methods II: GD method, smooth optimization, Polyak's momentum, Nesterov's AGD, composite optimization	Lecture 4	Chapter 3.2 of Bubeck’s book (on GD methods for smooth functions) Chapter 14 & 15 of Ashok Cutkosky's lecture note (on momentum and acceleration) Chapter 10 of Amir Beck’s book (on composite optimization and proximal gradient)
5	10.31	Online Convex Optimization: online gradient descent, online Newton step, exp-concave functions; online-to-batch conversion, SGD	Lecture 5	Chapter 3 of Hazan's book (on OGD for convex and strongly convex functions) Chapter 4 of Hazan's book (on ONS for exp-concave functions)
6	11.01	Online Mirror Descent: expert problem, Hedge, mirror descent, Bregman divergence, FTRL, dual averaging, mirror map	Lecture 6	Lecture Note 2 of Luo's course (on PEA problem) Chapter 6 of Orabona's note (on OMD) Chapter 7 of Orabona's note (on FTRL) Chapter 4 of Bubeck's book (on MD and Dual Averaging)
7	11.07	Adaptive Online Convex Optimization: small-loss bound, self-confident tuning	Lecture 7	Lecture Note 4 of Luo’s course (on small-loss PEA) Chapter 4.2 of Orabona’s note (on small-loss OCO)
8	11.21	Optimistic Online Mirror Descent: predictable sequence, small-loss bound, gradient-variance bound, gradient-variation bound	Lecture 8	Chapter 7.12 of Orabona’s note (on variance/variation bound of OCO)
9	11.28	Optimism for Fast Rates: Online Repeated Games, Fast convergence, Accelerated Methods, UnixGrad	Lecture 9	Lecture 4 of Luo’s course (on online games) Chapter 12 of Orabona’s note (on games and saddle point)
10	12.05	Adversarial Bandits: MAB, IW loss estimator, Bandit Convex Optimization, Gradient Estimator, Self-concordant Barrier	Lecture 10	Lecture 6 of Luo’s course (on adversarial MAB) Lecture 9 of Luo’s course (on self-concordant barrier for adversarial bandits)
11	12.19	Guest Lecture by Prof. Haipeng Luo (USC)	Adversarial Bandits: Theory and Algorithms
12	12.20	Stochastic Bandits: MAB, ETE, UCB, linear bandits, self-normalized concentration, generalized linear bandits	Lecture 12	Lecture Note 14 of Luo’s course (on stochastic MAB) Lecture Note 15 of Luo’s course (on stochastic linear bandits) Chapter 6 & 7 of Lattimore and Szepesvári’s book (on ETC and UCB)
13	12.26	Advanced Topic: non-stationary online learning, universal online learning, online ensemble, base algorithm, meta algorithm	Lecture 13	see the references in the slides

Prerequisites

Familiar with calculus, probability, and linear algebra. Basic knowledge in convex optimization and machine learning.

Reading

Unfortunately, we don't have a specific textbook for this course. In addition to the course slides and lecture notes (will write if time permits), the following books are very good materials for extra readings.

Amir Beck. First-Order Methods in Optimization. MOS-SIAM Series on Optimization, 2017.
Sébastien Bubeck. Convex Optimization: Algorithms and Complexity. Foundation and Trends in Machine Learning, 2015.
Elad Hazan. Introduction to Online Convex Optimization (second edition). MIT Press, 2022.
Francesco Orabona. A Modern Introduction to Online Learning. Lecture Notes, 2022.
Tor Lattimore and Csaba Szepesvári. Bandit Algorithms. Cambridge University Press, 2021.

Some related courses:

CSCI 659: Introduction to Online Optimization/Learning, Fall 2022. University of South California, Haipeng Luo.
EC525: Optimization for Machine Learning, Fall 2022. Boston University, Ashok Cutkosky.

Last modified: 2025-02-10 by Peng Zhao