## Course Info

**Lecture Info**

*Time:*Tuesday/Thursday 4:30-5:45pm*Location*: Klaus 2443 (*note:*classroom has changed from CCB!)

**Course Staff**

Role | Name | `____@gatech.edu` |
Office Hours |
---|---|---|---|

Prof | Jacob Abernethy | `prof` |
Tuesdays 10am-11am in Klaus 2134 |

GTA | Benjamin Bray | `benrbray` |
Wednesdays 2-3pm between Klaus 2116 and 2124 |

TA | Shyamal Patel | `patels` |
Fridays 1-3pm between Klaus 2116 and 2124 |

TA | Rafael Hanashiro | `rhanashiro3` |
Mondays 8-10am between Klaus 2116 and 2124 |

**Communication**

We will be using Piazza for managing questions about lecture material, homeworks, and course logistics. Please only send us email in the event of a personal emergency or other sensitive matter. If you email us about anything else, we will likely request that you open a Piazza post. (Note that you can make private posts on Piazza to avoid revealing homework solutions publicly.)

## Course Description

This is an advanced course on algorithms. That is quite a broad topic, and in particular this semester’s course will focus heavily on **algorithms for machine learning**. We will be especially interested in diving into the following topics:

- Numerical Analysis
- Convex Geometry & Optimization
- Probability & Statistical Inference

While students should have a strong back background in core algorithmic concepts, linear algebra, calculus, and probability, we will review many of these topics early in the course. Students will be required to write code in Python, and we will present much of the material in the course using Jupyter Notebooks.

## Hands-on Format

In most courses, students learn about material in class through lecture, and then they practice problem solving on their own by doing homework. In this course we will do the opposite! Students will be required to read material before each class period, and then arrive in class ready to dive into problem-solving. This way, students can familiarize themselves with basic definitions and examples at home, and benefit from one-on-one interaction with the course staff during lecture while working through more challenging aspects of the material. Lecture notes for each day will be posted online at least one week prior to each lecture (with the first week as an exception).

Why do it like this? The lecture format is an outdated way to teach mathematical material, especially for topics such as algorithms and machine learning where it is so easy to play with code and implement ideas. The lecture format also limits the professor’s ability to interact directly with students.

Each class period will have the following structure:

*(0mins)*Students arrive and organize themselves by sitting with their group*(5mins)*Students take a very short and simple quiz on assigned reading material*(70mins)*Problem time! Instructor presents a sequence of problems, and students are given 5-10 minutes per problem to try to solve each one with their group. Instructors will move around the classroom to engage with students and answer questions.

## Grading

### Homeworks (35%)

Students are allowed, and indeed encouraged, to work on homework with other students in the course. But when solutions are written up, this should be done alone and without the help of other students. Students are required to specify on their writeups which students that collaborated with. If we find solutions that appear even remotely to have been copied, these will be given a zero and the students will be notified.

### Attendance+Quizzes (15%)

In-class quizzes will be graded generously, and 50% of the credit will be given simply for showing up. Quizzes will be entered electronically, via a web form, so make sure you have a phone, laptop, or tablet with you in class! If you don’t have access to any of these, please let us know and we will make accomodations. The grading scheme for quizzes will be:

- 2 points for a correct answer
- 1 point for an incorrect answer
- 0 points for not taking the quiz

Your **five lowest quiz scores will be dropped**, which should be enough to account for quizzes missed due to planned or unplanned absences.

### Midterm Exam (20%)

The date of the midterm is TBD. More details about the topics and format will be released at a later date.

### Final Exam (30%)

The final exam will take place on **Thursday, December 5, 2019** from **2:40-5:30pm**. The final exam date is fixed by the university, and we are not at liberty to change it.

## Reading

Readings will be assigned for you to complete *before each lecture*. All required reading will either be linked to here or posted to canvas. You are not required to purchase a textbook for this course, but you may find the following books helpful.

- Boyd & Vandengerbhe,
*Convex Optimization*(Free PDF) - Trefethen & Bau,
*Numerical Linear Algebra*

## Important Dates

Tuesday, 8/20/19 | First day of class |

Friday, 8/23/19 | Schedule change deadline |

Monday, 9/2/19 | Labor Day |

Mon 10/14/19 - Tues, 10/15/19 | Fall Recess (no class) |

Monday, 10/26/19 @ 4pm | Withdrawal Deadline |

Wed, 11/27/19 - Sun, 11/31/19 | Thanksgiving Break |

Tuesday, 12/3/19 | Last day of class |

Thursday, 12/5/19 @ 2:40-5:30pm | Final Exam (determined by final exam matrix) |

# Homework

**Homework #1**Due date Sunday, September 8**Homework #2**Due date Sunday, September 22

# Calendar

# Schedule

### Tentative Schedule (*read ahead at your own risk!*)

(Tu 8/20/19) **Lecture #1: Introduction & Perceptron**
(Lecture Slides)

No Required Reading

*Additional Resources*

- CMU 15-859(B), Lecture #4, The Perceptron Algorithm by Avrim Blum
- Raul Rojas, Neural Networks: A Systematic Introduction

(Th 8/22/19) **Lecture #2: Review of Linear Algebra & Intro to Numpy**
(Lecture Slides)

*Required Preparation before Class*

- Notes, CS 4540: Python Basics
- Brush up on linear algebra!

*Additional Resources*

- 3blue1brown, Essence of Linear Algebra video series
- UNSW 2501: Linear Algebra Notes on Canvas
- MIT OCW 18.06 Linear Algebra Lecture Videos

(Tu 8/27/19) **Lecture #3: Convex Geometry**
(Lecture Slides)

*Required Preparation before Class*

- Read Boyd & Vandenberghe, Convex Optimization
- §2.1 Affine and Convex Sets
- §2.2 Important Examples (of Affine and Convex Sets)
- §2.5 Separating & Supporting Hyperplanes

*Additional Resources*

- Stanford EE364a, Lecture #2: Convex Sets (Slides, Video)
- Jeffe, UIUC Computational Geometry, “Lecture 1: Convex Hulls”
- Lauritzen,
*Lectures on Convex Sets*, Ch 3: Separation (nice proof of Farkas lemma) - ETH Zürich, Approximate Methods in Geometry, “Chapter 1: Some Basic Geometry”
- David L. Finn, Rose-Hullman MA 323, “Barycentric Coordinates & de Casteljau’s Algorithm”

(Th 8/29/19) **Lecture #4: Review of Multivariable Calculus**
(Lecture Slides)

*Required Preparation before Class*

- Brush up on single and multivariable calculus!
- Sequences, series, and limits
- Chain rule, product rule, quotient rule
- Mean value theorem, intermediate value theorem
- Taylor series

*Additional Resources*

- Parr & Howard 2018, “The Matrix Calculus You Need for Deep Learning”
- Petersen & Pedersen 2012, “The Matrix Cookbook”
- 3blue1brown YouTube Channel
- Essence of Calculus #4, “Visualizing the chain rule and product rule”
- Essence of Calculus #6, “Implicit Differentiation, what’s going on here?”
- “What they won’t teach you in Calculus”

(Tu 9/3/19) **Lecture #5: Convex Functions & Intro to Optimization**
(Lecture Slides)

*Required Preparation before Class*

- Boyd & Vandenberghe, Convex Optimization
- §3.1 Basic Properties & Examples of Convex Functions
- Skip §3.1.2 Extended-Value Extensions

- §3.1 Basic Properties & Examples of Convex Functions

*Additional Resources*

- Stanford EE364a, Lecture #3: Convex Functions (Slides, Video)
- Boyd & Vandenberghe, Convex Optimization
- §2.3 Operations that Preserve Convex Sets
- §3.2 Operations that Preserve Convex Functions
- §2.5 Separating & Supporting Hyperplanes

(Th 9/5/19) **Lecture #6: More Calculus & Positive Definite Matrices & Convex Functions** (Lecture Slides)

*Required Preparation before Class*

- Randal J. Barnes, “Matrix Differentiation”

(Tu 9/10/19) **Lecture #7: Gradient Descent for Convex Functions** (Lecture Slides)

*Required Preparation before Class*

- Sham Kakade, “Optimization 1: Gradient Descent”
- Read starting from Section 3, the most important thing is to understand the convergence rate of gradient descent under different assumptions (convex and lipschitz, convex and smooth, strongly convex and smooth)

*Additional Resources*

- Sebastian Bubeck 2015, “Convex Optimization: Algorithms and Complexity”
- Sections 3.0, 3.1, 3.2, and 3.4

- Jonathan Shewchuk 1994, “Painless Conjugate Gradient” (Pages 1-17)
- We (probably) won’t cover Conjugate-Gradient, but these notes are a great intro gradient descent.
- We’ll cover the Jacobi method in more detail later, so don’t worry too much about §5.2

- Definition of strong convexity in Boyd & Vandenberghe, “Convex Optimization” §9.1.2
- Moritz Hardt, UC Berkeley EE227C
- Elad Hazan, “Introduction to Online Convex Optimization”, Chapters 2 & 3

(Tu 9/17/19) **Lecture #8: Intro to Supervised ML: Least Squares, Logistic Regression, Support Vector Machines**
(Lecture Slides)

*Required Preparation before Class*

- Andrew Ng, CS229 Lecture Notes 1. You need only read:
- Pages 1-12, intro to least squares regression
- Pages 14-19, intro to logistic regression, and Newton’s method

- Pedro Felzenszwalb CS142 Lectures Notes 10
- Focus on the
*unconstrained*formulation of the SVM in equation (1)!

- Focus on the

(Th 9/19/19) **Lecture #9: Intro to Online Learning**
(Lecture Slides)

*Required Preparation before Class*

- Elad Hazan, Online Convex Optimization book
- Chapter 1, pages 3-15,
*learning with expert advice*

- Chapter 1, pages 3-15,

(Tu 9/24/19) **Lecture #10: Online Convex Optimization and Online Gradient Descent** Lecture Slides

*Required Preparation before Class*

- Elad Hazan, Online Convex Optimization book
- Chapter 3, pages 39-45,
*learning with expert advice*

- Chapter 3, pages 39-45,

(Th 9/26/19) **Lecture #11: Matrix Decompositions & SVD**

*Required Preparation before Class*

- Trefethen & Bau, Numerical Linear Algebra (posted to Canvas)
- Lecture 4: The Singular Value Decomposition (Pages 25-31)
- Lecture 5: More on the SVD (Pages 32-37)

(Tu 10/1/19) **Lecture #12: OGD, SGD, and rates of optimization** Lecture Slides

*Requires Preparation before Class*

- Elad Hazan, Online Convex Optimization book
- Chapter 3, pages 46-48,
*learning with expert advice*

- Chapter 3, pages 46-48,

(Th 10/3/19) **Lecture #13: Linear Programming Introduction** (Lecture Slides)

*Required Preparation before Class*

- Tim Roughgarden, Stanford CS261, Lecture 7: Linear Programming (Pages 1-5)
- Try to get used to the matrix notation for linear programs! Think geometrically!

(Th 10/17/19) **Lecture #14: Linear Programming Duality** (Lecture Slides)

*Required Preparation before Class*

- Tim Roughgarden, Stanford CS261, Lecture 8: Linear Programming Duality I (Pages 1-6)

*Additional Resources*

- Tim Roughgarden, Stanford CS261, Lecture 9: Linear Programming Duality II
- Jim Burke, University of Washington MATH 407

(Tu 10/22/19) **Lecture #15: Zero-sum Games and the Minimax Theorem** Lecture Slides

*Required Preparation before Class*

- Thomas S. Ferguson, Game Theory
- Sections 1-1.4, pages II-4 to II-7
- Sections 2-2.2, pages II-9 to II-11
- Sections 4-4.2, pages II-35 to (beginning of) II-38

*Additional Resources*

- Eric Prisner, Game Theory through Examples
- Ran Canetti, Notes on Game Theory for Cryptograph Course
- Section 1

(Th 10/29/18) **Lecture #17: Boosting** (Lecture Slides)

*Required Preparation before Class*

- The Boosting Approach to Machine Learning
- You only need to read pages 1-5

*Additional Reading*

(Th 10/29/18) **Lecture #18: Online Learning for Solving Games and LPs** (Lecture Slides)

*Required Preparation before Class*

- Elad Hazan, Online Convex Optimization book
- Chapter 8.0, 8.1, 8.3, 8.4 (you can skip 8.2)

(Tu 11/5/19) **Lecture #19: Second-order Methods & Fixed Point Iteration** ([Lecture Slides]https://nbviewer.jupyter.org/github/cs4540-f19/cs4540-f19.github.io/blob/master/lectures/cs4540-f19-lecture19_fixed_points_newtons_method.ipynb))

*Required Preparation before Class*

- Sauer,
*Numerical Analysis*(posted to Canvas)- §1.1 The Bisection Method (Pages 25-29)
- §1.2 Fixed Point Iteration (Pages 30-40)
- §1.4 Newton’s Method (Pages 51-58)

- Review the following topics from multivariable calculus:
- Multivariate Taylor’s Theorem
- Mean / Intermediate Value Theorems

*Additional Resources*

- Newton’s Method
- Boyd & Vandenberghe, Convex Optimization, §9.5 Newton’s Method
- Chris Hauser, “Multivariate Newton’s Method and Quasi-Newton”
- Wei-Ta Chu 2014, “Multivariate Newton’s Method” (slides)
- Ryan Tibshirani 2015, “Newton’s Method” (slides)
- Sean Harrington, “Solving Logistic Regression with Newton’s Method” (blog post; we’ll cover logistic regression later)

- Newton Fractals
- Simon Tatham, “Fractals Derived from Newton-Raphson Iteration”
- Daniel Dreibelbis, “Newton Fractals”
- Paul Bourke, “An Introduction to Fractals”

- Geoffrey Hinton, Coursera NNML, “A Brief Overview of Hessian-Free Optimization”
- Nykamp DQ,
*Math Insight*, “Introduction to Taylor’s Theorem for Multivariable Functions” - Sauer,
*Numerical Analysis*§1.3 briefly covers of conditioning / sensitivity, but we won’t focus on these topics in class. For a slightly more advanced treatment, see Trefethen & Bau,*Numerical Linear Algebra*§13-14.

(Th 11/7/19) **Lecture #20: Numerical Methods for Linear Systems** (Lecture Slides)

*Required Preparation before Class*

- Understand §5.1-5.3 of Shewchuk, “Painless Conjugate Gradient”
- Understand the Gershgorin Circle Theorem (Wikipedia’s proof isn’t too bad!)

*Additional Resources*

- Golub & van Loan,
*Matrix Computations*, “§11.2: The Classical Iterations” (posted to Canvas) - Volker John, “Ch 3: Classical Iterative Schemes”
- Schonlieb, “Numerical Analysis” Lectures 16, 17, 18
- Gilbert Strang,
*Mathematical Methods for Engineers*, §6.2: Iterative Methods

(Tu 11/12/19) **Lecture #21: Numerical Methods for Computing Eigenvalues** (Lecture Slides)

*Required Preparation before Class*

- Trefethen & Bau,
*Numerical Linear Algebra*(posted to Canvas)- Lecture 27: Rayleigh Quotient, Inverse Iteration (Pages 202-210)

*Additional Resources*

- Trefethen & Bau,
*Numerical Linear Algebra*(posted to Canvas)- Lecture 25: Overview of Eigenvalue Algorithms (read if you want some more context about computing eigenvalues)
- Lecture 28: QR Algorithm without Shifts (we won’t cover this, but the algorithm is interesting!)

(Th 11/14/19) **Lecture #22: Clustering** (F18 Lecture Slides )

*Required Preparation before Class*

- Lior Rokach and Oded Maimon Clustering Methods
- Read Sections 1-4, and Section 5.2

(Tu 11/19/19) **Lecture #23: Random Sampling** (CANCELLED!)

*Required Preparation before Class*

(Th 11/21/19) **Lecture #24: Markov Chains and Sampling** (Lecture Slides)

*Required Preparation before Class*

- Tim Roughgarden & Gregory Valiant Markov Chain Monte Carlo
- All of it!

(Tu 11/26/19) **Lecture #25: Stationary Distributions, and Mixing Times, of Markov Chains**

*Required Preparation before Class*

- None! This lecture will have some content, but will mostly be fun

*Additional Resources*

- If you are interested in the concept of Markov Chains, and want to learn more about their connection to algorithms and complexity, I suggest this book:
- Levin, Peres, Wilmer Markov Chains and Mixing Time

(Tu 12/3/19) **Final Exam Review Session by Ben Bray**

Come with questions!