Some links to stuff related to the lecture’s coverage:

An overview of gradient descent optimization algorithms.

Rectifier (neural networks) .

Backpropagation.

Escaping From Saddle Points – Online Stochastic Gradient for Tensor Decomposition (Ge et al.) 48 more words