- From Gradient Descent to Stochastic Gradient Descent (SGD)
- RMSProp and Adam
- Learning to learn by gradient descent by gradient descent
- Learning to optimize
Learning to learn and to optimize
References
[And16L]
Learning to learn by gradient descent by gradient descent,