Provable convergence of Nesterov's accelerated gradient method for over-parameterized neural networks
Momentum methods, such as heavy ball method (HB) and Nesterov's accelerated gradient method (NAG), have been widely used in training neural networks by incorporating the history of gradients into the current updating process. In practice, they often provide improved performance over (stochastic...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier B.V.
2022
|
Subjects: | |
Online Access: | View Fulltext in Publisher |