Find Related products on Amazon

Shop on Amazon

Why Momentum Works (2017)

Published on: 2025-08-05 14:01:10

⋆ \star ⋆ + + + − - − = = = α \alpha α λ \lambda λ β \beta β R R R α = \alpha= α = β = \beta= β = β = 0 \beta = 0 β = 0 β = 1 \beta=1 β = 1 α = 1 / λ i \alpha = 1/\lambda_i α = 1 / λ ​ i ​ ​ m o d e l \text{model} model 0 p 1 0 p_1 0 p ​ 1 ​ ​ 0 p ¯ 1 0 \bar{p}_1 0 ​ p ​ ¯ ​ ​ ​ 1 ​ ​ 2 β 2\sqrt{\beta} 2 √ ​ β ​ ​ ​ λ i \lambda_i λ ​ i ​ ​ λ i = 0 \lambda_i = 0 λ ​ i ​ ​ = 0 α > 1 / λ i \alpha > 1/\lambda_i α > 1 / λ ​ i ​ ​ max { ∣ σ 1 ∣ , ∣ σ 2 ∣ } > 1 \max\{|\sigma_1|,|\sigma_2|\} > 1 max { ∣ σ ​ 1 ​ ​ ∣ , ∣ σ ​ 2 ​ ​ ∣ } > 1 x i k − x i ∗ x_i^k - x_i^* x ​ i ​ k ​ ​ − x ​ i ​ ∗ ​ ​ ξ i \xi_i ξ ​ i ​ ​ β = ( 1 − α λ i ) 2 \beta = (1 - \sqrt{\alpha \lambda_i})^2 β = ( 1 − √ ​ α λ ​ i ​ ​ ​ ​ ​ ) ​ 2 ​ ​ Why Momentum Really Works Step-size α = 0.02 Momentum β = 0.99 We often think of Momentum as a means of dampening oscillations and speeding up the iterations, leading to faster convergence. But it has other interesting behavior. It allows a larger range of step-sizes to be used, and ... Read full article.