Add momentum operator (!4571) · 合并请求 · PaddlePaddle / Paddle

Add momentum operator !4571

Created by: sidgoyal78

This PR adds the implementation of momentum operator.

In summary, we want to perform the update with a new velocity vector, such that,

 velocity = mu * velocity + grad
 param = param - learning_rate * velocity

(where mu is the momentum coefficient).