For classification task fluid version trains slower than v2 version with same learning rate. (#14830) · Issue · PaddlePaddle / Paddle

For classification task fluid version trains slower than v2 version with same learning rate.

Created by: TheodoreG

We found that for the same input, same network structure, same Momentum optimizer and all the sam e hyper parameters, fluid classification version has 1% ~ 2% lower accuracy compared to v2 version and trains slower. When we set learning rate 10x larger for the fluid version, the loss descendant curve resembles v2's curve more. So is there any explanation for my observation?

PaddlePaddle / Paddle 大约 2 年 前同步成功

For classification task fluid version trains slower than v2 version with same learning rate.

PaddlePaddle / Paddle
大约 2 年前同步成功