Fix learning rate scaling bug
this bug is quite peculiar and hard to track down, when learning rate for a parameter is scaled via param_attr and learning rate schedulers are used, `append_optimizer_op` will error out complaining `LearningRate` input is null turns out learning rate scaling is done in `_create_param_lr`, which basically add a scale op, the problem is: it is appended to `orig_prog` (since `global_learning_rate()` variable is in it), therefore the resulting scaled learning rate variable can not be found in `train_prog`. the reason it works previously w/o lr scaling is this: `clone()` will create a variable with the same name as the `global_learning_rate()` variable, which will be used in `append_optimizer_op`
Showing
想要评论请 注册 或 登录