Fix learning rate scaling bug (!26) · 合并请求 · PaddlePaddle / hapi

Fix learning rate scaling bug !26

Created by: willthefrog

this bug is quite peculiar and hard to track down, when learning rate for a parameter is set via param_attr and learning rate scheduler is used, append_optimizer_op will fail.

turns out when learning rate scaling is done in _create_param_lr, which basically add a scale op, however, the scaling op is appended to the program of the global_learning_rate() variable, which is still in orig_prog, therefore the resulting scaled learning rate can not be found in train_prog.

the reason it works previously w/o lr scaling is this: clone() will create a variable with the same name as the global_learning_rate() variable, which will be used in append_optimizer_op