Created by: chenwhql
In dygraph multi-cards mode, in order to speed up execution, we concat grad variable then only execute one allreduce collective, so we also need to split the concat grad variable and then reshape it to original shape inplace.
Because we disable reshape inplace in dygraph mode, so the grad variable cannot return to its original shape, this will cause infershape error in adam or momentum op.
----------------------
Error Message Summary:
----------------------
Error: Param and Grad input of AdamOp should have same dimension
[Hint: Expected param_dims == ctx->GetInputDim("Grad"), but received param_dims:20, 1, 5, 5 != ctx->GetInputDim("Grad"):500.] at (/work/paddle/paddle/fluid/operators/optimizers/adam_op.cc:65)
This PR does the following:
- add param and grad dimension check for sgd op
- add a internal reshape_inplace inferface for dygraph parallel
- make the parallel unittests optimizer strategy consistent with the scripts in paddle/models/dygraph