Add dygraph double grad implementation (!22939) · 合并请求 · PaddlePaddle / Paddle

Add dygraph double grad implementation !22939

Created by: sneaxiy

This PR adds double grad implementation in dygraph mode. The implementation is based on paddle.fluid.dygraph.grad which would calculate the gradients of y with respect to x (x and y can be any vars in the network).

StarGAN models with gradient penalty are tested in this PR.

This PR also rewrites some original codes of dygraph, such as VariableWrapper, SavedVariableWrapperList, GradOpNode, etc. After testing in PTB model, the time cost of one epoch on V100 CUDA 9 machine is about 82-84s after this revision, which is the same as the original develop code.

PaddlePaddle / Paddle 大约 2 年 前同步成功

Add dygraph double grad implementation !22939

PaddlePaddle / Paddle
大约 2 年前同步成功