Fork自 PaddlePaddle / Paddle
* Get three grad lists in CPP to avoid gpu idle time * Support legacy mode