Fork自 PaddlePaddle / Paddle
Using DeviceContext, not Place to get stream
* with unit-tests * Also complete `memcpy`