PaddlePaddle / Paddle
大约 1 年前同步成功

20931

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 1423
- 列表
- 看板
- 标记
- 里程碑
合并请求 543
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

Unify the gpu implementation of stack and unstack to reuse the optimization. (#49748) · 3586e856

由 Yiqun Liu 提交于 1月 31, 2023

* Unify the gpu implementation of stack and unstack to reuse the optimization.

* Optimize the cuda implementation of unstack.

* Use GpuMemcpyAsync instead of memory::Copy.

* Fix error of calculating the index.

* Use FastDivMod to further imporve the performance of unstack.

3586e856

unstack_grad_kernel.h 956 字节

PaddlePaddle / Paddle 大约 1 年 前同步成功

Replace unstack_grad_kernel.h

PaddlePaddle / Paddle
大约 1 年前同步成功