- 13 5月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Optimize the elementwise op with CUDA kernels. test=develop * Support setting of attr in op config file. test=develop * Add the support the setting dtype and initializer in config. test=develop * Save workspace. * Add initializer "zeros". test=develop * Fix compiling error. * Support the use of existed file to initailize tensor in op_tester. * Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims. test=develop
-
- 07 3月, 2019 2 次提交
- 26 2月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493) * Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. test=develop * Refine the op benchmark to support setting lod in config. test=develop
-
- 22 2月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Initialize the benchmark tester for operator. test=develop * Rearrange the codes. test=develop
-