- 02 9月, 2018 1 次提交
-
-
由 dzhwinter 提交于
-
- 25 8月, 2018 1 次提交
-
-
由 dzhwinter 提交于
-
- 03 7月, 2018 1 次提交
-
-
由 yuyang18 提交于
It is used by NetOp before.
-
- 02 7月, 2018 3 次提交
- 07 6月, 2018 2 次提交
-
-
由 dzhwinter 提交于
* "split into multiple .ccl" * "refine file structure" * "refine files" * "remove the cmakelist" * "fix typo" * "fix typo" * fix ci
-
由 mozga-intel 提交于
* Add MKLDNN layout support in Paddle Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout can be used in MKLDNN enabled OP kernel. Before this commit, NCHW is hardcode to be used in all MKLDNN op kernels. As a result, non-optimized execution path is selected in MKLDNN primitive which bring worse performance. Besides framework change, three MKLDNN OP kernels were updated for using new MKLDNN layout. They are conv/pool2d/batch_norm. Other MKLDNN OP kernels need be also updated in similar way to achieve best performance. * Add MKLDNN layout support in activation OP * Don't populate layout from input to output when kMKLDNN in * Refine pool mkldnn op kernel * MKLDNN layout * Remove the inferitance from tensor file * MKLDNN layout: refactoring * Remove additional #define to register new operator * Prepare mkldnn tests to work with layout
-
- 18 4月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 17 4月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 09 2月, 2018 1 次提交
-
-
由 emailweixu 提交于
-
- 17 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 03 1月, 2018 1 次提交
-
-
由 Luo Tao 提交于
-
- 27 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "refine kernel registrar" * "refine registrar with multikey" * "fix register" * "refine multikernel register" * "fix CI" * "fix CI" * "fix registry" * "swtich GPU to CUDA" * "add register macro test case" * "fix CI"
-
- 25 12月, 2017 1 次提交
-
-
由 dzhwinter 提交于
-
- 24 12月, 2017 1 次提交
-
-
由 qiaolongfei 提交于
-
- 22 12月, 2017 1 次提交
-
-
由 xuwei06 提交于
For input argument with a list of variables, drop_empty_grad is not allowed because it makes the correspondence bewteen a variable and its gradient ambiguous. Use REGISTER_OP_EX to register the op or call InputGrad(?,false) in GradOpDescMaker.
-
- 21 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Rename XXDescBind --> XXDesc * Fix Compile
-
- 20 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Move framework.proto to proto namespace * Fix compile * Fix compile * Fix Compile
-
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 08 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Chage `IndicateDataType` to `GetKernelType`. Make it easier to understand. * Change `OpKernelKey` to `OpKernelType` * Make operator developers can customize which kernel the operator will use in runtime.
-
- 01 11月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Init commit * Make executor use ProgramDescBind * Change Attribute from BlockDesc to BlockDescBind * Since we will get the program desc in RNN, just BlockDesc is not enough.
-
- 29 10月, 2017 2 次提交
- 24 10月, 2017 1 次提交
-
-
由 Dong Zhihong 提交于
-
- 19 10月, 2017 2 次提交
- 18 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-
- 17 10月, 2017 1 次提交
-
-
由 qijun 提交于
-
- 13 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Add no_grad_vars for grad_op_maker * Add unittest * Fix unittest * Fix unittest * Follow comment
-
- 10 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-
- 06 10月, 2017 2 次提交
-
-
由 qiaolongfei 提交于
-
由 qiaolongfei 提交于
-
- 05 10月, 2017 4 次提交
-
-
由 Yu Yang 提交于
-
由 Yu Yang 提交于
-
由 qiaolongfei 提交于
-
由 Yi Wang 提交于
-