- 13 11月, 2018 1 次提交
-
-
由 tensor-tang 提交于
test=develop
-
- 07 6月, 2018 1 次提交
-
-
由 mozga-intel 提交于
* Add MKLDNN layout support in Paddle Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout can be used in MKLDNN enabled OP kernel. Before this commit, NCHW is hardcode to be used in all MKLDNN op kernels. As a result, non-optimized execution path is selected in MKLDNN primitive which bring worse performance. Besides framework change, three MKLDNN OP kernels were updated for using new MKLDNN layout. They are conv/pool2d/batch_norm. Other MKLDNN OP kernels need be also updated in similar way to achieve best performance. * Add MKLDNN layout support in activation OP * Don't populate layout from input to output when kMKLDNN in * Refine pool mkldnn op kernel * MKLDNN layout * Remove the inferitance from tensor file * MKLDNN layout: refactoring * Remove additional #define to register new operator * Prepare mkldnn tests to work with layout
-
- 08 5月, 2018 1 次提交
-
-
由 Yu Yang 提交于
Do not use ctor * Reduce line of codes. * We can use virtual function for Maker now. * The implementation does not care what maker holds, it is easier to refactor later.
-
- 19 4月, 2018 1 次提交
-
-
由 Yang Yang(Tony) 提交于
* script to add semicolon * fix typo
-
- 17 4月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 12 4月, 2018 1 次提交
-
-
由 Siddharth Goyal 提交于
* Fix cpplint errors, round2 * Fix pointer issue
-
- 30 3月, 2018 1 次提交
-
-
由 Tomasz Patejko 提交于
-
- 22 3月, 2018 1 次提交
-
-
由 Tomasz Patejko 提交于
-
- 21 3月, 2018 1 次提交
-
-
由 Tomasz Patejko 提交于
-
- 19 3月, 2018 3 次提交
-
-
由 Tomasz Patejko 提交于
-
由 Tomasz Patejko 提交于
-
由 Tomasz Patejko 提交于
-
- 15 3月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 26 12月, 2017 1 次提交
-
-
由 Luo Tao 提交于
-
- 20 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Move framework.proto to proto namespace * Fix compile * Fix compile * Fix Compile
-
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 06 12月, 2017 1 次提交
-
-
由 gongweibao 提交于
Add LRN efficient GPU implement
-
- 04 11月, 2017 1 次提交
-
-
由 kexinzhao 提交于
-
- 26 10月, 2017 1 次提交
-
-
由 gongweibao 提交于
Add local response normalize
-