- 07 6月, 2018 1 次提交
-
-
由 mozga-intel 提交于
* Add MKLDNN layout support in Paddle Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout can be used in MKLDNN enabled OP kernel. Before this commit, NCHW is hardcode to be used in all MKLDNN op kernels. As a result, non-optimized execution path is selected in MKLDNN primitive which bring worse performance. Besides framework change, three MKLDNN OP kernels were updated for using new MKLDNN layout. They are conv/pool2d/batch_norm. Other MKLDNN OP kernels need be also updated in similar way to achieve best performance. * Add MKLDNN layout support in activation OP * Don't populate layout from input to output when kMKLDNN in * Refine pool mkldnn op kernel * MKLDNN layout * Remove the inferitance from tensor file * MKLDNN layout: refactoring * Remove additional #define to register new operator * Prepare mkldnn tests to work with layout
-
- 08 5月, 2018 1 次提交
-
-
由 Yu Yang 提交于
Do not use ctor * Reduce line of codes. * We can use virtual function for Maker now. * The implementation does not care what maker holds, it is easier to refactor later.
-
- 03 5月, 2018 1 次提交
-
-
由 Tomasz Patejko 提交于
* Initial implementation of forward pass for MKLDNN batch norm * Added attributes for MKLDNN batch norm * MKLDNN batch norm forward pass passes unittest. Started working on backward * Backward pass for MKLDNN batch norm added * MKLDNN batch norm: scoring added to forward pass * MKLDNN batch norm: bias as input added; handling AnyLayout when kernel is looked up * MKLDNN batch norm: python unit tests added; mkldnn tests removed * MKLDNN batch norm: changes required by cpplint * MKLDNN batch norm: refactoring the operator * MKLDNN batch norm: saved variance inversed in backward pass for correct execution of MKLDNN unit tests * MKLDNN batch norm: refctoring, function for static/const cast to void* added * MKLDNN batch norm: remove AnyLayout from batch norm * MKLDNN batch norm: only NCHW format is supported. Unittests refactored * MKDNN batch norm: use_mkldnn added to attributes * MKLDNN batch norm: AnyLayout removed from unittest * MKLDNN batch norm: added CUDNN defines to batch norm * MKLDNN batch norm: undefined data_format variable corrected * MKLDNN batch norm: use_cudnn added, use of setUp method for configuring attributes * MKLDNN batch norm: added use_cudnn attribute to batch norm operator * MKLDNN batch norm: correcting batch norm unit tests for MKLDNN * MKLDNN batch norm: MKLDNN tests moved to another file; reverting changes for saved variance not being inverted * Change default layout to NCHW * MKLDNN batch norm: init_kernel_type method added to unit tests * MKLDNN batch norm: style changes * MKLDNN batch norm: unit tests refactored * MKLDNN batch norm: added use_mkldnn attribute to batch norm python interface
-
- 02 5月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "fix double type error" * "fix ci"
-
- 11 4月, 2018 1 次提交
-
-
由 Siddharth Goyal 提交于
-
- 21 3月, 2018 1 次提交
-
-
由 Yu Yang 提交于
-
- 19 3月, 2018 2 次提交
-
-
由 Kexin Zhao 提交于
-
由 Kexin Zhao 提交于
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 08 1月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
* add rename guard * add device_data_transform * add device_data_transform_test * modify GetExpectedKernelType * update operator.run * support test test_label_semantic_roles * optimize code * optimize code * rename GetActualKernelType to GetExpectedKernelType * fix chunk_eval_op and device_data_transform_test * add is_same_place to place * optimize code, refine rename_guard * refine rename guard, add GetKernelTypeForVar * optimize code * add some log * rename guard * use sub scope to create var * fix compile * add IsInitialized for Tensor * add VarIsTensor * fix op_registry_test * test * tmp disable priority * restore switch_kernel.md * code clean
-
- 04 1月, 2018 1 次提交
-
-
由 Yang Yu 提交于
-
- 26 12月, 2017 1 次提交
-
-
由 chengduoZH 提交于
-
- 25 12月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* init kernel hint * fix typo * rm unused code * add include in op_kernel.h * restore op_kernel since it will be moved to op_kernel_type * change force_cpu to use_cpu * fix compilation
-
- 22 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
* add data layout * fix ci
-
- 20 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Move framework.proto to proto namespace * Fix compile * Fix compile * Fix Compile
-
- 12 12月, 2017 1 次提交
-
-
由 QI JUN 提交于
There are mainly following fixes: - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place` - remove `eigen_device` interface in base class `DeviceContext` - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext` - remove unused `platform::EigenDeviceConverter` - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL` - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
-
- 28 11月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* batch norm support matrix input * update gpu code * format code
-
- 08 11月, 2017 2 次提交
-
-
由 dangqingqing 提交于
-
由 Yu Yang 提交于
* Chage `IndicateDataType` to `GetKernelType`. Make it easier to understand. * Change `OpKernelKey` to `OpKernelType` * Make operator developers can customize which kernel the operator will use in runtime.
-
- 05 11月, 2017 1 次提交
-
-
由 kavyasrinet 提交于
* Adding the doc format for AdaDelta * Updating the documentation for Adagrad, Adam and Adamax * Updating the auc op * Fix review comments * Updating doc for Batch Norm * Updating the cast op * Updating the clip op * Fixing review comment * Fixing review comment: * Small change to restart PR_CI
-
- 04 11月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* add acc layer * memory log level change from 3 to 10 * use gaussian random to init conv parameters * use initializer * fix import * batch_norm use helper to create persistable var * refine code * train only 2 batches for test * use g_program and g_init_program * use XavierInitializer to init fc parameter
-
- 30 10月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* add batch_norm_layer * add img_conv_group layer and test * add check to Tensor.type() * forward can run * with backward * change label data time from int32 to int64 * refine code * follow comment
-
- 25 10月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* init batch norm op * prepare input output * compute mean_out var_out save_mean save_var on CPU * active is test * use eigen to do computation * complete batch norm forward * set default momentum to 0.9 * add batch norm grad op in CPU * add tensor_format and NHWC support, add python test * add test training * add batch norm gradient test * improve comment, fix foward Python UnitTest * add gradient test * fix eigen warning * follow name style * fix a bug * change float to T * add simple forward test * test with different place * add backward test * refine python test * remove old python test code * code clean * follow code style * update comment
-