- 25 7月, 2019 1 次提交
-
-
由 lidanqing 提交于
* change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews * reverse user_residual_memory_p and user_bias_memory_p declaration scope test=develop
-
- 23 7月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
test=develop - compileation fix - Yet another compilation fix - Even yet another compilation fix - Surprise! Again compilation fix - lint fixes test=develop - Fix to workspace acquire of LRN test=develop - Fix to hash of BWD LRN test=develop - fix to lrn BWD PD acquire test=develop - Fixing LRN PD creation test=develop - cosmetic fix in comment test=develop - Fixes after review test=develop
-
- 19 7月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
test=develop
-
- 16 7月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - Added partial draft of pooling acquire - Workspace support - compilation fix - Added draft of pooling backward reimplementation - Segfault fix - reverted 'any' for diff_dst crewation in pooling - Lint fixes test=develop - lint fixes test=develop - Further lint fixes test=develop * - Fixes after review test=develop * - Lint fixes test=develop * - Even more lint fixes test=develop
-
- 10 7月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 09 7月, 2019 1 次提交
-
-
由 Jiabin Yang 提交于
* test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format
-
- 02 7月, 2019 1 次提交
-
-
由 Leo Zhao 提交于
* rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() test=develop * update session id definition and adjust logic for default behavior test=develop * reset logic in mkldnn reuse as most of cases work in default. test=develop
-
- 01 7月, 2019 1 次提交
-
-
由 Brian Liu 提交于
* Fix bug in quantize kernel which cause crash in vgg16/19 model test=develop * refine the code to reduce verbose code; test=develop * remove useless code; test=develop
-
- 28 6月, 2019 1 次提交
-
-
由 Leo Zhao 提交于
1. some key generation method is not aligned with PR#17965 2. enlarge ptr lifetime to avoid memory release if SetBlob fails otherwise it will get core dump. test=develop
-
- 27 6月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - Reusing of reuder used in elementwise_add_mkldnn - Added MKL-DNN sum prim reusing test=develop - Compilation fixes test=develop - Yet another compilation fix test=develop - Yet another compilation fix test=develo - Yet another linking fix test=develop - Final compilation fix test=develop - lint fixes test=develop - Lint fixes test=develop * - Fixes after review test=develop
-
- 11 6月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - removed is_reusing_ * - Added TID to keys for reusing apart from softmax PD * - compilation fix * - Yet another compilation fix * - Batch Norm and Conv adapted * - Fix to softmax MT * - Fixes to MT code of MKL-DNN * - Lint fixes test=develop
-
- 10 6月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* remove attribute in Allocator::Allocate, test=develop * fix travis ci error, test=develop
-
- 04 6月, 2019 1 次提交
-
-
由 Leo Zhao 提交于
test=develop
-
- 22 5月, 2019 1 次提交
-
-
由 guomingz 提交于
* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size | with fusion | without fusion -- | -- | -- 1 | 214.7 | 53.4 50 | 1219.727 | 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop
-
- 16 4月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - Reuse of conv PD - conv transpose pd reused - Added PD reusing of softmax and Batch Norm - Refactoring and removal of not needed routines of mkl-dnn ops test=develop - Fix to reusing conv test=develop - Lint fixes test=develop - Further lint fixes test=develop - Lint fixes test=develop - lint fixes test=develop - Lint workaround test=develop * - Fix after review on including boost as third party header test=develop * - Fix after review. Name change to something more descriptive test=develop
-
- 28 3月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* Revert "[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233)" This reverts commit 13816dd4. Apart from enabling transformer for MKL-DNN * Revert "- MKL-DNN pooling updated to set_prim_desc" This reverts commit c63f6b20. Conflicts: paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc * Revert "[MKL-DNN] MKL-DNN specific Tensor modification (#15429)" test=develop This reverts commit dec9cf53. * - concat compilation fix - lint test=develop - Lint fixes test=develop - Lint fixes test=develop - Fix Transpose MKLDNN op test=develop
-
- 27 2月, 2019 1 次提交
-
-
由 xiaolil1 提交于
* Optimize key creation of INT8 pool kernel to improve the peformance of ResNet-50 and MobileNet, especially for latency. test=develop * Optimize key creation of pool fp32 grad. test=develop
-
- 25 2月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - Implemented draft of primitive desc keeping in Tensor test=develop - TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented - Added nchw and nc formats setting for sake of compatiblity Fixed unit tests - Worakaround to problem with 5D data in conv - Added 3D and 1D MKL-DNN formats for name handles for tensor test=develop - Fix to UTs test=develop - Conv fp32 op was updated Cosmetic fixes test=develop - tensor mkldnn cosmetics test=develop - Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils * - Lint fixes test=develop * - setting prim dec in Tensor , sets also layout to kMKLDNN test=develop * - Moved creation of prim desc totally out of Tensor test=develop * - Cosmetic fixes adter review test=develop
-
- 22 2月, 2019 1 次提交
-
-
由 Sylwester Fraczek 提交于
reason: dereferencing smart pointer is the same as the underlying pointer test=develop
-
- 23 1月, 2019 1 次提交
-
-
由 tangwei12 提交于
checkpoint for distributed training.
-
- 10 1月, 2019 1 次提交
-
-
由 xiaolil1 提交于
* Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Enable INT8 Conv with residual fusion OP test=develop * Modify code. test=develop * Modify basic INT8 Conv test=develop * Modify Conv. test=develop * fix style test=develop * Fix style test=develop * Fix test test=develop * Modify code. test=develop * Fix test test=develop
-
- 07 1月, 2019 1 次提交
-
-
由 xiaolil1 提交于
* Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Modify basic INT8 Conv test=develop * fix type test=develop * Modify test test=develop
-
- 04 1月, 2019 1 次提交
-
-
由 xiaolil1 提交于
* Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Modify basic INT8 Conv test=develop
-
- 24 12月, 2018 1 次提交
-
-
由 xiaoli.liu@intel.com 提交于
test=develop
-
- 19 12月, 2018 1 次提交
-
-
由 Jacek Czaja 提交于
test=develop
-
- 27 11月, 2018 1 次提交
-
-
由 Jacek Czaja 提交于
test=develop - Added new header for MKLDNN reuse functionality - Extended conv2d_transpose GetExpectedKernelType for MKL-DNN supporrt - Buildable conv transpose mkldnn and conv mkldnn using conv template - Conv2d transpose roughlt implemented and buildable - Added modifications conv2d transpose MKLDNN unit tests - Fix to UT of conv2d transpose mkldnn op - Wrong type of MKLDNN primitive was chosen for conv2d transpose - HAcks for conv2d transpose - UT enalbed - Replaced copying loop with memcpy - Draft of passing lambda into AcquireMemory - Made reorder (IOHW->OIHW) to be called only once
-