- 07 6月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix cuda/cudnn version detection error, test=develop * fix again, test=develop
-
- 05 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 04 6月, 2019 1 次提交
-
-
由 Leo Zhao 提交于
test=develop
-
- 03 6月, 2019 2 次提交
-
-
由 wangchaochaohu 提交于
* revise conv layer cudnn algo choose test=develop * update for code style test=develop * update for code style test=develop
-
由 chengduo 提交于
test=develop
-
- 29 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 27 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 24 5月, 2019 2 次提交
-
-
由 wopeizl 提交于
* add __str__ method for tensor and lodtensor to support print test=develop
-
由 mozga-intel 提交于
* Enable assign operator for a ngraph, test=develop * Cross_entropy operators needs to be updated
-
- 23 5月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* Revert "Revert "Fix allocator bug"" This reverts commit 174d0d0b. * Revert "fix travis ci" This reverts commit 5656fa9f. test=develop * add inlined_vector.h, test=develop * add inlined_vector_test,test=develop
-
由 mozga-intel 提交于
-
- 22 5月, 2019 1 次提交
-
-
由 guomingz 提交于
* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size | with fusion | without fusion -- | -- | -- 1 | 214.7 | 53.4 50 | 1219.727 | 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop
-
- 20 5月, 2019 1 次提交
-
-
由 qingqing01 提交于
test=develop
-
- 15 5月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 10 5月, 2019 1 次提交
-
-
由 qingqing01 提交于
* Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables,return None in Python.
-
- 08 5月, 2019 3 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine elementwise kernel. Add a simple cuda kernel if grad x and y both exist Use 2D block cuda kernel to do broadcast. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 chengduo 提交于
test=develop
-
由 baojun 提交于
* added lrn op test=develop * Added CreateConstant method test=develop * avoid duplicates test=develop
-
- 07 5月, 2019 1 次提交
-
-
由 Tao Luo 提交于
* remove unused FLAGS_warpctc_dir test=develop * remove FLAGS_warpctc_dir test=develop
-
- 30 4月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
test=develop
-
- 28 4月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn. 2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search. test=develop
-
- 23 4月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* make_conv_cudnn_ws_size_configurable, test=develop * change std::max to std::min test=develop
-
- 21 4月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* speedup gc and inplace softmax_with_cross_entropy_grad test=develop * refine models gpu mem Merge skip vars and warning messages of mem opt remove relu mem opt test=develop * follow comments test=develop
-
- 18 4月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 16 4月, 2019 2 次提交
-
-
由 xuezhong 提交于
test=develop
-
由 Jacek Czaja 提交于
* - Reuse of conv PD - conv transpose pd reused - Added PD reusing of softmax and Batch Norm - Refactoring and removal of not needed routines of mkl-dnn ops test=develop - Fix to reusing conv test=develop - Lint fixes test=develop - Further lint fixes test=develop - Lint fixes test=develop - lint fixes test=develop - Lint workaround test=develop * - Fix after review on including boost as third party header test=develop * - Fix after review. Name change to something more descriptive test=develop
-
- 11 4月, 2019 1 次提交
-
-
由 dongdaxiang 提交于
test=develop
-
- 03 4月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
test=develop This reverts commit c38c7c56.
-
- 02 4月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* link the libwbaes.so into paddle * polish detail, test=develop * try fix mac_pr_ci error, test=develop * add compile option, test=develop * fix ci error, test=develop * ignore failed to find mac lib, test=develop * change cdn to bj, cdn can't get the latest version * trigger ci, test=develop * temporary delete win32 lib linking, test=develop * change https to http, test=develop * turn compile option on to off * turn compile option off to on, test=develop * try lib compiled by gcc4.8, test=develop * update lib version, test=develop * link other lib, test=develop * add setup config * delete false, test=develop * delete no_soname, test=develop * recover so name set * fix, test=develop * adjust make config, test=develop * remove link to wbaes, test=develop * remove useless define, test=develop
-
- 30 3月, 2019 1 次提交
-
-
由 gongweibao 提交于
* fix compiled test=develop * follow comments test=develop
-
- 29 3月, 2019 10 次提交
-
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
test=develop
-
由 dongdaxiang 提交于
-
由 dongdaxiang 提交于
support win32 flag in io.cc shell.cc, fix code style problem in fleet_wrapper, fix lodtensor_printer_test problem test=develop
-
由 dongdaxiang 提交于
test=develop
-