- 22 5月, 2019 2 次提交
-
-
由 guomingz 提交于
* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size | with fusion | without fusion -- | -- | -- 1 | 214.7 | 53.4 50 | 1219.727 | 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop
-
由 Zhen Wang 提交于
* add quant_dequant_pass, test=develop * Add quant_dequant before some ops, such as the elementwise_add op. This is required by TensorRT. test=develop
-
- 21 5月, 2019 10 次提交
-
-
由 Yibing Liu 提交于
* Add LAMB optimizer * Expose LAMB Optimizer's APIs test=develop, test=document_preview * Cleanup code & doc test=develop, test=document_preview * Update lamb optimizer's formula test=develop
-
由 mozga-intel 提交于
-
由 Tao Luo 提交于
test=develop
-
由 zhaoyuchen2018 提交于
* Add api doc code examples add or fix topk, squeeze, stack, StaticRNN, StaticRNN memory in doc test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Add squeeze md5. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Add import package test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Jiabin Yang 提交于
-
由 mozga-intel 提交于
-
由 lidanqing 提交于
* Add 6 models tests support in CMake * enabling resnet101, vgg16, vgg19 INT8v2 model tests test=develop * remove SERIAL test=develop
-
由 liuwei1031 提交于
http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page test=develop
-
由 Zhaolong Xing 提交于
* add quant_dequant_moving_avg_max_abs op test=develop * add more note for quantdequant op test=develop
-
由 Hongyu Liu 提交于
-
- 20 5月, 2019 9 次提交
-
-
由 Qiao Longfei 提交于
* optimize communicator flag * change flags in init py test=develop
-
由 Zeng Jinle 提交于
-
由 liuwei1031 提交于
-
由 liuwei1031 提交于
* improve the doc of paddle.fluid.memory_optimize, test=develop * fix typo, test=develop
-
由 Tao Luo 提交于
test=develop
-
由 Zeng Jinle 提交于
-
由 wopeizl 提交于
* fix the random compilation failure on windows
-
由 lvmengsi 提交于
* double backward, elementwise_div * fix dx empty. test=develop * bug fix (#17392) fix secure bug * Eanble stack operator for a Ngraph, test=develop (#17406) * fix sqrt_grad_grad unittest. test=develop (#17410) * fix sqrt_grad_grad unittest. test=develop * disable sqrt_grad_grad unittest. test=develop * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix bug * fix unittest. test=develop * fix unittest dx. test=develop * tmp fix! for test... test=develop * reduce tmp, test=develop * test=develop, reduce tmp * fix broadcast unittest. test=develop * fix format. test=develop * refine code. test=develop * refine code. test=develop * refine GetDoubleGradSafeTensor. test=develop * fix format. test=develop
-
由 qingqing01 提交于
test=develop
-
- 19 5月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 Kaipeng Deng 提交于
-
- 18 5月, 2019 3 次提交
-
-
由 lvmengsi 提交于
add elementwise_sub_grad_grad op for backward of backward calculation
-
由 jiaqi 提交于
* fix data_feed_desc.py example run error test=develop test=test=document_preview * fix data_feed_desc.py example display error test=develop test=document_preview * update API.spec for DataFeedDesc test=develop test=document_preview
-
由 jiaqi 提交于
* examples use code-block in dataset.py test=develop test=document_preview * add QueueDataset example test=develop test=document_preview
-
- 17 5月, 2019 7 次提交
-
-
由 chengduo 提交于
* add record_event test=develop * remove csp test=develop
-
由 jiaqi 提交于
test=develop
-
由 Yan Xu 提交于
* add var grad hook test=develop
-
由 Jiabin Yang 提交于
* test=develop, add gradient sort backward strategy * test=develop, fix test by add FLAGS_cudnn_deterministic on new tests * test=develop, fix memory leak in dygraph mode * test=develop, fix memory leak in dygraph mode * test=develop, polish code * test=develop, polish code * test=develop, polish code
-
由 Qiao Longfei 提交于
* add cache_update_mutex_ for operator
-
由 Jiabin Yang 提交于
-
由 Bai Yifan 提交于
-
- 16 5月, 2019 7 次提交
-
-
由 zhaoyuchen2018 提交于
* improve gru unit performance. refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Add conditional compile for gru opt Not enable gru opt if compute ability < 700 test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 liuwei1031 提交于
* improve the API Sample of DataFeeder, memory_optimize and release_memory, test=develop * update API.spec, test=develop, test=document_preview * tweak the code format of feed API, test=develop * update API.spec, test=develop * improve doc for DataFeeder and default_main_program, test=develop
-
由 guru4elephant 提交于
add inductive shape index
-
由 Zeng Jinle 提交于
-
由 chengduo 提交于
Refine the Executor when the num_thread=1
-
由 Jie Fang 提交于
* init auto loss scaling test=develop * change API.spec * change ifelse to switch and use reduce_sum to optimize checking isfinite test=develop * Remove redundant code test=develop
-