提交 · 2281ebf0f3c50a3ba5398632a3e3bc344ca634f2 · PaddlePaddle / Paddle

22 5月, 2019 1 次提交

Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130) · 2281ebf0

由 guomingz 提交于 5月 22, 2019

* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.

Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280

test=develop

* Fix the format issue

test=develop

* Add the missing nolint comments.

test=develop

* Fix the typos.

test=develop

* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.

test=develop

* Adjust the indentation.

test=develop

* Add the test_conv_brelu_mkldnn_fuse_pass case.

test=develop

* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.

test=develop

2281ebf0

21 5月, 2019 6 次提交

Add LAMB Optimizer support (#17489) · f9796b12

由 Yibing Liu 提交于 5月 21, 2019

* Add LAMB optimizer

* Expose LAMB Optimizer's APIs

test=develop, test=document_preview

* Cleanup code & doc

test=develop, test=document_preview

* Update lamb optimizer's formula

test=develop

f9796b12

M

Enabled ngraph elementwise max operator (#17517) · 99ab5712
由 mozga-intel 提交于 5月 21, 2019

99ab5712
T
remove unused SERIAL compiler option (#17500) · 3d19f44a
由 Tao Luo 提交于 5月 21, 2019
```
test=develop
```
3d19f44a
M

Enable abs operator for a ngraph test=develop (#17436) · 1eb15175
由 mozga-intel 提交于 5月 20, 2019

1eb15175

fix security bugs : (#17464) · ba70cc49

由 liuwei1031 提交于 5月 21, 2019

http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page

test=develop

ba70cc49

Z
add quant_dequant_moving_avg_max_abs op (#17480) · ff7f911b
由 Zhaolong Xing 提交于 5月 21, 2019
```
* add quant_dequant_moving_avg_max_abs op
test=develop

* add more note for quantdequant op
test=develop
```
ff7f911b

20 5月, 2019 2 次提交

Q
Optimize communicator flags (#17494) · 287de41c
由 Qiao Longfei 提交于 5月 20, 2019
```
* optimize communicator flag

* change flags in init py test=develop
```
287de41c

Double backward elementwise div (#17416) · 10b23a72

由 lvmengsi 提交于 5月 20, 2019

* double backward, elementwise_div

* fix dx empty. test=develop

* bug fix (#17392)

fix secure bug

* Eanble stack operator for a Ngraph, test=develop (#17406)

* fix sqrt_grad_grad unittest. test=develop (#17410)

* fix sqrt_grad_grad unittest. test=develop

* disable sqrt_grad_grad unittest. test=develop

* test=develop, fix unittest

* test=develop, fix unittest

* test=develop, fix unittest

* test=develop, fix bug

* fix unittest. test=develop

* fix unittest dx. test=develop

* tmp fix! for test... test=develop

* reduce tmp, test=develop

* test=develop, reduce tmp

* fix broadcast unittest. test=develop

* fix format. test=develop

* refine code. test=develop

* refine code. test=develop

* refine GetDoubleGradSafeTensor. test=develop

* fix format. test=develop

10b23a72

19 5月, 2019 1 次提交
- Z
  
  fix recurrent fwd bug when no backward and scope clear (#17460) · 3d4e8268
  由 Zeng Jinle 提交于 5月 19, 2019
  
  3d4e8268
18 5月, 2019 1 次提交
- L
  support elementwise_sub double backward (#17476) · 977e9fcb
  由 lvmengsi 提交于 5月 18, 2019
```
add elementwise_sub_grad_grad op for backward of backward calculation
```
  977e9fcb
17 5月, 2019 3 次提交
- C
  Add record event And remove CSP (#17447) · 5a6ab380
  由 chengduo 提交于 5月 17, 2019
```
* add record_event
test=develop

* remove csp
test=develop
```
  5a6ab380
- Y
  polish parallel dygraph code (#17164) · 02175555
  由 Yan Xu 提交于 5月 17, 2019
```
* add var grad hook test=develop
```
  02175555
- B
  
  fix assert,test=develop (#17445) · 3a9ae28d
  由 Bai Yifan 提交于 5月 17, 2019
  
  3a9ae28d
16 5月, 2019 2 次提交

Add conditional compile for gru opt (#17368) · b02f2aff

由 zhaoyuchen2018 提交于 5月 16, 2019

* improve gru unit performance.
refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Add conditional compile for gru opt

Not enable gru opt if compute ability < 700

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

b02f2aff

Z

fix recurrent_op,test=develop (#17433) · 712bfb17
由 Zeng Jinle 提交于 5月 16, 2019

712bfb17

15 5月, 2019 6 次提交
- M
  
  Eanble stack operator for a Ngraph, test=develop (#17406) · 6ee6700f
  由 mozga-intel 提交于 5月 15, 2019
  
  6ee6700f
- K
  Optimize the sequence padding op (#17403) · 0823a7bc
  由 Krzysztof Binias 提交于 5月 15, 2019
```
test=develop
```
  0823a7bc
- B
  
  NGraph Added fill_zeros_like op test=develop (#17295) · 1ce7b45b
  由 baojun 提交于 5月 14, 2019
  
  1ce7b45b
- B
  
  NGraph Added dropout and dropout_grad to ngraph test=develop (#17320) · 91019652
  由 baojun 提交于 5月 14, 2019
  
  91019652
- M
  
  Ngraph Enable gather operator test=develop (#17296) · b1894807
  由 mozga-intel 提交于 5月 14, 2019
  
  b1894807
- L
  Double backward sqrt (#17387) · 4ef63101
  由 lvmengsi 提交于 5月 15, 2019
```
* double backward sqrt

* refine unittest. test=develop

* refine test. test=develop

* remove alpha in unittest. test=develop
```
  4ef63101
14 5月, 2019 6 次提交

Double backward reduce mean (#17372) · 5d1ac41b

由 lvmengsi 提交于 5月 14, 2019

* test=develop, double backward reduce_mean

* add comment. test=develop

* fix format. test=develop

* rename GradGrad -> DoubleGrad. test=develop

* fix op_use_default_grad_op_maker.spec. test=develop

5d1ac41b

J

enhance generate mask labels, test=develop (#17380) · 0cae5a36
由 jerrywgz 提交于 5月 14, 2019

0cae5a36
K
add elementwise_add_grad_grad op (#17366) · bd9bef5a
由 Kaipeng Deng 提交于 5月 14, 2019
```
* add elementwise_add_grad_grad op. test=develop

* use defined GradMaker. test=develop
```
bd9bef5a
J
add collect fpn proposals op,test=develop (#16074) · 1c6d0646
由 jerrywgz 提交于 5月 14, 2019
```
* add collect fpn proposals op,test=develop
```
1c6d0646

support fc_op double grad (#17317) · 60be66e2

由 Kaipeng Deng 提交于 5月 14, 2019

* add double grad for mul_op. test=develop

* fix format. test=develop

* fix format. test=develop

* fix format. test=develop

* refine code. test=develop

* remove setzero. test=develop

* fix dx/dy init bug. test=develop

* fix format. test=develop

60be66e2

L
Fix the uninitialized gru_value.output_value. (#17197) · 08635993
由 liuwei1031 提交于 5月 14, 2019
```
test=develop
```
08635993

13 5月, 2019 4 次提交

Optimize the computing kernel of sequence_reverse operator (#17349) · 218d8d8f

由 Yihua Xu 提交于 5月 13, 2019

* Optimize the computing kernel of sequence_reverse operator.

test=develop

* Clean code

test=develop

* Fix for cpplint syntax checking.

test=develop

* Fix the compile warning issue.

test=develop

218d8d8f

Optimize the elementwise op using eigen (#15494) · dcda2023

由 Yiqun Liu 提交于 5月 13, 2019

* Optimize the elementwise op with CUDA kernels.
test=develop

* Support setting of attr in op config file.
test=develop

* Add the support the setting dtype and initializer in config.
test=develop

* Save workspace.

* Add initializer "zeros".
test=develop

* Fix compiling error.

* Support the use of existed file to initailize tensor in op_tester.

* Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
test=develop

dcda2023

add double grad for elementwise_mul op (#17255) · 8bae8590

由 Kaipeng Deng 提交于 5月 13, 2019

* add double grad for elementwise_mul. test=develop

* remove comment. test=develop

* fix grad sum. test=develop

* fix for axis expand. test=develop

* add test for axis expand. test=develop

8bae8590

add double grad for square op (#17173) · 11d3a38f

由 Kaipeng Deng 提交于 5月 13, 2019

* add double grad for square. test=develop

* formax code. test=develop

* fix for grad sum. test=develop

* refine shape. test=develop

* refine extract. test=develop

11d3a38f

10 5月, 2019 4 次提交

Z

Add Where Op(#16793) · d4b67e16
由 zhoukunsheng 提交于 5月 10, 2019

d4b67e16
Z

Add Diag Op(#17027) · 1bfff020
由 zhoukunsheng 提交于 5月 10, 2019

1bfff020

improve gru unit performance. (#16338) · 8a2caacd

由 zhaoyuchen2018 提交于 5月 10, 2019

refine code

fuse cublas  calling and kernels into one cuda kernel.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

8a2caacd

Double backward of conv2d. (#17211) · e32c9888

由 qingqing01 提交于 5月 10, 2019

* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
    - Now use it in conv2d_grad_grad.
    - Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables，return None in Python.

e32c9888

09 5月, 2019 2 次提交
- Z
  
  follow comments,test=develop (#17273) · fff270ea
  由 Zeng Jinle 提交于 5月 09, 2019
  
  fff270ea
- Z
  Mod floordiv (#17251) · 4292bd86
  由 zhoukunsheng 提交于 5月 09, 2019
```
* test=develop
add elementwise_mod and elementwise_floordiv, fix equation problem in elementwise_mod
```
  4292bd86
08 5月, 2019 2 次提交

X
modified formula for Lrn (#17281) · 9ed4aaad
由 xiaoting 提交于 5月 08, 2019
```
* modified formula for lrn

test=develop

* modified api.spec

test=develop
```
9ed4aaad

Refine elementwise kernel. (#16952) · 792443ef

由 zhaoyuchen2018 提交于 5月 08, 2019

* Refine elementwise kernel.

Add a simple cuda kernel if grad x and y both exist
Use 2D block cuda kernel to do broadcast.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

792443ef

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功