提交 · 3df38f5cdd0866c1e78f1c2674d3d6cf3166d35f · 机器未来 / Paddle

10 1月, 2020 2 次提交

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

石
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) (#22185) · e8e12499
由石晓伟提交于 1月 10, 2020
```
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop

* export FLAGS and GLOG symbols, test=develop
```
e8e12499

08 1月, 2020 1 次提交

Fix multi-threads memory out of bounds error for passes (#21920) (#22132) · 835201bf

由 liu zhengxi 提交于 1月 08, 2020

* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop

* fix attention_lstm_fuse_pass during multi-threads inference, test=develop

* fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop

835201bf

07 1月, 2020 1 次提交
- P
  
  fix trt calib not working bug, test=develop (#21934) (#22110) · 5a611afd
  由 Pei Yang 提交于 1月 07, 2020
  
  5a611afd
09 12月, 2019 1 次提交
- Z
  Revert "CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449)" (#21619) · f7c629d9
  由 Zhaolong Xing 提交于 12月 09, 2019
```
This reverts commit 0473cdb8.
```
  f7c629d9
02 12月, 2019 1 次提交
- Z
  
  CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449) · 0473cdb8
  由 Zhaolong Xing 提交于 12月 02, 2019
  
  0473cdb8
26 11月, 2019 1 次提交
- W
  
  [Cherry-pick 1.6] Fix dgc buffer illegal & reuse velocity & fix fuse (#21281) · 93c7f058
  由 WangXi 提交于 11月 26, 2019
  
  93c7f058
25 11月, 2019 1 次提交

Add pre-condition check for fuse optimizer op pass (#21005) (#21305) · 9f004548

由 Chen Weihang 提交于 11月 25, 2019

* add pre condition check for fuse optimizer op pass, test=develop

* add log & set init to zero, test=develop

* fix test_fuse_all_reduce_pass failed, test=develop

* polish details, test=develop

* refine PADDLE_ENFORCE & remove needless VLOG, test=develop

* refactor op check method, test=develop

9f004548

07 11月, 2019 1 次提交

[cherry-pick] fix squared_mat_sub_fuse_pass bug when elementwise_op input is... · e6ed6379

由 Wilber 提交于 11月 07, 2019

[cherry-pick] fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param test=develop test=release/1.6 (#21044)

fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param

e6ed6379

30 10月, 2019 1 次提交
- L
  [cherry-pick] Add support to gcc8, add docker env (#20892) · 6fb04e8a
  由 liu zhengxi 提交于 10月 30, 2019
```
* add support to gcc8, add docker env
* remove the warning issue
```
  6fb04e8a
21 10月, 2019 1 次提交
- W
  
  [Cherry-pick 1.6] Fix DGC test and DGC nan bug (#20708) · 2378aa8a
  由 WangXi 提交于 10月 21, 2019
  
  2378aa8a
20 10月, 2019 1 次提交
- Z
  CHERRY_PICK 20720: fix ts_sort's bug, test=develop (#20726) · a7d0d888
  由 Zhaolong Xing 提交于 10月 20, 2019
```
test=release/1.6
```
  a7d0d888
14 10月, 2019 2 次提交

P

add DisableGlogInfo() to AnalysisConfig, test=develop (#20581) (#20600) · fed1263c
由 Pei Yang 提交于 10月 14, 2019

fed1263c

[cherry-pick] Add multihead matmul fuse pass(#20167) (#20592) · cefbcf77

由 zhaoyuchen2018 提交于 10月 13, 2019

* Add Multihead matmul fuse pass (#20167)

* Add multihead fuse pass for ernie opt

* Refine softmax

test=develop

* Refine cuda kernel

* Refine cuda version

* Refine cmake

test=develop

* refine header file

* refine test case and pass
* refine comments

* Delete useless code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

cefbcf77

28 9月, 2019 1 次提交

Follow comment of Merged QAT PR 18970 (#19979) · 9de67725

由 bingyanghuang 提交于 9月 28, 2019

* Follow Wangzhen's comment in PR 18970, test=develop

* Review comments, test=develop

* Leave fake quantization around mul

test=develop

* Replace Fake with Real Quantized Mul

test=develop

* Fix bug in quantize placement pass

Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop

9de67725

27 9月, 2019 2 次提交

Disable conv requant squash (#20041) · f5221ac1

由 joanna.wozna.intel 提交于 9月 27, 2019

* Fix conv2d+dequantize squash for residual fusion

test=develop

* Disable conv-requant squash

test=develop

f5221ac1

codegen code for reconstruction (#19728) · c9ea317b

由 wangchaochaohu 提交于 9月 27, 2019

* codegen code for reconstruction test=develop

* fix the cmake test=develop

* fix review advice test=develop

c9ea317b

26 9月, 2019 1 次提交
- C
  Add dtype for coalesce_tensor_op (#20016) · 101a2b61
  由 chengduo 提交于 9月 26, 2019
```
Add dtype for coalesce_tensor_op
```
  101a2b61
19 9月, 2019 2 次提交

J
Fix conv2d+dequantize squash for residual fusion (#19545) · 3f1d0234
由 joanna.wozna.intel 提交于 9月 19, 2019
```
* Fix conv2d+dequantize squash for residual fusion

test=develop

* Change condition

test=develop
```
3f1d0234

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

18 9月, 2019 2 次提交
- Z
  
  fix gc bug in controlflow ops, test=develop (#19827) · 3fd3b663
  由 Zeng Jinle 提交于 9月 18, 2019
  
  3fd3b663
- Z
  [Bug fix] Disable memory reuse on feeded variables (#19835) · db26de83
  由 Zeng Jinle 提交于 9月 18, 2019
```
* fix memory reuse bug on feeding variables, test=develop

* add comments to reference count members, test=develop
```
  db26de83
16 9月, 2019 2 次提交

C
Fix warning info of build_strategy (#19805) · 82814970
由 chengduo 提交于 9月 16, 2019
```
* fix warning info
test=develop

* fix bug of all_reduce_deps_pass
test=develop
```
82814970

Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758

由 Yiqun Liu 提交于 9月 16, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

* Enhance fc_fuse_pass to enable fusing relu.

* Allow print the shapes of var_desc in graph.
test=develop

* Enhance fc_fuse_pass_tester.

* Remove the use of PADDLE_ENFORCE.
test=develop

* Correct the number of ops after fusing.
test=develop

* Fix a typo.
test=develop

* Set activation_type to null when there is no relu in fc.
test=develop

* Refine fc_fuse_pass's codes.

* Enable the set of shape for tensor.

* Refine repeated_fc_relu_pass and add unittest.
test=develop

c67c8758

13 9月, 2019 1 次提交

Open fuse all reduce option (#19765) · 056fdedd

由 chengduo 提交于 9月 13, 2019

* Open fuse all reduce op
test=develop

* Add Fuse optimization op log

* Add log in fuse_optimizer op pass and fuse all_reduce op pass

* replace with boost::optional<bool>
test=develop

* Polish code
test=develop

* fix code coverage
test=develop

056fdedd

11 9月, 2019 3 次提交

C
Open fuse broadcast option (#18833) · e506c99c
由 chengduo 提交于 9月 11, 2019
```
* fix vlog level and fuse option type
test=develop
```
e506c99c

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

C
Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418) · 5866a7a5
由 chengduo 提交于 9月 11, 2019
```
* Enable fused_all_reduce_op_handle support GPU and CPU Gradients
```
5866a7a5

06 9月, 2019 1 次提交
- W
  codegen for fused elementwise operation (#19520) · ed8f44ea
  由 wangchaochaohu 提交于 9月 06, 2019
```
* test=develop codegen for fused elementwise operation

* fix test=develop
```
  ed8f44ea
04 9月, 2019 1 次提交

Enable ngraph through build_strategy (#19266) · a3a4b6e5

由 baojun 提交于 9月 04, 2019

* enable ngraph throught build_strategy test=develop

* add unittest test=develop

* put use_ngraph unconditional test=develop

* remove paddle_enforce test=develop

* remove paddle_enforce test=develop

* fix copyright test=develop

* limit for ngraph only test=develop

a3a4b6e5

03 9月, 2019 2 次提交

T
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603) · 75d15719
由 Tao Luo 提交于 9月 03, 2019
```
test=develop
```
75d15719

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

30 8月, 2019 1 次提交

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

28 8月, 2019 1 次提交

Fix the correctness of async mode at distributed training (#18863) · 65c73684

由 tangwei12 提交于 8月 28, 2019

* fix correctness of the communicator

* fix a bug in send thread when sending var context is empty, test=develop

* add lookup_table_prefetch_op and prefetch optimize, test=develop

* remove remote prefetch GPU supported

* word2vec force with CPU, test=develop

* test dist remote lookup table force with CPU, test=develop

65c73684

27 8月, 2019 1 次提交
- J
  
  Add conv dequant squash for int8 (#18905) · 2e3ec66b
  由 joanna.wozna.intel 提交于 8月 27, 2019
  
  2e3ec66b
23 8月, 2019 1 次提交
- T
  remove unused conv_elementwise_add2_act_fuse.cc (#19344) · c82280e4
  由 Tao Luo 提交于 8月 23, 2019
```
test=develop
```
  c82280e4
21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 3 次提交

Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0

由 Zhaolong Xing 提交于 8月 19, 2019

* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop

76c95af0

L
fix compilation issue in windows vs2017 (#19183) · 50582071
由 liuwei1031 提交于 8月 19, 2019
```
* fix compilation issue in windows vs2017, test=develop

* fix gtest lib not found issue, test=develop
```
50582071

remove the warning for reminding user to avoid using the OriginProgram method,... · 5368b365

由 juncaipeng 提交于 8月 19, 2019

remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244)

This log information may annoy users who don't need to care about it.

5368b365

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致