提交 · 4dc4d5fc45efd20d62ca9aebc94343ee7c5f8f30 · BaiXuePrincess / Paddle

20 10月, 2022 1 次提交
- S
  
  log only if > 0 (#47181) · d6208aad
  由 Sylwester Fraczek 提交于 10月 20, 2022
  
  d6208aad
19 10月, 2022 2 次提交
- R
  Support stream overlap for c_allreduce_sum (#47030) · d00b7d83
  由 Ruibiao Chen 提交于 10月 19, 2022
```
* Support stream overlap for c_allreduce_sum

* Test CI

* Add notes

* Add SingleStreamGuard for BuildOpFuncList
```
  d00b7d83
- W
  [Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105) · 94132190
  由 WangZhen 提交于 10月 19, 2022
```
* Fix recurrent op eager deletion pass error in dy2st

* Polish code

* Refine error message
```
  94132190
18 10月, 2022 2 次提交

Merge layernorm trt fuse (#46320) · 5e9f491e

由 Wang Bojun 提交于 10月 18, 2022

* first version, accuracy corrected

* disable debug print

* use blockReduceSum in phi

* add UT

* add opCompat

* code style

* code refine

* bug fix

* code refine

* test fix

* bugfix

* codesytle fix

* code style

* code-style

* code-style

* code-style

5e9f491e

FC + activation fuse passes (#45183) · b7a23adb

由 Sławomir Siwek 提交于 10月 18, 2022

* git

* style

* leave default relu in kernel

* style

* cleanup FCMKLDNN pattern

* merge conflicts

* update develop

* update develop

* add const

* rename to oneDNN and adjust attributes

* whitespace

b7a23adb

17 10月, 2022 4 次提交
- H
  Revert "add common subexpression elimination (#44386)" (#47062) · 7c6835ca
  由 hong 提交于 10月 17, 2022
```
This reverts commit 166ff39a.
```
  7c6835ca
- W
  Layernorm shift partition enhance (#46816) · 9e08633c
  由 Wang Bojun 提交于 10月 17, 2022
```
* first version of ln_s_p with s>0

* refine and UT

* pass opt draft

* pass opt

* code refine

* code-style

* bug fix

* fix ci test

* code style
```
  9e08633c
- J
  
  fix for conv_bias_mkldnn_pass (#47037) · acbda3e4
  由 jakpiase 提交于 10月 17, 2022
  
  acbda3e4
- P
  skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr (#46911) · 2e7dc666
  由 pangyoki 提交于 10月 17, 2022
```
* skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr

* update ut

* test_dist_allreduce_op failed

* fix test_dist_allreduce_op

* add ut

* fix nccl cpu compile

* fix
```
  2e7dc666
16 10月, 2022 1 次提交
- Z
  
  add common subexpression elimination (#44386) · 166ff39a
  由 ZeKai Zhou 提交于 10月 16, 2022
  
  166ff39a
13 10月, 2022 2 次提交

Fix quantize model deploy bugs when using MKLDNN (#45920) · 561fd8c8

由 yeliang2258 提交于 10月 13, 2022

* fix immutable op quantize bugs

* fix

* fix build bug

* fix test

* notest,test=inference

* fix ppyoloe acc drop bugs

* fix test

* fix test

* add test

* fix

* fix

* fix test

* fix refined name bug

* fix test

* bias fix

* fix matmul weight dequant bug

* re-ci

* fix tester

* fix test

* fix tester

* update weight dequantize func

* update code

* update test for converage

* update test

* update cmake

* update cmakelist

* update code

* rerun ci

* remove useless code

561fd8c8

Add unsigned int8 scale propagation (#46378) · c72b3bfa

由 joanna.wozna.intel 提交于 10月 13, 2022

* Add unsigned int8 propagation

* Add or modify unit tests

* Correct concat scale checking

* Apply review suggestions

* Corrections

c72b3bfa

12 10月, 2022 1 次提交
- W
  
  remove all control_vars in IR graph (#46888) · bf1dc548
  由 weishengying 提交于 10月 12, 2022
  
  bf1dc548
11 10月, 2022 2 次提交
- S
  add logging to fc residual fuse pass (#46760) · 21668cb2
  由 Sylwester Fraczek 提交于 10月 11, 2022
```
* add logging to fc residual fuse pass

* expand logging message to fc residual fuse pass

* Add test for fc residual not fusing with activation
```
  21668cb2
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
10 10月, 2022 3 次提交
- S
  Add fc residual pattern (#46757) · 0c789ae5
  由 Sylwester Fraczek 提交于 10月 10, 2022
```
* fix fc pattern

remove use_bias
add residual input switch
fix references to pattern

* review fixes
```
  0c789ae5
- S
  add function FindInputNameByVarName (#46759) · 8eaff62d
  由 Sylwester Fraczek 提交于 10月 10, 2022
```
* Add methods that find input or output name by var name

* kind of bugfix - initialize variables

* ci fix

* review fixed
```
  8eaff62d
- Z
  
  [Paddle-TRT] support new quant format from slim (#46022) · 7987a905
  由 zhoutianzi666 提交于 10月 10, 2022
  
  7987a905
30 9月, 2022 2 次提交
- A
  [IPU] paddle-inference support custom-ops (#45235) · a6b4bee3
  由 Allen Guo 提交于 9月 30, 2022
```
* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
```
  a6b4bee3
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46629) · abee2210
  由 HongyuJia 提交于 9月 30, 2022
  
  abee2210
29 9月, 2022 1 次提交
- Y
  Remove calibration file path when deploy quantize model (#46283) · d71f1b3f
  由 yeliang2258 提交于 9月 29, 2022
```
* remove calibration file path

* remove useless code
```
  d71f1b3f
28 9月, 2022 3 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

R
Convert GradMergeAllReduceOpHandle in GraphToBlock (#46544) · 6a706e63
由 Ruibiao Chen 提交于 9月 28, 2022
```
* Convert GradMergeAllReduceOpHandle in GraphToBlock

* Set FLAGS_CONVERT_GRAPH_TO_PROGRAM to False
```
6a706e63
L

remove const qualifier in function return (#46546) · 8c5b9cf8
由 Leo Chen 提交于 9月 28, 2022

8c5b9cf8

27 9月, 2022 1 次提交
- W
  [Paddle Inference]support n lookup_tables fuse to embeddinglayernorm(3) (#46243) · 4d772144
  由 Wangzheee 提交于 9月 27, 2022
```
* [Paddle Inference]support n lookup_tables fuse to embeddinglayernorm(3)
```
  4d772144
22 9月, 2022 2 次提交

[PHI] Migrate gelu kernels (#45596) · 567e2fc8

由 Sławomir Siwek 提交于 9月 22, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* gelu fwd

* sort activations

* gelu gradient

* remove unused macros

* merge conflicts

* fix merge conflicts

* remove extra contraint from gelu op

567e2fc8

L

convert grad_merge_all_reduce in graph to program (#46353) · 0a144ca1
由 Leo Chen 提交于 9月 22, 2022

0a144ca1

21 9月, 2022 2 次提交

Enable PaddleInference to use CINN. (#45009) · 3aa6bd57

由 Zhen Wang 提交于 9月 21, 2022

* use cinn in the paddle inference

* fix some cmake errors

* Avoid division by zero in the arange_kernel.

* Avoid dynamic ops.

* Remove some useless codes.

* Use OpTransInfo to encapsulate some codes used in the build_cinn_pass.

3aa6bd57

W
residual_no_bias (#46129) · aa0e84e3
由 wenbin 提交于 9月 21, 2022
```
* residual_no_bias

* comments

* more ut

* fix input
```
aa0e84e3

20 9月, 2022 1 次提交
- Z
  [Inference] fix preln_residual_bias_fuse_pass bug in TNT_small model (#46178) · bfee398b
  由 zhoutianzi666 提交于 9月 20, 2022
```
* fix preln_residual_bias_fuse_pass bug in TNT_small model 
```
  bfee398b
19 9月, 2022 1 次提交

Fix wrong eigen header include (#46082) · 59a2a987

由 zyfncg 提交于 9月 19, 2022

* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper

59a2a987

13 9月, 2022 1 次提交
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
09 9月, 2022 1 次提交

[new-exe] convert fused_all_reduce_op_handle to program (#45774) · e755c07e

由 Leo Chen 提交于 9月 09, 2022

* add operator<< for BuildStrategy

* add fake_coalesce

* fit allreduce mode for new_exe

* remove dubeg code

* follow comments

e755c07e

07 9月, 2022 1 次提交

Layernorm shift partition (#45736) · 960109af

由 wenbin 提交于 9月 07, 2022

* first commit

* conver done

* correct format

* layernorm_shift_partition

* correct convert

* redefine plugin

* runable

* bug fix

* modify ShiftPartitionPattern

* correct

* add UT

* modify ut

* compile

* modify enforce

* modify UT

960109af

06 9月, 2022 1 次提交

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is... · ddc244d3

由 zhoutianzi666 提交于 9月 06, 2022

[Paddle Inference] fix bugs in quant_conv2d_dequant_fuse_pass when weight is shared  between ops (#45719)

* fix_old_format

* fix bug in quant_conv2d_dequant

* fix bug in quant_conv2d_dequant

ddc244d3

05 9月, 2022 2 次提交

New format quant model support for MKLDNN (#45416) · 4e4f4586

由 yeliang2258 提交于 9月 05, 2022

* support onnx format quantized model

* update code

* add test

* add test

* fix

* fix test

* fix cmake

* update code

* change scale file path to calibration file path

* update code

* update code

* fix build bug

* fix build bugs

* fix

* fix

4e4f4586

F
fix bugs for vit attention pass (#45721) · b9d66e6b
由 feng_shuai 提交于 9月 05, 2022
```
* fix: vit attention pass

* reflash CI
```
b9d66e6b

31 8月, 2022 1 次提交
- H
  add del dropout op pass to jit pe enigne (#45439) · 46bc06b5
  由 Hui Zhang 提交于 8月 31, 2022
```
* add del dropout op pass to jit pe enigne

* add delete dropout test
```
  46bc06b5
30 8月, 2022 2 次提交

Remove extra attribute in OpMaker (#44310) · fe321f9a

由 zyfncg 提交于 8月 30, 2022

* add runtime config in phi

* add runtime attr for op desc and op

* fix no proto error

* adjust opdesc set_attr impl

* try to remove conv_op extra attrs

* add init runtime attr map

* change extra header path

* fix runtime_attr

* fix trace_op

* fix bug of pass

* fix merge conflict

* fix dygraph attrs

* fix bug of pass

* fix dygraph bug

* fix unittest module

* delete extra attr default

* fix dropout kernel

* polish code

* fix extra output of instance_norm

* fix merge confilct

* fix op_desc bug

* add extra attr in yaml for conv3d_transpose

* don't remove extra input and output

* fix save_inference_model

* fix bug of batch_norm

* revert some change

* polish log

* polish code

* add code comment
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

fe321f9a

Z
[Paddle-TRT] constant-folding (#45494) · 97f43a8e
由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
97f43a8e

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致