提交 · 519e7426dd4bbf0b7134b0b59fd9db9cbb0c7102 · BaiXuePrincess / Paddle

15 11月, 2022 2 次提交

J
Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
519e7426

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

14 11月, 2022 1 次提交
- Y
  
  fix squueze_transpose (#47911) · f50de679
  由 yeliang2258 提交于 11月 14, 2022
  
  f50de679
11 11月, 2022 2 次提交

[IPU]: add model_runtime backend support in IPU (#47363) · 21b901cb

由 czr-gc 提交于 11月 11, 2022

* feat(ipu): add model_runtime backend support in IPU.

* fix(ipu_executor): fix error message format.

* fix(ipu_executor): fix format.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

21b901cb

Generate static graph code for some ops by yaml (part3) (#47803) · 31f3f643

由 zyfncg 提交于 11月 11, 2022

* generate static graph code for some ops by yaml

* remove deleted files

* update cmake

* update cmake

* udpate cmake

31f3f643

10 11月, 2022 3 次提交
- Z
  [search && paddle inference]add roformer pass&&plugin novarlen version (#47523) · 0f3fb562
  由 zhangxin81 提交于 11月 10, 2022
```
* add roformer pass&&plugin（novarlen）
```
  0f3fb562
- W
  skip_merge_layernorm (#47810) · 1c6013dd
  由 wenbin 提交于 11月 10, 2022
```
* skip_merge_layernorm

* add UT

* modify comments
```
  1c6013dd
- R
  Fuse multi transformer layer pass (#47541) · 1e3245a8
  由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
  1e3245a8
08 11月, 2022 3 次提交

Migrate old C++ unit tests to Python framework (#47006) · 0c9f09b8

由 Sławomir Siwek 提交于 11月 08, 2022

* softplus+activation

* fc + elementwise_add test refactored

* rename MKLDNN to OneDNN

* fc+activation tests refactored

* remove softplus ut

* whitespace

* whitespace

* codestyle

* codestyle

* add more cases to fc+act

* remove softplus+hard_sigmoid pass

* remove softplus + hard_sigmoid UT

* add approximate for gelu

* swish beta range

* new codestyle

* reduce number of tests

0c9f09b8

Z
[Paddle Inference] allow fold fill_constant && allow nms3 into trt in int8 model (#47551) · c3a69111
由 zhoutianzi666 提交于 11月 08, 2022
```
* allow fold fill_constant && allow nms3 into trt in int8 model
* use unordered_map
* fix CI failing
```
c3a69111

Split quant (#47449) · 130db92a

由 Paulina Gacek 提交于 11月 08, 2022

* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected

130db92a

07 11月, 2022 2 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

S
[PHI] Migrate batch_norm (#47652) · 2337e609
由 Sławomir Siwek 提交于 11月 07, 2022
```
* init changes

* bnorm

* method signature

* change order

* bnorm

* removed unused args
```
2337e609

04 11月, 2022 1 次提交
- J
  Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
  由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
  9e006987
03 11月, 2022 3 次提交

Y
Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) · 5fc92943
由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
5fc92943

[PHI] Migrate softmax kernel (#47339) · b8ae3858

由 Sławomir Siwek 提交于 11月 03, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

b8ae3858

W

bug fix (#47611) · 5160628c
由 wenbin 提交于 11月 03, 2022

5160628c

02 11月, 2022 1 次提交
- 丁
  
  Logsigmoid and Tanhshrink ops convert to trt (#47322) · b045fdfb
  由丁一提交于 11月 02, 2022
  
  b045fdfb
01 11月, 2022 1 次提交
- K
  fix memory copy in prepare_data of FusedMultiTransformer pass (#47306) · 9ad0e37e
  由 Kaipeng Deng 提交于 11月 01, 2022
```
* fix memory copy in prepare_data. test=develop
```
  9ad0e37e
31 10月, 2022 1 次提交
- F
  feat: add int8 support for vit (#47330) · 2953b708
  由 feng_shuai 提交于 10月 31, 2022
```
* feat: add int8 support for vit

* test:add test
```
  2953b708
27 10月, 2022 2 次提交

make all cpp tests dynamic linked to libpaddle.so [except windows] (#47088) · 2096448b

由 Leo Chen 提交于 10月 27, 2022

* make all cpp tests dynamic linked to libpaddle.so

* add comments

* keep old cc_test for some tests

* fix some ut

* make some ut use cc_test_old

* fix typos and fit for win32

* fix lib path

* fix some tests

* skip lite test

* fit for rocm

* fit for cinn

* fit for mac

* fit for win32

* skip inference ut

* skip  windows

* fix coverage

2096448b

C
Fix compile error of mkldnn and tensorrt (#47388) · 19feba38
由 Chen Weihang 提交于 10月 26, 2022
```
* fix compile error of mkldnn

* fix tensorrt error
```
19feba38

26 10月, 2022 3 次提交

Preln_Layernorm_Shift_Partition (#47099) · d17d0cd1

由 wenbin 提交于 10月 26, 2022

* prelnlayernorm_shift

* add ut

* remove paddle_enforce

* remove useless

* add UT

* remove UT

* add UT

* set timeout

d17d0cd1

FC/matmul(v2) + scale fuse pass (#47127) · c1c2be2d

由 Sławomir Siwek 提交于 10月 26, 2022

* fc/matmuls + scale fuse pass

* remove double-extension

* add unit tests

* comments from review

* codestyle

* add pass to int8 list

* new codestyle

* attr name typo

c1c2be2d

C
Remove the declaration of using LoDTensor in framework/lod_tensor.h (Part2) (#46953) · 1cb12ff5
由 Chen Weihang 提交于 10月 25, 2022
```
* remove using lodtensor part2

* resolve code format error

* resolve conflict

* resolve conflict

* replace added frameworrk tensor
```
1cb12ff5

24 10月, 2022 1 次提交
- Y
  Fix compilation bug caused by incorrect log information (#47254) · 40212582
  由 yeliang2258 提交于 10月 24, 2022
```
* fix log bugs

* more fix

* fix bugs
```
  40212582
21 10月, 2022 1 次提交
- A
  
  fix runtime error (#47133) · 016766cc
  由 Allen Guo 提交于 10月 21, 2022
  
  016766cc
20 10月, 2022 2 次提交
- K
  Add FusedMultiTransformer fuse pass for GPT3 (#45907) · 5a2e5179
  由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
```
  5a2e5179
- S
  
  log only if > 0 (#47181) · d6208aad
  由 Sylwester Fraczek 提交于 10月 20, 2022
  
  d6208aad
19 10月, 2022 2 次提交
- R
  Support stream overlap for c_allreduce_sum (#47030) · d00b7d83
  由 Ruibiao Chen 提交于 10月 19, 2022
```
* Support stream overlap for c_allreduce_sum

* Test CI

* Add notes

* Add SingleStreamGuard for BuildOpFuncList
```
  d00b7d83
- W
  [Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105) · 94132190
  由 WangZhen 提交于 10月 19, 2022
```
* Fix recurrent op eager deletion pass error in dy2st

* Polish code

* Refine error message
```
  94132190
18 10月, 2022 2 次提交

Merge layernorm trt fuse (#46320) · 5e9f491e

由 Wang Bojun 提交于 10月 18, 2022

* first version, accuracy corrected

* disable debug print

* use blockReduceSum in phi

* add UT

* add opCompat

* code style

* code refine

* bug fix

* code refine

* test fix

* bugfix

* codesytle fix

* code style

* code-style

* code-style

* code-style

5e9f491e

FC + activation fuse passes (#45183) · b7a23adb

由 Sławomir Siwek 提交于 10月 18, 2022

* git

* style

* leave default relu in kernel

* style

* cleanup FCMKLDNN pattern

* merge conflicts

* update develop

* update develop

* add const

* rename to oneDNN and adjust attributes

* whitespace

b7a23adb

17 10月, 2022 4 次提交
- H
  Revert "add common subexpression elimination (#44386)" (#47062) · 7c6835ca
  由 hong 提交于 10月 17, 2022
```
This reverts commit 166ff39a.
```
  7c6835ca
- W
  Layernorm shift partition enhance (#46816) · 9e08633c
  由 Wang Bojun 提交于 10月 17, 2022
```
* first version of ln_s_p with s>0

* refine and UT

* pass opt draft

* pass opt

* code refine

* code-style

* bug fix

* fix ci test

* code style
```
  9e08633c
- J
  
  fix for conv_bias_mkldnn_pass (#47037) · acbda3e4
  由 jakpiase 提交于 10月 17, 2022
  
  acbda3e4
- P
  skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr (#46911) · 2e7dc666
  由 pangyoki 提交于 10月 17, 2022
```
* skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr

* update ut

* test_dist_allreduce_op failed

* fix test_dist_allreduce_op

* add ut

* fix nccl cpu compile

* fix
```
  2e7dc666
16 10月, 2022 1 次提交
- Z
  
  add common subexpression elimination (#44386) · 166ff39a
  由 ZeKai Zhou 提交于 10月 16, 2022
  
  166ff39a
13 10月, 2022 2 次提交

Fix quantize model deploy bugs when using MKLDNN (#45920) · 561fd8c8

由 yeliang2258 提交于 10月 13, 2022

* fix immutable op quantize bugs

* fix

* fix build bug

* fix test

* notest,test=inference

* fix ppyoloe acc drop bugs

* fix test

* fix test

* add test

* fix

* fix

* fix test

* fix refined name bug

* fix test

* bias fix

* fix matmul weight dequant bug

* re-ci

* fix tester

* fix test

* fix tester

* update weight dequantize func

* update code

* update test for converage

* update test

* update cmake

* update cmakelist

* update code

* rerun ci

* remove useless code

561fd8c8

Add unsigned int8 scale propagation (#46378) · c72b3bfa

由 joanna.wozna.intel 提交于 10月 13, 2022

* Add unsigned int8 propagation

* Add or modify unit tests

* Correct concat scale checking

* Apply review suggestions

* Corrections

c72b3bfa

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致