提交 · 39c85064a27a6a6ab0d8eed8d8e996caf5302ff8 · PaddlePaddle / Paddle

16 11月, 2022 1 次提交
- C
  
  feat(ipu): add paddle inference support for model_runtime. (#47364) · 39c85064
  由 czr-gc 提交于 11月 16, 2022
  
  39c85064
15 11月, 2022 2 次提交

J
Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
519e7426

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

14 11月, 2022 2 次提交
- R
  
  Do not release memory cache after build_op_func_list in interpretercore (#47910) · 8347354d
  由 Ruibiao Chen 提交于 11月 14, 2022
  
  8347354d
- Y
  
  fix squueze_transpose (#47911) · f50de679
  由 yeliang2258 提交于 11月 14, 2022
  
  f50de679
11 11月, 2022 3 次提交

[IPU]: add model_runtime backend support in IPU (#47363) · 21b901cb

由 czr-gc 提交于 11月 11, 2022

* feat(ipu): add model_runtime backend support in IPU.

* fix(ipu_executor): fix error message format.

* fix(ipu_executor): fix format.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

* fix(ipu_executor): fix format again.

21b901cb

Refine shape op lanch method for standalone executor (#47843) · 981d1a10

由 zhangbo9674 提交于 11月 11, 2022

* refine shape op in new_exe

* Revert "refine shape op in new_exe"

This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e.

* refine shape op in new_exe

* refine shape expected_kernel_type

* add SelectedRows check for shape op

* refine code

981d1a10

Generate static graph code for some ops by yaml (part3) (#47803) · 31f3f643

由 zyfncg 提交于 11月 11, 2022

* generate static graph code for some ops by yaml

* remove deleted files

* update cmake

* update cmake

* udpate cmake

31f3f643

10 11月, 2022 4 次提交
- Z
  [search && paddle inference]add roformer pass&&plugin novarlen version (#47523) · 0f3fb562
  由 zhangxin81 提交于 11月 10, 2022
```
* add roformer pass&&plugin（novarlen）
```
  0f3fb562
- W
  skip_merge_layernorm (#47810) · 1c6013dd
  由 wenbin 提交于 11月 10, 2022
```
* skip_merge_layernorm

* add UT

* modify comments
```
  1c6013dd
- J
  fix paddle with cinn cannot link relu op bug (#47793) · 8e65ac5d
  由 jiangcheng 提交于 11月 10, 2022
```
* fix paddle with cinn cannot link relu op bug

* change cmake activation_op to generator_op
```
  8e65ac5d
- R
  Fuse multi transformer layer pass (#47541) · 1e3245a8
  由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
  1e3245a8
09 11月, 2022 2 次提交

Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad

由 Jacek Czaja 提交于 11月 09, 2022

* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2

14f261ad

[PHI decoupling] Move fluid op generator into fluid (#47714) · f369b2b1

由 Chen Weihang 提交于 11月 09, 2022

* move fluid op generator into fluid

* remove parsed op

* resolve sig undef error

* append python interp find logic

* remove dup code

f369b2b1

08 11月, 2022 3 次提交

Migrate old C++ unit tests to Python framework (#47006) · 0c9f09b8

由 Sławomir Siwek 提交于 11月 08, 2022

* softplus+activation

* fc + elementwise_add test refactored

* rename MKLDNN to OneDNN

* fc+activation tests refactored

* remove softplus ut

* whitespace

* whitespace

* codestyle

* codestyle

* add more cases to fc+act

* remove softplus+hard_sigmoid pass

* remove softplus + hard_sigmoid UT

* add approximate for gelu

* swish beta range

* new codestyle

* reduce number of tests

0c9f09b8

Z
[Paddle Inference] allow fold fill_constant && allow nms3 into trt in int8 model (#47551) · c3a69111
由 zhoutianzi666 提交于 11月 08, 2022
```
* allow fold fill_constant && allow nms3 into trt in int8 model
* use unordered_map
* fix CI failing
```
c3a69111

Split quant (#47449) · 130db92a

由 Paulina Gacek 提交于 11月 08, 2022

* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected

130db92a

07 11月, 2022 3 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

[Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d

由 HongyuJia 提交于 11月 07, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

* Call SetDnnFallback function in the base class

* activation fallback to plain kernel

* fix default GetExpectedKernelType find wrong kernel

* search cudnn kernel instead of fallback

* fix cudnn_handle bug

* remove tanh use_cudnn

* restore tanh use_cudnn

* debug tanh

* fix tanh bug

* delete activation cudnn kernel

* polish code

908a381d

S
[PHI] Migrate batch_norm (#47652) · 2337e609
由 Sławomir Siwek 提交于 11月 07, 2022
```
* init changes

* bnorm

* method signature

* change order

* bnorm

* removed unused args
```
2337e609

05 11月, 2022 1 次提交
- Y
  
  Use an unified FLAGS_check_nan_inf_level to control the result of checking infinite. (#47672) · 54bc3b46
  由 Yiqun Liu 提交于 11月 05, 2022
  
  54bc3b46
04 11月, 2022 2 次提交
- Z
  Generate static graph code for some activation ops by Yaml (part3) (#47640) · 40cd5271
  由 zyfncg 提交于 11月 04, 2022
```
* generate static graph code for some activation op

* fix bug

* fix infermeta of selected_rows
```
  40cd5271
- J
  Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
  由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
  9e006987
03 11月, 2022 6 次提交

Improve performance of coalesce_tensor and depend op in standalone executor (#47606) · 5fb1e824

由 Ruibiao Chen 提交于 11月 03, 2022

* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors

* Improve performance of coalesce_tensor and depend OP in standalone executor

* pre-commit check

5fb1e824

Y
Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) · 5fc92943
由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
5fc92943
L

clean unused code: save_load_util.cc/.h (#47588) · 6d0f730d
由 Leo Chen 提交于 11月 03, 2022

6d0f730d

[Opt Kernel Selection] Opt CanMKLDNNBeUsed performance (#47563) · 9adad42d

由 HongyuJia 提交于 11月 03, 2022

* opt CanMKLDNNBeUsed performance

* fix nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

9adad42d

[PHI] Migrate softmax kernel (#47339) · b8ae3858

由 Sławomir Siwek 提交于 11月 03, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

b8ae3858

W

bug fix (#47611) · 5160628c
由 wenbin 提交于 11月 03, 2022

5160628c

02 11月, 2022 4 次提交
- H
  Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
  由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
  a57a19ea
- 丁
  
  Logsigmoid and Tanhshrink ops convert to trt (#47322) · b045fdfb
  由丁一提交于 11月 02, 2022
  
  b045fdfb
- R
  Dispatch computation OPs before communication in standalone executor (#47471) · 5ed487bf
  由 Ruibiao Chen 提交于 11月 02, 2022
```
* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors
```
  5ed487bf
- Y
  Improve the tool for checking nan and inf, and support to compute the max, min... · ad39043f
  由 Yiqun Liu 提交于 11月 02, 2022
```
Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095)

* Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor.

* Add a FLAGS to control whether abort when meets inf/nan and polish codes.

* Fix unittest.

* Change the computing of mean.
```
  ad39043f
01 11月, 2022 6 次提交

[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045

由 HongyuJia 提交于 11月 01, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

f9134045

Y

[Paddle Inference] add RegisterOutputHook interface (#47050) · db323927
由 Yuanle Liu 提交于 11月 01, 2022

db323927
Y
[PHI]Standardise some C++ API (Part2) (#47510) · 399047d7
由 YuanRisheng 提交于 11月 01, 2022
```
* standard_api

* add hardtanh
```
399047d7

Support custom stream for standalone executor (#47411) · e12b6c04

由 Ruibiao Chen 提交于 11月 01, 2022

* [Auto Parallel] Improve the c++ dist attr

* [Auto Parallel] Modify test_program.py

* Support custom stream for standalone executor
Co-authored-by: NYulong Ao <aoyulong@baidu.com>

e12b6c04

K
fix memory copy in prepare_data of FusedMultiTransformer pass (#47306) · 9ad0e37e
由 Kaipeng Deng 提交于 11月 01, 2022
```
* fix memory copy in prepare_data. test=develop
```
9ad0e37e

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

31 10月, 2022 1 次提交
- F
  feat: add int8 support for vit (#47330) · 2953b708
  由 feng_shuai 提交于 10月 31, 2022
```
* feat: add int8 support for vit

* test:add test
```
  2953b708

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功