提交 · 8a339d24c7aa15fb071a02ab85f3438e99af4b69 · PaddlePaddle / Paddle

15 11月, 2022 1 次提交

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

14 11月, 2022 3 次提交
- W
  Refactor collective communication send_partial, recv_partial, all_gather_partial C++ API (#47863) · 25e63dca
  由 Wen Sun 提交于 11月 14, 2022
```
* refactor: simplify send, recv interfaces

* refactor: rm send_partial, recv_partial, all_gather_partial
```
  25e63dca
- X
  
  [Paddle Inference] Add where trt converter (#47820) · dac0f7dd
  由 xiaoxiaohehe001 提交于 11月 14, 2022
  
  dac0f7dd
- R
  
  Add InferShape for Depend OP (#47907) · 5478e1a5
  由 Ruibiao Chen 提交于 11月 14, 2022
  
  5478e1a5
11 11月, 2022 3 次提交

[Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
由 zhouweiwei2014 提交于 11月 11, 2022

18549417

Refine shape op lanch method for standalone executor (#47843) · 981d1a10

由 zhangbo9674 提交于 11月 11, 2022

* refine shape op in new_exe

* Revert "refine shape op in new_exe"

This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e.

* refine shape op in new_exe

* refine shape expected_kernel_type

* add SelectedRows check for shape op

* refine code

981d1a10

Generate static graph code for some ops by yaml (part3) (#47803) · 31f3f643

由 zyfncg 提交于 11月 11, 2022

* generate static graph code for some ops by yaml

* remove deleted files

* update cmake

* update cmake

* udpate cmake

31f3f643

10 11月, 2022 5 次提交

S
[phi] migrate prelu (#47422) · cdd8c8ab
由 Sylwester Fraczek 提交于 11月 10, 2022
```
* migrate prelu

* remove cache

* review fixes
```
cdd8c8ab

[PHI]Standardise some C++ API (Part4) (#47702) · 594bd723

由 YuanRisheng 提交于 11月 10, 2022

* standard api

* fix sparse bugs

* fix xpu bugs, test=kunlun

* remove hard code for custom unittest

* open ci, test=kunlun

* deal with conflict

594bd723

XPU multi-card support eager mode (#47445) · 3b91f8f3

由 james 提交于 11月 10, 2022

* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun

3b91f8f3

Z
Add CI check for script of auto code-gen (#47814) · 00ea0b2f
由 zyfncg 提交于 11月 10, 2022
```
* add ci check for code-gen script

* update
```
00ea0b2f
C

support pow_triple_grad op (#47799) · 7964119b
由 Charles-hit 提交于 11月 10, 2022

7964119b

09 11月, 2022 7 次提交

H

clean repetitious GetKernelTypeForVar (#47763) · c551e55d
由 HongyuJia 提交于 11月 09, 2022

c551e55d
J

fix for missing reorders in profiling (#47777) · a97b3630
由 jakpiase 提交于 11月 09, 2022

a97b3630
S

cleanup unused code (#47762) · fb16fea3
由 Sławomir Siwek 提交于 11月 09, 2022

fb16fea3

Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad

由 Jacek Czaja 提交于 11月 09, 2022

* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2

14f261ad

Z
Generate static graph code for some ops by yaml (part2) (#47752) · ccb47076
由 zyfncg 提交于 11月 09, 2022
```
* generate static graph code of some op

* polish code

* fix bug

* update default value
```
ccb47076

[PHI decoupling] Move fluid op generator into fluid (#47714) · f369b2b1

由 Chen Weihang 提交于 11月 09, 2022

* move fluid op generator into fluid

* remove parsed op

* resolve sig undef error

* append python interp find logic

* remove dup code

f369b2b1

L

new mp_allreduce_sum_op (#47715) · 18d33346
由 LiYuRio 提交于 11月 09, 2022

18d33346

08 11月, 2022 4 次提交

Split quant (#47449) · 130db92a

由 Paulina Gacek 提交于 11月 08, 2022

* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected

130db92a

J
removing dependent to fluid/framework/eigen.h in phi (#47675) · c7cd8d98
由 jzhang533 提交于 11月 08, 2022
```
* removing dependent to fluid/framework/eigen.h in phi

* more fix according to PR-CI-Py3 fail
```
c7cd8d98

support pow double grad op (#47691) · 6fe9dfb2

由 Charles-hit 提交于 11月 08, 2022

* support pow_double_grad op

* add unit test for pow double grad

* fix pow double grad

* optimize pow double grad kernel

* fix pow double grad kernel

6fe9dfb2

T

fix cinn_instruction_run_op_test when FLAGS_use_system_allocator=True (#47731) · a4a9ce0e
由 TeFeng Chen 提交于 11月 08, 2022

a4a9ce0e

07 11月, 2022 6 次提交

suqeeze2 + transpose2 fuse onednn (#47592) · fa874a46

由 Hui Zhang 提交于 11月 07, 2022

* suqeeze2 transpose2 fuse onednn

* format

* fix output shape

* fix conflict

* format

* format

* remove useless

* remove log

* simply pass

* fix comment

* fix

* fix msg

* fix error msg

* format

fa874a46

W

remove hardcoded -Wunused-variable compiler flags (#47706) · 45bc4542
由 Wang Xin 提交于 11月 07, 2022

45bc4542

[Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d

由 HongyuJia 提交于 11月 07, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

* Call SetDnnFallback function in the base class

* activation fallback to plain kernel

* fix default GetExpectedKernelType find wrong kernel

* search cudnn kernel instead of fallback

* fix cudnn_handle bug

* remove tanh use_cudnn

* restore tanh use_cudnn

* debug tanh

* fix tanh bug

* delete activation cudnn kernel

* polish code

908a381d

W

Refactor collective communication all_gather, all_reduce, broadcast & barrier C++ API (#47481) · e1a1c354
由 Wen Sun 提交于 11月 07, 2022

e1a1c354
S
[PHI] Migrate batch_norm (#47652) · 2337e609
由 Sławomir Siwek 提交于 11月 07, 2022
```
* init changes

* bnorm

* method signature

* change order

* bnorm

* removed unused args
```
2337e609
S
[PHI] Migrate depthwise_conv2d_grad and conv3d_grad kernels (#47686) · b0c38568
由 Sławomir Siwek 提交于 11月 07, 2022
```
* remove fwd funcs

* migrate conv grads
```
b0c38568

04 11月, 2022 7 次提交

Z
Generate static graph code for some activation ops by Yaml (part3) (#47640) · 40cd5271
由 zyfncg 提交于 11月 04, 2022
```
* generate static graph code for some activation op

* fix bug

* fix infermeta of selected_rows
```
40cd5271
J
slice & mul & requantize tensors to use mem_desc (#47617) · 2cff0e8a
由 Jacek Czaja 提交于 11月 04, 2022
```
* slice & mul & requantize

* - Fix to requentize test
```
2cff0e8a
L

forbid backward for comm (#47636) · eac973d1
由 LiYuRio 提交于 11月 04, 2022

eac973d1
S

migrate convs (#47658) · 4a4f3f80
由 Sławomir Siwek 提交于 11月 04, 2022

4a4f3f80

[PHI] Migrate pool2d and pool2d_grad kernels (#47423) · ca4bed7b

由 Piotr Paturej 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine

* Migrate pool+grad to PHI

* Update paddle/fluid/operators/mkldnn/test_mkldnn_op_nhwc.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NChen Weihang <chenwhpro@163.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

ca4bed7b

[PHI] Migrate softplus kernel (#47406) · 1831919f

由 Sławomir Siwek 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI

* init

* adjust imports

* support postops

* format codeblocks

* revert changes to softmax
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

1831919f

J
Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
9e006987

03 11月, 2022 3 次提交

Fix oneDNN elementwise_sub dnnl_error in unit test (#47237) · 30c7758f

由 Piotr Paturej 提交于 11月 03, 2022

* Fix dnnl errors in elementwise_sub tests

* Fix model accuracy attempt

* Add new fix

* Add proper fix

* Refactor by removing code repetition

30c7758f

Improve performance of coalesce_tensor and depend op in standalone executor (#47606) · 5fb1e824

由 Ruibiao Chen 提交于 11月 03, 2022

* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors

* Improve performance of coalesce_tensor and depend OP in standalone executor

* pre-commit check

5fb1e824

[PHI] Migrate softmax kernel (#47339) · b8ae3858

由 Sławomir Siwek 提交于 11月 03, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

b8ae3858

02 11月, 2022 1 次提交
- H
  Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
  由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
  a57a19ea

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功