提交 · ac2a94c768be26d11e36270652f154deb2ea4749 · PaddlePaddle / Paddle

04 11月, 2022 8 次提交

[XPU] add cumsum op. test=kunlun (#47585) · ac2a94c7

由 houj04 提交于 11月 04, 2022

* [XPU] add cumsum op. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* debug. test=kunlun

* update xpu.cmake. remove unnecessary codes. test=kunlun.

ac2a94c7

S

migrate convs (#47658) · 4a4f3f80
由 Sławomir Siwek 提交于 11月 04, 2022

4a4f3f80

[PHI] Migrate pool2d and pool2d_grad kernels (#47423) · ca4bed7b

由 Piotr Paturej 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine

* Migrate pool+grad to PHI

* Update paddle/fluid/operators/mkldnn/test_mkldnn_op_nhwc.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NChen Weihang <chenwhpro@163.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

ca4bed7b

[PHI] Migrate softplus kernel (#47406) · 1831919f

由 Sławomir Siwek 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI

* init

* adjust imports

* support postops

* format codeblocks

* revert changes to softmax
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

1831919f

L

remove global var (#47659) · 7fe7eebc
由 LiYuRio 提交于 11月 04, 2022

7fe7eebc
Y

fix deepfm and deep_wide bug, add embedding_sparse_grad kernel, test=kunlun (#47365) · f53e920d
由 ykkk2333 提交于 11月 04, 2022

f53e920d
J
Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
9e006987
W
fix cc_library link python lib (#47605) · cd59c10c
由 wanghuancoder 提交于 11月 04, 2022
```
* fix cc_library link python lib
```
cd59c10c

03 11月, 2022 10 次提交

W
[Paddle Inference]disable_lookup_table_v2 (#47515) · f778c170
由 Wangzheee 提交于 11月 03, 2022
```
* disable_lookup_table_v2
```
f778c170

Fix oneDNN elementwise_sub dnnl_error in unit test (#47237) · 30c7758f

由 Piotr Paturej 提交于 11月 03, 2022

* Fix dnnl errors in elementwise_sub tests

* Fix model accuracy attempt

* Add new fix

* Add proper fix

* Refactor by removing code repetition

30c7758f

Improve performance of coalesce_tensor and depend op in standalone executor (#47606) · 5fb1e824

由 Ruibiao Chen 提交于 11月 03, 2022

* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors

* Improve performance of coalesce_tensor and depend OP in standalone executor

* pre-commit check

5fb1e824

sparse attention kernel is used from 11.8 (#47594) · 7648f429
由 zhouweiwei2014 提交于 11月 03, 2022

7648f429
Y
Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) · 5fc92943
由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
5fc92943
L

clean unused code: save_load_util.cc/.h (#47588) · 6d0f730d
由 Leo Chen 提交于 11月 03, 2022

6d0f730d

[Opt Kernel Selection] Opt CanMKLDNNBeUsed performance (#47563) · 9adad42d

由 HongyuJia 提交于 11月 03, 2022

* opt CanMKLDNNBeUsed performance

* fix nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

9adad42d

[PHI] Migrate softmax kernel (#47339) · b8ae3858

由 Sławomir Siwek 提交于 11月 03, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

b8ae3858

W

bug fix (#47611) · 5160628c
由 wenbin 提交于 11月 03, 2022

5160628c
W
[CodeStyle] remove unused-variable warning in linux (#47558) · e67d6f17
由 Wang Xin 提交于 11月 03, 2022
```
* remove unused-variable warning in linux

* fix unused-variable error in GpuPS
```
e67d6f17

02 11月, 2022 10 次提交
- H
  Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
  由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
  a57a19ea
- Z
  [inference][trt] bilinear support OutSize input (#47495) · c061c082
  由 Zhang Jun 提交于 11月 02, 2022
```
* add bilinear OutSize
```
  c061c082
- 丁
  
  Logsigmoid and Tanhshrink ops convert to trt (#47322) · b045fdfb
  由丁一提交于 11月 02, 2022
  
  b045fdfb
- R
  Dispatch computation OPs before communication in standalone executor (#47471) · 5ed487bf
  由 Ruibiao Chen 提交于 11月 02, 2022
```
* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors
```
  5ed487bf
- Y
  [PHI]Standardise some C++ API (Part3) (#47532) · fe8c6796
  由 YuanRisheng 提交于 11月 02, 2022
```
* Standardise batch norm

* standardize conv3d and depwise_conv2d

* fix ci bugs
```
  fe8c6796
- [Zero-Dim] support input 0D Tensor for some binary api (#46909) · cad2e68d
  由 zhouweiwei2014 提交于 11月 02, 2022
  
  cad2e68d
- Y
  Improve the tool for checking nan and inf, and support to compute the max, min... · ad39043f
  由 Yiqun Liu 提交于 11月 02, 2022
```
Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095)

* Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor.

* Add a FLAGS to control whether abort when meets inf/nan and polish codes.

* Fix unittest.

* Change the computing of mean.
```
  ad39043f
- Z
  Support generating static code of high order grad op by yaml (#47511) · bafa890a
  由 zyfncg 提交于 11月 02, 2022
```
* support generating static code of high order grad op by yaml

* polish code
```
  bafa890a
- H
  [XPU] add int64 support for slice and subtract. (#47409) · 77395619
  由 houj04 提交于 11月 02, 2022
```
* [XPU] add int64 support for slice and subtract. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* remove unnecessary modification. test=kunlun
```
  77395619
- T
  Add build option for CUDNN Frontend API (#47524) · eb100c7b
  由 Tian Zheng 提交于 11月 02, 2022
```
* Add build option for CUDNN Frontend API

* Fix review comments

* Change namespace for cudnn_frontend.h
```
  eb100c7b
01 11月, 2022 12 次提交

fix dynamic link of xpu library (#47434) · 9d801855

由 Leo Chen 提交于 11月 01, 2022

* refine comments,test=kunlun

* link xpu lib, test=kunlun

* add sleep for test, test=kunlun

* merge develop, fix compile, test=kunlun

* remove debug code, test=kunlun

* add dependency to avoid potential concurrency error, test=kunlun

9d801855

[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045

由 HongyuJia 提交于 11月 01, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

f9134045

Y

[Paddle Inference] add RegisterOutputHook interface (#47050) · db323927
由 Yuanle Liu 提交于 11月 01, 2022

db323927
H

clean mkldnn headerfile (#47507) · a341bb8c
由 HongyuJia 提交于 11月 01, 2022

a341bb8c

Fix bugs in tranpose kernel (#47212) · ec7fe888

由 limingshu 提交于 11月 01, 2022

* first commit

* transpose_kernel_optimization

* first complishment of transpose op

* second commit

* refine code logics of tranpose_kernel

* refine transpose kernel

* first commit

* fix DtoD copy bugs for hip

* refine code according to the PR advice

* change dim to int64_t type.

* fix some type error

ec7fe888

Y
[PHI]Standardise some C++ API (Part2) (#47510) · 399047d7
由 YuanRisheng 提交于 11月 01, 2022
```
* standard_api

* add hardtanh
```
399047d7
S

fix (#47537) · 957fbb02
由 shentanyue 提交于 11月 01, 2022

957fbb02

[CodeStyle][E712] use `if cond`/`if cond is True` for comparison with `True` (#47464) · 5a2ab683

由 Nyakku Shigure 提交于 11月 01, 2022

* [CodeStyle][E712] use `if cond`/`if cond is True` for comparison with `True`

* revert changes in fluid

* revert unrelated file

* revert changes in norm

* revert changes in auto_parallel_amp

* fix norm and auto_parallel_amp

* revert a typo fix due to fixed at #47477

5a2ab683

Support custom stream for standalone executor (#47411) · e12b6c04

由 Ruibiao Chen 提交于 11月 01, 2022

* [Auto Parallel] Improve the c++ dist attr

* [Auto Parallel] Modify test_program.py

* Support custom stream for standalone executor
Co-authored-by: NYulong Ao <aoyulong@baidu.com>

e12b6c04

K
fix memory copy in prepare_data of FusedMultiTransformer pass (#47306) · 9ad0e37e
由 Kaipeng Deng 提交于 11月 01, 2022
```
* fix memory copy in prepare_data. test=develop
```
9ad0e37e
S

[Lite][XPU] Upgrade lite subgraph api of xpu (#47373) · 8a1124b1
由 shentanyue 提交于 11月 01, 2022

8a1124b1
F

fix:add no support for cuda_arch<700 (#47509) · 974f8f32
由 feng_shuai 提交于 11月 01, 2022

974f8f32

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功