提交 · ac2a94c768be26d11e36270652f154deb2ea4749 · PaddlePaddle / Paddle

04 11月, 2022 10 次提交

[XPU] add cumsum op. test=kunlun (#47585) · ac2a94c7

由 houj04 提交于 11月 04, 2022

* [XPU] add cumsum op. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* debug. test=kunlun

* update xpu.cmake. remove unnecessary codes. test=kunlun.

ac2a94c7

P

add cudnn error (#47666) · eb9e4601
由 pangyoki 提交于 11月 04, 2022

eb9e4601
S

migrate convs (#47658) · 4a4f3f80
由 Sławomir Siwek 提交于 11月 04, 2022

4a4f3f80

[PHI] Migrate pool2d and pool2d_grad kernels (#47423) · ca4bed7b

由 Piotr Paturej 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine

* Migrate pool+grad to PHI

* Update paddle/fluid/operators/mkldnn/test_mkldnn_op_nhwc.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* Update paddle/phi/kernels/onednn/pool_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NChen Weihang <chenwhpro@163.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

ca4bed7b

[PHI] Migrate softplus kernel (#47406) · 1831919f

由 Sławomir Siwek 提交于 11月 04, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI

* init

* adjust imports

* support postops

* format codeblocks

* revert changes to softmax
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

1831919f

L

remove global var (#47659) · 7fe7eebc
由 LiYuRio 提交于 11月 04, 2022

7fe7eebc
Y

fix deepfm and deep_wide bug, add embedding_sparse_grad kernel, test=kunlun (#47365) · f53e920d
由 ykkk2333 提交于 11月 04, 2022

f53e920d
J
Optimized oneDNN FC and added operator+unsqueeze2 and operator+reshape2 oneDNN fuse passes (#47391) · 9e006987
由 jakpiase 提交于 11月 04, 2022
```
* tmp save

* minor chnage

* CI fix

* added FC optimizations

* latest update

* CI fix

* fixed bug with fusing fc
```
9e006987
Z

matmul_v2 support new case and fix masked_select bug for xpu, test=kunlun (#47370) · 6916215e
由 zhangyikun02 提交于 11月 04, 2022

6916215e
W
fix cc_library link python lib (#47605) · cd59c10c
由 wanghuancoder 提交于 11月 04, 2022
```
* fix cc_library link python lib
```
cd59c10c

03 11月, 2022 19 次提交

W
[Paddle Inference]disable_lookup_table_v2 (#47515) · f778c170
由 Wangzheee 提交于 11月 03, 2022
```
* disable_lookup_table_v2
```
f778c170
fix paddle ci script to show more debug log (#47599) · dce81ee1
由 zhouweiwei2014 提交于 11月 03, 2022

dce81ee1
W

Weight and bias's stop_gradient of BatchNorm must be True or False at the same time (#47634) · 21277904
由 wanghuancoder 提交于 11月 03, 2022

21277904

Fix oneDNN elementwise_sub dnnl_error in unit test (#47237) · 30c7758f

由 Piotr Paturej 提交于 11月 03, 2022

* Fix dnnl errors in elementwise_sub tests

* Fix model accuracy attempt

* Add new fix

* Add proper fix

* Refactor by removing code repetition

30c7758f

T

Test CUDNN Frontend Build (#47612) · 605bc003
由 Tian Zheng 提交于 11月 03, 2022

605bc003

Improve performance of coalesce_tensor and depend op in standalone executor (#47606) · 5fb1e824

由 Ruibiao Chen 提交于 11月 03, 2022

* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors

* Improve performance of coalesce_tensor and depend OP in standalone executor

* pre-commit check

5fb1e824

sparse attention kernel is used from 11.8 (#47594) · 7648f429
由 zhouweiwei2014 提交于 11月 03, 2022

7648f429
Y
Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) · 5fc92943
由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
5fc92943

[CodeStyle][py2][U008] remove unnecessary args in `super()` (#47549) · 3de3e45e

由 Nyakku Shigure 提交于 11月 03, 2022

* [CodeStyle][py2][U008] remove unnecessary args in `super()`

* remove remained args

* revert changes in test_pylayer_op

* Revert "revert changes in test_pylayer_op"

This reverts commit ff185a9ae738afac3b0264f61bde6c6b7f72e7c4.

* revert some changes in example code

3de3e45e

L

clean unused code: save_load_util.cc/.h (#47588) · 6d0f730d
由 Leo Chen 提交于 11月 03, 2022

6d0f730d

[Opt Kernel Selection] Opt CanMKLDNNBeUsed performance (#47563) · 9adad42d

由 HongyuJia 提交于 11月 03, 2022

* opt CanMKLDNNBeUsed performance

* fix nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

* fix OpBase default_attrs=nullptr bug

9adad42d

S

fix gemm compute_type (#47613) · 954be40d
由 sneaxiy 提交于 11月 03, 2022

954be40d

[PHI] Migrate softmax kernel (#47339) · b8ae3858

由 Sławomir Siwek 提交于 11月 03, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* remove redundant imports

* migrate softmax

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* merge dev

* fix map at error

* adjust attribute

* adapt funcs to PHI
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

b8ae3858

Z

[Sparse] Unified api args name (#47529) · f9a0605d
由 zhangkaihuo 提交于 11月 03, 2022

f9a0605d

add cuDNN dynamic-link libraries into wheel package (#47552) · 1a404657

由 pangyoki 提交于 11月 03, 2022

* add cudnn into whl package

* add cudnn dso into whl package

* let WITH_CUDNN_DSO be consistent with WITH_GPU

* fix WITH_CUDNN_DSO in paddle_build

1a404657

[Zero-Dim] support input 0D Tensor for min/max/amin/amax/prod/logsumexp/all/any (#47501) · a7509ce3
由 zhouweiwei2014 提交于 11月 03, 2022

a7509ce3
W

bug fix (#47611) · 5160628c
由 wenbin 提交于 11月 03, 2022

5160628c
W
[CodeStyle] remove unused-variable warning in linux (#47558) · e67d6f17
由 Wang Xin 提交于 11月 03, 2022
```
* remove unused-variable warning in linux

* fix unused-variable error in GpuPS
```
e67d6f17
Y

fix xpu ci bugs, test=kunlun (#47581) · da083436
由 YuanRisheng 提交于 11月 03, 2022

da083436

02 11月, 2022 11 次提交
- H
  Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
  由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
  a57a19ea
- Z
  [inference][trt] bilinear support OutSize input (#47495) · c061c082
  由 Zhang Jun 提交于 11月 02, 2022
```
* add bilinear OutSize
```
  c061c082
- L
  
  fix link order (#47584) · 05a4be36
  由 Leo Chen 提交于 11月 02, 2022
  
  05a4be36
- Z
  fix ci bug (#47583) · 0967506e
  由 zhangbo9674 提交于 11月 02, 2022
```
* fix ci bug

* test
```
  0967506e
- 丁
  
  Logsigmoid and Tanhshrink ops convert to trt (#47322) · b045fdfb
  由丁一提交于 11月 02, 2022
  
  b045fdfb
- R
  Dispatch computation OPs before communication in standalone executor (#47471) · 5ed487bf
  由 Ruibiao Chen 提交于 11月 02, 2022
```
* Dispath computation OPs before communication in standalone executor

* Update code

* Fix CI errors
```
  5ed487bf
- T
  
  fix amax/amin/max/min write overflow (#47570) · 6f7a80c3
  由 Tao Luo 提交于 11月 02, 2022
  
  6f7a80c3
- C
  Add storage properties into DenseTensor for supporting extra device properties (#47527) · 246fb841
  由 Chen Weihang 提交于 11月 02, 2022
```
* add storage properties for npu

* fix compile failed

* fix api name mismatch

* polish design
```
  246fb841
- Y
  [PHI]Standardise some C++ API (Part3) (#47532) · fe8c6796
  由 YuanRisheng 提交于 11月 02, 2022
```
* Standardise batch norm

* standardize conv3d and depwise_conv2d

* fix ci bugs
```
  fe8c6796
- [Zero-Dim] support input 0D Tensor for some binary api (#46909) · cad2e68d
  由 zhouweiwei2014 提交于 11月 02, 2022
  
  cad2e68d
- Y
  Improve the tool for checking nan and inf, and support to compute the max, min... · ad39043f
  由 Yiqun Liu 提交于 11月 02, 2022
```
Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095)

* Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor.

* Add a FLAGS to control whether abort when meets inf/nan and polish codes.

* Fix unittest.

* Change the computing of mean.
```
  ad39043f

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功