提交 · d7e74e634de7ec122eeafbaedf87ac48375896d7 · 机器未来 / Paddle

14 9月, 2022 4 次提交
- L
  
  add check_memory_continue kernel (#45999) · a5021e89
  由 Leo Chen 提交于 9月 14, 2022
  
  a5021e89
- Z
  [AMP] Support AMP-O2 for bfloat16 (#45541) · e8809d99
  由 zhangbo9674 提交于 9月 14, 2022
```
* support bfloat16 for amp_decorate

* add check_finite for bf16

* fix bug

* add ut

* add ut

* refine code
```
  e8809d99
- Y
  
  Simplify the codes of conv. (#45966) · 3a5b5048
  由 Yiqun Liu 提交于 9月 14, 2022
  
  3a5b5048
- Z
  
  [Sparse]Sparse add support gpu (#45974) · da33f7b0
  由 zhangkaihuo 提交于 9月 14, 2022
  
  da33f7b0
13 9月, 2022 3 次提交
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
- Y
  
  migrate squeeze kernel to phi, test=kunlun (#45968) · d3366853
  由 ykkk2333 提交于 9月 13, 2022
  
  d3366853
- Y
  
  fix transformer bug, test=kunlun (#45927) · e6d397e6
  由 ykkk2333 提交于 9月 13, 2022
  
  e6d397e6
09 9月, 2022 7 次提交
- Fix namespace error (#45925) · a687b531
  由 engineer1109 提交于 9月 09, 2022
```
paddle::platform::CudaAtomicAdd
https://github.com/PaddlePaddle/Paddle/issues/45881
```
  a687b531
- S
  Fix softmax op when the input shape is larger than INT32_MAX (#45897) · 38edea9a
  由 sneaxiy 提交于 9月 09, 2022
```
* fix softmax int64

* follow comments
```
  38edea9a
- 5
  
  optimization of max_pool3d forward (#45820) · 2632d77d
  由 5u13 提交于 9月 09, 2022
  
  2632d77d
- C
  [Phi] Migrate load kernel (#45891) · a001f263
  由 Chen Weihang 提交于 9月 09, 2022
```
* migrate load kernel

* remove load op

* fix test failed
```
  a001f263
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
- X
  modify slice op Infershape (#45855) · 97847ae8
  由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
  97847ae8
- C
  Simplify size op impl (#45808) · c252b1de
  由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
  c252b1de
08 9月, 2022 2 次提交

[PHI] Migrate cast, clip+grad and pool+grad oneDNN kernels (#45775) · 1a929c31

由 piotrekobi 提交于 9月 08, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI

* Refactor grad kernels into separate files

* Fix CI failures

* Fix Codestyle

* Implement reviewer suggestions

* Add new lines after includes for readability
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

1a929c31

L

Migrate roi_align and roi_align_grad to phi. test=kunlun (#45858) · 8add11a0
由 Leo Guo 提交于 9月 08, 2022

8add11a0

07 9月, 2022 9 次提交

[Phi] Migrate save kernel (#45665) · fc66fdb7

由 Chen Weihang 提交于 9月 07, 2022

* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h

fc66fdb7

H
[XPU] update xdnn to 0907. (#45777) · 1e981d0d
由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
1e981d0d

[PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528

由 piotrekobi 提交于 9月 07, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

22255528

W
[OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
fe169bf1
H

[XPU] move rnn op to phi. (#45822) · 91631492
由 houj04 提交于 9月 07, 2022

91631492
L
Performance fix for broadcast kernel [Part2] (#40051) · 87cba48b
由 limingshu 提交于 9月 07, 2022
```
* first commit

* merged with develop

* merged with develop

* fix merge sequential one dims bugs
```
87cba48b
S
[PHI] Migrate scale kernel (#45537) · 429b5b5b
由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
429b5b5b
Z

[Sparse]Rename sparse kernel (#45730) · 36739748
由 zhangkaihuo 提交于 9月 07, 2022

36739748
S
Fix UpdateLossScalingKernel to prevent data transform error (#45809) · c084a7b1
由 sneaxiy 提交于 9月 07, 2022
```
* fix amp kernel

* update to remove PADDLE_WITH_XPU macro
```
c084a7b1

06 9月, 2022 10 次提交
- Y
  [PHI]Add TensorArray for PHI (#45479) · 68f99b78
  由 YuanRisheng 提交于 9月 06, 2022
```
* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code
```
  68f99b78
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb
- Y
  
  migrate unsqueeze kernels to phi, test=kunlun (#45673) · 4acf1ef7
  由 ykkk2333 提交于 9月 06, 2022
  
  4acf1ef7
- W
  [Eager, Performance optimization] reduce_all interface move reduce_all flag... · 192b3033
  由 Weilong Wu 提交于 9月 06, 2022
```
[Eager, Performance optimization] reduce_all interface move reduce_all flag from python to C++ (#45744)

* [Eager, Performance optimization] move reduce_all flag from python to c++

* polish reduce_all

* fix ci error

* fix errors
```
  192b3033
- W
  [Eager, Performance optimization] Reduce min/max kernel polish (#45755) · a6476418
  由 Weilong Wu 提交于 9月 06, 2022
```
* [Eager, Performance optimization] reduce_max / min polish

* polish reduce_max / min

* update min/max kernel reduce_all logic

* fix a mistake

* fix ci errors

* fix errors
```
  a6476418
- X
  
  elementwise op support fp16 (#45496) · f6d9ec27
  由 xiaohemaikoo 提交于 9月 06, 2022
  
  f6d9ec27
- L
  Fix grad error of groupnorm op when cuda version==11.7 (#45738) · b0a3638f
  由 LielinJiang 提交于 9月 06, 2022
```
* fix grad error of grounorm op when cuda version==11.7
```
  b0a3638f
- C
  
  polish xpu enforce msg, test=kunlun (#45749) · b1f1dd05
  由 Chen Weihang 提交于 9月 06, 2022
  
  b1f1dd05
- W
  
  Completes basic dtypes for collective api in eager mode (#45574) · 7a92e74b
  由 Wen Sun 提交于 9月 06, 2022
  
  7a92e74b
- H
  
  [XPU] rmsprop to phi. (#45734) · 1137677a
  由 houj04 提交于 9月 06, 2022
  
  1137677a
05 9月, 2022 5 次提交

[PHI] Move oneDNN helper classes to new location (#45626) · 269bd1fe

由 piotrekobi 提交于 9月 05, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

269bd1fe

K
[Bug Fix] fix compile error in gcc540 (#45702) · fd56f08e
由 kangguangli 提交于 9月 05, 2022
```
* fix compile error in gcc540
```
fd56f08e
A
[OpAttr]Fix complation error of XPU from Pool2dGradKernel (#45727) · a8da1625
由 Aurelius84 提交于 9月 05, 2022
```
* [OpAttr]Fix complation error of XPU from Pool2dGradKernel

* test=kunlun
```
a8da1625

[phi] Migrate memcpy kernel to PHI, hold NPU op (#45622) · 2f19a364

由 HongyuJia 提交于 9月 05, 2022

* migrate memcpy to phi

* fix typo error

* fix typo error

* fix  bug and testcase

* fix typo, uniform_random_kernel.cc header

* fix Alloc pinned bug

* change GPUContext::GetPinnedPlace

* add GetPinnedPlace function

* add GetPinnedPlace function

* restore default throw error

* fix Unimplemented error

* skip StandaloneExecutor testcase

* delete memcpy_sig

2f19a364

S

fix some op int32 exceed range (#45711) · a1dbee23
由 sneaxiy 提交于 9月 05, 2022

a1dbee23

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致