提交 · 7f346a76bc4e5fabdba3e54613f711acdeb74045 · 机器未来 / Paddle

18 9月, 2022 1 次提交
- Y
  Delete redundant param in SoftmaxFunctor (#46003) · 7f346a76
  由 YuanRisheng 提交于 9月 18, 2022
```
* perfect softmax functor

* fix compile bugs

* fix ci bugs
```
  7f346a76
17 9月, 2022 1 次提交

Fix bug of reduce_sum op. (#46045) · 28b4240b

由 Ghost Screaming 提交于 9月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Fix some problems.
1. Change fluid head files to phi files.
2. Delete useless code.
3. Fix code style problems.

* Fix some code style problems.

* Fix some code style problems.

28b4240b

16 9月, 2022 2 次提交

Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84

由 sneaxiy 提交于 9月 16, 2022

* support int64 non-broadcast

* support broadcast case for int64 index

* fix bug

* support more Arity

* remove some codes

* upgrade patchelf to v0.15.0 to pass CI build

* fix bug

* fix patchelf installation

* add debug flags

* remove useless codes

* fix viterbi_decode and set_value op uts

* remove always enable int64

20b5bf84

Z

Correct spelling errors (#46108) · 08186f14
由 Zhang Zheng 提交于 9月 16, 2022

08186f14

15 9月, 2022 5 次提交
- J
  updating mul and matmul with set_mem_desc (#45624) · 416e0de7
  由 Jacek Czaja 提交于 9月 15, 2022
```
* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix
```
  416e0de7
- 傅
  
  Optimize flip kernel by eliminating H2D data transfer, test=develop (#46046) · b3283f4c
  由傅剑寒提交于 9月 15, 2022
  
  b3283f4c
- W
  Support 0 shapes input Tensor for MKL slice (#45930) · 1d78681d
  由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
  1d78681d
- L
  Performance fix for broadcast kernel [Part3] (#45854) · f48b1264
  由 limingshu 提交于 9月 15, 2022
```
* first commit

* fix some bugs in code

* fix bugs

* to optimize merge one dimension feature
```
  f48b1264
- L
  
  add determine action for embed_grad and index_add. (#46040) · 0c40d889
  由 Li Min 提交于 9月 15, 2022
  
  0c40d889
14 9月, 2022 8 次提交
- J
  [PHI] Support bmm and bmm_grad in xpu (#45887) · 6bd2762c
  由 Jiabin Yang 提交于 9月 14, 2022
```
* support bmm and bmm_grad in xpu

* add error removal

* test=kunlun

* refactor code for better structure

* test=kunlun

* add fp16 kernel for bmm

* test=kunlun
```
  6bd2762c
- L
  
  Support fp16 for index_select and index_add (#45601) · 61012a76
  由 Li Min 提交于 9月 14, 2022
  
  61012a76
- L
  Migrate scale and scatter to phi, and modify the code style for... · 1349584e
  由 Leo Guo 提交于 9月 14, 2022
```
Migrate scale and scatter to phi, and modify the code style for roi_align_kernel. test=kunlun (#45938)
```
  1349584e
- Y
  
  [XPU] migrate reduce kernels to phi, test=kunlun (#45973) · 5829069d
  由 ykkk2333 提交于 9月 14, 2022
  
  5829069d
- L
  
  add check_memory_continue kernel (#45999) · a5021e89
  由 Leo Chen 提交于 9月 14, 2022
  
  a5021e89
- Z
  [AMP] Support AMP-O2 for bfloat16 (#45541) · e8809d99
  由 zhangbo9674 提交于 9月 14, 2022
```
* support bfloat16 for amp_decorate

* add check_finite for bf16

* fix bug

* add ut

* add ut

* refine code
```
  e8809d99
- Y
  
  Simplify the codes of conv. (#45966) · 3a5b5048
  由 Yiqun Liu 提交于 9月 14, 2022
  
  3a5b5048
- Z
  
  [Sparse]Sparse add support gpu (#45974) · da33f7b0
  由 zhangkaihuo 提交于 9月 14, 2022
  
  da33f7b0
13 9月, 2022 3 次提交
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
- Y
  
  migrate squeeze kernel to phi, test=kunlun (#45968) · d3366853
  由 ykkk2333 提交于 9月 13, 2022
  
  d3366853
- Y
  
  fix transformer bug, test=kunlun (#45927) · e6d397e6
  由 ykkk2333 提交于 9月 13, 2022
  
  e6d397e6
09 9月, 2022 7 次提交
- Fix namespace error (#45925) · a687b531
  由 engineer1109 提交于 9月 09, 2022
```
paddle::platform::CudaAtomicAdd
https://github.com/PaddlePaddle/Paddle/issues/45881
```
  a687b531
- S
  Fix softmax op when the input shape is larger than INT32_MAX (#45897) · 38edea9a
  由 sneaxiy 提交于 9月 09, 2022
```
* fix softmax int64

* follow comments
```
  38edea9a
- 5
  
  optimization of max_pool3d forward (#45820) · 2632d77d
  由 5u13 提交于 9月 09, 2022
  
  2632d77d
- C
  [Phi] Migrate load kernel (#45891) · a001f263
  由 Chen Weihang 提交于 9月 09, 2022
```
* migrate load kernel

* remove load op

* fix test failed
```
  a001f263
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
- X
  modify slice op Infershape (#45855) · 97847ae8
  由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
  97847ae8
- C
  Simplify size op impl (#45808) · c252b1de
  由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
  c252b1de
08 9月, 2022 2 次提交

[PHI] Migrate cast, clip+grad and pool+grad oneDNN kernels (#45775) · 1a929c31

由 piotrekobi 提交于 9月 08, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI

* Refactor grad kernels into separate files

* Fix CI failures

* Fix Codestyle

* Implement reviewer suggestions

* Add new lines after includes for readability
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

1a929c31

L

Migrate roi_align and roi_align_grad to phi. test=kunlun (#45858) · 8add11a0
由 Leo Guo 提交于 9月 08, 2022

8add11a0

07 9月, 2022 9 次提交

[Phi] Migrate save kernel (#45665) · fc66fdb7

由 Chen Weihang 提交于 9月 07, 2022

* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h

fc66fdb7

H
[XPU] update xdnn to 0907. (#45777) · 1e981d0d
由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
1e981d0d

[PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528

由 piotrekobi 提交于 9月 07, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

22255528

W
[OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
fe169bf1
H

[XPU] move rnn op to phi. (#45822) · 91631492
由 houj04 提交于 9月 07, 2022

91631492
L
Performance fix for broadcast kernel [Part2] (#40051) · 87cba48b
由 limingshu 提交于 9月 07, 2022
```
* first commit

* merged with develop

* merged with develop

* fix merge sequential one dims bugs
```
87cba48b
S
[PHI] Migrate scale kernel (#45537) · 429b5b5b
由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
429b5b5b
Z

[Sparse]Rename sparse kernel (#45730) · 36739748
由 zhangkaihuo 提交于 9月 07, 2022

36739748
S
Fix UpdateLossScalingKernel to prevent data transform error (#45809) · c084a7b1
由 sneaxiy 提交于 9月 07, 2022
```
* fix amp kernel

* update to remove PADDLE_WITH_XPU macro
```
c084a7b1

06 9月, 2022 2 次提交
- Y
  [PHI]Add TensorArray for PHI (#45479) · 68f99b78
  由 YuanRisheng 提交于 9月 06, 2022
```
* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code
```
  68f99b78
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致