提交 · a5021e894290aca676542094c8a0edcaa52476a5 · PaddlePaddle / Paddle

14 9月, 2022 7 次提交
- L
  
  add check_memory_continue kernel (#45999) · a5021e89
  由 Leo Chen 提交于 9月 14, 2022
  
  a5021e89
- Z
  [AMP] Support AMP-O2 for bfloat16 (#45541) · e8809d99
  由 zhangbo9674 提交于 9月 14, 2022
```
* support bfloat16 for amp_decorate

* add check_finite for bf16

* fix bug

* add ut

* add ut

* refine code
```
  e8809d99
- C
  Fix arm fp16 compile error (#45991) · 9f4f18f2
  由 Chen Weihang 提交于 9月 14, 2022
```
* fix arm fp16 compile error

* polish macro impl
```
  9f4f18f2
- C
  
  support assign op backward refuse forward (#45879) · 65dd828e
  由 Charles-hit 提交于 9月 14, 2022
  
  65dd828e
- Y
  
  Simplify the codes of conv. (#45966) · 3a5b5048
  由 Yiqun Liu 提交于 9月 14, 2022
  
  3a5b5048
- C
  [PHI] Normalize yaml op label (#45976) · e43e4825
  由 Chen Weihang 提交于 9月 14, 2022
```
* normalize yaml op label

* revert op_compat yaml change

* fix prelu and rnn compat problem

* replace api by op
```
  e43e4825
- Z
  
  [Sparse]Sparse add support gpu (#45974) · da33f7b0
  由 zhangkaihuo 提交于 9月 14, 2022
  
  da33f7b0
13 9月, 2022 7 次提交
- C
  
  support concat backward refuse forward (#45940) · ff1da188
  由 Charles-hit 提交于 9月 13, 2022
  
  ff1da188
- C
  
  support tile op backward refuse forward (#45942) · c6f173b0
  由 Charles-hit 提交于 9月 13, 2022
  
  c6f173b0
- C
  
  support expand_v2 op backward refuse forward (#45941) · 1eefd66a
  由 Charles-hit 提交于 9月 13, 2022
  
  1eefd66a
- Z
  Clear extra attributes of activation op in OpMaker (#45772) · c7b373f2
  由 zyfncg 提交于 9月 13, 2022
```
* clear extra attr of activation op in opmaker

* fix syntax bug

* fix mkldnn kernel

* fix merge conflict

* fix bug
```
  c7b373f2
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
- Y
  
  migrate squeeze kernel to phi, test=kunlun (#45968) · d3366853
  由 ykkk2333 提交于 9月 13, 2022
  
  d3366853
- Y
  
  fix transformer bug, test=kunlun (#45927) · e6d397e6
  由 ykkk2333 提交于 9月 13, 2022
  
  e6d397e6
09 9月, 2022 11 次提交
- Fix namespace error (#45925) · a687b531
  由 engineer1109 提交于 9月 09, 2022
```
paddle::platform::CudaAtomicAdd
https://github.com/PaddlePaddle/Paddle/issues/45881
```
  a687b531
- S
  Fix softmax op when the input shape is larger than INT32_MAX (#45897) · 38edea9a
  由 sneaxiy 提交于 9月 09, 2022
```
* fix softmax int64

* follow comments
```
  38edea9a
- C
  Fix split bug in static mode (#45906) · bd8f998b
  由 Charles-hit 提交于 9月 09, 2022
```
* fix split bug in static mode

* modify code style

* modify code style

* add unit test for split
```
  bd8f998b
- C
  
  normalize yaml file name (#45894) · 54e1a7cc
  由 Chen Weihang 提交于 9月 09, 2022
  
  54e1a7cc
- L
  [new-exe] convert fused_all_reduce_op_handle to program (#45774) · e755c07e
  由 Leo Chen 提交于 9月 09, 2022
```
* add operator<< for BuildStrategy

* add fake_coalesce

* fit allreduce mode for new_exe

* remove dubeg code

* follow comments
```
  e755c07e
- 5
  
  optimization of max_pool3d forward (#45820) · 2632d77d
  由 5u13 提交于 9月 09, 2022
  
  2632d77d
- C
  [Phi] Migrate load kernel (#45891) · a001f263
  由 Chen Weihang 提交于 9月 09, 2022
```
* migrate load kernel

* remove load op

* fix test failed
```
  a001f263
- C
  
  support cumsum flip reverse backward refuse forward (#45892) · d6b5d91c
  由 Charles-hit 提交于 9月 09, 2022
  
  d6b5d91c
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
- X
  modify slice op Infershape (#45855) · 97847ae8
  由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
  97847ae8
- C
  Simplify size op impl (#45808) · c252b1de
  由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
  c252b1de
08 9月, 2022 4 次提交

[PHI] Migrate cast, clip+grad and pool+grad oneDNN kernels (#45775) · 1a929c31

由 piotrekobi 提交于 9月 08, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI

* Refactor grad kernels into separate files

* Fix CI failures

* Fix Codestyle

* Implement reviewer suggestions

* Add new lines after includes for readability
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

1a929c31

L

Migrate roi_align and roi_align_grad to phi. test=kunlun (#45858) · 8add11a0
由 Leo Guo 提交于 9月 08, 2022

8add11a0
C

fix fp16 compile error (#45873) · e56a2853
由 Chen Weihang 提交于 9月 08, 2022

e56a2853
H

polish code comment, test=doc (#45859) · 447d79da
由 HongyuJia 提交于 9月 08, 2022

447d79da

07 9月, 2022 11 次提交

[Phi] Migrate save kernel (#45665) · fc66fdb7

由 Chen Weihang 提交于 9月 07, 2022

* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h

fc66fdb7

H
[XPU] update xdnn to 0907. (#45777) · 1e981d0d
由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
1e981d0d
C
[Phi] Fix infermeta bug for vector input and output (#45810) · 420d186a
由 Chen Weihang 提交于 9月 07, 2022
```
* fix infermeta bug for vector input and output

* add unittest
```
420d186a
B

fix nullptr bug of BmmGradInferMeta (#45765) · 26d161ef
由 BiynXu 提交于 9月 07, 2022

26d161ef

[PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528

由 piotrekobi 提交于 9月 07, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

22255528

W
[OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
fe169bf1
Z
Clear extra attrs of reduce op in OpMaker (#45786) · 63b6a11b
由 zyfncg 提交于 9月 07, 2022
```
* clear extra attrs of reduce op in opmaker

* fix reduce_mean
```
63b6a11b
H

[XPU] move rnn op to phi. (#45822) · 91631492
由 houj04 提交于 9月 07, 2022

91631492
L
Performance fix for broadcast kernel [Part2] (#40051) · 87cba48b
由 limingshu 提交于 9月 07, 2022
```
* first commit

* merged with develop

* merged with develop

* fix merge sequential one dims bugs
```
87cba48b
S
[PHI] Migrate scale kernel (#45537) · 429b5b5b
由 Sławomir Siwek 提交于 9月 07, 2022
```
* scale kernel

* endline

* add inplace

* fix merge conflicts

* Merge conflicts
```
429b5b5b

[InferMeta] add compile-time infermeta logic for stack infermeta. (#45528) · 5a4ceb32

由 xiongkun 提交于 9月 07, 2022

* add compile-time infermeta logic for stack infermeta.

* add unittest for stack infermeta where -1 exists in shapes.

* remove backward changes.

5a4ceb32

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功