提交 · 5f6b9f1b04a1e167e41774f225f91fa34cf418b7 · BaiXuePrincess / Paddle

18 10月, 2022 1 次提交
- W
  [Cherry pick] trt pool2d adaptive ifx (#47069) · 5f6b9f1b
  由 Wang Bojun 提交于 10月 18, 2022
```
* draft with debug print
* remove debug print
* bug fix for ci
```
  5f6b9f1b
17 10月, 2022 3 次提交

Z
[cherry-pick]Sparse static graph (#46838) · 10225d22
由 zhangkaihuo 提交于 10月 17, 2022
```
cherry-pick : #46322, #46245
Sparse API 支持静态图
```
10225d22

Optimize performance of depthwise_conv (#46896) · 976af0da

由 Zhang Zheng 提交于 10月 17, 2022

Optimize performance of depthwise_conv

Config: input[2048, 1024, 4, 4], filter[1024, 1, 4, 4], stride=1, pad=0, dilation=1

976af0da

[Cherry-Pick]Move valid check from python to kernel (#46980) · 8bfd45ad

由 Zhang Zheng 提交于 10月 17, 2022

为了提升性能，将label的边界检查从python端转移到kernel内，减少额外op的调用，如min、max和同步拷贝等
    当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效，但是当某个label值超出了边界，ignore_index等于该label，这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错，但逻辑上仍是有问题的，且模板参数IgnoreIndex是没有必要的

8bfd45ad

13 10月, 2022 1 次提交

[cherry-pick] [PHI] transpose2_grad op migration (#46139) (#46873) · 0280c0b9

由 Sławomir Siwek 提交于 10月 13, 2022

* Revert pool+grad oneDNN kernel conversion (#45989)

* [PHI] transpose2_grad op migration (#46139)

* op migrated, Copy(OneDNNContext, ...) added

* mutable_data & op registration in fluid removed

* refactoring

* OneDNNGetDataType to uppercase

* missing cpu check added, handler moved to .h file

* name changed to transpose_grad

* Copy changed back to TensorCopy

* Resizing corrected, Copy(OneDNNContext) removed
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>

0280c0b9

11 10月, 2022 6 次提交
- F
  
  set_value_op: add support for complex types (#46885) · b051455f
  由 Feiyu Chan 提交于 10月 11, 2022
  
  b051455f
- S
  
  add seed check (#46858) · 2190da20
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2190da20
- S
  
  hard_swish grad (#46857) · 2c6bd4ad
  由 Sławomir Siwek 提交于 10月 11, 2022
  
  2c6bd4ad
- S
  [cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0
  由 Sławomir Siwek 提交于 10月 11, 2022
```
* [PHI] Migrate gelu kernels (#45596)

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* gelu fwd

* sort activations

* gelu gradient

* remove unused macros

* merge conflicts

* fix merge conflicts

* remove extra contraint from gelu op

* [PHI] relu6_grad kernel (#46501)

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style
```
  2bcbf8b0
- S
  Revert pool+grad oneDNN kernel conversion (#45989) (#46860) · 7b3837e6
  由 Sławomir Siwek 提交于 10月 11, 2022
```
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
```
  7b3837e6
- Y
  [BugFix]Fix concat bugs when call onednn kernel (#46518) (#46845) · 6a6c7493
  由 YuanRisheng 提交于 10月 11, 2022
```
* fix concat bug

* fix ci bugs

* fix ci bugs
```
  6a6c7493
10 10月, 2022 5 次提交

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN... · fdd0d6d0

由 Sławomir Siwek 提交于 10月 10, 2022

[cherry-pick] [PHI] Migrate concat+grad, expand+grad, fill_constant … oneDNN kernels (#45863) (#46727)

* [PHI] Migrate concat+grad, expand+grad, fill_constant, nearest_interp and bilinear_interp oneDNN kernels (#45863)

* Migrate concat+grad, expand+grad, fill_constant, nearest_interp_v2 and bilinear_interp_v2 oneDNN kernels to PHI

* Remove old namespace variable

* Fix invalid out dims error

* Add mutable_data method to concat output

* Add check for -1 dim before computing out_dims

* Capitalize oneDNNGetDataType function name

* Change fill_constant kernel to correct PHI kernel

* Attempt to fix dims error

* Fix fill_constant (full) kernel

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

fdd0d6d0

[cherry-pick] [PHI] Migrate sgd and stack oneDNN kernels (#46374) (#46729) · 25d61cd1

由 Sławomir Siwek 提交于 10月 10, 2022

* [PHI] Migrate sgd and stack oneDNN kernels (#46374)

* Convert slice+grad oneDNN fluid kernels to PHI

* Change mutable_data to Alloc

* Refactor licences

* update dependencies
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

25d61cd1

[PHI] Migrate slice, slice_grad, split, pad and pad3d oneDNN kernels (#46101) (#46726) · 51a91fee

由 Sławomir Siwek 提交于 10月 10, 2022

* Convert split, pad and pad3d kernels

* Convert slice+grad oneDNN fluid kernels to PHI

* change out->mutable_data to dev_ctx.Alloc
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>

51a91fee

S
[PHI] migrate softmax_grad kernel (#46257) (#46725) · 44ecae6c
由 Sławomir Siwek 提交于 10月 10, 2022
```
* init

* remove softmaxop

* merge dev

* correct dir

* style
```
44ecae6c

[PHI] Shape op migration (#46051) (#46724) · 3cc3f60f

由 Sławomir Siwek 提交于 10月 10, 2022

* First approach

* Shape kernel corrected

* Compilation error fixed

* Resize corrected

* Registered types added

* Mistake corrected & types added

* sum kernel deleted
Co-authored-by: NPaulina Gacek <paulina.gacek.pl@gmail.com>

3cc3f60f

29 9月, 2022 2 次提交
- 傅
  [cherry-pick] Add FP16 support for uniform in dygraph mode on Nvidia GPU (#46641) · a58663f3
  由傅剑寒提交于 9月 29, 2022
```
Add FP16 support for uniform in dygraph mode on Nvidia GPU
Dev PR link PR46212
```
  a58663f3
- L
  [CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and... · f5956bec
  由 Lin Manhui 提交于 9月 29, 2022
```
[CherryPick][Fix] Remove std::trunc() in FloorDivideFunctor and InverseFloorDivideFunctor (#45051) (#46504)
```
  f5956bec
27 9月, 2022 1 次提交
- Z
  
  fix shard_index kernel (#46491) (#46511) · 5711bbee
  由 zhaoyingli 提交于 9月 27, 2022
  
  5711bbee
20 9月, 2022 6 次提交

H
[cherry-pick][xpu] update xdnn activations (#46282) · a43f960e
由 houj04 提交于 9月 20, 2022
```
* [XPU] update xdnn activations. (#46246)

* [XPU] update xpu cmake. test=kunlun
```
a43f960e
H
[PolishComments] Polish some code comments (#46032) (#46261) · 42e56f65
由 HongyuJia 提交于 9月 20, 2022
```
* polish code comments

* polish data_device_transform.cc
```
42e56f65

[Cherry-pick] Fix amp error cp (#46272) · da173c40

由 Jiabin Yang 提交于 9月 20, 2022

* [Eager] Fix ocr (#46124)

* fix linspace error in amp

* fix log

* fix amp error

* fix ocr error which caused by amp

* add more check

* rename dtype ns

* [Eager Bug fix]Fix Detection (#46147)

* fix linspace error in amp

* fix log

* fix amp error

* Revert "Simplify size op impl (#45808)"

This reverts commit c252b1de.

* fix_seg

* fix detection
Co-authored-by: NChen Weihang <sunny_cwh@163.com>
Co-authored-by: NChen Weihang <sunny_cwh@163.com>

da173c40

[Release/2.4][Cherry-pick] Fix bug of reduce_sum op (#46160) · 759736df

由 Ghost Screaming 提交于 9月 20, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX,
its result is wrong.

* Cherry-pick of PR 46045

* Fix bug of reduce_sum kp op.

* Fix bug of reduce_sum kp operator compilation.
If compilation device is XPU, eigen kernel should be ignored.

759736df

[Cherry-pick] Sparse add InferMeta (#46235) · fd8ec4a1

由 zhangkaihuo 提交于 9月 20, 2022

cherry-pick : #46016, #46021, #45974

* [Sparse]Sparse add support gpu (#45974)

* [Sparse]Remove unused code (#46021)

* [Sparse] Add infer meta (#46016)

fd8ec4a1

Fix wrong eigen header include (#46082) (#46202) · ac8cce20

由 zyfncg 提交于 9月 20, 2022

* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper

ac8cce20

19 9月, 2022 4 次提交
- R
  [vision.ops.nms] Fix return order error and duplicate results with specific... · be84cac7
  由 RichardWooSJTU 提交于 9月 19, 2022
```
[vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) (#46193)

* fix return order error and duplicate results with specific inputs
```
  be84cac7
- J
  [Cherry-pick] Support bmm and bmm_grad in xpu (#45887) (#46132) · 1c7e95cc
  由 Jiabin Yang 提交于 9月 19, 2022
```
* [PHI] Support bmm and bmm_grad in xpu (#45887)

* support bmm and bmm_grad in xpu

* add error removal

* test=kunlun

* refactor code for better structure

* test=kunlun

* add fp16 kernel for bmm

* test=kunlun

* test=kunlun
```
  1c7e95cc
- S
  
  fix broadcast kernel (#46158) · 860f6077
  由 sneaxiy 提交于 9月 19, 2022
  
  860f6077
- C
  Revert "Simplify size op impl (#45808)" (#46168) · dabb8f23
  由 Chen Weihang 提交于 9月 19, 2022
```
This reverts commit c252b1de.
```
  dabb8f23
15 9月, 2022 1 次提交
- W
  Support 0 shapes input Tensor for MKL slice (#45930) (#46072) · 903c87bd
  由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
  903c87bd
14 9月, 2022 2 次提交
- [chery-pick] Fix namespace error (#45925) (#46029) · 925e84bf
  由 engineer1109 提交于 9月 14, 2022
```
修复cuda11.7编译出错的问题
```
  925e84bf
- Y
  
  fix transformer bug, test=kunlun (#45983) · 20d168d9
  由 ykkk2333 提交于 9月 14, 2022
  
  20d168d9
13 9月, 2022 1 次提交
- J
  
  cherry pick softmax infer kernel (#45957) · 0903020d
  由 JingZhuangzhuang 提交于 9月 13, 2022
  
  0903020d
09 9月, 2022 3 次提交
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
- X
  modify slice op Infershape (#45855) · 97847ae8
  由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
  97847ae8
- C
  Simplify size op impl (#45808) · c252b1de
  由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
  c252b1de
08 9月, 2022 2 次提交

[PHI] Migrate cast, clip+grad and pool+grad oneDNN kernels (#45775) · 1a929c31

由 piotrekobi 提交于 9月 08, 2022

* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI

* Refactor grad kernels into separate files

* Fix CI failures

* Fix Codestyle

* Implement reviewer suggestions

* Add new lines after includes for readability
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

1a929c31

L

Migrate roi_align and roi_align_grad to phi. test=kunlun (#45858) · 8add11a0
由 Leo Guo 提交于 9月 08, 2022

8add11a0

07 9月, 2022 2 次提交

[Phi] Migrate save kernel (#45665) · fc66fdb7

由 Chen Weihang 提交于 9月 07, 2022

* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h

fc66fdb7

H
[XPU] update xdnn to 0907. (#45777) · 1e981d0d
由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
1e981d0d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致