提交 · a0f473504bdddb414497861335c83b7a10917c68 · PaddlePaddle / Paddle

23 11月, 2022 1 次提交
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
21 11月, 2022 2 次提交
- W
  refine reduce_all (#48133) · 56f15c43
  由 wanghuancoder 提交于 11月 21, 2022
```
* refine reduce_all
```
  56f15c43
- T
  
  add adamw suppor xpu, test=kunlun (#48114) · 27e252d9
  由 taixiurong 提交于 11月 21, 2022
  
  27e252d9
18 11月, 2022 2 次提交

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

Z

cast and gradient_accumulator support double for xpu, test=kunlun (#47800) · 982d5ff7
由 zhangyikun02 提交于 11月 18, 2022

982d5ff7

17 11月, 2022 2 次提交
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
16 11月, 2022 1 次提交

Fix paddle rec, kim, dsin models' bugs (#47792) · e23dfed9

由 ykkk2333 提交于 11月 16, 2022

* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* embedding and embedding_grad add int32 input, test=kunlun

e23dfed9

15 11月, 2022 1 次提交
- [Zero-Dim] support input 0D Tensor for xpu kernel, test=kunlun (#47849) · d4d3d7ed
  由 zhouweiwei2014 提交于 11月 15, 2022
  
  d4d3d7ed
11 11月, 2022 1 次提交
- [Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
  由 zhouweiwei2014 提交于 11月 11, 2022
  
  18549417
10 11月, 2022 6 次提交

Z

conv2d_transpose and deformable_conv unrestricted some limit for xpu2, test=kunlun (#47837) · a38fc5e1
由 zhangyikun02 提交于 11月 10, 2022

a38fc5e1

[PHI]Standardise some C++ API (Part4) (#47702) · 594bd723

由 YuanRisheng 提交于 11月 10, 2022

* standard api

* fix sparse bugs

* fix xpu bugs, test=kunlun

* remove hard code for custom unittest

* open ci, test=kunlun

* deal with conflict

594bd723

W
[PHI decoupling] remove fluid/framework/generator.h from phi (#47822) · 28c56d77
由 Wang Xin 提交于 11月 10, 2022
```
* remove fluid/framework/generator.h from phi

* fix PR-CI-Kunlun-KP-Build fail
```
28c56d77

[PHI Decoupling] remove "paddle/fluid/platform/float16.h" and... · 8164b97a

由 huangjiyi 提交于 11月 10, 2022

[PHI Decoupling] remove "paddle/fluid/platform/float16.h" and "paddle/fluid/platform/for_range.h" in phi. (#47817)

* rm "paddle/fluid/platform/float16.h" in phi

* rm "paddle/fluid/platform/for_range.h" in phi

8164b97a

[Zero-Dim] support input 0D Tensor for xpu compare kernel, test=kunlun (#47812) · d01109fc
由 zhouweiwei2014 提交于 11月 10, 2022

d01109fc

XPU multi-card support eager mode (#47445) · 3b91f8f3

由 james 提交于 11月 10, 2022

* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun

3b91f8f3

09 11月, 2022 1 次提交

[PHI decoupling] remove framework/data_type.h from phi (#47776) · 1631836f

由 Wang Xin 提交于 11月 09, 2022

* remove framework/data_type.h from phi

* fix CI fail: map proto::VarType to phi::DataType

* refactor code to add more detailed comments

1631836f

08 11月, 2022 2 次提交
- Z
  
  add adadelta op for xpu, test=kunlun (#47661) · 047971f0
  由 zhangyikun02 提交于 11月 08, 2022
  
  047971f0
- Z
  
  argsort support n > 16384 and add argsort_grad op for xpu, test=kunlun (#47701) · 6a6a3ff1
  由 zhangyikun02 提交于 11月 08, 2022
  
  6a6a3ff1
07 11月, 2022 2 次提交
- Q
  support kldiv_loss/kldiv_loss_grad for kunlun (#47638) · 5f0a8adc
  由 QingshuChen 提交于 11月 07, 2022
```
*test=kunlun
```
  5f0a8adc
- Y
  add roll and roll_grad kernels and strided_slice and strided_slice_grad... · 5a4d2186
  由 ykkk2333 提交于 11月 07, 2022
```
add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368)

* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun
```
  5a4d2186
04 11月, 2022 3 次提交
- H
  [XPU] add cumsum op. test=kunlun (#47585) · ac2a94c7
  由 houj04 提交于 11月 04, 2022
```
* [XPU] add cumsum op. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* debug. test=kunlun

* update xpu.cmake. remove unnecessary codes. test=kunlun.
```
  ac2a94c7
- Y
  
  fix deepfm and deep_wide bug, add embedding_sparse_grad kernel, test=kunlun (#47365) · f53e920d
  由 ykkk2333 提交于 11月 04, 2022
  
  f53e920d
- Z
  
  matmul_v2 support new case and fix masked_select bug for xpu, test=kunlun (#47370) · 6916215e
  由 zhangyikun02 提交于 11月 04, 2022
  
  6916215e
03 11月, 2022 1 次提交
- Y
  
  fix xpu ci bugs, test=kunlun (#47581) · da083436
  由 YuanRisheng 提交于 11月 03, 2022
  
  da083436
02 11月, 2022 4 次提交
- Z
  fix ci bug (#47583) · 0967506e
  由 zhangbo9674 提交于 11月 02, 2022
```
* fix ci bug

* test
```
  0967506e
- Y
  [PHI]Standardise some C++ API (Part3) (#47532) · fe8c6796
  由 YuanRisheng 提交于 11月 02, 2022
```
* Standardise batch norm

* standardize conv3d and depwise_conv2d

* fix ci bugs
```
  fe8c6796
- [Zero-Dim] support input 0D Tensor for some binary api (#46909) · cad2e68d
  由 zhouweiwei2014 提交于 11月 02, 2022
  
  cad2e68d
- H
  [XPU] add int64 support for slice and subtract. (#47409) · 77395619
  由 houj04 提交于 11月 02, 2022
```
* [XPU] add int64 support for slice and subtract. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* remove unnecessary modification. test=kunlun
```
  77395619
01 11月, 2022 3 次提交

Y
[PHI]Standardise some C++ API (Part2) (#47510) · 399047d7
由 YuanRisheng 提交于 11月 01, 2022
```
* standard_api

* add hardtanh
```
399047d7
W

remove unused-local-typedefs warning on linux (#47513) · 96f36962
由 Wang Xin 提交于 11月 01, 2022

96f36962

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

31 10月, 2022 1 次提交
- Y
  [PHI]Standardise some C++ API (#47385) · 60e0c506
  由 YuanRisheng 提交于 10月 31, 2022
```
* standard api

* fix ci bugs

* fix ci bugs

* fix ce bugs
```
  60e0c506
25 10月, 2022 1 次提交
- [Zero-Dim] support input 0D Tensor for softmax/log_softmax/gumbel_softmax (#47251) · ac3b882f
  由 zhouweiwei2014 提交于 10月 25, 2022
  
  ac3b882f
21 10月, 2022 1 次提交
- Z
  
  fix bug of abs_grad in eager mode for kunlun, test=kunlun (#47164) · a9ac608f
  由 zhangyikun02 提交于 10月 21, 2022
  
  a9ac608f
18 10月, 2022 1 次提交
- H
  [XPU] update xpu cmake to 1016. test=kunlun (#47041) · 55ac9c46
  由 houj04 提交于 10月 18, 2022
```
* [XPU] update xpu cmake to 1016. test=kunlun

* fix special case of transpose op. test=kunlun
```
  55ac9c46
17 10月, 2022 2 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
- L
  Fix the bug of PHI kernel of reduce_sum in kunlun when using eager mode. (#47004) · f9c1cdc1
  由 Leo Guo 提交于 10月 17, 2022
```
test=kunlun
```
  f9c1cdc1
30 9月, 2022 1 次提交

fix bugs of tipc, test=kunlun (#46540) · d16360c8

由 ykkk2333 提交于 9月 30, 2022

* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun

* migrate add_n kernep to phi, test=kunlun

* fix bugs of tipc, test=kunlun

d16360c8

29 9月, 2022 1 次提交

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功