提交 · 981d1a1067098f9de39cf19eeaec9d049e1fdd8a · PaddlePaddle / Paddle

10 11月, 2022 1 次提交

XPU multi-card support eager mode (#47445) · 3b91f8f3

由 james 提交于 11月 10, 2022

* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun

3b91f8f3

08 11月, 2022 2 次提交
- Z
  
  add adadelta op for xpu, test=kunlun (#47661) · 047971f0
  由 zhangyikun02 提交于 11月 08, 2022
  
  047971f0
- Z
  
  argsort support n > 16384 and add argsort_grad op for xpu, test=kunlun (#47701) · 6a6a3ff1
  由 zhangyikun02 提交于 11月 08, 2022
  
  6a6a3ff1
07 11月, 2022 3 次提交

Q
support kldiv_loss/kldiv_loss_grad for kunlun (#47638) · 5f0a8adc
由 QingshuChen 提交于 11月 07, 2022
```
*test=kunlun
```
5f0a8adc

add roll and roll_grad kernels and strided_slice and strided_slice_grad... · 5a4d2186

由 ykkk2333 提交于 11月 07, 2022

add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun (#47368)

* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

5a4d2186

[Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d

由 HongyuJia 提交于 11月 07, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

* Call SetDnnFallback function in the base class

* activation fallback to plain kernel

* fix default GetExpectedKernelType find wrong kernel

* search cudnn kernel instead of fallback

* fix cudnn_handle bug

* remove tanh use_cudnn

* restore tanh use_cudnn

* debug tanh

* fix tanh bug

* delete activation cudnn kernel

* polish code

908a381d

04 11月, 2022 2 次提交
- H
  [XPU] add cumsum op. test=kunlun (#47585) · ac2a94c7
  由 houj04 提交于 11月 04, 2022
```
* [XPU] add cumsum op. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* try to fix linker. test=kunlun

* debug. test=kunlun

* update xpu.cmake. remove unnecessary codes. test=kunlun.
```
  ac2a94c7
- Y
  
  fix deepfm and deep_wide bug, add embedding_sparse_grad kernel, test=kunlun (#47365) · f53e920d
  由 ykkk2333 提交于 11月 04, 2022
  
  f53e920d
02 11月, 2022 2 次提交

H
Revert "[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325)" (#47582) · a57a19ea
由 HongyuJia 提交于 11月 02, 2022
```
This reverts commit f9134045.
```
a57a19ea

[XPU] add int64 support for slice and subtract. (#47409) · 77395619

由 houj04 提交于 11月 02, 2022

* [XPU] add int64 support for slice and subtract. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* try to fix xpu compile. test=kunlun

* remove unnecessary modification. test=kunlun

77395619

01 11月, 2022 2 次提交

[Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045

由 HongyuJia 提交于 11月 01, 2022

* move cudnn hardcode outside GetExpectedKernelType

* add header file

* debug

* update interpreter_util with hardcode

* update interpreter_util headerfile

* solve activation hardcode

* debug with CI

* add mkldnn_op_list header file

* temporarily uncomment mkldnn

* temporarily uncomment mkldnn

* delete sequence_softmax cudnn hardcode

* add hardcode to data_transfer.cc

* update data_transfer headerfile

* try fix segment fault

* update cudnn&miopen_helper

* reset HasAttr of DygraphExctnCtx

* debug, this commit should pass all CI

* debug should pass CI, temporarily disable activation

* debug should pass CI

* fix default_attr=nullptr bug

* clean debug code

f9134045

Adapting device-specific Extra Attributes for the PHI kernel (#46342) · c923e6c9

由 Chen Weihang 提交于 10月 31, 2022

* add extra attr property set

* add type_info for all context

* add onednn context to all context

* fix context compile error

* simplify conv kernel args

* pass runtime attr into dev_ctx

* fix marco error

* clear conv_grad_kernel extra args

* merge conv_grad_grad into conv_grad

* clear conv2d_grad_grad extra attrs

* clear yaml and eager extra attr

* fix conv1d error

* change to thread local

* fix npu compile failed

* try to fix windows compile failed

* add conv2d onednn phi kernel

* fix ci bugs (#36)

* fix compile bugs (#38)

* fix extra input transform bug (#39)

* support dynamic created attr (#40)

* reset extra info gen code

* rm conv_grad_grad kernel

* reimpl pass attr adapting

* add int attr support

* remove vector inputnames creating

* fix map at error

* Update paddle/phi/kernels/onednn/conv_grad_kernel.cc
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

* remove useless extra attrs

* replace mkldnn_engine by onednn_engine
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

c923e6c9

25 10月, 2022 1 次提交
- H
  
  opt conv_transpose cudnn (#47294) · afd5a96b
  由 HongyuJia 提交于 10月 25, 2022
  
  afd5a96b
21 10月, 2022 1 次提交
- Y
  fix nvprof_nvtx_push interface bug (#47232) · 340009d6
  由 Yuanle Liu 提交于 10月 21, 2022
```
* fix nvprof_nvtx_push interface bug
```
  340009d6
19 10月, 2022 2 次提交
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- L
  clean unused code: piece.cc/h (#47103) · e435d695
  由 Leo Chen 提交于 10月 19, 2022
```
* clean unused code: piece.cc/h

* clean usage
```
  e435d695
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
11 10月, 2022 2 次提交
- W
  
  Completes bfloat16 dtype for collective api in eager mode (#45844) · e4eb8d36
  由 Wen Sun 提交于 10月 11, 2022
  
  e4eb8d36
- C
  Remove LoDTensor using in fluid (Part 1) (#46663) · 940d8f25
  由 Chen Weihang 提交于 10月 11, 2022
```
* remove using lodtensor part1

* polish history code format
```
  940d8f25
30 9月, 2022 3 次提交

[IPU] paddle-inference support custom-ops (#45235) · a6b4bee3

由 Allen Guo 提交于 9月 30, 2022

* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

a6b4bee3

fix bugs of tipc, test=kunlun (#46540) · d16360c8

由 ykkk2333 提交于 9月 30, 2022

* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun

* migrate add_n kernep to phi, test=kunlun

* fix bugs of tipc, test=kunlun

d16360c8

support pure bfloat16 for more ops (#46364) · b7b231a6

由 sneaxiy 提交于 9月 30, 2022

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error

b7b231a6

29 9月, 2022 1 次提交

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

28 9月, 2022 1 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

26 9月, 2022 1 次提交
- C
  
  [MLU] fluid: add mluop (#46429) · 3e1e482b
  由 cifar10 提交于 9月 26, 2022
  
  3e1e482b
16 9月, 2022 2 次提交

Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84

由 sneaxiy 提交于 9月 16, 2022

* support int64 non-broadcast

* support broadcast case for int64 index

* fix bug

* support more Arity

* remove some codes

* upgrade patchelf to v0.15.0 to pass CI build

* fix bug

* fix patchelf installation

* add debug flags

* remove useless codes

* fix viterbi_decode and set_value op uts

* remove always enable int64

20b5bf84

[CustomDevice] add new executor support (#46038) · 268f097e

由 ronnywang 提交于 9月 16, 2022

* [CustomDevice] add custom_device_resource_pool & device_event_custom_device

* update

* update

* update

* update

268f097e

09 9月, 2022 1 次提交
- C
  
  [MLU] fix mluinfo compile error. (#45886) · f06ab336
  由 Chenxiao Niu 提交于 9月 09, 2022
  
  f06ab336
08 9月, 2022 1 次提交
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97
07 9月, 2022 1 次提交
- H
  
  [XPU] move rnn op to phi. (#45822) · 91631492
  由 houj04 提交于 9月 07, 2022
  
  91631492
05 9月, 2022 2 次提交
- C
  
  Fix jetson compile error (#45692) · cfaee812
  由 chalsliu 提交于 9月 05, 2022
  
  cfaee812
- S
  
  fix some op int32 exceed range (#45711) · a1dbee23
  由 sneaxiy 提交于 9月 05, 2022
  
  a1dbee23
01 9月, 2022 2 次提交
- H
  
  [XPU] add c_embedding_op_xpu. (#45617) · ed2ad5d9
  由 houj04 提交于 9月 01, 2022
  
  ed2ad5d9
- T
  xpu-paddlepaddle-37 [任务] 迁移lamb到phi (#45520) · 1a0ef45e
  由 taixiurong 提交于 9月 01, 2022
```
test=kunlun
```
  1a0ef45e
29 8月, 2022 2 次提交

[IPU] support depthwise_conv2d ops (#45234) · a237ff8e

由 Allen Guo 提交于 8月 29, 2022

* support depthwise_conv2d ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

* fix duplicate name
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

a237ff8e

A

fix compile (#45441) · f49f3b4f
由 Allen Guo 提交于 8月 29, 2022

f49f3b4f

26 8月, 2022 1 次提交
- H
  
  [XPU] add load_combine_op_xpu. test=kunlun (#45436) · 3055d71a
  由 houj04 提交于 8月 26, 2022
  
  3055d71a
25 8月, 2022 1 次提交
- H
  
  add temporal shift and grad *test=kunlun (#45300) · 63d9a175
  由 haosicheng 提交于 8月 25, 2022
  
  63d9a175
24 8月, 2022 1 次提交

Support fp16 of adam operator in xpu environment (#45292) · a012d426

由 mengqingchun02 提交于 8月 24, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

a012d426

19 8月, 2022 1 次提交
- H
  
  [XPU] c_allreduce support int. update bkcl to 1.0.5. test=kunlun (#45248) · 9f1f1b0a
  由 houj04 提交于 8月 19, 2022
  
  9f1f1b0a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功