提交 · ac9debee877a34342e2d7744313524f1d3c17af2 · PaddlePaddle / Paddle

12 1月, 2023 3 次提交
- Y
  
  deal with conflict (#49766) · 27aec62b
  由 YuanRisheng 提交于 1月 12, 2023
  
  27aec62b
- L
  Fix the bugs of set_value and set_value_grad ops and add register in (#49750) · 438975fd
  由 Leo Guo 提交于 1月 12, 2023
```
xpu2_op_list.cc. test=kunlun
```
  438975fd
- Y
  [PHI]Rename some PHI Kernel (#49470) · 30f5e39b
  由 YuanRisheng 提交于 1月 12, 2023
```
* rename kernel

* delete sig

* modify code according comment

* fix ci bugs
```
  30f5e39b
10 1月, 2023 2 次提交
- L
  Optimization for StackGradCUDAKernel for last dimension stack case. (#48992) · 0cae5c7f
  由 limingshu 提交于 1月 10, 2023
```
* add stack grad kernel optimization

* add basic optimization kernel for stack_grad_kernel

* optimization of stack_grad_kernel for last dim stack and change code format with pre-commit
```
  0cae5c7f
- Add cuda compiled arch check (#49592) · c0d6ec63
  由 MarDino 提交于 1月 10, 2023
  
  c0d6ec63
09 1月, 2023 2 次提交
- Q
  
  add fill/fill_any for kunlun (#49645) · 31ea3231
  由 QingshuChen 提交于 1月 09, 2023
  
  31ea3231
- Y
  [XPU] add einsum fill diagonal and diagonal kernels (#49465) · a5bf156b
  由 ykkk2333 提交于 1月 09, 2023
```
* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun

* fix dlrm throughput problem, test=kunlun

* add xpu einsum, fill_diagonal, and diagonal kernels, test=kunlun
```
  a5bf156b
06 1月, 2023 3 次提交
- R
  Dev (#49591) · 07db4a9f
  由 RuohengMa 提交于 1月 06, 2023
```
* add bitwise and, bitwise not, bitwise or and bitwise xor

* correct typo
```
  07db4a9f
- H
  
  fix typo, compatiable->compatible, test=document_fix (#49552) · 6ec8dfdd
  由 HongyuJia 提交于 1月 06, 2023
  
  6ec8dfdd
- 张
  
  Expansions of some unmaintained pr (#49551) · 419c2d14
  由张春乔提交于 1月 06, 2023
  
  419c2d14
03 1月, 2023 1 次提交
- L
  
  H2D data transfer optimization for concat kernel (#49040) · 0de94cd9
  由 limingshu 提交于 1月 03, 2023
  
  0de94cd9
27 12月, 2022 1 次提交
- Z
  
  add unbind op for xpu (#49356) · 16931039
  由 zhangyikun02 提交于 12月 27, 2022
  
  16931039
26 12月, 2022 1 次提交

由 ykkk2333 提交于 12月 26, 2022

* migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun

* fix dlrm throughput problem, test=kunlun

c8f76337

23 12月, 2022 2 次提交
- H
  
  square_grad support fp16 *test=kunlun (#48847) · ae544586
  由 haosicheng 提交于 12月 23, 2022
  
  ae544586
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
22 12月, 2022 1 次提交
- Q
  
  fix softmax_with_cross_entropy bug for kunlun (#49207) · b421d7a5
  由 QingshuChen 提交于 12月 22, 2022
  
  b421d7a5
20 12月, 2022 1 次提交

[PHI decouple] move dropout_impl and cuda_graph_with_memory_pool from fluid to phi (#49139) · 579784e2

由 huangjiyi 提交于 12月 20, 2022

* move dropout_impl from fluid to phi

* move cuda_graph_with_memory_pool from fluid to phi

* update namespace

* remove cuad_graph in fluid

* fix mac-build

* fix bugs

* correct CodeStyle

* fix mac-build

* fix mutable_data

* fix stl include

* fix copy param

579784e2

19 12月, 2022 2 次提交
- W
  
  refactor: rename process group (#49137) · 22e416cf
  由 Wen Sun 提交于 12月 19, 2022
  
  22e416cf
- Z
  
  add diag_v2 op for xpu, test=kunlun (#49088) · 922f0868
  由 zhangyikun02 提交于 12月 19, 2022
  
  922f0868
17 12月, 2022 1 次提交
- W
  
  refactor: rename xccl files (#49127) · d4f43ad4
  由 Wen Sun 提交于 12月 17, 2022
  
  d4f43ad4
16 12月, 2022 1 次提交
- W
  
  refactor: rename files (#49117) · 40f3f4f0
  由 Wen Sun 提交于 12月 16, 2022
  
  40f3f4f0
15 12月, 2022 1 次提交
- H
  
  [PHI decoupling] move softmax from fluid to phi and remove cpu_vec.h in fluid (#48970) · 344b99e1
  由 huangjiyi 提交于 12月 15, 2022
  
  344b99e1
14 12月, 2022 1 次提交

nullptr bugfix for XPU pg mode (#49043) · f0dab193

由 james 提交于 12月 14, 2022

* nullptr bugfix for XPU pg mode

Also a few kernels is added to xpu whitelist

* increase error msg length

f0dab193

12 12月, 2022 1 次提交

傅

Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3

由傅剑寒提交于 12月 12, 2022

* fix codestyle

* add double complex<float> complex<double> dtype support for syevj_batched

* fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case

* optimize eigh in different case

* fix missing ; bug

* fix use_syevj bug

* fix use_cusolver_syevj_batched flag

16e364d3

09 12月, 2022 3 次提交
- J
  xpu support inplace flatten (#48909) · e6fdcd90
  由 james 提交于 12月 09, 2022
```
This is a PR to catch up with latest xpu white list strategy
(https://github.com/PaddlePaddle/Paddle/pull/48606)
, since original list only include 'fluid' fashion names, but new list
must include 'phi' fashion as well.
Refer to paddle/phi/core/kernel_factory.cc for more details.
```
  e6fdcd90
- H
  
  temporally disable set_value (#48942) · 905be668
  由 haosicheng 提交于 12月 09, 2022
  
  905be668
- P
  
  [PHI decoupling] move "flags.h" from fluid to phi (#48696) · 39ffef0d
  由 PuQing 提交于 12月 09, 2022
  
  39ffef0d
08 12月, 2022 3 次提交
- H
  
  [XPU] add set_value and set_value_grad (#48845) · 94fe929a
  由 haosicheng 提交于 12月 08, 2022
  
  94fe929a
- H
  [XPU] add load op into oplist. (#48860) · 2bba3e18
  由 houj04 提交于 12月 08, 2022
```
* [XPU] add load op into oplist.

* remove test_sampling_id_op_xpu.py
```
  2bba3e18
- H
  [PHI decoupling] move cuda_graph from fluid to phi (#48686) · a4d9851b
  由 huangjiyi 提交于 12月 08, 2022
```
* move cuda_graph from fluid to phi

* move device_memory_aligment from fluid to phi

* Revert "move device_memory_aligment from fluid to phi"

This reverts commit b92fcd39a0a50fdac13278f49be0237a85f3a13f.

* update xpu cmake
```
  a4d9851b
07 12月, 2022 2 次提交
- Q
  update kl1 op list and optimize matmul unitest for kunlun (#48775) · 93b7ccf5
  由 QingshuChen 提交于 12月 07, 2022
```
*test=kunlun
```
  93b7ccf5
- Z
  
  modify d2d copy to xpu::copy in xpu kernel, test=kunlun (#48710) · 0d8ddf9f
  由 zhangyikun02 提交于 12月 07, 2022
  
  0d8ddf9f
06 12月, 2022 1 次提交
- Q
  add xpu_support op function (#48606) · 06b32b38
  由 QingshuChen 提交于 12月 06, 2022
```
*test=kunlun
```
  06b32b38
05 12月, 2022 1 次提交
- H
  
  move device_memory_aligment from fluid to phi (#48694) · 796499fd
  由 huangjiyi 提交于 12月 05, 2022
  
  796499fd
30 11月, 2022 2 次提交

[PHI decoupling] migrate transpose_op.cu.h and gpu_utils.h to phi (#48286) · 8a9bef70

由 Netpunk 提交于 11月 30, 2022

* migrate transpose_op.cu.h and gpu_utils.h

* format code style

* fix some problems

* format code

* reset tranpose_op.cc

* test commit

* recover transpose_op.h

* delete transpose_op.h

* adjust header files order in transpose_op.cc

8a9bef70

use correct xpu stream for synchronization (#48470) · 16562a9d

由 james 提交于 11月 30, 2022

some legacy code still use xpu_wait() for stream sync -- it only syncs
default stream. this PR replaces them with dev_ctx.Wait() to ensure
that correct stream is always used

16562a9d

29 11月, 2022 4 次提交

[PHI] traspose2 kernel migration (#47748) · d86aa4ca

由 Paulina Gacek 提交于 11月 29, 2022

* traspose2 kernel migrated

* Got rid of mutable_data

* x modification added

* ops added in extra info file

* Formatting fix

* 2 fuse passes with tanpose2 commented

* nr of outs changed in 2 passes, passes uncommented

* Changes in passes reverted

* transpose chnaged in operator.cc

* MKLDNN check in operator.cc

* Transpose fixes

* Fix deleted from operato

* template corrected
Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>

d86aa4ca

A
[PHI decoupling]migrate enforce_custom.h from fluid to phi (#48422) · 9896ac1e
由 Asthestarsfalll 提交于 11月 29, 2022
```
* migrate enforce_custom.h from fluid to phi

* move to backends/custom/
```
9896ac1e

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功