提交 · c088f9ec6f98d23b25ee303f7e04e4baff2da072 · PaddlePaddle / Paddle

23 12月, 2022 1 次提交
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
22 12月, 2022 1 次提交
- Q
  
  fix softmax_with_cross_entropy bug for kunlun (#49207) · b421d7a5
  由 QingshuChen 提交于 12月 22, 2022
  
  b421d7a5
20 12月, 2022 1 次提交

[PHI decouple] move dropout_impl and cuda_graph_with_memory_pool from fluid to phi (#49139) · 579784e2

由 huangjiyi 提交于 12月 20, 2022

* move dropout_impl from fluid to phi

* move cuda_graph_with_memory_pool from fluid to phi

* update namespace

* remove cuad_graph in fluid

* fix mac-build

* fix bugs

* correct CodeStyle

* fix mac-build

* fix mutable_data

* fix stl include

* fix copy param

579784e2

19 12月, 2022 2 次提交
- W
  
  refactor: rename process group (#49137) · 22e416cf
  由 Wen Sun 提交于 12月 19, 2022
  
  22e416cf
- Z
  
  add diag_v2 op for xpu, test=kunlun (#49088) · 922f0868
  由 zhangyikun02 提交于 12月 19, 2022
  
  922f0868
17 12月, 2022 1 次提交
- W
  
  refactor: rename xccl files (#49127) · d4f43ad4
  由 Wen Sun 提交于 12月 17, 2022
  
  d4f43ad4
16 12月, 2022 1 次提交
- W
  
  refactor: rename files (#49117) · 40f3f4f0
  由 Wen Sun 提交于 12月 16, 2022
  
  40f3f4f0
15 12月, 2022 1 次提交
- H
  
  [PHI decoupling] move softmax from fluid to phi and remove cpu_vec.h in fluid (#48970) · 344b99e1
  由 huangjiyi 提交于 12月 15, 2022
  
  344b99e1
14 12月, 2022 1 次提交

nullptr bugfix for XPU pg mode (#49043) · f0dab193

由 james 提交于 12月 14, 2022

* nullptr bugfix for XPU pg mode

Also a few kernels is added to xpu whitelist

* increase error msg length

f0dab193

12 12月, 2022 1 次提交

傅

Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3

由傅剑寒提交于 12月 12, 2022

* fix codestyle

* add double complex<float> complex<double> dtype support for syevj_batched

* fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case

* optimize eigh in different case

* fix missing ; bug

* fix use_syevj bug

* fix use_cusolver_syevj_batched flag

16e364d3

09 12月, 2022 3 次提交
- J
  xpu support inplace flatten (#48909) · e6fdcd90
  由 james 提交于 12月 09, 2022
```
This is a PR to catch up with latest xpu white list strategy
(https://github.com/PaddlePaddle/Paddle/pull/48606)
, since original list only include 'fluid' fashion names, but new list
must include 'phi' fashion as well.
Refer to paddle/phi/core/kernel_factory.cc for more details.
```
  e6fdcd90
- H
  
  temporally disable set_value (#48942) · 905be668
  由 haosicheng 提交于 12月 09, 2022
  
  905be668
- P
  
  [PHI decoupling] move "flags.h" from fluid to phi (#48696) · 39ffef0d
  由 PuQing 提交于 12月 09, 2022
  
  39ffef0d
08 12月, 2022 3 次提交
- H
  
  [XPU] add set_value and set_value_grad (#48845) · 94fe929a
  由 haosicheng 提交于 12月 08, 2022
  
  94fe929a
- H
  [XPU] add load op into oplist. (#48860) · 2bba3e18
  由 houj04 提交于 12月 08, 2022
```
* [XPU] add load op into oplist.

* remove test_sampling_id_op_xpu.py
```
  2bba3e18
- H
  [PHI decoupling] move cuda_graph from fluid to phi (#48686) · a4d9851b
  由 huangjiyi 提交于 12月 08, 2022
```
* move cuda_graph from fluid to phi

* move device_memory_aligment from fluid to phi

* Revert "move device_memory_aligment from fluid to phi"

This reverts commit b92fcd39a0a50fdac13278f49be0237a85f3a13f.

* update xpu cmake
```
  a4d9851b
07 12月, 2022 2 次提交
- Q
  update kl1 op list and optimize matmul unitest for kunlun (#48775) · 93b7ccf5
  由 QingshuChen 提交于 12月 07, 2022
```
*test=kunlun
```
  93b7ccf5
- Z
  
  modify d2d copy to xpu::copy in xpu kernel, test=kunlun (#48710) · 0d8ddf9f
  由 zhangyikun02 提交于 12月 07, 2022
  
  0d8ddf9f
06 12月, 2022 1 次提交
- Q
  add xpu_support op function (#48606) · 06b32b38
  由 QingshuChen 提交于 12月 06, 2022
```
*test=kunlun
```
  06b32b38
05 12月, 2022 1 次提交
- H
  
  move device_memory_aligment from fluid to phi (#48694) · 796499fd
  由 huangjiyi 提交于 12月 05, 2022
  
  796499fd
30 11月, 2022 2 次提交

[PHI decoupling] migrate transpose_op.cu.h and gpu_utils.h to phi (#48286) · 8a9bef70

由 Netpunk 提交于 11月 30, 2022

* migrate transpose_op.cu.h and gpu_utils.h

* format code style

* fix some problems

* format code

* reset tranpose_op.cc

* test commit

* recover transpose_op.h

* delete transpose_op.h

* adjust header files order in transpose_op.cc

8a9bef70

use correct xpu stream for synchronization (#48470) · 16562a9d

由 james 提交于 11月 30, 2022

some legacy code still use xpu_wait() for stream sync -- it only syncs
default stream. this PR replaces them with dev_ctx.Wait() to ensure
that correct stream is always used

16562a9d

29 11月, 2022 4 次提交

[PHI] traspose2 kernel migration (#47748) · d86aa4ca

由 Paulina Gacek 提交于 11月 29, 2022

* traspose2 kernel migrated

* Got rid of mutable_data

* x modification added

* ops added in extra info file

* Formatting fix

* 2 fuse passes with tanpose2 commented

* nr of outs changed in 2 passes, passes uncommented

* Changes in passes reverted

* transpose chnaged in operator.cc

* MKLDNN check in operator.cc

* Transpose fixes

* Fix deleted from operato

* template corrected
Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>

d86aa4ca

A
[PHI decoupling]migrate enforce_custom.h from fluid to phi (#48422) · 9896ac1e
由 Asthestarsfalll 提交于 11月 29, 2022
```
* migrate enforce_custom.h from fluid to phi

* move to backends/custom/
```
9896ac1e

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

28 11月, 2022 4 次提交

[PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3

由 huangjiyi 提交于 11月 28, 2022

* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug

fd9c91c3

Y
[BugFix]Fix OneDNN Kernels Bug when use pass (#48364) · df82fd35
由 YuanRisheng 提交于 11月 28, 2022
```
* Fix onednn kernel bugs

* fix gpu bugs
```
df82fd35
P

add cpu_info.h (#48403) · 923ad5dc
由 PuQing 提交于 11月 28, 2022

923ad5dc

[Phi decouple] remove dependece to "paddle/fluid/platform/device/xpu/xxx.h" in phi (#48420) · 2bae75ed

由 huangjiyi 提交于 11月 28, 2022

* rm fluid “xpu_header.h” deps in phi

* move part of xpu_op_list.h from fluid to phi

* add fluid xpu_op_list deps

* add glog deps for xpu_op_list in phi

* fix PR-CI-Kunlun

2bae75ed

25 11月, 2022 2 次提交
- W
  for xpu multi thread bug test (#48373) · a1bdc652
  由 wanghuancoder 提交于 11月 25, 2022
```
* for xpu multi thread bug test
```
  a1bdc652
- S
  
  fix cuda 116 compile error (#48342) · 080349cd
  由 sneaxiy 提交于 11月 25, 2022
  
  080349cd
24 11月, 2022 2 次提交
- P
  
  [PHI decoupling] remove "paddle/fluid/platform/enforce.h" in phi (#48049) · df23c7c3
  由 PuQing 提交于 11月 24, 2022
  
  df23c7c3
- S
  
  [PHI] Migrate batch_norm_grad kernel (#48288) · 561b7278
  由 Sławomir Siwek 提交于 11月 24, 2022
  
  561b7278
23 11月, 2022 1 次提交
- S
  Make bfloat16 implicitly convert to float/double (#48238) · 1066094a
  由 sneaxiy 提交于 11月 23, 2022
```
* make bfloat16 implicit convert to float/double

* fix bfloat16_test ut compile
```
  1066094a
21 11月, 2022 2 次提交

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

L

add new map instance (#48145) · 2a47416c
由 LiYuRio 提交于 11月 21, 2022

2a47416c

18 11月, 2022 3 次提交

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI decoupling] move "gpu_device_function.h" from fluid to phi (#48097) · 27ee6e71

由 huangjiyi 提交于 11月 18, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* fix rocm-complie bugs

27ee6e71

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功