提交 · 0754e09d1a7223be50f19555d289389a993ab4e1 · BaiXuePrincess / Paddle

02 12月, 2022 2 次提交
- Y
  add silu, silu_grad, unfold and unfold_grad xpu kernels (#48325) · f71de378
  由 ykkk2333 提交于 12月 02, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* add silu, unfold and their grads,test=kunlun
```
  f71de378
- C
  
  polish fusion kernel naming (#48609) · 61486bf2
  由 Chen Weihang 提交于 12月 02, 2022
  
  61486bf2
01 12月, 2022 2 次提交
- Z
  Rename kernel for top_k, slogdeterminant, generate_proposals_v2 (#48594) · 3d35aa80
  由 zyfncg 提交于 12月 01, 2022
```
* rename kernel for top_k, slogdeterminant, generate_proposals_v2

* fix bug
```
  3d35aa80
- Z
  
  change d2d copy to api copy in xpu kernel, test=kunlun (#48505) · 4f834cb2
  由 zhangyikun02 提交于 12月 01, 2022
  
  4f834cb2
30 11月, 2022 7 次提交
- Q
  
  fix phi header file without fluid header, test=develop (#48488) · cbb1cfbb
  由 Qi Li 提交于 11月 30, 2022
  
  cbb1cfbb
- Z
  Fix error log for yaml check (#48126) · f62b3fc8
  由 zyfncg 提交于 11月 30, 2022
```
* fix error log for yaml check

* remove grad_op of increment
```
  f62b3fc8
- N
  [PHI decoupling] migrate transpose_op.cu.h and gpu_utils.h to phi (#48286) · 8a9bef70
  由 Netpunk 提交于 11月 30, 2022
```
* migrate transpose_op.cu.h and gpu_utils.h

* format code style

* fix some problems

* format code

* reset tranpose_op.cc

* test commit

* recover transpose_op.h

* delete transpose_op.h

* adjust header files order in transpose_op.cc
```
  8a9bef70
- A
  [Perf]Fix interploate OutSize data transform problem (#48498) · 0b2a66bb
  由 Aurelius84 提交于 11月 30, 2022
```
* [Perf]Fix interploate OutSize data transform problem

* fix code style

* fix grad

* fix phi kernel
```
  0b2a66bb
- Z
  Fix the name map of operator from Phi to fluid (#48496) · e337d280
  由 zyfncg 提交于 11月 30, 2022
```
* rename some kernel name

* fix compile problem
```
  e337d280
- J
  use correct xpu stream for synchronization (#48470) · 16562a9d
  由 james 提交于 11月 30, 2022
```
some legacy code still use xpu_wait() for stream sync -- it only syncs
default stream. this PR replaces them with dev_ctx.Wait() to ensure
that correct stream is always used
```
  16562a9d
- Z
  
  optimize for argsort with xpu, test=kunlun (#48440) · 7bf7e6e0
  由 zhangyikun02 提交于 11月 30, 2022
  
  7bf7e6e0
29 11月, 2022 12 次提交

H

rename use_cudnn to use_gpudnn in phi (#48443) · 41f15537
由 HongyuJia 提交于 11月 29, 2022

41f15537

[PHI] traspose2 kernel migration (#47748) · d86aa4ca

由 Paulina Gacek 提交于 11月 29, 2022

* traspose2 kernel migrated

* Got rid of mutable_data

* x modification added

* ops added in extra info file

* Formatting fix

* 2 fuse passes with tanpose2 commented

* nr of outs changed in 2 passes, passes uncommented

* Changes in passes reverted

* transpose chnaged in operator.cc

* MKLDNN check in operator.cc

* Transpose fixes

* Fix deleted from operato

* template corrected
Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>

d86aa4ca

N
[CodeStyle][isort] introduce isort (part4) (#48402) · f85def97
由 Nyakku Shigure 提交于 11月 29, 2022
```
* isort all files

* revert conflicting files

* revert conflicting files

* revert conflicting files
```
f85def97
A
[PHI decoupling]migrate enforce_custom.h from fluid to phi (#48422) · 9896ac1e
由 Asthestarsfalll 提交于 11月 29, 2022
```
* migrate enforce_custom.h from fluid to phi

* move to backends/custom/
```
9896ac1e
S

eltwise_div + scale [PHI] (#48484) · fa10524d
由 Sławomir Siwek 提交于 11月 29, 2022

fa10524d
V
Optimize the implementation of the argsort operator. (#47738) · 9e9b705a
由 Vvsmile 提交于 11月 29, 2022
```
Optimize the implementation of the argsort operator
```
9e9b705a

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

H

add floor fp32 op *test=kunlun (#48458) · 9d4b4be3
由 haosicheng 提交于 11月 29, 2022

9d4b4be3
S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

Generate static graph code for lerp by yaml (#48322) · d5387de2

由 HappyHeavyRain 提交于 11月 29, 2022

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* remove the 'attrs' of lerp, test=develop
Signed-off-by: lizhiyu02 <1528794076@qq.com>
Signed-off-by: lizhiyu02 <1528794076@qq.com>

d5387de2

Z

[Sparse]BatchNorm use inplace (#48254) · d33d6db0
由 zhangkaihuo 提交于 11月 29, 2022

d33d6db0
Z

group the index in not cutlass mode (#48439) · 41ba2722
由 zhangkaihuo 提交于 11月 29, 2022

41ba2722

28 11月, 2022 12 次提交

Z
Generate static graph code for some ops by yaml (part5) (#48284) · b5c6c36c
由 zyfncg 提交于 11月 28, 2022
```
* generate static graph code for some operators

* add some ops generate

* revert npu gelu
```
b5c6c36c

[PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3

由 huangjiyi 提交于 11月 28, 2022

* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug

fd9c91c3

Optimize the log of broadcast and decrease the log level. (#48327) · 8424cf28

由 Yiqun Liu 提交于 11月 28, 2022

* Optimize the log of broadcast and decrease the log level.

* Remove the redundant brackets.

* Change op benchmark ci to test the tests module.

* Remove the observe of elementwise and reduce_ops sub-directory.

8424cf28

Y
[BugFix]Fix OneDNN Kernels Bug when use pass (#48364) · df82fd35
由 YuanRisheng 提交于 11月 28, 2022
```
* Fix onednn kernel bugs

* fix gpu bugs
```
df82fd35
A

migrate top_k_function_cuda.h from fluid to phi (#48251) · b4b926f4
由 Asthestarsfalll 提交于 11月 28, 2022

b4b926f4
P

add cpu_info.h (#48403) · 923ad5dc
由 PuQing 提交于 11月 28, 2022

923ad5dc

[NPU] apply npu_identity to conv bn and copy2cpu, test=develop (#48039) · 32143f44

由 Qi Li 提交于 11月 28, 2022

* [NPU] apply npu_identity to conv bn and copy2cpu, test=develop

* update npu identity to share data with x, test=develop

* address review comments, test=develop

32143f44

[Phi decouple] remove dependece to "paddle/fluid/platform/device/xpu/xxx.h" in phi (#48420) · 2bae75ed

由 huangjiyi 提交于 11月 28, 2022

* rm fluid “xpu_header.h” deps in phi

* move part of xpu_op_list.h from fluid to phi

* add fluid xpu_op_list deps

* add glog deps for xpu_op_list in phi

* fix PR-CI-Kunlun

2bae75ed

Fix bug of TransToFluidOpName (#48355) · d3f52efd

由 zyfncg 提交于 11月 28, 2022

* add fluid_op_name_map

* rename some kernel name

* add comments for op-kernel map

* refine map name of op to kernel

d3f52efd

Use phi layernorm (#48276) · 86d92092
由 MarDino 提交于 11月 28, 2022

86d92092
T
fix expand as op (#48336) · 827fd5cd
由 Thomas Young 提交于 11月 28, 2022
```
* fix expand as op

* fix bug
```
827fd5cd
H

add square fp16 *test=kunlun (#48095) · 81d0a3cc
由 haosicheng 提交于 11月 28, 2022

81d0a3cc

25 11月, 2022 5 次提交
- W
  for xpu multi thread bug test (#48373) · a1bdc652
  由 wanghuancoder 提交于 11月 25, 2022
```
* for xpu multi thread bug test
```
  a1bdc652
- W
  Group norm fp16 support (#48222) · 34fd65cf
  由 Wang Bojun 提交于 11月 25, 2022
```
* group norm fp16 support
```
  34fd65cf
- C
  [PROFILER] add flops for Profiler (#47766) · 3d1981ad
  由 Chitsing KUI 提交于 11月 25, 2022
```
* attr ready

* op ip ready

* start dynamic

* end2end ok

* input shape to map, stat by op

* layer wip

* first version ready

* fix proto depds

* fix profiler deps

* fix flops typo, rm tuple shape
```
  3d1981ad
- R
  [XPU] Support Sharding stage2 on XPU (#48310) · 145cc262
  由 Roc 提交于 11月 25, 2022
```
* support xpu scalar inplace

* sharding for xpu
Co-authored-by: Nheyanru <81976792+heyanru01@users.noreply.github.com>
```
  145cc262
- W
  
  fix embedding_bug (#48318) · ea83f898
  由 wanghuancoder 提交于 11月 25, 2022
  
  ea83f898

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致