提交 · 41483383d4955c5f478ca2239a21681e9d1ce548 · PaddlePaddle / Paddle

21 11月, 2022 4 次提交

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

mma qk tensor_core (#48087) · d79eda71

由 lzy 提交于 11月 21, 2022

* use mma for QK dot computing in fused_multi_transformer.
* Update fused_multi_transformer_op.cu.h

d79eda71

H
[PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
3501ff7d

Unify `ProcessGroupNCCL` APIs underlying implementation (#48163) · 88410225

由 Wen Sun 提交于 11月 21, 2022

* refactor: replace Collective & PointToPoint with NCCLEnv

* refactor: rename to RunFnInNCCLEnv

* refactor: pass std::function by value

88410225

18 11月, 2022 7 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

W
[PHI decoupling] remove "gpu_primitives.h" in fluid (#48063) · 9918bf9c
由 Wang Xin 提交于 11月 18, 2022
```
* remove "gpu_primitives.h" in fluid namespace

* fix PR-CI-GpuPS fail

* fix PR-CI-GpuPS fail
```
9918bf9c
F

fix: supoort huge length of attention (#48053) · 42f35841
由 feng_shuai 提交于 11月 18, 2022

42f35841
H

rm "paddle/fluid/operators/amp/fp16_type_traits.h" in phi (#48051) · e4670d80
由 huangjiyi 提交于 11月 18, 2022

e4670d80

17 11月, 2022 6 次提交
- Z
  Clip intermediate output of op when save inference model (#48026) · fafc7be2
  由 zyfncg 提交于 11月 17, 2022
```
* clip extra and intermediate output of op

* fix bug

* fix bug

* polich code

* polich log
```
  fafc7be2
- H
  
  rm "paddle/fluid/framework/convert_utils.h" in phi (#48001) · 2f34fc7a
  由 huangjiyi 提交于 11月 17, 2022
  
  2f34fc7a
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
- H
  [PHI decoupling] move "paddle/fluid/operators/math.h" to phi (#48062) · f62bd3b4
  由 huangjiyi 提交于 11月 17, 2022
```
* rm "paddle/fluid/operators/math.h" in phi

* rm "paddle/fluid/operators/math.h" in fluit
```
  f62bd3b4
- Z
  
  generate static graph code for some op (#48036) · 7cc0d171
  由 zyfncg 提交于 11月 17, 2022
  
  7cc0d171
15 11月, 2022 4 次提交

Y

fix onednn bugs, test=document_fix (#48013) · 21d4fa02
由 YuanRisheng 提交于 11月 15, 2022

21d4fa02
J
Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
519e7426
[Zero-Dim] Make auto parallel judge dim more strict (#47961) · 626d7bcb
由 zhouweiwei2014 提交于 11月 15, 2022

626d7bcb

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

14 11月, 2022 3 次提交
- W
  Refactor collective communication send_partial, recv_partial, all_gather_partial C++ API (#47863) · 25e63dca
  由 Wen Sun 提交于 11月 14, 2022
```
* refactor: simplify send, recv interfaces

* refactor: rm send_partial, recv_partial, all_gather_partial
```
  25e63dca
- X
  
  [Paddle Inference] Add where trt converter (#47820) · dac0f7dd
  由 xiaoxiaohehe001 提交于 11月 14, 2022
  
  dac0f7dd
- R
  
  Add InferShape for Depend OP (#47907) · 5478e1a5
  由 Ruibiao Chen 提交于 11月 14, 2022
  
  5478e1a5
11 11月, 2022 3 次提交

[Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
由 zhouweiwei2014 提交于 11月 11, 2022

18549417

Refine shape op lanch method for standalone executor (#47843) · 981d1a10

由 zhangbo9674 提交于 11月 11, 2022

* refine shape op in new_exe

* Revert "refine shape op in new_exe"

This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e.

* refine shape op in new_exe

* refine shape expected_kernel_type

* add SelectedRows check for shape op

* refine code

981d1a10

Generate static graph code for some ops by yaml (part3) (#47803) · 31f3f643

由 zyfncg 提交于 11月 11, 2022

* generate static graph code for some ops by yaml

* remove deleted files

* update cmake

* update cmake

* udpate cmake

31f3f643

10 11月, 2022 5 次提交

S
[phi] migrate prelu (#47422) · cdd8c8ab
由 Sylwester Fraczek 提交于 11月 10, 2022
```
* migrate prelu

* remove cache

* review fixes
```
cdd8c8ab

[PHI]Standardise some C++ API (Part4) (#47702) · 594bd723

由 YuanRisheng 提交于 11月 10, 2022

* standard api

* fix sparse bugs

* fix xpu bugs, test=kunlun

* remove hard code for custom unittest

* open ci, test=kunlun

* deal with conflict

594bd723

XPU multi-card support eager mode (#47445) · 3b91f8f3

由 james 提交于 11月 10, 2022

* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun

3b91f8f3

Z
Add CI check for script of auto code-gen (#47814) · 00ea0b2f
由 zyfncg 提交于 11月 10, 2022
```
* add ci check for code-gen script

* update
```
00ea0b2f
C

support pow_triple_grad op (#47799) · 7964119b
由 Charles-hit 提交于 11月 10, 2022

7964119b

09 11月, 2022 7 次提交

H

clean repetitious GetKernelTypeForVar (#47763) · c551e55d
由 HongyuJia 提交于 11月 09, 2022

c551e55d
J

fix for missing reorders in profiling (#47777) · a97b3630
由 jakpiase 提交于 11月 09, 2022

a97b3630
S

cleanup unused code (#47762) · fb16fea3
由 Sławomir Siwek 提交于 11月 09, 2022

fb16fea3

Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad

由 Jacek Czaja 提交于 11月 09, 2022

* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2

14f261ad

Z
Generate static graph code for some ops by yaml (part2) (#47752) · ccb47076
由 zyfncg 提交于 11月 09, 2022
```
* generate static graph code of some op

* polish code

* fix bug

* update default value
```
ccb47076

[PHI decoupling] Move fluid op generator into fluid (#47714) · f369b2b1

由 Chen Weihang 提交于 11月 09, 2022

* move fluid op generator into fluid

* remove parsed op

* resolve sig undef error

* append python interp find logic

* remove dup code

f369b2b1

L

new mp_allreduce_sum_op (#47715) · 18d33346
由 LiYuRio 提交于 11月 09, 2022

18d33346

08 11月, 2022 1 次提交

Split quant (#47449) · 130db92a

由 Paulina Gacek 提交于 11月 08, 2022

* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected

130db92a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功