提交 · 88cac16b2cd3797c5009b0505fbed40ec39714ae · BaiXuePrincess / Paddle

23 11月, 2022 4 次提交
- H
  [PHI decoupling] move im2col from fluid to phi (#48174) · 88cac16b
  由 huangjiyi 提交于 11月 23, 2022
```
* decouple im2col from fluid

* move im2col to phi

* fix build error

* delete redundant comment
```
  88cac16b
- D
  
  add support of controlflow op for custom device (#48259) · edf46919
  由 duanyanhui 提交于 11月 23, 2022
  
  edf46919
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
- Use cublaslt in multi transformer FFN (#48052) · b07e6b45
  由 MarDino 提交于 11月 23, 2022
```
* use fused mlp in multi transformer
* Restruct code
* use cublaslt to fuse ffn
* fix conflict
```
  b07e6b45
22 11月, 2022 7 次提交

P
[PHI] Migrate elementwise_div + all elementwise grad kernels (#48210) · 78b30e97
由 Piotr Paturej 提交于 11月 22, 2022
```
* Migrate elementwise_div

* Migrate elementwise grad kernels
```
78b30e97
H

fix typo error (#48156) · 0cdca676
由 HongyuJia 提交于 11月 22, 2022

0cdca676
H
[PHI decoupling] move vol2col from fluid to phi (#48175) · aa36c6aa
由 huangjiyi 提交于 11月 22, 2022
```
* move vol2col from fluid to phi

* update copyright year
```
aa36c6aa
T
CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) · df4dfda0
由 Tian Zheng 提交于 11月 22, 2022
```
* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
```
df4dfda0

Some residualdata fixes (#48118) · 7bbdbe5b

由 Sylwester Fraczek 提交于 11月 22, 2022

Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute.
Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass
Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

7bbdbe5b

H
Delete caching from requantize_mkldnn_op and changed to Acquire API (#48113) · 7d6a4a54
由 Hulek 提交于 11月 22, 2022
```
* Delete caching from requantize_mkldnn_op and changed to Acquire API
* Fixed codestyle and implementation
```
7d6a4a54

[PHI decoupling] remove "gpu_device_function.h" in fluid. (#48117) · 4da1a0fe

由 huangjiyi 提交于 11月 22, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* rm dependence to "gpu_device_function.h" in fluid

* rm gpu_device_function.h etc in fluid

* fix rocm-complie bugs

* fix cuda_helper_test.cu bugs

4da1a0fe

21 11月, 2022 5 次提交

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

mma qk tensor_core (#48087) · d79eda71

由 lzy 提交于 11月 21, 2022

* use mma for QK dot computing in fused_multi_transformer.
* Update fused_multi_transformer_op.cu.h

d79eda71

H
[PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
3501ff7d

Unify `ProcessGroupNCCL` APIs underlying implementation (#48163) · 88410225

由 Wen Sun 提交于 11月 21, 2022

* refactor: replace Collective & PointToPoint with NCCLEnv

* refactor: rename to RunFnInNCCLEnv

* refactor: pass std::function by value

88410225

18 11月, 2022 7 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

W
[PHI decoupling] remove "gpu_primitives.h" in fluid (#48063) · 9918bf9c
由 Wang Xin 提交于 11月 18, 2022
```
* remove "gpu_primitives.h" in fluid namespace

* fix PR-CI-GpuPS fail

* fix PR-CI-GpuPS fail
```
9918bf9c
F

fix: supoort huge length of attention (#48053) · 42f35841
由 feng_shuai 提交于 11月 18, 2022

42f35841
H

rm "paddle/fluid/operators/amp/fp16_type_traits.h" in phi (#48051) · e4670d80
由 huangjiyi 提交于 11月 18, 2022

e4670d80

17 11月, 2022 6 次提交
- Z
  Clip intermediate output of op when save inference model (#48026) · fafc7be2
  由 zyfncg 提交于 11月 17, 2022
```
* clip extra and intermediate output of op

* fix bug

* fix bug

* polich code

* polich log
```
  fafc7be2
- H
  
  rm "paddle/fluid/framework/convert_utils.h" in phi (#48001) · 2f34fc7a
  由 huangjiyi 提交于 11月 17, 2022
  
  2f34fc7a
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
- H
  [PHI decoupling] move "paddle/fluid/operators/math.h" to phi (#48062) · f62bd3b4
  由 huangjiyi 提交于 11月 17, 2022
```
* rm "paddle/fluid/operators/math.h" in phi

* rm "paddle/fluid/operators/math.h" in fluit
```
  f62bd3b4
- Z
  
  generate static graph code for some op (#48036) · 7cc0d171
  由 zyfncg 提交于 11月 17, 2022
  
  7cc0d171
15 11月, 2022 4 次提交

Y

fix onednn bugs, test=document_fix (#48013) · 21d4fa02
由 YuanRisheng 提交于 11月 15, 2022

21d4fa02
J
Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
519e7426
[Zero-Dim] Make auto parallel judge dim more strict (#47961) · 626d7bcb
由 zhouweiwei2014 提交于 11月 15, 2022

626d7bcb

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

14 11月, 2022 3 次提交
- W
  Refactor collective communication send_partial, recv_partial, all_gather_partial C++ API (#47863) · 25e63dca
  由 Wen Sun 提交于 11月 14, 2022
```
* refactor: simplify send, recv interfaces

* refactor: rm send_partial, recv_partial, all_gather_partial
```
  25e63dca
- X
  
  [Paddle Inference] Add where trt converter (#47820) · dac0f7dd
  由 xiaoxiaohehe001 提交于 11月 14, 2022
  
  dac0f7dd
- R
  
  Add InferShape for Depend OP (#47907) · 5478e1a5
  由 Ruibiao Chen 提交于 11月 14, 2022
  
  5478e1a5
11 11月, 2022 3 次提交

[Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
由 zhouweiwei2014 提交于 11月 11, 2022

18549417

Refine shape op lanch method for standalone executor (#47843) · 981d1a10

由 zhangbo9674 提交于 11月 11, 2022

* refine shape op in new_exe

* Revert "refine shape op in new_exe"

This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e.

* refine shape op in new_exe

* refine shape expected_kernel_type

* add SelectedRows check for shape op

* refine code

981d1a10

Generate static graph code for some ops by yaml (part3) (#47803) · 31f3f643

由 zyfncg 提交于 11月 11, 2022

* generate static graph code for some ops by yaml

* remove deleted files

* update cmake

* update cmake

* udpate cmake

31f3f643

10 11月, 2022 1 次提交
- S
  [phi] migrate prelu (#47422) · cdd8c8ab
  由 Sylwester Fraczek 提交于 11月 10, 2022
```
* migrate prelu

* remove cache

* review fixes
```
  cdd8c8ab

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致