提交 · 7bf7e6e0f97b40e739858b10e353a3a9998458d8 · BaiXuePrincess / Paddle

29 11月, 2022 12 次提交

由 lzy 提交于 11月 29, 2022

* fix mma_tensorcore (__CUDA_ARCH__)

* disable tensorcore by default.

disable tensorcore by default, because the judgment of __CUDA_ARCH__ will cause undefined behavior in some environments, can manually enable it on a machine that supports tensorcore.

bf4d1792

[PHI] traspose2 kernel migration (#47748) · d86aa4ca

由 Paulina Gacek 提交于 11月 29, 2022

* traspose2 kernel migrated

* Got rid of mutable_data

* x modification added

* ops added in extra info file

* Formatting fix

* 2 fuse passes with tanpose2 commented

* nr of outs changed in 2 passes, passes uncommented

* Changes in passes reverted

* transpose chnaged in operator.cc

* MKLDNN check in operator.cc

* Transpose fixes

* Fix deleted from operato

* template corrected
Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>

d86aa4ca

张

Replace LoDTensor with phi::DenseTensor in fluid\operators (#48417) · 91dd8a2e

由张春乔提交于 11月 29, 2022

* replace LoDTensor with phi::DenseTensor in fluid\operators

* replace LoDTensor with phi::DenseTensor in fluid\operators

* Update split_lod_tensor_op.cc

* Update warpctc_op.cc

* Update broadcast_tensors_op.cc

* Update crf_decoding_op.cc

* Update lstm_op.cc

* Update lstm_op.cc

* Update lod_reset_op.cc

* Update gru_op.cc

* Update linear_chain_crf_op.cc

* resume 2 files for confilct

* Update gru_op.cc

* Update linear_chain_crf_op.cc

* Update lstm_op.cc

91dd8a2e

N
[CodeStyle][isort] introduce isort (part4) (#48402) · f85def97
由 Nyakku Shigure 提交于 11月 29, 2022
```
* isort all files

* revert conflicting files

* revert conflicting files

* revert conflicting files
```
f85def97
S

eltwise_div + scale [PHI] (#48484) · fa10524d
由 Sławomir Siwek 提交于 11月 29, 2022

fa10524d

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

[Control Flow] replace executor in while op with InterpreterCore (#47573) · 6dbfbfa5

由 kangguangli 提交于 11月 29, 2022

* fix:add no support for cuda_arch<700

* replace Executor in while op with InterpreterCore

* cache InterpreterCore as the member of WhileOp

* fix bug: tensor place changed because of assign op in while loop

* refine code

* refine code

* refine code

* hot fix

* fix compile

* merge develop

* follow comments

* add log for test

* remove LoDTensor

* set flag control_flow_use_new_executor false
Co-authored-by: Nfengshuai <fengshuai03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

6dbfbfa5

J
Bugfix for Collective default calc stream (#48308) · a66bb67a
由 JZ-LIANG 提交于 11月 29, 2022
```
* get default calc stream from execution ctx instead of global dev ctx pool.
```
a66bb67a

[Fluid API]Remove multiple APIs in control_flow (#48279) · c0d31dac

由 LiYuRio 提交于 11月 29, 2022

* remove lod_tensor_to_array, array_to_lod_tensor, DynamicRNN

* remove less_equal, greater_than, greater_equal, equal, not_equal

c0d31dac

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

Generate static graph code for lerp by yaml (#48322) · d5387de2

由 HappyHeavyRain 提交于 11月 29, 2022

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* remove the 'attrs' of lerp, test=develop
Signed-off-by: lizhiyu02 <1528794076@qq.com>
Signed-off-by: lizhiyu02 <1528794076@qq.com>

d5387de2

Z

[Sparse]BatchNorm use inplace (#48254) · d33d6db0
由 zhangkaihuo 提交于 11月 29, 2022

d33d6db0

28 11月, 2022 9 次提交
- S
  
  eltwises + scale fuse pass (#48400) · a0930484
  由 Sławomir Siwek 提交于 11月 28, 2022
  
  a0930484
- J
  Reenabled reshape, squeeze and flatten oneDNN kernels (#48359) · 98aaf797
  由 jakpiase 提交于 11月 28, 2022
```
* re-enabled reshape, squeeze and flatten kernels

* added formatting
```
  98aaf797
- W
  fix: multihead matmul biasqk broadcast support for [1,1,seq,seq] shape (#47975) · 11b9d85f
  由 Wang Bojun 提交于 11月 28, 2022
```
* add trt support
```
  11b9d85f
- Z
  Generate static graph code for some ops by yaml (part5) (#48284) · b5c6c36c
  由 zyfncg 提交于 11月 28, 2022
```
* generate static graph code for some operators

* add some ops generate

* revert npu gelu
```
  b5c6c36c
- H
  [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
  由 huangjiyi 提交于 11月 28, 2022
```
* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug
```
  fd9c91c3
- 张
  
  replace LoDTensor with phi::DenseTensor in fluid\operators\*\ except sequence_ops (#48418) · 30a31a53
  由张春乔提交于 11月 28, 2022
  
  30a31a53
- A
  
  migrate top_k_function_cuda.h from fluid to phi (#48251) · b4b926f4
  由 Asthestarsfalll 提交于 11月 28, 2022
  
  b4b926f4
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092
- W
  
  add pbtxt (#48326) · d7540a4a
  由 wenbin 提交于 11月 28, 2022
  
  d7540a4a
24 11月, 2022 2 次提交

[PHI decoupling] simplify "convert_utils.h" in fluid (#48168) · de4310e6

由 huangjiyi 提交于 11月 24, 2022

* rm dependence to "convert_utils.h" in some files

* fix bugs

* replace DataType2String with DataTypeToString

* replace framework::DataTypeSize with phi::SizeOf

* mv convert_function from fluid to phi and rm old map

* recommit with pre-commit

* repalce ProtoVarType with ProtoDataType and update comment.

* fix error about include "dnnl.hpp"

* revert add dep mkldnn to convert_utils in phi

* add mkldnn deps in convert_utils.h in phi

* move deps to convert_utils.h in phi

de4310e6

S

[PHI] Migrate batch_norm_grad kernel (#48288) · 561b7278
由 Sławomir Siwek 提交于 11月 24, 2022

561b7278

23 11月, 2022 4 次提交
- H
  [PHI decoupling] move im2col from fluid to phi (#48174) · 88cac16b
  由 huangjiyi 提交于 11月 23, 2022
```
* decouple im2col from fluid

* move im2col to phi

* fix build error

* delete redundant comment
```
  88cac16b
- D
  
  add support of controlflow op for custom device (#48259) · edf46919
  由 duanyanhui 提交于 11月 23, 2022
  
  edf46919
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
- Use cublaslt in multi transformer FFN (#48052) · b07e6b45
  由 MarDino 提交于 11月 23, 2022
```
* use fused mlp in multi transformer
* Restruct code
* use cublaslt to fuse ffn
* fix conflict
```
  b07e6b45
22 11月, 2022 7 次提交

P
[PHI] Migrate elementwise_div + all elementwise grad kernels (#48210) · 78b30e97
由 Piotr Paturej 提交于 11月 22, 2022
```
* Migrate elementwise_div

* Migrate elementwise grad kernels
```
78b30e97
H

fix typo error (#48156) · 0cdca676
由 HongyuJia 提交于 11月 22, 2022

0cdca676
H
[PHI decoupling] move vol2col from fluid to phi (#48175) · aa36c6aa
由 huangjiyi 提交于 11月 22, 2022
```
* move vol2col from fluid to phi

* update copyright year
```
aa36c6aa
T
CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) · df4dfda0
由 Tian Zheng 提交于 11月 22, 2022
```
* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
```
df4dfda0

Some residualdata fixes (#48118) · 7bbdbe5b

由 Sylwester Fraczek 提交于 11月 22, 2022

Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute.
Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass
Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

7bbdbe5b

H
Delete caching from requantize_mkldnn_op and changed to Acquire API (#48113) · 7d6a4a54
由 Hulek 提交于 11月 22, 2022
```
* Delete caching from requantize_mkldnn_op and changed to Acquire API
* Fixed codestyle and implementation
```
7d6a4a54

[PHI decoupling] remove "gpu_device_function.h" in fluid. (#48117) · 4da1a0fe

由 huangjiyi 提交于 11月 22, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* rm dependence to "gpu_device_function.h" in fluid

* rm gpu_device_function.h etc in fluid

* fix rocm-complie bugs

* fix cuda_helper_test.cu bugs

4da1a0fe

21 11月, 2022 5 次提交

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

mma qk tensor_core (#48087) · d79eda71

由 lzy 提交于 11月 21, 2022

* use mma for QK dot computing in fused_multi_transformer.
* Update fused_multi_transformer_op.cu.h

d79eda71

H
[PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
3501ff7d

Unify `ProcessGroupNCCL` APIs underlying implementation (#48163) · 88410225

由 Wen Sun 提交于 11月 21, 2022

* refactor: replace Collective & PointToPoint with NCCLEnv

* refactor: rename to RunFnInNCCLEnv

* refactor: pass std::function by value

88410225

18 11月, 2022 1 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致