提交 · cbe8e6e9a8a492ff59f24afce8ded226f99fb53d · PaddlePaddle / Paddle

07 4月, 2023 20 次提交
- Z
  [AMP OP&Test] Fix the logic of calling infer_dtype func in op test (#52581) · cbe8e6e9
  由 Zhang Zheng 提交于 4月 07, 2023
```
* [AMP OP&Test] Fix the logic of calling infer_dtype func in op test

* add fp16
```
  cbe8e6e9
- R
  Isolate DenseTensor::set_type and DenseTensor::set_layout from header file (#52591) · f5ae67e8
  由 Ruibiao Chen 提交于 4月 07, 2023
```
* Isolate DenseTensor::set_type from header file

* Fix selected_rows
```
  f5ae67e8
- [AMP]register bf16 for communication ops (#52555) · 9a0de116
  由 shaojie_wang 提交于 4月 07, 2023
```
* register bf16 for communication ops

* fix bfloat16 type finding compile error in c_allreduce_max_op
```
  9a0de116
- W
  
  Polish dy2st error message (#52527) · 8da89b81
  由 WangZhen 提交于 4月 07, 2023
  
  8da89b81
- N
  
  [Dy2St] fix train step random failed on Windows (#52580) · d2939cab
  由 Nyakku Shigure 提交于 4月 07, 2023
  
  d2939cab
- Z
  
  add autogen code support for warpctc op (#52610) · a62de41a
  由 Zhenghai Zhang 提交于 4月 07, 2023
  
  a62de41a
- G
  
  fix qat export bug when input_spec is None (#52615) · fa949b1b
  由 Guanghua Yu 提交于 4月 07, 2023
  
  fa949b1b
- add distributed p_send/p_recv/reduce_scatter operator (#51858) · 2b12a117
  由 TaoTao Li 提交于 4月 07, 2023
```
fix merge conflicts
```
  2b12a117
- X
  
  [prim] support set output dtype for autogen (#52475) · d6a38532
  由 Xiaoxu Chen 提交于 4月 07, 2023
  
  d6a38532
- R
  [FLUID_API_CLEAN]remove zeros (#52536) · f6d4ae3d
  由 risemeup1 提交于 4月 07, 2023
```
* remove zeros

* remove zeros

* apply gcc12 to py3

* apply gcc12 to py3

* fluid api clear

* fluid api clean

* fluid api clean
```
  f6d4ae3d
- R
  
  support auto generate static for tril_indices and triu_indices (#52537) · f3e8c4be
  由 RedContritio 提交于 4月 07, 2023
  
  f3e8c4be
- F
  [Dy2static] [bugfix] fixed a bug which happens while parsing grad var name (#52110) · e459ac03
  由 feifei-111 提交于 4月 07, 2023
```
* fix dy2s grad name parse

* pre-commit

* bug fix

* Fix grad/ error

* Format code

---------
Co-authored-by: N0x45f <wangzhen45@baidu.com>
```
  e459ac03
- R
  fix_build_ci_error (#52576) · 8630375c
  由 risemeup1 提交于 4月 07, 2023
```
* fix_build_ci_error

* fix_build_ci_error

* fix_build_ci_error
```
  8630375c
- Y
  
  unify kernel (#52594) · 09da1c4c
  由 YuanRisheng 提交于 4月 07, 2023
  
  09da1c4c
- W
  [CustomDevice] Add enable custom device C Api (#52568) · 5662adcc
  由 wenzhe.wang 提交于 4月 07, 2023
```
fix bugs
Co-authored-by: Nwenzhe.wang <wenzhe.wang@xdxct.com>
```
  5662adcc
- R
  fix mkdir (#52570) · 41226d55
  由 Roc 提交于 4月 07, 2023
```
* fix mkdir

* update
```
  41226d55
- H
  [Test MV] standalone_executor (#52520) · 5e63038a
  由 Happyd99 提交于 4月 07, 2023
```
* [Test MV] standalone_executor

* update

as

* update

as

* update codestyle
```
  5e63038a
- W
  
  clean up WITH_MLU (#52546) · e75c01f9
  由 Wang Xin 提交于 4月 07, 2023
  
  e75c01f9
- J
  
  [kunlun] bugfix for collective softmax_with_ce (#52565) · 075d6b14
  由 jameszhang 提交于 4月 07, 2023
  
  075d6b14
- add argmax to ops (#52562) · d947b20a
  由 engineer1109 提交于 4月 07, 2023
  
  d947b20a
06 4月, 2023 20 次提交

C

support hybrid parallel in qat (#52219) · 5c19bfc8
由 ceci3 提交于 4月 06, 2023

5c19bfc8
Y

fix build bug (#52566) · 6c01ce8a
由 yuehuayingxueluo 提交于 4月 06, 2023

6c01ce8a

[StandaloneExe] improving sequentialRun mode of standaloneExecutor (#52111) · 14fe4b54

由 kangguangli 提交于 4月 06, 2023

* Verify SequentialRun Model of StandaloneExecutor

* fix

* fix

* fix

* remove redundant code

* fix CI

* fix CI

* recover multi-step dependency

14fe4b54

由 huangjiyi 提交于 4月 06, 2023

* update

* fix compile bug

* fix bug

* fix bug

* revert crop_op

* fix xpu compile

* fix cinn compile

* fix bug

* fix bug

* fix bug

* fix bug

* update

* update

* update

058ca61d

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

Move fused_attention op to phi [迁移前向 GPU OpKernel] (#51743) · a7ec8958

由 Sonder 提交于 4月 06, 2023

* add kernel functions

* update kernel functions

* update func parameters' name

* create codes for gpu device

* 调整文件位置

* fix include error

* remove dependent files to phi/

* restore fused_attention_op.cu

* fix dependence errors

* fix dependence errors

* fix include error

* fix all depandence errors[build success]

* remove useless include

* recover useless include

* use phi::ToNCCLDataType

* fix namespace

* update new register code

* fix error in fused_gemm_epilogue_utils

* fix error in FusedAttentionKernel parm

* finish fused_attention registe code[build success]

* add paddle::optional

* add sig file

* fix build error

* fix a include error

* update CMkaeList

* fix parameter sequence

* add include file

* update #if before include

* fix grammly error

* update codes for DropoutParam

* remove const cast

* trans some fluid api to phi api

* add #if

* update test code

* update test codes

* recover test codes

* trans fused_attention to fluid

* move #endif to end

* move #endif

* delete useless files

* use fused attention utils and recover random seed

* remove fluid include in phi

a7ec8958

S

add autogen code support for logical_and, logical_not, logical_or and logical_xor (#52451) · 6df4a667
由 scotty 提交于 4月 06, 2023

6df4a667
R

support auto generate static for assign_value (#52534) · d394c9ed
由 RedContritio 提交于 4月 06, 2023

d394c9ed
R

support auto generate static for decode_jpeg (#52542) · c1f97a9b
由 RedContritio 提交于 4月 06, 2023

c1f97a9b
张

mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
由张春乔提交于 4月 06, 2023

80dd1672
J

support more custom vjp (#52533) · 29c28e2f
由 Jiabin Yang 提交于 4月 06, 2023

29c28e2f

Delete some flags in COMMON_FLAGS (#52564) · 5257a79e

由 Galaxy1458 提交于 4月 06, 2023

* delete [-Wno-error=terminate], test=develop

* remove GPUps[-Wterminate],test=develop

* remove some -Wno-, test=develop

5257a79e

N

[CodeStyle][B017] catch more specific exceptions in unittests (#52553) · 9dbfadab
由 Nyakku Shigure 提交于 4月 06, 2023

9dbfadab
Z

Modify the condition of _get_places in fp16 (#52508) · 160dfd01
由 Zhang Zheng 提交于 4月 06, 2023

160dfd01

feat: add composite rule of roll grad (#52532) · 348a36b5

由 Kang Zhao 提交于 4月 06, 2023

* feat: add relu composite rule

* feat: add relu composite rule, maximum op

* feat: add relu composite rule, maximum op

* feat: add relu composite rule, polish comments

* feat: add relu composite rule, polish comments

* feat: add relu composite rule, add python api of relu

* feat: add relu composite rule, commit hook

* fix: maximum type error & ban cinn test

* fix: maximum input sequence bugs

* resolve conflicts

* fix: code style bugs

* add: relu fp16 test

* feat: add rsqrt composite rule

* feat: add rsqrt composite rule

* resolve conflicts of composite rule

* fix: delete check eager

* feat: add roll grad composite rule

* fix minus shift

* fix test roll op

348a36b5

Z
Rename conv2d transpose grad grad (#52371) · 49bbd466
由 zhangyuqin1998 提交于 4月 06, 2023
```
* Rename conv2d transpose grad grad

* fix
```
49bbd466
陈

【昇腾和寒武纪相关代码退场】No.9 清理 PADDLE_WITH_ASCEND 相关代码 (#52403) · 262ea02a
由陈沧夜提交于 4月 06, 2023

262ea02a
J
[CINN] disable CINN test_mean_op unittest to pass CINN CI (#52510) · b36ac56d
由 jiangcheng 提交于 4月 06, 2023
```
* [CINN] disable CINN test_mean_op unittest to pass CINN CI

* disable test_mean_op for pass ci
```
b36ac56d
C

fix backend bug (#52526) · 380a9bf7
由 Chitsing KUI 提交于 4月 06, 2023

380a9bf7
S
Fix flash attention bug (#52551) · 8ac5a6b6
由 sneaxiy 提交于 4月 06, 2023
```
* fix flash attn

* fix another API
```
8ac5a6b6

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功