提交 · 392787315da0eeff839700bb5d0b26bb0c1a8e9a · PaddlePaddle / Paddle

07 4月, 2023 11 次提交
- K
  [Executor] remove run_program branch (#52471) · 39278731
  由 kangguangli 提交于 4月 07, 2023
```
* remove run_program

* remove FLAGS_USE_STANDALONE_EXECUTOR
```
  39278731
- C
  
  【Hackathon4】No58. empty (#52388) · 47c740e7
  由 cyberslack_lee 提交于 4月 07, 2023
  
  47c740e7
- Z
  [AMP OP&Test] Fix the logic of calling infer_dtype func in op test (#52581) · cbe8e6e9
  由 Zhang Zheng 提交于 4月 07, 2023
```
* [AMP OP&Test] Fix the logic of calling infer_dtype func in op test

* add fp16
```
  cbe8e6e9
- W
  
  Polish dy2st error message (#52527) · 8da89b81
  由 WangZhen 提交于 4月 07, 2023
  
  8da89b81
- G
  
  fix qat export bug when input_spec is None (#52615) · fa949b1b
  由 Guanghua Yu 提交于 4月 07, 2023
  
  fa949b1b
- add distributed p_send/p_recv/reduce_scatter operator (#51858) · 2b12a117
  由 TaoTao Li 提交于 4月 07, 2023
```
fix merge conflicts
```
  2b12a117
- R
  [FLUID_API_CLEAN]remove zeros (#52536) · f6d4ae3d
  由 risemeup1 提交于 4月 07, 2023
```
* remove zeros

* remove zeros

* apply gcc12 to py3

* apply gcc12 to py3

* fluid api clear

* fluid api clean

* fluid api clean
```
  f6d4ae3d
- F
  [Dy2static] [bugfix] fixed a bug which happens while parsing grad var name (#52110) · e459ac03
  由 feifei-111 提交于 4月 07, 2023
```
* fix dy2s grad name parse

* pre-commit

* bug fix

* Fix grad/ error

* Format code

---------
Co-authored-by: N0x45f <wangzhen45@baidu.com>
```
  e459ac03
- R
  fix mkdir (#52570) · 41226d55
  由 Roc 提交于 4月 07, 2023
```
* fix mkdir

* update
```
  41226d55
- H
  [Test MV] standalone_executor (#52520) · 5e63038a
  由 Happyd99 提交于 4月 07, 2023
```
* [Test MV] standalone_executor

* update

as

* update

as

* update codestyle
```
  5e63038a
- W
  
  clean up WITH_MLU (#52546) · e75c01f9
  由 Wang Xin 提交于 4月 07, 2023
  
  e75c01f9
06 4月, 2023 12 次提交

C

support hybrid parallel in qat (#52219) · 5c19bfc8
由 ceci3 提交于 4月 06, 2023

5c19bfc8

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

Move fused_attention op to phi [迁移前向 GPU OpKernel] (#51743) · a7ec8958

由 Sonder 提交于 4月 06, 2023

* add kernel functions

* update kernel functions

* update func parameters' name

* create codes for gpu device

* 调整文件位置

* fix include error

* remove dependent files to phi/

* restore fused_attention_op.cu

* fix dependence errors

* fix dependence errors

* fix include error

* fix all depandence errors[build success]

* remove useless include

* recover useless include

* use phi::ToNCCLDataType

* fix namespace

* update new register code

* fix error in fused_gemm_epilogue_utils

* fix error in FusedAttentionKernel parm

* finish fused_attention registe code[build success]

* add paddle::optional

* add sig file

* fix build error

* fix a include error

* update CMkaeList

* fix parameter sequence

* add include file

* update #if before include

* fix grammly error

* update codes for DropoutParam

* remove const cast

* trans some fluid api to phi api

* add #if

* update test code

* update test codes

* recover test codes

* trans fused_attention to fluid

* move #endif to end

* move #endif

* delete useless files

* use fused attention utils and recover random seed

* remove fluid include in phi

a7ec8958

S

add autogen code support for logical_and, logical_not, logical_or and logical_xor (#52451) · 6df4a667
由 scotty 提交于 4月 06, 2023

6df4a667
N

[CodeStyle][B017] catch more specific exceptions in unittests (#52553) · 9dbfadab
由 Nyakku Shigure 提交于 4月 06, 2023

9dbfadab
Z

Modify the condition of _get_places in fp16 (#52508) · 160dfd01
由 Zhang Zheng 提交于 4月 06, 2023

160dfd01

feat: add composite rule of roll grad (#52532) · 348a36b5

由 Kang Zhao 提交于 4月 06, 2023

* feat: add relu composite rule

* feat: add relu composite rule, maximum op

* feat: add relu composite rule, maximum op

* feat: add relu composite rule, polish comments

* feat: add relu composite rule, polish comments

* feat: add relu composite rule, add python api of relu

* feat: add relu composite rule, commit hook

* fix: maximum type error & ban cinn test

* fix: maximum input sequence bugs

* resolve conflicts

* fix: code style bugs

* add: relu fp16 test

* feat: add rsqrt composite rule

* feat: add rsqrt composite rule

* resolve conflicts of composite rule

* fix: delete check eager

* feat: add roll grad composite rule

* fix minus shift

* fix test roll op

348a36b5

J
[CINN] disable CINN test_mean_op unittest to pass CINN CI (#52510) · b36ac56d
由 jiangcheng 提交于 4月 06, 2023
```
* [CINN] disable CINN test_mean_op unittest to pass CINN CI

* disable test_mean_op for pass ci
```
b36ac56d
S
Fix flash attention bug (#52551) · 8ac5a6b6
由 sneaxiy 提交于 4月 06, 2023
```
* fix flash attn

* fix another API
```
8ac5a6b6

rem is_compiled_with_npu (#52385) · 7976e2a3

由 Kim Yann 提交于 4月 06, 2023

* rem is_compiled_with_npu

* rem nup related code

* make lint happy

* rem test

* remove some tests

* Update grad_scaler.py

* fix an error

7976e2a3

【PaddlePaddle Hackathon 4】No.63 add fp16 and bf16 for eye and frame (#51819) · ae10133a

由 LoneRanger 提交于 4月 06, 2023

* add fp16 and bf16 for eye and frame

* fix bug

* fix bug

* fix bug

* Update test_frame_op.py

fix code style

* fix bug

* fix bug

ae10133a

[AMP OP&Test]Add fp16/bf16 support logical op (#52112) · b10e4577

由 WJJ1995 提交于 4月 06, 2023

* fixed glog

* add

* add bfloat16 test for logical op

* rm useless code

* add uint16

* deal with comments

* fixed code style

* fixed code style

* fixed for ci

* deal with comments

* fixed for ci

b10e4577

05 4月, 2023 1 次提交
- fix Tensor.item to np.array(Tensor).item (#52483) · d95eaa17
  由 zhouweiwei2014 提交于 4月 05, 2023
  
  d95eaa17
04 4月, 2023 12 次提交

Add Gloo Gather Function (#52334) · 5f6376b7

由 yuehuayingxueluo 提交于 4月 04, 2023

* add gloo gather

* add gloo_tools

* fix CI bug

* use gloo gather

* remove redundant code

* fix process_group_gloo.py

* rename send_recv

* fix conflict

* fix conflict

* fix codestyle

* fix CI bug

* add PADDLE_ENFORCE_NE

5f6376b7

T

bugfix on dist.alltoall_single (#52495) · e6e62342
由 Tian 提交于 4月 04, 2023

e6e62342
C
【Hackathon No.62】增加pool3d算子BF16及单测，lgamma, masked_select FP16/BF16算子单测 (#51837) · b0dbf9fe
由 chenxujun 提交于 4月 04, 2023
```
* Add pool3d lgamma masked_select tests

* Fix code
```
b0dbf9fe
J

remove some left apis in fluid.nn (#52503) · f6f104d5
由 JYChen 提交于 4月 04, 2023

f6f104d5
N
[Dy2St] support train step in to_static (#51693) · 7728efb4
由 Nyakku Shigure 提交于 4月 04, 2023
```
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
```
7728efb4
W

speedup eager_legacy_trace_op (#52467) · 80a4a2e5
由 wanghuancoder 提交于 4月 04, 2023

80a4a2e5

Improve new executor static build (#51149) · 5bac67d4

由 Ruibiao Chen 提交于 4月 04, 2023

* Improve new executor static build

* Skip GC for static build

* Skip infershape for static build

* Handle read_op

* Add fused_attention to OpsWithFluidKernelNeedMoveToPhi

* Fix argsort typos

* Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi

* Fix skip share lod errors

* Fix errors for adam

* Fix errors for eigvals, memcpy and fake_quantize

* Add static_build.cc

* Add black list

* Fix CI errors

* Fix CI errors

* Fix CI errors

* Fix TensorArray

* Fix TensorArray

* Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel

* Fix copy

* Fix errors

* Fix momentum

* Skip mkldnn

* Fix CI errors

* Fix c_sync_calc_stream_op

* Fix CINN

* Fix while op

* All CI pass, disable FLAGS to merge code, enable it after more tests in future

* Add UTs

* Fix typos

* Fix typos

* Add mkldnn UT

* Remove mkldnn test

* Fix typos

* Fix dist test

* Fix typos

* Fix CI errors

* Fix CI errors

* Add UTs

* Fix typos

* Fix typos

* Add sparse tests

* ToComplexType -> ToComplex

* Add test_matmul_op_static_build to disable_win_inference_test

5bac67d4

L
relocate debugger.py (#52048) · 076bc5d6
由 LoneRanger 提交于 4月 04, 2023
```
* relocate debugger.py

* fix bug

* fix bug

* fix bug

* fix bug
```
076bc5d6

【Prim】Support fuse jit save (#52344) · fca0a4bf

由 Jiabin Yang 提交于 4月 04, 2023

* fix_prim

* fix bug

* add note

* fix logic

* fix

* add note

* fix check

* fix bug

* fix bug

* fix bug

* add debug

* fix check

* fix bug

* sync print log

* fix test case

* change default

* support jit save with fuse

* add more check

* sync with pr 52120

* add more ut

---------
Co-authored-by: Ncyber-pioneer <chenzhuo@tju.edu.cn>

fca0a4bf

K

[CustomDevice] Change use_custom_device in eager_op_test from method to variable (#52480) · 67a6dd32
由 Kai Song 提交于 4月 04, 2023

67a6dd32

remove op.py in fluid (#52248) · 273783b3

由 LoneRanger 提交于 4月 04, 2023

* remove op.py

* [Zero-Dim] change Tensor.numpy() usage to other equivalent usage, avoid hack (#52197)

* [BugFix] fix compute error in fused_dropout_add (#52261)

* fix bg

* add utest

* add utest

* [CodeStyle][UP034] remove (()) cases (#52060)

* add up34

* modify var name in loop

* revert changes in test_slice

* Revert "modify var name in loop"

This reverts commit 6d748e371afb417054ed0c6b36fd11e87959a90d.

* temporarily ignore test_slice.py

* add comment

* empty commit, re-trigger all ci

* fix inc

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

* [AMP OP&Test] add unittest for log_softmax (#52264)

* Fix_Linux_[-Wterminate]warning (#52186)

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output (#52214)

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete dtype,shape func of multi_inplace op

* [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output

* [CustomOP Inplace] Auto-generate python API for inplace vector<Tensor> output

* [AMP OP&Test] add float16 optest for reshape_op (#51678)

* [AMP OP&Test] add float16 optest for reshape_op

* add public_python_api

* [AMP OP&Test] Add fp16/bf16 to clip op (#52158)

* add fp16/bf16 to clip op

* fix as reviewed

* update test_clip_op.py

* update test_clip_op.py

* fix bug

* fix code style

* fix bug

* fix bug

---------
Co-authored-by: Zhou Wei <1183042833@qq.com>
Co-authored-by: NShenLiang <1422485404@qq.com>
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
Co-authored-by: NCcc <52520497+juncaipeng@users.noreply.github.com>
Co-authored-by: NGalaxy1458 <55453380+Galaxy1458@users.noreply.github.com>
Co-authored-by: NHongyuJia <jiahongyu@baidu.com>
Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com>
Co-authored-by: Nwuyefeilin <30919197+wuyefeilin@users.noreply.github.com>

273783b3

J

support amp logic (#52397) · ecf586a5
由 Jiabin Yang 提交于 4月 04, 2023

ecf586a5

03 4月, 2023 4 次提交
- C
  [Prim] polish prim arg None check (#52449) · 35a7ae21
  由 cyber-pioneer 提交于 4月 03, 2023
```
* polish prim arg None check

* fix bug
```
  35a7ae21
- H
  [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input (#52421) · 59c9d75e
  由 HongyuJia 提交于 4月 03, 2023
```
* [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input

* uncomment unittest codes
```
  59c9d75e
- C
  [Prim] simplify bn vjp code (#51933) · 04f8c24e
  由 cyber-pioneer 提交于 4月 03, 2023
```
* simplify bn vjp code

* simplify composite rule

* polish name
```
  04f8c24e
- C
  
  Add margin_cross_entropy, transfer_layout, dropout_nd tests (#52369) · 648563dd
  由 chenxujun 提交于 4月 03, 2023
  
  648563dd

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功