提交 · 8d99dd0ce8729c9a0621bf604c89786639c97df0 · PaddlePaddle / Paddle

10 11月, 2022 13 次提交
- Y
  [PHI]Standardise some C++ API (Part4) (#47702) · 594bd723
  由 YuanRisheng 提交于 11月 10, 2022
```
* standard api

* fix sparse bugs

* fix xpu bugs, test=kunlun

* remove hard code for custom unittest

* open ci, test=kunlun

* deal with conflict
```
  594bd723
- W
  refactor Recompute example doc (#47783) · 60ec3107
  由 wuhuachaocoding 提交于 11月 10, 2022
```
* add recompute doc.

* add recompute doc.

* add recompute doc.

* update.

* update.
```
  60ec3107
- Z
  [search && paddle inference]add roformer pass&&plugin novarlen version (#47523) · 0f3fb562
  由 zhangxin81 提交于 11月 10, 2022
```
* add roformer pass&&plugin（novarlen）
```
  0f3fb562
- Z
  
  fix dp completion (#47804) · 3a14857b
  由 zhaoyingli 提交于 11月 10, 2022
  
  3a14857b
- C
  [Auto Parallel]Add c_concat pass for reshard (#47809) · 831db343
  由 caozhou 提交于 11月 10, 2022
```
* add c_concat pass for reshard

* add unittest
```
  831db343
- [Zero-Dim] support input 0D Tensor for xpu compare kernel, test=kunlun (#47812) · d01109fc
  由 zhouweiwei2014 提交于 11月 10, 2022
  
  d01109fc
- J
  XPU multi-card support eager mode (#47445) · 3b91f8f3
  由 james 提交于 11月 10, 2022
```
* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun
```
  3b91f8f3
- Z
  
  Remove unnecessary operations of GroupNorm in eager mode (#47791) · 8785537c
  由 Zhang Zheng 提交于 11月 10, 2022
  
  8785537c
- W
  skip_merge_layernorm (#47810) · 1c6013dd
  由 wenbin 提交于 11月 10, 2022
```
* skip_merge_layernorm

* add UT

* modify comments
```
  1c6013dd
- H
  
  [Dygraph] Support grad division to nranks before reduce in sharding stage2 (#47764) · 3addd568
  由 Haohongxiang 提交于 11月 10, 2022
  
  3addd568
- C
  
  support pow_triple_grad op (#47799) · 7964119b
  由 Charles-hit 提交于 11月 10, 2022
  
  7964119b
- Z
  [AutoParallel] fix insert concat op (#47710) · 658387b0
  由 zhaoyingli 提交于 11月 10, 2022
```
* fix insert concat op

* fix fp16 assert
```
  658387b0
- N
  [CodeStyle][F821] fix test_exception in test_unpool3d_op and test_unpool_op (#47756) · 1b250710
  由 Nyakku Shigure 提交于 11月 10, 2022
```
* [Fix][F821] fix TestUnpoolOpException

* fix TestUnpoolOpException

* fix TestUnpool3DOpException

* remove unused variables

* fix the regexp does not match the C++ traceback

* add missing error message for gpu unpool_kernel

* Revert "add missing error message for gpu unpool_kernel"

This reverts commit 17ef7a127e1c3ee00f9102c37ad8cea35953f20c.

* assertion indices_value_error errors are only reported on the CPU

* for test

* run test_exception in dygraph mode
```
  1b250710
09 11月, 2022 12 次提交

Get grads from cpp for optimizer to avoid gpu idel time (#47709) · 261ebb0c

由 WangZhen 提交于 11月 09, 2022

* Get params and grads in cpp to avoid gpu idel time

* Using python param instead of cpp return param to fix test_asp_optimize_dynamic.py

* Get grads from cpp and construct params_grads on python

* Check meta and remove comments

261ebb0c

Enable fc passes (#45704) · 7e914386

由 Paulina Gacek 提交于 11月 09, 2022

* Analysis API interface for disabling fc passes

* Unit tests corrected

* Python API added

* test runs only when PADDLE_WITH_MKLDNN

* Fc op changed to relu in matmul_op_test

* Disable fc passes in tests where acc drops

* code formating

* Unit test for analysisConf added

* Unit test gpu added

* fc passes disabled when iterations=0 in gru test

* style

* passes disabled when fp32 in gru test

* fc passes disabled in lstm test

* Import from inference, not fluid in doc

7e914386

T
[CodeStyle][E266] remove multiple '#' in comments (#47772) · 8c8cf0fd
由 Tony Cao 提交于 11月 09, 2022
```
* fix flake8 CodeStyle E266

* fix comments
```
8c8cf0fd
Z

manage no shape var type (#47775) · 339aefac
由 zhaoyingli 提交于 11月 09, 2022

339aefac
R

Replace built-in print with logger in distributed_strategy.py (#47761) · ecefd4e3
由 Roc 提交于 11月 09, 2022

ecefd4e3
C

add sin triple grad operator (#47753) · 267b218f
由 cyber-pioneer 提交于 11月 09, 2022

267b218f

超

fix The first round of evaluation (#47256) · 0a051297

由超级码牛提交于 11月 09, 2022

* fix paddle.get_default_dtype 

Chinese and English return values are inconsistent

* fix paddle.matmul 文档评估 #4407

把函数的输出改成正确的

* fix paddle.std文档评估 #4370

增加了一个unbiased=False的代码示例，没有增加numpy,怕引起误会。

* fix paddle.load文档测评 #4455 

只把代码拆分了5段

* try

* try

* try

* Update io.py

* Update io.py

* Update creation.py

* Update creation.py

* [Docs]add name description

* [Docs]fix broadcasting issue
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>

0a051297

N

[CodeStyle][py2] use new syntax for metaclass declaration (PEP 3115) (#47730) · 2dce4320
由 Nyakku Shigure 提交于 11月 09, 2022

2dce4320

[CodeStyle][py2] remove unnecessary u-prefix in string literal (#47727) · 433e67bd

由 Nyakku Shigure 提交于 11月 09, 2022

* [CodeStyle][py2] remove unnecessary u-prefix in string literal

* `"{}".format(x)` -> `x`

* remove duplicated dtype literals

* revert changes in data_feeder.py

* remove u-prefix in data_feeder

* revert remove duplicated dtype literals in data_feeder

* remove unnecessary convert to str

* for test

* add some comments

* refine comment

* restore a removed str conversion

* re-trigger all ci, empty commit

433e67bd

L

new mp_allreduce_sum_op (#47715) · 18d33346
由 LiYuRio 提交于 11月 09, 2022

18d33346

fix ScaleKernel configuration error where input numel is 0 (#47111) · 38ba5f2e

由 FlyingQianMM 提交于 11月 09, 2022

* fix scale kernel configuration error where input numel is 0

* fix code stype

* add unit test case for scale op when numel of input x is zero

* fix ci codestyle check

* add cpu and gpu unit test case for scale op when numel of input x is zero

* add uninitialized judgment for input of scale

38ba5f2e

W
[Paddle Inference]upgrade scale and slice op convert for Paddle-TensorRT (#47746) · cdd7b956
由 Wangzheee 提交于 11月 09, 2022
```
* upgrade scale and slice op convert for Paddle-TensorRT
```
cdd7b956

08 11月, 2022 15 次提交
- J
  [Auto Parallel] Sharding Optimization：Partition Algorithm & Stage2 Parameter... · e5eb3f55
  由 JZ-LIANG 提交于 11月 08, 2022
```
[Auto Parallel] Sharding Optimization：Partition Algorithm & Stage2 Parameter Bucket communication  (#47180)

* partition param by order

* add logging

* reorder opt

* config

* stage2 bucket

* update unitest
```
  e5eb3f55
- L
  
  refine comm api implementation (#47713) · 84c9a0d6
  由 LiYuRio 提交于 11月 08, 2022
  
  84c9a0d6
- [Zero-Dim] support input 0D Tensor for sundary api (#47734) · 3198af20
  由 zhouweiwei2014 提交于 11月 08, 2022
```
* [Zero-Dim] support input 0D Tensor for sundary api

* fix comment
```
  3198af20
- S
  Migrate old C++ unit tests to Python framework (#47006) · 0c9f09b8
  由 Sławomir Siwek 提交于 11月 08, 2022
```
* softplus+activation

* fc + elementwise_add test refactored

* rename MKLDNN to OneDNN

* fc+activation tests refactored

* remove softplus ut

* whitespace

* whitespace

* codestyle

* codestyle

* add more cases to fc+act

* remove softplus+hard_sigmoid pass

* remove softplus + hard_sigmoid UT

* add approximate for gelu

* swish beta range

* new codestyle

* reduce number of tests
```
  0c9f09b8
- [Zero-Dim] support input 0D Tensor for distribution transform api (#47677) · dc85b393
  由 zhouweiwei2014 提交于 11月 08, 2022
```
* [Zero-Dim] support input 0D Tensor for distribution api

* fix comment
```
  dc85b393
- Z
  
  add adadelta op for xpu, test=kunlun (#47661) · 047971f0
  由 zhangyikun02 提交于 11月 08, 2022
  
  047971f0
- Z
  
  argsort support n > 16384 and add argsort_grad op for xpu, test=kunlun (#47701) · 6a6a3ff1
  由 zhangyikun02 提交于 11月 08, 2022
  
  6a6a3ff1
- S
  
  fix npu:0 stage (#47729) · 793c35ef
  由 shentanyue 提交于 11月 08, 2022
  
  793c35ef
- N
  
  [CodeStyle][py2] remove the `next` method for python2 compatibility (PEP 3114) (#47728) · 4061b1b8
  由 Nyakku Shigure 提交于 11月 08, 2022
  
  4061b1b8
- X
  [BugFix] fix tensor_array slice bugs in _getitem_impl_ (#46447) · fccf664f
  由 xiongkun 提交于 11月 08, 2022
```
* fix tensor_array slice bugs in _getitem_impl_

* fix when var is a paddle.Tensor

* code format
```
  fccf664f
- Z
  [Paddle Inference] allow fold fill_constant && allow nms3 into trt in int8 model (#47551) · c3a69111
  由 zhoutianzi666 提交于 11月 08, 2022
```
* allow fold fill_constant && allow nms3 into trt in int8 model
* use unordered_map
* fix CI failing
```
  c3a69111
- N
  [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition (#47642) · 888272b5
  由 Nyakku Shigure 提交于 11月 08, 2022
```
* [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition

* fix an increment
```
  888272b5
- Z
  
  fix examplce code of slice api (#47735) · e5bb8785
  由 zyfncg 提交于 11月 08, 2022
  
  e5bb8785
- P
  Split quant (#47449) · 130db92a
  由 Paulina Gacek 提交于 11月 08, 2022
```
* Split kernel registered, tests for uint/int added

* Split quantized

* Split output scales calculated only once

* NearestInterp test fix reversed

* DequantizeOutputs corrected
```
  130db92a
- T
  remove dist xpu tests for R200 (#47381) · ef21b58b
  由 tianshuo78520a 提交于 11月 08, 2022
```
* disable distributed xpu tests

* test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun

* test=document_fix;test=kunlun
```
  ef21b58b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功