提交 · 083853cd4e4a9bdad22c70fa48eb9a036d2def27 · PaddlePaddle / Paddle

22 9月, 2022 1 次提交
- Z
  
  [Auto Parallel] fix lazyinit (#46355) (#46382) · 083853cd
  由 zhaoyingli 提交于 9月 22, 2022
  
  083853cd
21 9月, 2022 2 次提交
- A
  [Cherry-pick][BugFix]Fix pooling output_size bug if encounter list[Tensor] (#46360) · cc3e7cd8
  由 Aurelius84 提交于 9月 21, 2022
```
* [Check]Enhance pooling output_size type check

* add unittest
```
  cc3e7cd8
- G
  
  remove tmp fp32 var for gaussian_random (#46285) · b027652b
  由 Guoxia Wang 提交于 9月 21, 2022
  
  b027652b
20 9月, 2022 10 次提交

Z
[Paddle-TRT][Cherry-Pick]Fix cast bug (#46293) · 230b9a82
由 zhoutianzi666 提交于 9月 20, 2022
```
* fix cast bug
```
230b9a82
H
[PolishComments] Polish some code comments (#46032) (#46261) · 42e56f65
由 HongyuJia 提交于 9月 20, 2022
```
* polish code comments

* polish data_device_transform.cc
```
42e56f65

[Cherry-Pick][AutoParallel] change import way and fix strategy (#46270) · c43ebfcf

由 zhaoyingli 提交于 9月 20, 2022

* [Auto Parallel] Change the import way of Auto Parallel (#46115)

* fix strategy (#46256)

* [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (#46180)

* remove no need grad allreduce communication when sharding-dp

* remove no need grad allreduce communication when sharding-dp

* bugfix

* bugfix

* bugfix
Co-authored-by: NYulong Ao <aoyulong@baidu.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>

c43ebfcf

Z
[Paddle-TRT] Support matmul_v2 in Paddle-TensorRT (#46177) · 654807cd
由 zhoutianzi666 提交于 9月 20, 2022
```
* Support matmul_v2 in Paddle-TensorRT converter.
```
654807cd
W
Fix TransDataBackend Error when call unsqueeze using MKL Tensor (#46094) (#46186) · 50340302
由 WangZhen 提交于 9月 20, 2022
```
* Fix TransDataBackend Error when call unsqueeze using MKL Tensor

* Add UT

* Refine UT
```
50340302

[Cherry-pick] Sparse add InferMeta (#46235) · fd8ec4a1

由 zhangkaihuo 提交于 9月 20, 2022

cherry-pick : #46016, #46021, #45974

* [Sparse]Sparse add support gpu (#45974)

* [Sparse]Remove unused code (#46021)

* [Sparse] Add infer meta (#46016)

fd8ec4a1

J
[Eager] Fix linspace error in amp (#46088) (#46206) · 38c0fd02
由 Jiabin Yang 提交于 9月 20, 2022
```
* fix linspace error in amp

* fix log

* fix amp error
```
38c0fd02

(cherry-pick)Support some op refuse forward and fix some bugs (#46211) · bc92d5f5

由 Charles-hit 提交于 9月 20, 2022

* support cast op backward refuse forward and fix some bugs (#46173)

* support cast op backward refuse forward

* Fix the bug of high order unit test framework

* support sign op backward refuse forward (#46002)

bc92d5f5

Run_program_op add scope cache & reuse (#45813) (#46223) · 4f28a4c2

由 zhangbo9674 提交于 9月 20, 2022

* add scope cache & reuse

* add gc scope for end of each train step

* del scope reuse for jit

* refine code

* test

4f28a4c2

[Cherry-pick] Update layoutautotune for inplace (#45826) (#46226) · c0324e82

由 niuliling123 提交于 9月 20, 2022

cherry-pick from #45826
LayoutAutotune 支持 inplace 类型的OP
 根据 Add eager layout autotune #45409 修改意见调整UseAutotune
将LayoutAutotune判断放到controller中，与AMP 判断保持一致

c0324e82

19 9月, 2022 10 次提交

W

Recompute unify incubate (#46073) (#46210) · 4bced24a
由 wuhuachaocoding 提交于 9月 19, 2022

4bced24a

[vision.ops.nms] Fix return order error and duplicate results with specific... · be84cac7

由 RichardWooSJTU 提交于 9月 19, 2022

[vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) (#46193)

* fix return order error and duplicate results with specific inputs

be84cac7

[cherry-pick] add abs,mean,sum,ge,gt,pow,etc higher-order differentiation operators (#46184) · ad8beaaf

由 Xiaoxu Chen 提交于 9月 19, 2022

* [cherry-pick] extend reduce_sum,reduce_sum,eq,ne,ge,abs,pow,etc higher order operators

* add reduce_mean,reduce_sum primitive ops
* add ne_p gt_p primitive operators
* add ge_p abs_p primitive oparators
* add cast primitive operators
* add pow,square prim2oirg rules
* add elementwise_div orig2prim rule

* [cherry-pick] add mean,sum,ge,gt,ne,abs,etc higher-order differentiation operators(#45888)

* add reduce_mean,reduce_sum primitive ops

* add ne_p gt_p primitive operators

* add ge_p abs_p primitive oparators

ad8beaaf

W

[JitLayer]Save property meta file to correct path (#46131) (#46195) · 45a3c656
由 WangZhen 提交于 9月 19, 2022

45a3c656

[cherry-pick] [dy2static] support user to use decorator in their program (#46194) · d1ce974e

由 feifei-111 提交于 9月 19, 2022

* [dy2static] support user to use decorator in their program (#45768)

* support deco

* fix deco ast type

* arg_str

* 1

* support callable deco

* code style

* codestyle

* test_error

* fix decos in another file

* recover conflict codes

* [BugFix] fixed a bug in decorator transformer, it can not analyze decorator with params correctly (#46055)

* fix deco call

* add raise

* add test

* add warn, fix paddle api

* fix error type

* fix coverage

d1ce974e

W

Add symbolic shape deduction function for general Plugin mechanism (#46179) · a0566010
由 weishengying 提交于 9月 19, 2022

a0566010

(cherry-pick)support some op backward refuse forward (#46201) · adab3c59

由 Charles-hit 提交于 9月 19, 2022

* add unit test for sum higher level op (#45961)

* support slice op backward refuse forward and add high level unit test (#45960)

* support tile op backward refuse forward (#45942)

* support expand_v2 op backward refuse forward (#45941)

* support concat backward refuse forward (#45940)

adab3c59

M
Add INT8 support for fused_multi_transformer_op (#45284) (#46169) · db368d5b
由 minghaoBD 提交于 9月 19, 2022
```
Co-authored-by: NRichardWooSJTU <37864677+RichardWooSJTU@users.noreply.github.com>
```
db368d5b

[Cherry-pick][Auto Parallel] Improve the APIs (#46164) · c5cc4278

由 Yulong Ao 提交于 9月 19, 2022

* [AutoParallel] adapt gradient merge pass (#45915)

* adapt gradient merge

* fix op_role

* fix strategy

* [Auto Parallel] Gradient Fuse Allreduce (#45643)

* bugfix (#45332)

* dist embedding support lookup table v1

* add unitest

* customize wait_comm

* group gradients

* bugfix

* update program

* [Auto Parallel] Improve the APIs (#45776)

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Add the serialization process for dist attrs

* [Auto Parallel] Remove unnecessary comments

* [Auto Parallel] Fix some bugs

* [Auto Parallel] Fix the code style

* [Auto Parallel] Remove unnecessary impls

* [Auto Parallel] Fix the importing error

* [Auto Parallel] Fix the copy from bugs of op dist attr

* [Auto Parallel] Replace the use of constexpr if

* [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh

* [Auto Parallel] Change API of the completion unittest

* [Auto Parallel] Fix the bug when set_attr an int

* [Auto Parallel] Add the unittest for the serialization

* [Auto Parallel] Add some unit tests

* [Auto Paralle] Unify the strategy

* [Auto Parallel] Improve the engine api

* [Auto Parallel] Reset the changes made to the framework

* [Auto Parallel] Change the engine unittest

* [Auto Parallel] Update API of the completion and partitioner

* [Auto Parallel] Update unit tests using engine api

* update shard annotation

* [Auto Parallel] Remove the modifications of other modules

* [Auto Parallel] Add docs for APIs

* add new strategy

* [Auto Parallel] Replace the logger

* [Auto Parallel] Restore the test_program.py

* [Auto Parallel] Change the import rules

* [Auto Parallel] Add the examples for Engine

* [Auto Parallel] Do some minor changes

* [Auto Parallel] Remove yaml dependency

* [Auto Parallel] Fix the unittests

* add valid after train

* bug fix
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>

* [Auto Parallel] Bugfix allreduce fuse for MP (#46086)

* bugfix

* bugfix

* typos fixed

* update strategy (#46138)
Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>

c5cc4278

Unify core avx and core_noavx to libpaddle (#46095) (#46113) · 4261ae34

由 Chen Weihang 提交于 9月 19, 2022

* unify  core_avx and core_noavx

* fix except error

* revert mac compile logic

* revert dylib to so

* add core_noavx branch

* remove core_noavx

* replace paddle_core by lib paddle

* polish var name

* replace paddle_core by libpaddle

* update custom device commit

* polish code by comments

4261ae34

16 9月, 2022 3 次提交

（cherry-pick）Fix split infershape in static mode and add convert rules for... · 4e09e402

由 Charles-hit 提交于 9月 16, 2022

（cherry-pick）Fix split infershape in static mode and add convert rules for fill_any_like op (#46079)

* Fix split bug in static mode (#45906)

* fix split bug in static mode

* modify code style

* modify code style

* add unit test for split

* add convert rules for fill_any_like op in paddle science (#45985)

* add convert rules for fill_any_like op in paddle science

* add unit test for fill_any_like op in paddle science

* modify fill_any_like convert rule

* modify fill_any_like convert rule dtype

4e09e402

H
[cherry-pick][jit] Jit skip forward (#45926) · e25e9471
由 Hui Zhang 提交于 9月 16, 2022
```
* skip forward save

* fix bug

* more ci for jit skip forward
```
e25e9471

[Cherry-pick] Normalize yaml name and label (#46052) · 8caaf85a

由 Chen Weihang 提交于 9月 16, 2022

* normalize yaml file name (#45894)

* Clear extra attributes of activation op in OpMaker (#45772)

* clear extra attr of activation op in opmaker

* fix syntax bug

* fix mkldnn kernel

* fix merge conflict

* fix bug

* [PHI] Normalize yaml op label (#45976)

* normalize yaml op label

* revert op_compat yaml change

* fix prelu and rnn compat problem

* replace api by op

* support assign op backward refuse forward (#45879)

* normize yaml backward op label (#46028)
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
Co-authored-by: NCharles-hit <56987902+Charles-hit@users.noreply.github.com>

8caaf85a

15 9月, 2022 4 次提交

[ Dy2Static ] Fix bugs when select inputs meeting different shape or... · 00486956

由 xiongkun 提交于 9月 15, 2022

[ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916) (#46020)

* fix select_input with different shape errors:
1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var
2. the output shape of select_input is inferred from inputs.

* reverse the logic in select_input

00486956

W
Support 0 shapes input Tensor for MKL slice (#45930) (#46072) · 903c87bd
由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
903c87bd
W

General Plugin Mechanism (#45355) (#46070) · 07933116
由 weishengying 提交于 9月 15, 2022

07933116
Z
fix trt multiclass_nms3 (#45166) (#46034) · 61a3e30b
由 Zhang Jun 提交于 9月 15, 2022
```
* Support dynamic shape in multiclass_nms3 Plugin for Paddle-TensorRT.
```
61a3e30b

14 9月, 2022 2 次提交
- J
  
  merge python lib (#46013) · 5130b0a1
  由 JingZhuangzhuang 提交于 9月 14, 2022
  
  5130b0a1
- P
  
  delete new executor log (#45917) · e223cf7b
  由 pangyoki 提交于 9月 14, 2022
  
  e223cf7b
13 9月, 2022 1 次提交
- R
  [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931) · 29c44eb2
  由 Ruibiao Chen 提交于 9月 13, 2022
```
* Allow manaully set py_reader name in standalone executor

* Fix CI errors
```
  29c44eb2
09 9月, 2022 6 次提交
- C
  
  support cumsum flip reverse backward refuse forward (#45892) · d6b5d91c
  由 Charles-hit 提交于 9月 09, 2022
  
  d6b5d91c
- Z
  [AutoParallel] adapt lazyinit & fix pass (#45840) · bc2265f8
  由 zhaoyingli 提交于 9月 09, 2022
```
* adapt lazy init and fix pass

* add unittest

* update comment

* fix amp and sharding

* remove clip_by_norm
```
  bc2265f8
- X
  [ Dy2Static ] convert_call support staticmethod for class. (#44983) · d0096eaf
  由 xiongkun 提交于 9月 09, 2022
```
* convert_call support staticmethod for class.

* while support for python container.
It is convenient to convert more dynamic graph codes into static graphs.

* cond support python container

* add unittest for staticmethod convert_call

* fix bugs

* add unittest for item interface

* fix bugs

* change to np.testing.assert_allclose

* code format

* fix comments.

* code format
```
  d0096eaf
- W
  Enhance slice to support 0 shape Tensor (#45861) · 4a675b7a
  由 WangZhen 提交于 9月 09, 2022
```
* Enhance slice to support 0 dims Tensor

* Add UT
```
  4a675b7a
- X
  modify slice op Infershape (#45855) · 97847ae8
  由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
  97847ae8
- L
  
  fix fused_attention with mp unit test case fail on A100 with CUDA >= 11.6 (#45883) · a5836222
  由 LiYuRio 提交于 9月 09, 2022
  
  a5836222
08 9月, 2022 1 次提交

[Dy2Static] fix non-local error while dealing push_pop names (#45828) · 67d77846

由 xiongkun 提交于 9月 08, 2022

* 1. fix non-local error while dealing push_pop names
2. escape "'" in push_pop_names to avoid syntax errors.
3. unified the non-local stmt creation processes in getter and setter.
4. split the nonlocal_names and getter/setter names.

* fix bugs

* 1. revert setter and getter, push_pop_names must have non-local

* fix bugs.

* code format

67d77846

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功