提交 · e5dc9d61064c68a47e88f445e5e400baf73926ea · Crayon鑫 / Paddle

19 9月, 2022 6 次提交

refactor mp. (#45803) (#46121) · e5dc9d61

由 wuhuachaocoding 提交于 9月 19, 2022

* refactor mp.

* update setup.py.

* update mp_layers.py for compatibility.

* add documents for mp_layers.py

* update init.py

* update collective.py.

* update.

* update mp_ops.py

* update.

* update code style.

* update code style.

e5dc9d61

[Cherry-pick][Auto Parallel] Improve the APIs (#46164) · c5cc4278

由 Yulong Ao 提交于 9月 19, 2022

* [AutoParallel] adapt gradient merge pass (#45915)

* adapt gradient merge

* fix op_role

* fix strategy

* [Auto Parallel] Gradient Fuse Allreduce (#45643)

* bugfix (#45332)

* dist embedding support lookup table v1

* add unitest

* customize wait_comm

* group gradients

* bugfix

* update program

* [Auto Parallel] Improve the APIs (#45776)

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Use c++ dist attr in the completion process

* [Auto Parallel] Add minor changes

* [Auto Parallel] Add the serialization process for dist attrs

* [Auto Parallel] Remove unnecessary comments

* [Auto Parallel] Fix some bugs

* [Auto Parallel] Fix the code style

* [Auto Parallel] Remove unnecessary impls

* [Auto Parallel] Fix the importing error

* [Auto Parallel] Fix the copy from bugs of op dist attr

* [Auto Parallel] Replace the use of constexpr if

* [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh

* [Auto Parallel] Change API of the completion unittest

* [Auto Parallel] Fix the bug when set_attr an int

* [Auto Parallel] Add the unittest for the serialization

* [Auto Parallel] Add some unit tests

* [Auto Paralle] Unify the strategy

* [Auto Parallel] Improve the engine api

* [Auto Parallel] Reset the changes made to the framework

* [Auto Parallel] Change the engine unittest

* [Auto Parallel] Update API of the completion and partitioner

* [Auto Parallel] Update unit tests using engine api

* update shard annotation

* [Auto Parallel] Remove the modifications of other modules

* [Auto Parallel] Add docs for APIs

* add new strategy

* [Auto Parallel] Replace the logger

* [Auto Parallel] Restore the test_program.py

* [Auto Parallel] Change the import rules

* [Auto Parallel] Add the examples for Engine

* [Auto Parallel] Do some minor changes

* [Auto Parallel] Remove yaml dependency

* [Auto Parallel] Fix the unittests

* add valid after train

* bug fix
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>

* [Auto Parallel] Bugfix allreduce fuse for MP (#46086)

* bugfix

* bugfix

* typos fixed

* update strategy (#46138)
Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com>
Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>

c5cc4278

C
Revert "Simplify size op impl (#45808)" (#46168) · dabb8f23
由 Chen Weihang 提交于 9月 19, 2022
```
This reverts commit c252b1de.
```
dabb8f23
S

rename fleetx, develop=document_fix (#46141) · 7a6db0a3
由 ShenLiang 提交于 9月 19, 2022

7a6db0a3
M
[Cherry-pick] fix bug for TransformedDistribution (#46157) · a5d4f571
由 MayYouBeProsperous 提交于 9月 19, 2022
```
fix bug for TransformedDistribution
```
a5d4f571

Unify core avx and core_noavx to libpaddle (#46095) (#46113) · 4261ae34

由 Chen Weihang 提交于 9月 19, 2022

* unify  core_avx and core_noavx

* fix except error

* revert mac compile logic

* revert dylib to so

* add core_noavx branch

* remove core_noavx

* replace paddle_core by lib paddle

* polish var name

* replace paddle_core by libpaddle

* update custom device commit

* polish code by comments

4261ae34

17 9月, 2022 1 次提交

V2.4 - cherry-pick (#46126) · a76fa414

由 ziyoujiyi 提交于 9月 17, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fix gloo compile warning

* adapt for nn fl-ps

a76fa414

16 9月, 2022 3 次提交

（cherry-pick）Fix split infershape in static mode and add convert rules for... · 4e09e402

由 Charles-hit 提交于 9月 16, 2022

（cherry-pick）Fix split infershape in static mode and add convert rules for fill_any_like op (#46079)

* Fix split bug in static mode (#45906)

* fix split bug in static mode

* modify code style

* modify code style

* add unit test for split

* add convert rules for fill_any_like op in paddle science (#45985)

* add convert rules for fill_any_like op in paddle science

* add unit test for fill_any_like op in paddle science

* modify fill_any_like convert rule

* modify fill_any_like convert rule dtype

4e09e402

H
[cherry-pick][jit] Jit skip forward (#45926) · e25e9471
由 Hui Zhang 提交于 9月 16, 2022
```
* skip forward save

* fix bug

* more ci for jit skip forward
```
e25e9471

[Cherry-pick] Normalize yaml name and label (#46052) · 8caaf85a

由 Chen Weihang 提交于 9月 16, 2022

* normalize yaml file name (#45894)

* Clear extra attributes of activation op in OpMaker (#45772)

* clear extra attr of activation op in opmaker

* fix syntax bug

* fix mkldnn kernel

* fix merge conflict

* fix bug

* [PHI] Normalize yaml op label (#45976)

* normalize yaml op label

* revert op_compat yaml change

* fix prelu and rnn compat problem

* replace api by op

* support assign op backward refuse forward (#45879)

* normize yaml backward op label (#46028)
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
Co-authored-by: NCharles-hit <56987902+Charles-hit@users.noreply.github.com>

8caaf85a

15 9月, 2022 5 次提交
- X
  [ Dy2Static ] Fix bugs when select inputs meeting different shape or... · 00486956
  由 xiongkun 提交于 9月 15, 2022
```
[ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916) (#46020)

* fix select_input with different shape errors:
1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var
2. the output shape of select_input is inferred from inputs.

* reverse the logic in select_input
```
  00486956
- W
  Support 0 shapes input Tensor for MKL slice (#45930) (#46072) · 903c87bd
  由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
  903c87bd
- W
  
  General Plugin Mechanism (#45355) (#46070) · 07933116
  由 weishengying 提交于 9月 15, 2022
  
  07933116
- C
  
  fix distributed bug caused by fill_any_like (#45978) (#46041) · 9012e8bc
  由 Charles-hit 提交于 9月 15, 2022
  
  9012e8bc
- Z
  fix trt multiclass_nms3 (#45166) (#46034) · 61a3e30b
  由 Zhang Jun 提交于 9月 15, 2022
```
* Support dynamic shape in multiclass_nms3 Plugin for Paddle-TensorRT.
```
  61a3e30b
14 9月, 2022 2 次提交
- J
  
  merge python lib (#46013) · 5130b0a1
  由 JingZhuangzhuang 提交于 9月 14, 2022
  
  5130b0a1
- P
  
  delete new executor log (#45917) · e223cf7b
  由 pangyoki 提交于 9月 14, 2022
  
  e223cf7b
13 9月, 2022 1 次提交
- R
  [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931) · 29c44eb2
  由 Ruibiao Chen 提交于 9月 13, 2022
```
* Allow manaully set py_reader name in standalone executor

* Fix CI errors
```
  29c44eb2
09 9月, 2022 9 次提交

C

support cumsum flip reverse backward refuse forward (#45892) · d6b5d91c
由 Charles-hit 提交于 9月 09, 2022

d6b5d91c

[AutoParallel] adapt lazyinit & fix pass (#45840) · bc2265f8

由 zhaoyingli 提交于 9月 09, 2022

* adapt lazy init and fix pass

* add unittest

* update comment

* fix amp and sharding

* remove clip_by_norm

bc2265f8

[ Dy2Static ] convert_call support staticmethod for class. (#44983) · d0096eaf

由 xiongkun 提交于 9月 09, 2022

* convert_call support staticmethod for class.

* while support for python container.
It is convenient to convert more dynamic graph codes into static graphs.

* cond support python container

* add unittest for staticmethod convert_call

* fix bugs

* add unittest for item interface

* fix bugs

* change to np.testing.assert_allclose

* code format

* fix comments.

* code format

d0096eaf

W
Enhance slice to support 0 shape Tensor (#45861) · 4a675b7a
由 WangZhen 提交于 9月 09, 2022
```
* Enhance slice to support 0 dims Tensor

* Add UT
```
4a675b7a
X
modify slice op Infershape (#45855) · 97847ae8
由 xiaoguoguo626807 提交于 9月 09, 2022
```
* modify slice infershape

* code style

* modify slice_unittest
```
97847ae8
C
Simplify size op impl (#45808) · c252b1de
由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
c252b1de
Y

fix dygraph pp + mp nan after async send/recv (#45869) · 5d7e1c91
由 Yuang Liu 提交于 9月 09, 2022

5d7e1c91
L

fix fused_attention with mp unit test case fail on A100 with CUDA >= 11.6 (#45883) · a5836222
由 LiYuRio 提交于 9月 09, 2022

a5836222

Add paddle audio feature module (#45424) · 7c1dc754

由 YangZhou 提交于 9月 09, 2022

* add audio feature dataset

* fix coding style

* fix coding style2

* rm librosa

* rm voxceleb

* rm librosa in test

* add scipy fftpack

* add functional

* fix setup

* fix setup2

* rm colorlog

* refactor dataset __init__.py

* fix converage

* fix librosa import error

* fix windows test

* fix windows ci

* rm datasets

* fix setup

* remove testdata

* add librosa in requirement

* add librosa in requirement2

* change librosa to 0.8.1

* update ci docker

* fix ci error

* fix ci error2

* fix ci coverage

* fix converage

* fix coverage

* rm audio_base in test, notest,test=coverage

* fix copyright

* rm backend

* rm compliance&&add function test

* fix setup

* fix windows

* fix windows2

* fix test timeout

* fix ci time issue

* rm test_audio_feature

* avoid windows isssue, tmp

* note windows isssue

* skip windows issue

* fix dtype in layers.mfcc && fix ci-static-check

* add relative accuracy

* modity API.spec

* skip cuda11.2 test

* skip cuda11.2 test2

* skip cuda11.2

7c1dc754

08 9月, 2022 10 次提交
- X
  [Dy2Static] fix non-local error while dealing push_pop names (#45828) · 67d77846
  由 xiongkun 提交于 9月 08, 2022
```
* 1. fix non-local error while dealing push_pop names
2. escape "'" in push_pop_names to avoid syntax errors.
3. unified the non-local stmt creation processes in getter and setter.
4. split the nonlocal_names and getter/setter names.

* fix bugs

* 1. revert setter and getter, push_pop_names must have non-local

* fix bugs.

* code format
```
  67d77846
- W
  
  copyright (#45866) · 38325636
  由 wenbin 提交于 9月 08, 2022
  
  38325636
- C
  backward refuse foward part1 (#45868) · d3f52bcc
  由 Charles-hit 提交于 9月 08, 2022
```
* support more op for high level

* add unit test for high level op

* remove unnecessary comments
```
  d3f52bcc
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97
- P
  
  new executor support compiled_program constructed by graph (#45836) · ca1cab3e
  由 pangyoki 提交于 9月 08, 2022
  
  ca1cab3e
- O
  
  [ Hackathon 3rd No.2 ] add paddle.iinfo (#45321) · 40a0a46b
  由 OccupyMars2025 提交于 9月 08, 2022
  
  40a0a46b
- L
  
  Increase the threshold of softmax and imperative qat UT (#45819) · bd4ce23e
  由 Leo Chen 提交于 9月 08, 2022
  
  bd4ce23e
- L
  
  add group argument (#44758) · bb725e3a
  由 LiYuRio 提交于 9月 08, 2022
  
  bb725e3a
- S
  
  Fix where xpu bug (#45832) · 2cda4e21
  由 Siming Dai 提交于 9月 08, 2022
  
  2cda4e21
- G
  
  fix ptq UT error (#45846) · 83cf6758
  由 Guanghua Yu 提交于 9月 08, 2022
  
  83cf6758
07 9月, 2022 3 次提交
- H
  [XPU] update xdnn to 0907. (#45777) · 1e981d0d
  由 houj04 提交于 9月 07, 2022
```
* [XPU] update xdnn to 0906. test=kunlun

* [XPU] update xdnn to 0907. test=kunlun
```
  1e981d0d
- C
  Fix test_custom_relu_op_jit windows error (#45812) · 352babaa
  由 Chen Weihang 提交于 9月 07, 2022
```
* fix test_custom_relu_op_jit windows error

* polish assert format
```
  352babaa
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致