提交 · c1c6b86944a6323a338af730c8947ba143cdd5c0 · PaddlePaddle / Paddle

16 3月, 2022 5 次提交

P

update; test=develop · 733d3109
由 phlrain 提交于 3月 16, 2022

733d3109

[PHI] Migrate index_select op (#40260) · 99452af7

由 chenenquan 提交于 3月 16, 2022

* [PHI] Migrate index_select op

* [PHI] Fix bug in test_variable

* [PHI] migrate index_select op

99452af7

M

Add Support Layer List to ASP (#40253) · c040bbd7
由 Ming-Xu Huang 提交于 3月 16, 2022

c040bbd7
T

fix xpu op test, *test=kunlun (#40409) · d1a98f0b
由 TTerror 提交于 3月 16, 2022

d1a98f0b

[Auto Parallel] Add the support for the auto completion of while_op (#39939) · ec6b8fbd

由 Yulong Ao 提交于 3月 16, 2022

* [Auto Parallel] Support the auto completion of while_op

* [Auto Parallel] Improve the completion algorithms

* [Auto Parallel] Fix bugs for ernie inference

* [Auto Parallel] Remove attrs which cannot be pickled

* [Auto Parallel] make the dims_mappings of LodTensorArray vars empty

* [Auto Parallel] Fix bugs for the ernie inference in the pipeline parallel

* [Auto Parallel] Remove unncessary comments

* [Auto Parallel] Fix a bug of the CMakeLists

* [Auto Parallel] Use the newest APIs to write the unit test

* [Auto Parallel] Remove unnecessary statements

ec6b8fbd

15 3月, 2022 13 次提交

add number count op (#39224) · 9bdee437

由 Roc 提交于 3月 15, 2022

* add expert count op

add ut for expert_count

* update UT only for cuda

* fix for rocm

* update ut

* add moe module

* add expert count op

add ut for expert_count

* update UT only for cuda

* update ut

* add moe module

* make expert count private

* rename expert count op
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

9bdee437

X
run python api in eager model and filter the out in argument list (#40523) · 4d886f75
由 xiongkun 提交于 3月 15, 2022
```
* run python api in eager model and filter the out in argument list

* fix code
```
4d886f75
T
[einsum] refactored and supporting unknown shapes in static mode (#40360) · 187fcfa3
由 Tongxin Bai 提交于 3月 15, 2022
```
* formatted.

* Remove dead code.

* Fix error message in the unit test.

* polish formats.

* [Einsum] fix bugs.
```
187fcfa3
Y
[Auto Parallel] Add the recorder and trial class for the tuner (#40555) · 2c5edb4f
由 Yulong Ao 提交于 3月 15, 2022
```
Add the recorder
```
2c5edb4f

oneDNN NHWC fixes (#40049) · dde9cec0

由 Jacek Czaja 提交于 3月 15, 2022

* - Prototype of third solution

- fix

- compilation fixes

- fix

- fixe

- fix

- fix

- compilation fix

- comment fix

- lint

update mkldnn conv_elementwise_add_fuse_pass ut

- NHWC changes to prelu

- alhpa dims

- UT fix

- fix to UT

- lint

- Some fixes

- added to BWD of prelu NHWC support

- reverted removal of resetting cu_layout in clearing of caching

* - Small changes

* - compilation fix

* - fix

* - fix

* lint

* - fixes after internal review

* - compilation fix

* - lint

dde9cec0

change CUDA implementation of randperm OP (#40464) · 813f61d2
由 zhouweiwei2014 提交于 3月 15, 2022

813f61d2
P

update · 9cd5cd4e
由 phlrain 提交于 3月 15, 2022

9cd5cd4e

Move one hot to phi (#39876) · 7701db37

由 hong 提交于 3月 15, 2022

* move one hot to phi; test=develop

* fix bugs; test=develop

* fix bugs; test=develop

* add infer meta; test=develop

* fix bugs; test=develop

* resolve confilct

* resolve confilct

* fix bug;

* fix error; test=develop

* update; test=develop

* polish code; test=develop

* add one api in eager mode; test=develop

* add one hot test; test=develop

* remove use less code; test=develop

* fix bug; test=develop

* polish code; test=develop

* polish code; test=develop

7701db37

K

New design for launch/run (#40086) · 67c6ddff
由 kuizhiqing 提交于 3月 15, 2022

67c6ddff
Y
[Auto parallel] Redesign the tuner for auto parallel (#40121) · f84b54eb
由 Yulong Ao 提交于 3月 15, 2022
```
* [Auto Parallel] Redesign the tunner for Auto Parallel
```
f84b54eb
Q

[MLU] add check_finite_and_unscale op for amp (#40458) · 42c7bb47
由 qipengh 提交于 3月 15, 2022

42c7bb47
A
[IPU] add IPU related CI configures (#40354) · 8852591f
由 Allen Guo 提交于 3月 15, 2022
```
* add ci

* rm retry tests

* format

* restore retry tests

* update timeout for ipu uts
```
8852591f

[Dygraph] Refactoring of reducer in DataParallel (#40389) · 1a32391c

由 Haohongxiang 提交于 3月 15, 2022

* refactor reducer

* modify cmakelists

* solve conflicts

* rename group and update process_group

* fix bugs of ProcessGroupNCCL

* modify for CIs

* refactoring reducer

1a32391c

14 3月, 2022 8 次提交

[Phi]Add diag_v2 grad kernel (#40447) · e157f2af

由 Siming Dai 提交于 3月 14, 2022

* Add diag grad kernel

* fix unittest case

* add float16, remove const &

* delete diag_grad in op_utils.h

e157f2af

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

F

[MLU] add merged_momentum mlu kernel (#40406) · 1f7b2516
由 fwenguang 提交于 3月 14, 2022

1f7b2516

Support custom op and paddle.autograd.bacward in eager (#40423) · 227fa408

由 Jiabin Yang 提交于 3月 14, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook

* support custom op in eager mode

* fix inference deps error

* split eager utils from custom operator

* fix type match

* fix typo
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

227fa408

0

adjust params order for eager.Tensor._copy_to (#40449) · c6ec8b9f
由 0x45f 提交于 3月 14, 2022

c6ec8b9f

[KP] Add unittests for... · f269ca3f

由 Lijunhui 提交于 3月 14, 2022

[KP] Add unittests for brelu,ceil,celu,elu,floor,hard_shrink,hard_sigmoid,log1p,logsigmoid,relu6,silu,soft_relu,softsign,swish (#40448)

* solve unexecuted UT

* add 24 activation op UT

* append swish&thresholded_relu to kpfirst_list

* rm thresholded_relu

f269ca3f

Z
[AutoParallel] Converter (#40434) · 3881b6cb
由 zhaoyingli 提交于 3月 14, 2022
```
* [AutoParallel] Converter
Converter API
```
3881b6cb

[multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors ... · e553f758

由 Zhong Hui 提交于 3月 14, 2022

[multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors  between python processes. (#37302)

* Add support for paddle.multiprocessing
* move multiprocessing to incubate.

e553f758

11 3月, 2022 5 次提交
- Y
  
  [hybrid] Support tensor parallel and cache structure for fused attention op. (#40101) · 1882c496
  由 Yuang Liu 提交于 3月 11, 2022
  
  1882c496
- Z
  
  [MLU]add allgather_op mlu kernel (#40356) · dc773828
  由 zn 提交于 3月 11, 2022
  
  dc773828
- update square & sigmoid unittest (#40404) · 807bff4a
  由 z8hanghuan 提交于 3月 11, 2022
  
  807bff4a
- H
  
  minor fix matmul and onehot xpu. test=kunlun (#40419) · 594e412d
  由 houj04 提交于 3月 11, 2022
  
  594e412d
- B
  
  fix_import_distribute_bugs (#40396) · bd2d4fd0
  由 Baibaifan 提交于 3月 11, 2022
  
  bd2d4fd0
10 3月, 2022 3 次提交

C
[Auto Parallel]Update reshard for while sub block (#40366) · 2747de2b
由 caozhou 提交于 3月 10, 2022
```
* update reshard for while sub block

* fix code format error
```
2747de2b

add tril_triu for xpu, *test=kunlun (#40246) · 1128db30

由 z8hanghuan 提交于 3月 10, 2022

* add tril_triu for xpu, *test=kunlun

* add tril_triu for xpu, *test=kunlun

* add tril_triu for xpu, *test=kunlun

* add tril_triu for xpu, *test=kunlun

* add tril_triu for xpu, *test=kunlun

1128db30

Move dropout to phi (#40148) · 99fc1b08

由 hong 提交于 3月 10, 2022

* move dropout to phi; test=develop

* fix xpu, npu compile error; test=develop

99fc1b08

09 3月, 2022 6 次提交
- B
  
  add_sharding_api (#40129) · f40ed5f4
  由 Baibaifan 提交于 3月 09, 2022
  
  f40ed5f4
- F
  
  change timeout for pool (#40341) · 1defc8f3
  由 feng_shuai 提交于 3月 09, 2022
  
  1defc8f3
- W
  fix the full_like with fill the value of inf (#40232) · ec582895
  由 wawltor 提交于 3月 09, 2022
```
* fix the full_like with fill the value of inf

* update the test case for the fill_any_like

* updae the comments for the full_like
```
  ec582895
- 0
  adapt run_program OP for eager (#40198) · 3e9601ba
  由 0x45f 提交于 3月 09, 2022
```
* adapt run_program OP for eager

* fix program_id

* refine code

* fix test
```
  3e9601ba
- S
  Fix time of utest in distributed (#40163) · 7ea9235c
  由 ShenLiang 提交于 3月 09, 2022
```
* fix time of utest
```
  7ea9235c
- W
  
  [hybrid] fused_feedforward op support tensor model parallel (#40160) · e0866dc6
  由 WangXi 提交于 3月 09, 2022
  
  e0866dc6

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功