提交 · 4b269baaad1c61184d62510afbb7dad514df07e6 · Crayon鑫 / Paddle

17 3月, 2022 5 次提交

W
Revert "[Eager Grad] Support eager grad interface (#40170)" · 4b269baa
由 Weilong Wu 提交于 3月 17, 2022
```
This reverts commit 4db8cf24.
```
4b269baa
H

add time of unittests for dataparallel in dygraph mode (#40639) · e3a67782
由 Haohongxiang 提交于 3月 17, 2022

e3a67782

[Eager Grad] Support eager grad interface (#40170) · 4db8cf24

由 Weilong Wu 提交于 3月 17, 2022

* [Eager] Support eager grad interface, draft version

* Support eager grad interface with allow_unused and multi startup_op

* Fix code format

* Fix allow_unused case, return PyNone if tensor not initialize

* Support output's stop_gradient related to create_graph

* Support grad exception case in eager mode, fix coverage CI

* Update ToPyObject, return PyNone if not initialize

* AccumulationNode add FLAGS_retain_grad_for_all_tensor

* Fix ci issue

* Fix CI issue

* fix, use core.eager.Tensor

* Add func SetBufferSlotRankZeros for GradTensorHolder

* Support retain_graph by using ClearTensorWrappers

* Support retain_graph by using ClearTensorWrappers

* Update retain_graph and no_grad_vars related test case

* Update code gen logic for ClearTensorWrappers

* Fix by override statement

* fix override func args

* Support retain_graph, update unit tests

* Updated ClearTensorWrappers logic

* fix grad python interface

* Use deep copy and update unit tests

* Polish code

* Polish code

* Fix CI issue, Deep copy only use when user set grad_tensors

* Fix CI, use Backward instead RunBackward

* Fix CI, Declare kernel explicitly in test file

* Polish, remove vector of TensorWrapper

* Refactor the logic of grad/backward, polish codes

* Update code after merge upstream develop

* Polish after merge upstream develop

* Update to adapt new GradNodeBase superclass

* Fix error introduced during conflict resolution

* Update purify potential_startup_nodes logic

* Fix errors

* Polish code

* Remove useless args for ToPyObject

* Remove useless TensorWrappersSet

* Fix code-format, re-install pre-commit

* Fix pre-process logic for potential_startup_ops

* Update unit tests, use eager mode

4db8cf24

Refine io for test_mnist.py (#40496) · 1e045cae

由 0x45f 提交于 3月 17, 2022

* for test_mnist.py

* remove comments

* using type() replace isinstance()

* valid vars for run program OP in io.py

* open test_mnist in eager_gurad for coverage

1e045cae

Optimize the performance of C++ API (#40640) · add304ed

由 zyfncg 提交于 3月 17, 2022

* Optimize performance

* optimiaze c++ api performance

* remove unsed code

* fix paddle throw

* updata format

add304ed

16 3月, 2022 14 次提交
- Z
  [Ops] segment pool op support for int int64 kernel. (#40577) · 6849d33b
  由 Zhong Hui 提交于 3月 16, 2022
```
* segment pool support for int int64 kernel.

* add support in python api
```
  6849d33b
- L
  [KP] Fix registry and add UT for thresholded_relu & softshrink (#40524) · bef6f2e1
  由 Lijunhui 提交于 3月 16, 2022
```
* init commit

* correct namespace
```
  bef6f2e1
- F
  Add yaml config for pool2d (#40563) · ac5cc136
  由 From00 提交于 3月 16, 2022
```
* Add yaml config for pool2d

* Fix CI error

* Fix code format error
```
  ac5cc136
- P
  Refactor elementwise op grad classes (#40187) · 7004f65c
  由 piotrekobi 提交于 3月 16, 2022
```
* Refactor elementwise op grad classes

* Add more refactor changes

* Revert set layout and format deletion

* Fix failing elementwise test
```
  7004f65c
- J
  Modify save_quant_model to support different input and output filenames (#40542) · dec2b1ca
  由 joanna.wozna.intel 提交于 3月 16, 2022
```
* Modify save_quant_model.py to support differnet input and output filenames

* Correct wrong order of arguments
```
  dec2b1ca
- R
  
  clean up DeviceManager in advance manually (#40504) · 23c036d6
  由 ronnywang 提交于 3月 16, 2022
  
  23c036d6
- N
  fix paddle.optimizer.SGD en docs (#40479) · 8e631715
  由 Nyakku Shigure 提交于 3月 16, 2022
```
* align to cn docs

* add parameter `weight_decay`
```
  8e631715
- C
  [PHI] Migrate index_select op (#40260) · 99452af7
  由 chenenquan 提交于 3月 16, 2022
```
* [PHI] Migrate index_select op

* [PHI] Fix bug in test_variable

* [PHI] migrate index_select op
```
  99452af7
- M
  
  Add Support Layer List to ASP (#40253) · c040bbd7
  由 Ming-Xu Huang 提交于 3月 16, 2022
  
  c040bbd7
- T
  
  fix xpu op test, *test=kunlun (#40409) · d1a98f0b
  由 TTerror 提交于 3月 16, 2022
  
  d1a98f0b
- Q
  
  [MLU] support amp O1 of mlu (#40461) · ad81f22c
  由 qipengh 提交于 3月 16, 2022
  
  ad81f22c
- A
  
  Polish reshape error message under @to_static (#40599) · 80194bde
  由 Aurelius84 提交于 3月 16, 2022
  
  80194bde
- Y
  [Auto Parallel] Add the support for the auto completion of while_op (#39939) · ec6b8fbd
  由 Yulong Ao 提交于 3月 16, 2022
```
* [Auto Parallel] Support the auto completion of while_op

* [Auto Parallel] Improve the completion algorithms

* [Auto Parallel] Fix bugs for ernie inference

* [Auto Parallel] Remove attrs which cannot be pickled

* [Auto Parallel] make the dims_mappings of LodTensorArray vars empty

* [Auto Parallel] Fix bugs for the ernie inference in the pipeline parallel

* [Auto Parallel] Remove unncessary comments

* [Auto Parallel] Fix a bug of the CMakeLists

* [Auto Parallel] Use the newest APIs to write the unit test

* [Auto Parallel] Remove unnecessary statements
```
  ec6b8fbd
- K
  
  fix IterableDataset may block model when num_workers > 0. test=develop (#40541) · a991b6a0
  由 Kaipeng Deng 提交于 3月 16, 2022
  
  a991b6a0
15 3月, 2022 17 次提交

G
Support some ops for full quantization (#40083) · 7ced3017
由 Guanghua Yu 提交于 3月 15, 2022
```
* add some op for full_quantization
```
7ced3017

add number count op (#39224) · 9bdee437

由 Roc 提交于 3月 15, 2022

* add expert count op

add ut for expert_count

* update UT only for cuda

* fix for rocm

* update ut

* add moe module

* add expert count op

add ut for expert_count

* update UT only for cuda

* update ut

* add moe module

* make expert count private

* rename expert count op
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

9bdee437

X
run python api in eager model and filter the out in argument list (#40523) · 4d886f75
由 xiongkun 提交于 3月 15, 2022
```
* run python api in eager model and filter the out in argument list

* fix code
```
4d886f75
T
[einsum] refactored and supporting unknown shapes in static mode (#40360) · 187fcfa3
由 Tongxin Bai 提交于 3月 15, 2022
```
* formatted.

* Remove dead code.

* Fix error message in the unit test.

* polish formats.

* [Einsum] fix bugs.
```
187fcfa3
F
[NPU] add AMP O1 support (#40362) · 69dd43d1
由 furnace 提交于 3月 15, 2022
```
* [NPU] add AMP O1 support

* [NPU] fix NOTE and warnings
```
69dd43d1
Y
[Auto Parallel] Add the recorder and trial class for the tuner (#40555) · 2c5edb4f
由 Yulong Ao 提交于 3月 15, 2022
```
Add the recorder
```
2c5edb4f

oneDNN NHWC fixes (#40049) · dde9cec0

由 Jacek Czaja 提交于 3月 15, 2022

* - Prototype of third solution

- fix

- compilation fixes

- fix

- fixe

- fix

- fix

- compilation fix

- comment fix

- lint

update mkldnn conv_elementwise_add_fuse_pass ut

- NHWC changes to prelu

- alhpa dims

- UT fix

- fix to UT

- lint

- Some fixes

- added to BWD of prelu NHWC support

- reverted removal of resetting cu_layout in clearing of caching

* - Small changes

* - compilation fix

* - fix

* - fix

* lint

* - fixes after internal review

* - compilation fix

* - lint

dde9cec0

change CUDA implementation of randperm OP (#40464) · 813f61d2
由 zhouweiwei2014 提交于 3月 15, 2022

813f61d2
C

add softmax yaml and add_raw infermeta (#40534) · 7039f61e
由 Chen Weihang 提交于 3月 15, 2022

7039f61e

Added more profile signposts to dygraph (#40201) · 36db75b4

由 Zhanlue Yang 提交于 3月 15, 2022

* Added more signposts to dygraph profiling

* Fixed minor issues

* Refactored signpost names

* Fixed typo

* Removed debug codes

* Fixed typo

* Adjusted signpost names

* Fixed issues from branch merge

36db75b4

Move one hot to phi (#39876) · 7701db37

由 hong 提交于 3月 15, 2022

* move one hot to phi; test=develop

* fix bugs; test=develop

* fix bugs; test=develop

* add infer meta; test=develop

* fix bugs; test=develop

* resolve confilct

* resolve confilct

* fix bug;

* fix error; test=develop

* update; test=develop

* polish code; test=develop

* add one api in eager mode; test=develop

* add one hot test; test=develop

* remove use less code; test=develop

* fix bug; test=develop

* polish code; test=develop

* polish code; test=develop

7701db37

K

New design for launch/run (#40086) · 67c6ddff
由 kuizhiqing 提交于 3月 15, 2022

67c6ddff
Y
[Auto parallel] Redesign the tuner for auto parallel (#40121) · f84b54eb
由 Yulong Ao 提交于 3月 15, 2022
```
* [Auto Parallel] Redesign the tunner for Auto Parallel
```
f84b54eb
Q

[MLU] add check_finite_and_unscale op for amp (#40458) · 42c7bb47
由 qipengh 提交于 3月 15, 2022

42c7bb47
Y

add yaml (#40533) · 5cb506b0
由 YuanRisheng 提交于 3月 15, 2022

5cb506b0
A
[IPU] add IPU related CI configures (#40354) · 8852591f
由 Allen Guo 提交于 3月 15, 2022
```
* add ci

* rm retry tests

* format

* restore retry tests

* update timeout for ipu uts
```
8852591f

[Dygraph] Refactoring of reducer in DataParallel (#40389) · 1a32391c

由 Haohongxiang 提交于 3月 15, 2022

* refactor reducer

* modify cmakelists

* solve conflicts

* rename group and update process_group

* fix bugs of ProcessGroupNCCL

* modify for CIs

* refactoring reducer

1a32391c

14 3月, 2022 4 次提交

[Phi]Add diag_v2 grad kernel (#40447) · e157f2af

由 Siming Dai 提交于 3月 14, 2022

* Add diag grad kernel

* fix unittest case

* add float16, remove const &

* delete diag_grad in op_utils.h

e157f2af

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

F

[MLU] add merged_momentum mlu kernel (#40406) · 1f7b2516
由 fwenguang 提交于 3月 14, 2022

1f7b2516

Support custom op and paddle.autograd.bacward in eager (#40423) · 227fa408

由 Jiabin Yang 提交于 3月 14, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook

* support custom op in eager mode

* fix inference deps error

* split eager utils from custom operator

* fix type match

* fix typo
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

227fa408

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致