提交 · 3f219160bee15a3afa7107439197361f8266dc57 · Crayon鑫 / Paddle

14 3月, 2022 17 次提交

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

F

[MLU] add merged_momentum mlu kernel (#40406) · 1f7b2516
由 fwenguang 提交于 3月 14, 2022

1f7b2516

optimize group_norm op backward (#39944) · 5720537e

由 crystal 提交于 3月 14, 2022

* optimize backwad

* optimize group_norm backward

* Add vectorized code

* move assignment code

* merge function

* move code

* optimize code

* Modify function name

5720537e

Optimize bilinear_interp backward (#39423) · 9e1f762c

由 Lijunhui 提交于 3月 14, 2022

* bilinear_bw init

* optimize code

* optimize

* optimize 2

* optimize functions

* modify func name

9e1f762c

X

[phi]migrate fmax,fmin kernel to phi (#40140) · bb801960
由 Xiaoxu Chen 提交于 3月 14, 2022

bb801960

Support custom op and paddle.autograd.bacward in eager (#40423) · 227fa408

由 Jiabin Yang 提交于 3月 14, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook

* support custom op in eager mode

* fix inference deps error

* split eager utils from custom operator

* fix type match

* fix typo
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

227fa408

Optimize performance of log_softmax (#38992) · 250e254f

由 Zhang Zheng 提交于 3月 14, 2022

* Optimize performance of log_softmax

* delete unity build

* modify to phi

* fix

* fixfixfixfix

* fix

* fix

* fix

* fix

* simplify

* fix

* fix enforce

250e254f

0

adjust params order for eager.Tensor._copy_to (#40449) · c6ec8b9f
由 0x45f 提交于 3月 14, 2022

c6ec8b9f

[KP] Add unittests for... · f269ca3f

由 Lijunhui 提交于 3月 14, 2022

[KP] Add unittests for brelu,ceil,celu,elu,floor,hard_shrink,hard_sigmoid,log1p,logsigmoid,relu6,silu,soft_relu,softsign,swish (#40448)

* solve unexecuted UT

* add 24 activation op UT

* append swish&thresholded_relu to kpfirst_list

* rm thresholded_relu

f269ca3f

【phi】migrate matrix_rank to phi (#40074) · b9d4285b

由 crystal 提交于 3月 14, 2022

* migrate matrix_rank to phi

* migrate eigh and matrix_rank to phi

* fix matrix_rank

* optimize code

* move matrix_rank to phi

* add max functor

* migrate matrix_rank to phi

* optimize code

b9d4285b

[Phi] Migrate triangular_solve dependence to phi (#40417) · 930a5136
由 zhouweiwei2014 提交于 3月 14, 2022

930a5136
L

Update profiler (#40460) · 89a70c76
由 liutiexing 提交于 3月 14, 2022

89a70c76
Z

Adjust Yaml name parsing to satisfy Sparse-related APIs (#40480) · d6e99fe4
由 Zhanlue Yang 提交于 3月 14, 2022

d6e99fe4
Z

[GPUPS]fix instag lod information (#40483) · e5c59fc9
由 zmxdream 提交于 3月 14, 2022

e5c59fc9

[multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors ... · e553f758

由 Zhong Hui 提交于 3月 14, 2022

[multiprocessing] Add paddle.incubate.multiprocessing for sharing tensors  between python processes. (#37302)

* Add support for paddle.multiprocessing
* move multiprocessing to incubate.

e553f758

F
Move Pool OPs to phi (#40208) · 88ec08a7
由 From00 提交于 3月 14, 2022
```
* Move Pool OPs to phi

* Fix CI error

* Fix conflicts
```
88ec08a7

Refine partial_program for new run_program OP (#40355) · afafb1c3

由 0x45f 提交于 3月 14, 2022

* refine partial_program

* fix code for test_mnist.py train

* support quantify UT

* make __fake_vars and _double_grads to lazy

* fix comments

afafb1c3

13 3月, 2022 2 次提交
- C
  
  polish several details (#40485) · 1b0cecb7
  由 Chen Weihang 提交于 3月 13, 2022
  
  1b0cecb7
- Z
  [PHI] Refactor infermeta files (Part2) (#40367) · f3f27d25
  由 zyfncg 提交于 3月 13, 2022
```
* refactor infermeta files

* update
```
  f3f27d25
12 3月, 2022 6 次提交
- C
  [Phi] Add softmax infermeta functions (#40471) · ec09ef26
  由 Chen Weihang 提交于 3月 12, 2022
```
* rename softmax kernel name

* move softmax infershape

* fix failed test
```
  ec09ef26
- C
  [Phi] Move allclose op kernel into phi (#40469) · 76f87034
  由 Chen Weihang 提交于 3月 12, 2022
```
* move allclose kernel

* remove allclose op kernel

* fix coverage failed
```
  76f87034
- Z
  [PHI] Move forward kernel of roi_align into phi (#40382) · 39de9b8a
  由 zyfncg 提交于 3月 12, 2022
```
* move roi_align kernel to phi

* fix bug of roi_align xpu
```
  39de9b8a
- A
  [custom kernel] fix static object de-initialize bug (#40414) · 573ca984
  由 Aganlengzi 提交于 3月 12, 2022
```
* [custom kernel] fix static object de-initialize bug

* fix text

* fix text

* refine log info
```
  573ca984
- J
  fix NetBuilder API Name bug in cinn_lib_test (#40392) · 69a01c47
  由 jiangcheng 提交于 3月 12, 2022
```
* fix NetBuilder API Name bug in cinn_lib_test

* update cinn version to newest
```
  69a01c47
- C
  Fix eager benchmark test failed (#40468) · 70f83f1d
  由 Chen Weihang 提交于 3月 12, 2022
```
* fix eager benchmark test failed

* fix test_tracer failed
```
  70f83f1d
11 3月, 2022 15 次提交
- T
  
  Use OneDNN's LayerNorm kernel (#40418) · d1811010
  由 Tomasz Socha 提交于 3月 11, 2022
  
  d1811010
- L
  
  fix the bug for processgroup_hccl compiling (#40437) · f70f5e4f
  由 lilong12 提交于 3月 11, 2022
  
  f70f5e4f
- C
  
  polish trace op detail (#40425) · 88c03071
  由 Chen Weihang 提交于 3月 11, 2022
  
  88c03071
- Z
  
  Added Final State Matmul_v2 to C++ performance test (#40391) · 6d830f6c
  由 Zhanlue Yang 提交于 3月 11, 2022
  
  6d830f6c
- S
  
  refactor conv+relementwise_add (residual) (#40005) · 47459e98
  由 Sylwester Fraczek 提交于 3月 11, 2022
  
  47459e98
- F
  Move psroi_pool OP to phi (#40353) · c0e29233
  由 From00 提交于 3月 11, 2022
```
* Move psroi_pool OP to phi

* Replace platform::TensorCopy with phi::Copy
```
  c0e29233
- C
  [Phi] Remove needless deps in unittests (#40256) · 89ed57e2
  由 Chen Weihang 提交于 3月 11, 2022
```
* remove needless deps in unittests

* add gpu marco

* fix other unittests

* fix kernel name error

* fix test_prepare_op

* fix failed dygraph unittests

* fix gpu failed tests

* fix cinn test failed

* fix cinn test failed

* fix dropout tests
```
  89ed57e2
- Y
  
  [hybrid] Support tensor parallel and cache structure for fused attention op. (#40101) · 1882c496
  由 Yuang Liu 提交于 3月 11, 2022
  
  1882c496
- [Phi]migrate cholesky_solve op to phi (#40387) · e24ca55e
  由 zhouweiwei2014 提交于 3月 11, 2022
  
  e24ca55e
- Z
  
  [MLU]add allgather_op mlu kernel (#40356) · dc773828
  由 zn 提交于 3月 11, 2022
  
  dc773828
- A
  [Phi] Migrate tile_op into Phi (#40371) · 282cba48
  由 Aurelius84 提交于 3月 11, 2022
```
* [Phi] Migrate tile_op into Phi

* fix tile_sig

* fix include headers

* fix using
```
  282cba48
- C
  [Phi] Reduce grad (#40263) · f452ad5c
  由 chentianyu03 提交于 3月 11, 2022
```
* add reduce_sum grad kernel

* add reduce_grad

* modify reduce grad

* update reduce grad functions

* fix build error

* add argument mapping

* move cast input after grad

* add dims.size=1 cpu reduce_sum grad compute method

* update reduce grad GPU

* remove raw reduce_sum_grad kernel

* modify header files

* add namespace funcs for reduce_grad_funcstions
```
  f452ad5c
- [PHI] Migrate shard_index op (#40254) · ad037caa
  由 Jeffrey Chen 提交于 3月 11, 2022
  
  ad037caa
- Z
  [Phi]Move expand_as kernel to phi (#40373) · 8cabb9f3
  由 Zhang Zheng 提交于 3月 11, 2022
```
* first commit

* fix

* fix

* fix

* fix

* fix

* fix xpu and npu

* fix
```
  8cabb9f3
- W
  [phi] Move erf op to phi (#40388) · 42ddee4e
  由 wuyefeilin 提交于 3月 11, 2022
```
* mv erf op to phi

* fix as review

* fix as review

* fix format
```
  42ddee4e

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致