提交 · 753964a2d69dc0515b53c5ffdcfa711a474fa028 · 机器未来 / Paddle

24 3月, 2022 4 次提交

J
Correct MultipleQuantizeSquash (#40717) · 753964a2
由 joanna.wozna.intel 提交于 3月 24, 2022
```
* Correct MultipleQuantizeSquash

* Correct logging
```
753964a2

Refine events waiter (#40876) · 36ee6dd3

由 liutiexing 提交于 3月 24, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Add EventsWaiter

* update

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* update

* update Error MSG

* update EventsWaiter

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

36ee6dd3

test gpu graph engine's performance (#40775) · 83ae1619

由 seemingwang 提交于 3月 24, 2022

* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

83ae1619

[Phi] Move mul op kernel into phi (#40833) · 1b491818

由 Chen Weihang 提交于 3月 24, 2022

* add mul phi kernel

* remove mul op kernel

* remove original mul grad op

* fix cinn test

* fix dygraph test failed

1b491818

23 3月, 2022 6 次提交

L

[new-exec] gc skip var that is not tensor, selectedrows, tensorarray (#40859) · 521cded2
由 Leo Chen 提交于 3月 23, 2022

521cded2

two-phase training for ps (#40762) · b1a4668c

由 zhaocaibei123 提交于 3月 23, 2022

* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext

* ps optimizer multi programs

* cvm & datanorm backend

* fix dim

* fix unittest

* fix

* the one ps merge

* remove comm

* add DownpourLiteWorker

* all

* fix

* fix

* device worker downpour lite

* fix

* fix bug in global shuffle

* save inference model

* fix & add log

* fix

* remove log

* fix

* fix save summary

* fix

* fix pscore

* fix

* fix

* fix

* fix

* fix

* remove logs

* fix

* fix

* fix

* fix

* fix

* add some comments

* fix
Co-authored-by: Nesythan <esythan@126.com>

b1a4668c

AddAwaitableTask (#40770) · 323d55a7

由 liutiexing 提交于 3月 23, 2022


* AddAwaitableTask for WorkQueue
Co-authored-by: Nliutiexing <liutiexing@google.com>

323d55a7

[Phi]Remove InferShape and Kernel of flatten_contiguous_range op (#40638) · 778008d7

由 YuanRisheng 提交于 3月 23, 2022

* remove flatten infermeta

* fix bugs when run inference ci

* fix bugs when run inference ci

* fix bugs when run ci

* support infrt

* inplace infershape code'

778008d7

Y
[Phi]Move log/log2/log10/log1p Kernels to Phi (#40785) · 13c99434
由 YuanRisheng 提交于 3月 23, 2022
```
* move activation

* fix bugs when run ce
```
13c99434
Z
Removed redundant use of declarations.h (#40703) · 2a1b4c07
由 Zhanlue Yang 提交于 3月 23, 2022
```
* Removed redundant use of declarations.h

* Fixed minor bug
```
2a1b4c07

22 3月, 2022 1 次提交

[new-exec] async prepare deps (#40713) · 814f7211

由 Leo Chen 提交于 3月 22, 2022

* async prepare deps

* fix bug that std::future is not set

* add ut

* refine code

* fix standalone ut

* disable prof

814f7211

21 3月, 2022 3 次提交

F
Move conv-transpose OPs to phi (#40675) · 1eb96eec
由 From00 提交于 3月 21, 2022
```
* Move conv-transpose OPs to phi

* Fix CI errors

* Fix CI errors
```
1eb96eec

[GpuPs] Update graph sampling method (#40085) · 78fad09b

由 Siming Dai 提交于 3月 21, 2022

* gpu ps graph engine

* remove logs

* Add neighbor sampling method

* Add actual_sample_size and offset for sampling

* Delete Chinese comment

* Fix code style
Co-authored-by: seemingwang <zsasuke@qq.com>

78fad09b

[IPU] update ipu_backend (#40685) · d67fe921

由 Allen Guo 提交于 3月 21, 2022

* sync changes

* copy sOpNamescope

* fix UTs

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* fix code-format

* fix compile error

* add comments for feed_op
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

d67fe921

18 3月, 2022 3 次提交

[Phi] Migrate gelu/log_softmax/prelu op kernel and infershape (#40393) · aed6faf2

由 shentanyue 提交于 3月 18, 2022

* add gelu

* fix gelu

* add log_softmax

* add prelu kernel and prelu/gelu/logsoftmax infershape

* fix

* fix

* fix

* fix

* fix ci

* log_softmax rewrite

* fix

* fix

* fix conflict

* fix compile error

* fix comment

* fix

* ci_fix
Co-authored-by: NYan Li <liyan665@gmail.com>

aed6faf2

[Phi]Move hierarchical_sigmoid kernel to phi (#40553) · 64a7cbd3

由 Zhang Zheng 提交于 3月 18, 2022

* first commit

* fix compile error

* support std::vector<std::srting>

* fix

* fix op support on GPU by chenweihang

* pass test

* infershape

* add set_dtype

* fix order

* fix

* unify the impl of dt and sr

* fix

64a7cbd3

[Phi] move reduce_grad kernel into phi (#40522) · 70726696

由 chentianyu03 提交于 3月 18, 2022

* move reduce_mean_grad kernel into phi

* move reduce_max/min_grad into phi

* remove raw max/min grad kernel

* fix bug

* fix max/min grad error

* move all reduce_grad kernel into one file

* add prod grad kernel

* add infermeta for prod kernel

70726696

17 3月, 2022 6 次提交
- C
  [Phi] Move assign kernel into phi (#40022) · 1904572a
  由 Chen Weihang 提交于 3月 17, 2022
```
* move assign kernel init commit

* change vec<tensor> to vec<tensor*>

* support tensor array

* support api declare

* fix test_list failed

* fix npu and xpu failed

* fix infrt failed

* remove assign array size in operator

* move assign sr header into sr dir

* add infermeta for assign

* test op success

* fix test_list failed

* fix kunlun failed

* add set host allocator in tests

* support tensor array in arg ctx

* open set layout in share_meta

* fix meta tensor layout error

* fix test failed
```
  1904572a
- S
  merge cpu and gpu graph engines (#40597) · 31776199
  由 seemingwang 提交于 3月 17, 2022
```
* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config
```
  31776199
- T
  
  fix double-free bug in variables of cinn subgraph (#40609) · 7dad9f70
  由 TeFeng Chen 提交于 3月 17, 2022
  
  7dad9f70
- Z
  
  move infershape of set_value to phi (#40636) · c335288d
  由 zyfncg 提交于 3月 17, 2022
  
  c335288d
- Y
  
  move activation sigmoid (#40626) · ed8a9370
  由 YuanRisheng 提交于 3月 17, 2022
  
  ed8a9370
- B
  
  support gpu mixed precision inference (#40531) · 06fee998
  由 baoachun 提交于 3月 17, 2022
  
  06fee998
16 3月, 2022 5 次提交

Quantize elementwise mul (#40546) · 2def79bc

由 Zuza 提交于 3月 16, 2022

* Quantize elementwise mul op

* Parametrize elementwise functions

* Fix code formatting

2def79bc

Z

Add tensor desc size check (#40518) · 849bfbbf
由 zlsh80826 提交于 3月 16, 2022

849bfbbf
L
[KP]fix bug that cannot fallback to CPU normally in XPU KP (#40576) · 603f8425
由 Liu-xiandong 提交于 3月 16, 2022
```
* [kp]fix bug that cannot fallback to CPU normally in XPU KP

* fix bug in static graph
```
603f8425
Q

[MLU] support amp O1 of mlu (#40461) · ad81f22c
由 qipengh 提交于 3月 16, 2022

ad81f22c

[Auto Parallel] Add the support for the auto completion of while_op (#39939) · ec6b8fbd

由 Yulong Ao 提交于 3月 16, 2022

* [Auto Parallel] Support the auto completion of while_op

* [Auto Parallel] Improve the completion algorithms

* [Auto Parallel] Fix bugs for ernie inference

* [Auto Parallel] Remove attrs which cannot be pickled

* [Auto Parallel] make the dims_mappings of LodTensorArray vars empty

* [Auto Parallel] Fix bugs for the ernie inference in the pipeline parallel

* [Auto Parallel] Remove unncessary comments

* [Auto Parallel] Fix a bug of the CMakeLists

* [Auto Parallel] Use the newest APIs to write the unit test

* [Auto Parallel] Remove unnecessary statements

ec6b8fbd

15 3月, 2022 4 次提交

oneDNN NHWC fixes (#40049) · dde9cec0

由 Jacek Czaja 提交于 3月 15, 2022

* - Prototype of third solution

- fix

- compilation fixes

- fix

- fixe

- fix

- fix

- compilation fix

- comment fix

- lint

update mkldnn conv_elementwise_add_fuse_pass ut

- NHWC changes to prelu

- alhpa dims

- UT fix

- fix to UT

- lint

- Some fixes

- added to BWD of prelu NHWC support

- reverted removal of resetting cu_layout in clearing of caching

* - Small changes

* - compilation fix

* - fix

* - fix

* lint

* - fixes after internal review

* - compilation fix

* - lint

dde9cec0

T
add shard_id (#40261) · 6b7d4845
由 Thunderbrook 提交于 3月 15, 2022
```
* shard_id

* format
```
6b7d4845

Move one hot to phi (#39876) · 7701db37

由 hong 提交于 3月 15, 2022

* move one hot to phi; test=develop

* fix bugs; test=develop

* fix bugs; test=develop

* add infer meta; test=develop

* fix bugs; test=develop

* resolve confilct

* resolve confilct

* fix bug;

* fix error; test=develop

* update; test=develop

* polish code; test=develop

* add one api in eager mode; test=develop

* add one hot test; test=develop

* remove use less code; test=develop

* fix bug; test=develop

* polish code; test=develop

* polish code; test=develop

7701db37

[Phi]Move Tanh/BRelu/LeakyRelu/ThresholdedRelu Kernels to Phi (#40385) · d7112180

由 YuanRisheng 提交于 3月 15, 2022

* move activation op

* adjust code format

* fix compile bugs

* fix ci bugs

* code format adjust

* code format adjust2

* activate ci status

* modify according to comment

* move activation kernel

* revert relu6

* reduce add code

* perfect use_phi_functor

* completing func name

* fix bugs when run ci

* fix bugs when run infr

* modifpy infrt get kernel signature

d7112180

14 3月, 2022 3 次提交

Add an elementwise + activation fusion pass. (#36541) · 3f219160

由 Tomasz Socha 提交于 3月 14, 2022

* Add elementwise add and activation fuse pass

* Fix copy ellision

* More flexible pattern detector

* More flexible fusion pass

* Update lists for pass

* Add support for Pow operator

* Add support for more activation types

* Style

* Rename fusion pass

* First version of tests

* Dirty version of pass

* Polished version

* Update pbtxt

* Style

* Update names

* Style

* Use PADDLE_ENFORCE_EQ

* Save error message to variable

* WO for error checks

* CR

* Static style check

* Add missing 'activation_scale' attribute

* Add relu6 and sigmoid activations

* Style

* Fix fuse list formating

* Sync filenames for fuse pass files

* Fix cmake after move

* Fix registration

* Fix pass name in tests

* Add missing activations to checker

* WIPS

* Working mul op

* Working sub

* Working Add

* Remove pten includes

* Remove some forward declarations

* Remove Includes

* Fixes

* Remove default kernels

* Add check if post_ops attributes are avaliable

* Style

* Code adjustment

* Register default kernels

* We have year 2022 not 2021...
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Fast review fixes
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

* Review Fix

* Rename one_dnn -> onednn

* Style after review

* Fast and dirty fix for quantization

* Update tests

* Style

* Fix mkldnn_quantizer config

* Add Joanna's suggestion.

* Check if operator is explicitly disables on OneDNN

* Try to use unregistered attributes

* Style

* Test new framework

* FXI

* FXII

* Update test

* Style
Co-authored-by: Njakpiase <jakpia21@gmail.com>
Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>

3f219160

Support custom op and paddle.autograd.bacward in eager (#40423) · 227fa408

由 Jiabin Yang 提交于 3月 14, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook

* support custom op in eager mode

* fix inference deps error

* split eager utils from custom operator

* fix type match

* fix typo
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

227fa408

F
Move Pool OPs to phi (#40208) · 88ec08a7
由 From00 提交于 3月 14, 2022
```
* Move Pool OPs to phi

* Fix CI error

* Fix conflicts
```
88ec08a7

13 3月, 2022 1 次提交
- C
  
  polish several details (#40485) · 1b0cecb7
  由 Chen Weihang 提交于 3月 13, 2022
  
  1b0cecb7
12 3月, 2022 1 次提交
- J
  fix NetBuilder API Name bug in cinn_lib_test (#40392) · 69a01c47
  由 jiangcheng 提交于 3月 12, 2022
```
* fix NetBuilder API Name bug in cinn_lib_test

* update cinn version to newest
```
  69a01c47
11 3月, 2022 3 次提交

S

refactor conv+relementwise_add (residual) (#40005) · 47459e98
由 Sylwester Fraczek 提交于 3月 11, 2022

47459e98

[Phi] Remove needless deps in unittests (#40256) · 89ed57e2

由 Chen Weihang 提交于 3月 11, 2022

* remove needless deps in unittests

* add gpu marco

* fix other unittests

* fix kernel name error

* fix test_prepare_op

* fix failed dygraph unittests

* fix gpu failed tests

* fix cinn test failed

* fix cinn test failed

* fix dropout tests

89ed57e2

[Phi] Reduce grad (#40263) · f452ad5c

由 chentianyu03 提交于 3月 11, 2022

* add reduce_sum grad kernel

* add reduce_grad

* modify reduce grad

* update reduce grad functions

* fix build error

* add argument mapping

* move cast input after grad

* add dims.size=1 cpu reduce_sum grad compute method

* update reduce grad GPU

* remove raw reduce_sum_grad kernel

* modify header files

* add namespace funcs for reduce_grad_funcstions

f452ad5c

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致