提交 · c47ae621c31aa94001c4d1d8e55ca4230aa4a25f · BaiXuePrincess / Paddle

04 3月, 2022 2 次提交

H

add eager test in rnn and fc; test=develop (#40149) · c47ae621
由 hong 提交于 3月 04, 2022

c47ae621

由 hong 提交于 3月 04, 2022

* move conv to pten

* move conv to pten; test=develop

* fix bug;

* add conv cudnn impl; test=develop

* update

* update operator; test=develop

* fix bug; test=develop

* move operator and prepared_operator to develop; test=develop

* resolve conflict; test=develop

* remove useless code;test=develop

* add depency ; test=develop

* fix bug;

* add sig.cc ; test=develop

* fix use_op error; test=develop

* fix bug; test=develop

* fix bug; test=develop

* add conv3d register; test=develop

* fix star gan and conv_nn_grad test failed; test=develop

* add header; test=develop

* manul to recover to develop;

* resolve confilct; test=develop

* remove useless code

* fix bug;

* remove conv2d_cudnn; test=develop

* fix bugs; test=develop

* fix cpu rocm compile bugs; test=develop

* fix blas error; test=develop

* fix compile bug; test=develop

* fix windows compile error; test=develop

* fix windows error; test=develop

* resolve confilct; test=develop

d50fb43e

03 3月, 2022 10 次提交

S

reduce size of max_input_shape so that the ut can pass on win6 (#40088) · 831b69d9
由 Sing_chan 提交于 3月 03, 2022

831b69d9
W
EmbEltwiseLayernorm fix (#40015) · c3f3643b
由 wenbin 提交于 3月 03, 2022
```
* emb fix

* fix trt6 compile

* fix half

* absolute error fix
```
c3f3643b

Add support of int16 for gather op. (#40052) · 3e56e816

由 Li Min 提交于 3月 03, 2022

* add support of int16 for gather op.

* Recover formats.

* Recover formats.

* fix.

* Fix format.

* Fix format.

3e56e816

L

add communication api for ProcessGroupNCCL (#40097) · b565b349
由 lilong12 提交于 3月 03, 2022

b565b349

[PHI] Code auto-generate for Sparse API (#40060) · 31d3d857

由 zyfncg 提交于 3月 03, 2022

* suppport sparse api in yaml

* support auto-gen code of sparse api

* do some refactor

* add unittest test_sparse_conv_api

* add unitest file
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

31d3d857

B

change_ASP_sharding_option (#40028) · 815f7a67
由 Baibaifan 提交于 3月 03, 2022

815f7a67

Support slim eager (#39874) · da47544c

由 Jiabin Yang 提交于 3月 03, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

da47544c

Z

adjust the args checking of backward in yaml (#40091) · d9884e20
由 zyfncg 提交于 3月 03, 2022

d9884e20

Move bn to pten (#39347) · ebd0f512

由 hong 提交于 3月 03, 2022

* add bn cpu version; test=develop

* move batch norm to pten

* move batch norm to pten; test=develop

* fix bug; test=develop

* fix func::tranpose depend bug; test=develop

* fix compile bugs; test=develop

* fix use_op batch_norm bug; test=develop

* fix cudnn bn add relu test; test=develop

* fix pten context build and double grad bug; test= develop

* remve useless code; test=develop

* add batch norm gpu fp16 support; test=develop

* fix test bn op bug; test=develop

* remove output dtype set; test=develop

* fix bug; test=develop

* fix bug; test=develop

* fix applay pass to program bug; test=develop

* revert to develop; test=develop

* fix rocm bug; test=develop

* revert operator to develop; test=develop

* fix pre_commit; test=develop

* fix statci check error; test=develop

* resolve conflict; test=develop

* ana batch norm bug;

* revert batch norm op

* resolve conlict

* fix nan inf and speed bug; test=develop

* fix bug; test=develop

* fix error; test=develop

* test expand op; test=develop

* fix bug; test=develop

* resolve confilct

* resolve confilct; test=develop

* polish code; test=develop

* polish code; test=develop

* change mutable data to ctx alloc; test=develop

* make format same with ci; test=develop

* fix format error with ci; test=develop

ebd0f512

L
Add the implementation of Gloo for ProcessGroup (#39892) · c16f85f9
由 lilong12 提交于 3月 03, 2022
```
* add pg_gloo
```
c16f85f9

02 3月, 2022 17 次提交

F
[MLU] add mlu ci script (#39805) · a8e02ef1
由 fwenguang 提交于 3月 02, 2022
```
* [MLU] add mlu ci script

* Update CMakeLists.txt
```
a8e02ef1
L
add check for backward hook (#40041) · 1980e33a
由 Leo Chen 提交于 3月 02, 2022
```
* add check for backward hook

* refine ut
```
1980e33a

Move transpose to pten (#39327) · 7a857924

由 hong 提交于 3月 02, 2022

* immigrate_transpose_to_pten cpu kernel only; test=develop

* fix bug; test=develop

* add transpose cuda api

* bug fix;

* fix bugs

* fix bugs; test=develop

* bug fix;

* move transepose to pten; test=develop

* fix bug; test=develop

* fix bugs; test=develop

* add transpose grad fp16 support; test=develop

* fix bug; test=develop

* fix npu bug; test=develop

* fix nemul = 0 bug; test=develop

* add fp16 support; test=develop

* fix data type register bug; test=develop

* fix transpose bug; test=develop

* update transpose

* fix transpose bug; test=develop

* remove useless code; test=develop

* remove useless code; test=develop

* fix transpose alias bug; test=develop

* polish code; test=develop

* resolve confict; test=develop

* resolve confilct; test=develop

* recover prepared operator; test=develop

* fix bug; test=develop

* polish code; test=develop

* fix bug; test=develop

* fix bug; test=develop

7a857924

L

run recompute's real backward with amp disabled (#40042) · 28795771
由 Leo Chen 提交于 3月 02, 2022

28795771

new fleet_desc builder (#39948) · 1c4e3e5d

由 ziyoujiyi 提交于 3月 02, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* refactor ps optimize

* refactor ps optimize

* refactor ps optimize

* .

* .

* .

* .

* .

* .

* refactor theoneps

* the_one_ps

* add ps pass unittest

* add ps pass unittest

* ps unitest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* ps unittest ready

* ps unittest ready

* solve dist_pass init conflict

* solve import CommContext error

* unittest ok

* implement AllocateFrom

* solve setup.py.in conflict

* solve conflict

* solve conflict

* solve conflict

* .

* .

* cpu-async-ps minimize test ok & gpu minimize test ok

* add heter 2stage unittest

* add heter 2stage unittest

* add heter 2stage unittest

* sync/geo test ok & fix heter_worker program ok

* .

* new fleet desc generator

* new fleet_desc builder

* new fleet_desc builder

* .

* .

* correct ps.proto compile

* .
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

1c4e3e5d

Z
[bf16] add bf16 kernel: softmax & log_softmax (#39999) · 4a4215ff
由 zhangbo9674 提交于 3月 02, 2022
```
* add softmax log_softmax

* refine rocm

* refine unittest
```
4a4215ff
J
[Auto Parallel] Adapt Partitioner & DistOp for ERNIE3.0 Inference and cache (#39895) · c9cd47d9
由 JZ-LIANG 提交于 3月 02, 2022
```
* adapot dist op

* add dist_fill_constant_batch_size_like

* remvoe print

* update compitable

* add unitest
```
c9cd47d9

[IPU] update ipu unittests p0 (#39707) · 1db188f3

由 Allen Guo 提交于 3月 02, 2022

* update ipu UTs part0

* rename UT

* sync api changes

* update uts for new api

* use_ipumodel() as classmethod

1db188f3

J

add logic kernel for mlu (#39940) · bc113e10
由 joeqiao12 提交于 3月 02, 2022

bc113e10
Y
[fleet_executor] Add entrance of FleetExecutor in AnalysisPredictor for... · 244ae318
由 Yuang Liu 提交于 3月 02, 2022
```
[fleet_executor] Add entrance of FleetExecutor in AnalysisPredictor for distributed inference (#39992)
```
244ae318
Q
[MLU] adapt matmul op (#39727) · b4d931e8
由 qipengh 提交于 3月 02, 2022
```
* [MLU] adapt matmul op

* [MLU] fix phi namespace
```
b4d931e8
F

[MLU] add transpose2 mlu kernel (#39994) · 4cab812e
由 fwenguang 提交于 3月 02, 2022

4cab812e
B

add_new_comm_primitive (#40040) · 4e00d2bb
由 Baibaifan 提交于 3月 02, 2022

4e00d2bb
L

fix unittests for eignvalsh (#39841) · aa47297a
由 lkylkylky 提交于 3月 02, 2022

aa47297a
optimize CUDA implementaion of randint OP (#39952) · fb635089
由 zhouweiwei2014 提交于 3月 02, 2022
```
* change CUDA implementaion of randint OP,move distribution common func to phi

* fix CI

* fix CI
```
fb635089

[Eager] open eager when WITH_PYTHON (#39979) · 9af72957

由 wanghuancoder 提交于 3月 02, 2022

* open eager when WITH_PYTHON, test=develop

* refine, test=develop

* refine, test=develop

* add DWITH_PYTHON for gen_fluid_lib, test=develop

9af72957

W

[Eager] Support gnn ptb_rnn in eager mode (#39993) · dbcf8797
由 Weilong Wu 提交于 3月 02, 2022

dbcf8797

01 3月, 2022 11 次提交
- fix bug of paddle.to_tensor and paddle.moveaxis (#39662) · 4617c1b2
  由 zhouweiwei2014 提交于 3月 01, 2022
```
* fix bug of paddle.to_tensor and paddle.moveaxis

* fix CI
```
  4617c1b2
- A
  
  fix compiling and running with ipu (#39920) · 69ab2700
  由 Allen Guo 提交于 3月 01, 2022
  
  69ab2700
- C
  [Phi]rm reduce infershape (#39820) · 09039636
  由 chentianyu03 提交于 3月 01, 2022
```
* modify infershape utils and rm reduce infershape

* merge develop

* fix infermete bug

* add IsForInferShape func in ArgumentMappingContext

* add reduce_mean infermeta

* modify annotation

* add default dims
```
  09039636
- J
  Add mobilenetv3_large performance test for bf16 and int8 (#39738) · eb7c211a
  由 joanna.wozna.intel 提交于 3月 01, 2022
```
* Add mobilenetv3_large performance test

* Disable the BF16 test if the device does not support BF16 computations

* Change test timeout
```
  eb7c211a
- Z
  [bf16] add bf16 kernel: layer_norm p_norm reduce_sum (#39843) · ce8ed978
  由 zhangbo9674 提交于 3月 01, 2022
```
* add layer norm

* add p norm

* add reduce sum

* refine layer norm register bf16 for cudnn811

* add bf16 cast for hip

* add unittest

* refine rocm

* refine layer_norm unittest

* refine reduce op

* refine unittest

* enhance atol for reduce unittest
```
  ce8ed978
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
- Z
  
  add test_warpctc_op in mac (#39983) · 25650774
  由 zhangchunle 提交于 3月 01, 2022
  
  25650774
- Z
  [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
  由 zhangbo9674 提交于 3月 01, 2022
```
* add scale gather sum

* refine CUDA_ATOMIC_WRAPPER ADD for bf16

* add gather unittest

* solve conflict

* add scale uinttest

* add sum unittest

* solve conflict

* refine gather unittest

* refine unittest
```
  6d26b332
- H
  
  update error_string when target is out of bound (#40001) · a7acfc5b
  由 HydrogenSulfate 提交于 3月 01, 2022
  
  a7acfc5b
- Z
  [PHI] Support Multi Input and Output for InferShape (#39870) · e8d45583
  由 zyfncg 提交于 3月 01, 2022
```
* add multi input for infer_shape

* support multi output for infershape

* fix split bug

* fix bug of concat

* support vector<MetaTensor*> in infrt

* fix bug
```
  e8d45583
- A
  [Phi] Migrate logical_and/or/not/xor into Phi (#39942) · 8c237973
  由 Aurelius84 提交于 3月 01, 2022
```
* [Phi] Migrate logical_and/or/not/xor into Phi

* fix unittest

* fix function name
```
  8c237973

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致