提交 · 5e839e4da584d073c065a60c39db4f81b16df110 · 机器未来 / Paddle

12 1月, 2021 2 次提交

add sparse embedding & load vars for 2.0 & gloo bug fix (#30306) · 5e839e4d

由 tangwei12 提交于 1月 12, 2021

* add sparse embedding & load vars for 2.0

Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b

* fix hdfs gloo

Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6

* fix gloo hdfs

Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e

* move loadvar/sparse embedding from incubute to static

Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0

5e839e4d

Fix/distributed proto (#29981) · 25f80fd3

由 tangwei12 提交于 1月 12, 2021

* rename sendrecv.proto to namespace paddle.distributed

* split ps with distributed

25f80fd3

11 1月, 2021 2 次提交
- L
  Support vector<double> as type of op attribute and op set_value suppport... · b4989fb7
  由 liym27 提交于 1月 11, 2021
```
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
```
  b4989fb7
- 石
  
  fix header file paths of gflags, commit 1, test=develop (#30271) · 8ce2482b
  由石晓伟提交于 1月 11, 2021
  
  8ce2482b
10 1月, 2021 1 次提交
- W
  reduce the occupied size of memory for the fused pattern of elementwise_add... · af80859d
  由 wangchaochaohu 提交于 1月 10, 2021
```
reduce the  occupied size  of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
```
  af80859d
08 1月, 2021 4 次提交

Support pure fp16 training for AMP API. (#29544) · 7f7dfccf

由 Zhen Wang 提交于 1月 08, 2021

* add cast ops before and after unsupported fp16 ops.

* Keep partial net in FP32 pattern.

* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.

* Add fp16 support for adam op.

* add multi precision attr for adam.

* Fix the bug of test_multi_precision_fp16_train UT.

* Code format for CI.

* Fix the redefine error about MPTypeTrait on windows.

* fix bugs of the _create_accumulators func in Momentum.

* fix bug when inserting post cast op.

* Add the update_loss_scaling op in allow_set of UnusedVarCheck.

* Update for ci coverage.

* Add some doc for OptimizerWithMixedPrecision.

* Fix the code style.

* Imporve the doc of `amp_init`.

* Change for fp16 testing if users have the infer program defined in separate way.

7f7dfccf

L

use cuda generator in bernoulli cuda kernel (#30199) · 789743e1
由 Leo Chen 提交于 1月 08, 2021

789743e1

Add callback after TensorCopy (#30123) · 1f97d61c

由 Leo Chen 提交于 1月 08, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

1f97d61c

C
【Paddle.Fleet】Fix tensor table (#30075) · 528e03fc
由 Chengmo 提交于 1月 08, 2021
```
* add tensor table
```
528e03fc

07 1月, 2021 3 次提交
- H
  Refine PADDLE_ENFORCE Error Messages. test=develop (#30149) · 54bf3f5a
  由 Huihuang Zheng 提交于 1月 07, 2021
```
Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc
```
  54bf3f5a
- C
  [Complex] Simplify prepared op impl to improve performance (#30153) · d0fb06b2
  由 Chen Weihang 提交于 1月 07, 2021
```
* simplify prepared op impl to improve performance

* fix kunlun compile error

* continue fix kunlun compile error

* only transform diff place when dtype diff

* fix failed unittests

* remove useless file

* polish impl by review comment
```
  d0fb06b2
- L
  
  fix assign_op_xpu concat_op_xpu warining (#30120) · 15fac5e7
  由 liuyuhui 提交于 1月 07, 2021
  
  15fac5e7
06 1月, 2021 1 次提交
- 石
  
  fix a bug in op_version_registry, test=develop, test=op_version (#29994) · 53bb1265
  由石晓伟提交于 1月 06, 2021
  
  53bb1265
05 1月, 2021 2 次提交
- L
  
  fix xpu pe sync, test=notest (#30095) · 254ad619
  由 liuyuhui 提交于 1月 05, 2021
  
  254ad619
- T
  add topo-aware in heter-ps (#30087) · 0b8e1fad
  由 Thunderbrook 提交于 1月 05, 2021
```
* add topo aware

* resource.h

* topo aware

* format
```
  0b8e1fad
04 1月, 2021 2 次提交
- W
  
  Optimization grad merge performance (#29784) · ee16006b
  由 WangXi 提交于 1月 04, 2021
  
  ee16006b
- S
  fix op version checker of pass bug (#30028) · 08dc5bc2
  由 Shang Zhizhou 提交于 1月 04, 2021
```
* fix op version checker of pass bug

* fix code style

* update  pass version
```
  08dc5bc2
31 12月, 2020 1 次提交

Add mkldnn nearest_interp and bilinear_interp op (#30016) · c3c064a8

由 cc 提交于 12月 31, 2020

* Add mkldnn nearest_interp and bilinear_interp op
* don't run mkldnn interpolate in default
* add interpolate_mkldnn_pass

c3c064a8

30 12月, 2020 3 次提交
- W
  add the support the op version check for matmul, test=op_version (#30011) · cc2f9462
  由 wawltor 提交于 12月 30, 2020
```
* add the support the op version check for matmul, test=op_version
```
  cc2f9462
- W
  add the op version check for the elementwise ops, test=op_version (#30010) · b33aaea8
  由 wawltor 提交于 12月 30, 2020
```
* add the op version check for the elementwise ops, test=op_version

* add the support check for elementwise_ops, test=op_version
```
  b33aaea8
- L
  Enhance debugging (#30001) · 47d10c55
  由 Leo Chen 提交于 12月 30, 2020
```
* add debug code

* add place info

* fix compile problem

* add place for output
```
  47d10c55
29 12月, 2020 3 次提交
- W
  change the elementwise ops version check, test=op_version · 8f49f9d5
  由 wawltor 提交于 12月 29, 2020
```
change the elementwise ops version check, test=op_version
```
  8f49f9d5
- T
  
  add include (#29952) · 0ca6de17
  由 Thunderbrook 提交于 12月 29, 2020
  
  0ca6de17
- C
  map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911) · 6a0102b0
  由 cc 提交于 12月 29, 2020
```
* map matmul/squeeze2+matmul/reshape2+matmul to mul
```
  6a0102b0
28 12月, 2020 4 次提交
- J
  add gru op_register_version; test=op_version; (#29931) · 5a4e42ca
  由 Jack Zhou 提交于 12月 28, 2020
```
* add gru op_register_version; test=op_version;

* Update fc,mul version;test=op_version;
```
  5a4e42ca
- W
  
  [Inference] Solve 2.0 trt performance reduce compare 1.8. (#29925) · 2b1d796c
  由 Wilber 提交于 12月 28, 2020
  
  2b1d796c
- 石
  flush denormals to zero, test=develop (#29924) · 181ea187
  由石晓伟提交于 12月 28, 2020
```
* flush denormals to zero, test=develop

* add comments, test=develop
```
  181ea187
- L
  
  [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29926) · 3d1741b7
  由 liuyuhui 提交于 12月 28, 2020
  
  3d1741b7
27 12月, 2020 1 次提交

[Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842) · 9602a182

由 liym27 提交于 12月 27, 2020

* Revert "[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)"

This reverts commit b10ecd9d.

* Support ShareInplaceVersionCounterWith to share the same inplace version counter for VarBase

9602a182

26 12月, 2020 1 次提交
- L
  
  [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574) · 4427df37
  由 liuyuhui 提交于 12月 26, 2020
  
  4427df37
25 12月, 2020 4 次提交
- Y
  
  remove duplicate ut names (#29809) · 2a01756b
  由 YUNSHEN XIE 提交于 12月 25, 2020
  
  2a01756b
- C
  [Complex] Handle complex to real after type promotion (#29855) · a6072055
  由 Chen Weihang 提交于 12月 25, 2020
```
* try to add fwd op input dtypes

* refactor base impl

* return tmp_ins after dygraph prepare data

* fix typo found in debug

* polish comment & add complex net test

* revert detail change

* fix unittest failed

* add complex kernel condition control

* fix xpu test failed & polish comment

* polish details by review comments
```
  a6072055
- L
  
  fix TransferInplaceBack (#29830) · 6b258317
  由 Leo Chen 提交于 12月 25, 2020
  
  6b258317
- Q
  feat: support check_nan_inf for kunlun/xpu device (#29694) · 59b47f3b
  由 QingshuChen 提交于 12月 25, 2020
```
* feat: support check_nan_inf for kunlun device

* support kunlun stack

* minor
```
  59b47f3b
24 12月, 2020 2 次提交
- T
  [Feature] one ps (3/4) (#29604) · 032414ca
  由 tangwei12 提交于 12月 24, 2020
```
* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>
```
  032414ca
- J
  
  Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772) · edc06c6a
  由 jakpiase 提交于 12月 24, 2020
  
  edc06c6a
23 12月, 2020 2 次提交

Y
remove duplicate ut reload (#29810) · 24ce051a
由 YUNSHEN XIE 提交于 12月 23, 2020
```
* remove duplicate ut reload

* remove duplicate ut define in cmakelist
```
24ce051a

heter box (#29734) · 09b6e719

由 Thunderbrook 提交于 12月 23, 2020

* 　add heter box

* add trainer, worker, wrapper...

* format

* for ci

* format

* remove boost get

* boost & copyright

* rename

* 　rename

* format

* format

* format
Co-authored-by: Nyaoxuefeng6 <yaoxuefeng@baidu.com>

09b6e719

22 12月, 2020 1 次提交
- J
  [oneDNN] Tensor copy fix to oneDNN tensors (#29771) · 7b33720c
  由 Jacek Czaja 提交于 12月 22, 2020
```
* - Tensor copy fix to oneDNN tensors

* - Fixes after review
```
  7b33720c
21 12月, 2020 1 次提交
- L
  
  format code (#29714) · 224f3bcb
  由 Leo Chen 提交于 12月 21, 2020
  
  224f3bcb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致