提交 · 3d015f1cf529915ab52cb8aef7c475f67fb128b5 · BaiXuePrincess / Paddle

13 1月, 2021 3 次提交

Set expected place in child thread for dataloader to avoid costing cuda memory... · 3d015f1c

由 Leo Chen 提交于 1月 13, 2021

Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

3d015f1c

Q
optimize memcpy perf for kunlun (#30291) · 2c1bba02
由 QingshuChen 提交于 1月 13, 2021
```
* optimize memcpy perf for kunlun

* remove useless unitest for kunlun mean

* minor
```
2c1bba02
S

Support unused parameters in dynamic graph distributed (#30224) · a60f17b8
由 ShenLiang 提交于 1月 13, 2021

a60f17b8

12 1月, 2021 11 次提交
- J
  
  Recompute Offload (#30233) · 75936d83
  由 JZ-LIANG 提交于 1月 12, 2021
  
  75936d83
- L
  
  correct the allowed dimension size (#30326) · a60893f6
  由 lidanqing 提交于 1月 12, 2021
  
  a60893f6
- C
  
  remove c++ stacktrace hint (#30325) · c8c8f205
  由 Chen Weihang 提交于 1月 12, 2021
  
  c8c8f205
- T
  add sparse embedding & load vars for 2.0 & gloo bug fix (#30306) · 5e839e4d
  由 tangwei12 提交于 1月 12, 2021
```
* add sparse embedding & load vars for 2.0

Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b

* fix hdfs gloo

Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6

* fix gloo hdfs

Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e

* move loadvar/sparse embedding from incubute to static

Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0
```
  5e839e4d
- T
  Fix/distributed proto (#29981) · 25f80fd3
  由 tangwei12 提交于 1月 12, 2021
```
* rename sendrecv.proto to namespace paddle.distributed

* split ps with distributed
```
  25f80fd3
- C
  【Paddle.Fleet】Support local save sparse param (#30175) · d479ae17
  由 Chengmo 提交于 1月 12, 2021
```
* add save tensor support
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
  d479ae17
- D
  fix elugradgrad test fail & error message opt (#30171) · 231501fe
  由 Double_V 提交于 1月 12, 2021
```
* fix elugradgrad test fail and error message opt

* fix unitest,test=develop

* Update prroi_pool_op.h

fix error message

* opt message,test=develop

* fix ci fail,test=develop
```
  231501fe
- Z
  Fix the accuracy problem of allclose op when using float64 data type in static mode. (#29890) · fb49ea38
  由 Zhen Wang 提交于 1月 12, 2021
```
* Fix the accuracy problem of allclose op when using float64 data type in static mode.

* Format the code style.
```
  fb49ea38
- Y
  
  fix datanorm error msg (#30294) · 4656525e
  由 yaoxuefeng 提交于 1月 12, 2021
  
  4656525e
- F
  
  add fp16 support for tril_triu op (#30186) · 77051cc9
  由 furnace 提交于 1月 12, 2021
  
  77051cc9
- 石
  
  fix header file paths of gflags, commit 3, test=develop (#30273) · efa54629
  由石晓伟提交于 1月 12, 2021
  
  efa54629
11 1月, 2021 12 次提交
- C
  Fix server.h include device_context (#30243) · 5b2c15af
  由 Chengmo 提交于 1月 11, 2021
```
* fix cmake
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
  5b2c15af
- 石
  
  enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop (#30240) · a0ee0914
  由石晓伟提交于 1月 11, 2021
  
  a0ee0914
- 石
  
  fix header file paths of gflags, commit 4, test=develop (#30274) · a66eebab
  由石晓伟提交于 1月 11, 2021
  
  a66eebab
- 石
  
  fix header file paths of gflags, commit 2, test=develop (#30272) · 8c4500ff
  由石晓伟提交于 1月 11, 2021
  
  8c4500ff
- L
  Support vector<double> as type of op attribute and op set_value suppport... · b4989fb7
  由 liym27 提交于 1月 11, 2021
```
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
```
  b4989fb7
- W
  
  register OPMaker and Infer Shape Check for fused_elementwise_add (#30259) · 8dcae0c5
  由 wangchaochaohu 提交于 1月 11, 2021
  
  8dcae0c5
- A
  
  Add tf32 switch for cuDNN (#29192) · 924aac22
  由 AshburnLee 提交于 1月 11, 2021
  
  924aac22
- 石
  
  fix header file paths of gflags, commit 1, test=develop (#30271) · 8ce2482b
  由石晓伟提交于 1月 11, 2021
  
  8ce2482b
- C
  type promotion for grad (#30177) · c7371b7b
  由 chentianyu03 提交于 1月 11, 2021
```
* type promotion for grad

* add type promotion for div op
```
  c7371b7b
- L
  
  Check the rank of input in kernel of set_value op (#30147) · 3ce878f3
  由 liym27 提交于 1月 11, 2021
  
  3ce878f3
- W
  modify error message based on comments (#30189) · 66dc4ac7
  由 WeiXin 提交于 1月 11, 2021
```
* modify error message based on comments

* edit code according to review.

* Correct spelling according to review.
```
  66dc4ac7
- W
  just add the op error message for the matmul xpu (#30246) · fee42441
  由 wawltor 提交于 1月 11, 2021
```
 add the op error message for the matmul xpu 
```
  fee42441
10 1月, 2021 2 次提交
- G
  optimize softmax forward (#30217) · 0a21924a
  由 GaoWei8 提交于 1月 10, 2021
```
* optimize softmax forward
```
  0a21924a
- W
  reduce the occupied size of memory for the fused pattern of elementwise_add... · af80859d
  由 wangchaochaohu 提交于 1月 10, 2021
```
reduce the  occupied size  of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
```
  af80859d
09 1月, 2021 3 次提交

Z

enhance error message, test=develop (#30220) · 5932fee6
由 zhang wenhui 提交于 1月 09, 2021

5932fee6

add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913) · da16b33f

由 pangyoki 提交于 1月 09, 2021

* add view strategy on squeeze,unsqueeze,reshape,flatten

* add squeeze unittest

* add unittests

* use View strategy as name rather than Reuse Allacation

* fix view api doc

* fix format

* use core.ops when input of reshape2 is Tensor

* fix test_cross_entropy_loss error because of reshape2

* delete selected_rows

* change op_function

* little change

* solve HandleViewBetweenInputAndOutput

da16b33f

J
[oneDNN] Added UT for testing elementwise_mul caching (#30203) · 4aba17b5
由 Jacek Czaja 提交于 1月 09, 2021
```
* - Added UT for testing elementwise_mul caching

* lint fixes
```
4aba17b5

08 1月, 2021 9 次提交

Support pure fp16 training for AMP API. (#29544) · 7f7dfccf

由 Zhen Wang 提交于 1月 08, 2021

* add cast ops before and after unsupported fp16 ops.

* Keep partial net in FP32 pattern.

* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.

* Add fp16 support for adam op.

* add multi precision attr for adam.

* Fix the bug of test_multi_precision_fp16_train UT.

* Code format for CI.

* Fix the redefine error about MPTypeTrait on windows.

* fix bugs of the _create_accumulators func in Momentum.

* fix bug when inserting post cast op.

* Add the update_loss_scaling op in allow_set of UnusedVarCheck.

* Update for ci coverage.

* Add some doc for OptimizerWithMixedPrecision.

* Fix the code style.

* Imporve the doc of `amp_init`.

* Change for fp16 testing if users have the infer program defined in separate way.

7f7dfccf

L

use cuda generator in bernoulli cuda kernel (#30199) · 789743e1
由 Leo Chen 提交于 1月 08, 2021

789743e1

Fix dtype of ungenerated grad var (#28511) · 8696335f

由 Leo Chen 提交于 1月 08, 2021

* fix dtype of ungenerated grad var

* update ut

* refine code

* set default dtype

* fix could_use_cudnn bug

* remove debug code

* re-implement

* fix bug

8696335f

W

shape op support int8 and uint8 tensor (#30201) · 609c0222
由 Wilber 提交于 1月 08, 2021

609c0222
W

fix windows compile when WITH_PYTHON=ON and WITH_TENSORRT=ON (#30194) · 01a287bf
由 Wilber 提交于 1月 08, 2021

01a287bf
R

Add version checking, test=op_version (#30129) · e42e1e80
由 ruri 提交于 1月 08, 2021

e42e1e80

Add callback after TensorCopy (#30123) · 1f97d61c

由 Leo Chen 提交于 1月 08, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

1f97d61c

C
【Paddle.Fleet】Fix tensor table (#30075) · 528e03fc
由 Chengmo 提交于 1月 08, 2021
```
* add tensor table
```
528e03fc
W

disable mkldnn inplace pass on windows (#30164) · ade24494
由 Wilber 提交于 1月 08, 2021

ade24494

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致