提交 · bc379ca3d5895eadbc1748bc5b71606011563ee1 · PaddlePaddle / Paddle

28 4月, 2021 10 次提交

A

Added pure_bf16 mode (#32281) · bc379ca3
由 arlesniak 提交于 4月 28, 2021

bc379ca3
C
Add fake interface for register_hook in static mode (#32642) · 9aad7527
由 Chen Weihang 提交于 4月 28, 2021
```
* add fake interface for hook in static mode

* add unittests

* fix failed unittests
```
9aad7527

由 denglin-github 提交于 4月 28, 2021

* Add dlnne engine runtime

* Fix log

* Remove <const_cast> and remove unrelated modify with dlnne, +clang-format

* Fix CMakeList format error

* Add copyright message

* Fix dlnne CMakeList.txt

* Add some paddlepaddle_pass to support more networks

* Fix some format bug

* Add delete dropout_op pass

* Fix some format bug

* Fix format bug

abcb3f54

W

modify spectralnorm (#32633) · bda0e609
由 wangna11BD 提交于 4月 28, 2021

bda0e609

[PsCore] solve Brpc dep (#32632) · 4ead9a5a

由 Thunderbrook 提交于 4月 28, 2021

* Revert "Revert "[PsCore] optimize performance of large kv (#32535)" (#32599)"

This reverts commit 809ac036.

* brpc dep

4ead9a5a

Fix some error message (#32614) · 9ee709fc

由 Kqnonrime 提交于 4月 28, 2021

* fix two error message

* fix two error message

* fix error

* fix error

* fix error

* fix error

* fix some error message

* fix some error

* fix error

* fix some error

* fix some error

* fix some error

* fix one error

* fix some error

* fix seven error message

* fix error

* fix error

* fix error

* fix error

* fix some error message

* fix error

* fix some error

* fix some error

9ee709fc

Z

[Rocm] fix test_var_base (#32639) · 7a245b7a
由 zhulei 提交于 4月 28, 2021

7a245b7a
W
Reduce the time cost for the elementwise_add test case (#32628) · 6d3eb3d0
由 wawltor 提交于 4月 28, 2021
```
Reduce the time cost for the elementwise_add test case (#32628)
```
6d3eb3d0
J
[oneDNN] Added clearing oneDNN cache per executor (#32499) · ba610761
由 Jacek Czaja 提交于 4月 28, 2021
```
* - Added clearing oneDNN per executor

* - Executor is nt always having FLAGS_use_mkldnn set to true
```
ba610761

Optimize update_loss_scaling_op (#32554) · 0dc02dc7

由 jiangcheng 提交于 4月 28, 2021

* optimize update_loss_scaling_op by fused for loop to one kernel, test=develop

* remove useless while loop and optimize variable name, test=develop

* optimize variable name from out_addrs_tensor to out_addrs_mem, test=develop

* optimize variable name for readable by change prefix identifier from t_ to local_

0dc02dc7

27 4月, 2021 23 次提交
- L
  add alltoall api (#32507) · db41b742
  由 lilong12 提交于 4月 27, 2021
```
* add alltoall api, test=develop
```
  db41b742
- P
  [Docker] support cuda11.2 and using gcc5.4 in cuda10.1 (#32531) · 31326950
  由 pangyoki 提交于 4月 27, 2021
```
* support cuda11.2 and using gcc5.4 in cuda10.1

* fix manylinux py36 bug

* support cuda11.2

* fix python36 pip version problem in ubuntu

* save cuda11.0
```
  31326950
- Z
  update 2.0 public api in nn (#31912) · 3b81f2b8
  由 zhiboniu 提交于 4月 27, 2021
```
* update 2.0 public api in nn

* replace Chinese character cause error in ci;synchronization with pr:#32588 to avoid 'ascii' codec in python2

* numbers used in paddle.nn.functional.norm but not imported
```
  3b81f2b8
- Z
  update 2.0 public api in paddle.init (#32034) · 125e4816
  由 zhiboniu 提交于 4月 27, 2021
```
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
```
  125e4816
- W
  edit paddle.save/load API (#32532) · 79f7ba69
  由 WeiXin 提交于 4月 27, 2021
```
* edit paddle.save/load API

* Update io.py

edit doc

* delete cpython-37.pyc

* Update io.py

edit doc

* Update io.py

recommit

* Update io.py

recommit

* Update io.py

recommit

* Update io.py

recommit
```
  79f7ba69
- W
  clear 'BasicEngine' when an exception occurs in the backward. (#32546) · 797b2dfd
  由 WeiXin 提交于 4月 27, 2021
```
* clear 'BasicEngine' when an exception occurs in the backward.

* deal with conflict.

* deal with conflict.
```
  797b2dfd
- W
  
  conservative judgment (#32556) · f285f4c1
  由 wenbin 提交于 4月 27, 2021
  
  f285f4c1
- Z
  [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) · 1afe1ac9
  由 Zhong Hui 提交于 4月 27, 2021
```
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
```
  1afe1ac9
- Z
  
  Unify the implementation of activation operation (#32348) · eca8dcc7
  由 Zhang Zheng 提交于 4月 27, 2021
  
  eca8dcc7
- B
  
  slove develop bugs (#32560) · 6f6e159a
  由 Baibaifan 提交于 4月 27, 2021
  
  6f6e159a
- W
  'jit.save/load' support save/load function without parameters. (#32430) · 0372f1dd
  由 WeiXin 提交于 4月 27, 2021
```
* jit.save/load support function.

* delete unnittest test_jit_load_model_incomplete.

* edit code according to CI

* Modify the documentation.

* add note to doc.
```
  0372f1dd
- X
  [Docs] Modified the docs of some api for supporting list/tuple args. (#32360) · 15158927
  由 xiemoyuan 提交于 4月 27, 2021
```
* fixed docs.

* Fixed docs. test=document_fix

code bak.

fixed docs. test=document_fix

* Revert to previous version of python/paddle/fluid/backward.py

* fixed bugs.

* test=document_fix. Fixed examples.
```
  15158927
- G
  fix cross_entropy calculation error (#32545) · 23d3e36a
  由 Guanghua Yu 提交于 4月 27, 2021
```
* fix cross_entropy calculation error

* add unittest and fix static
```
  23d3e36a
- R
  str in python2 is different to python3's, it make mistakes for some api's docstring (#32588) · 97794eca
  由 Ren Wei (任卫) 提交于 4月 27, 2021
```
* UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1788: ordinal not in range(128)

test=document_fix

str(doc) in python2

test=document_fix

* update md5 function in count_api_without_core_ops.py

str in py2 is different.

test=document_fix
```
  97794eca
- X
  Support list and tuple for args. (#32344) · a08a118d
  由 xiemoyuan 提交于 4月 27, 2021
```
* Support list and tuple for parameters of layer_norm, multiprocess_reader, DatasetFolder and ImageFolder.

* add unittest for layer_norm.

* add require gpu for example.
```
  a08a118d
- P
  
  support depthwise_conv2d_transpose (#32593) · 85e697d7
  由 Pei Yang 提交于 4月 27, 2021
  
  85e697d7
- T
  Revert "[PsCore] optimize performance of large kv (#32535)" (#32599) · 809ac036
  由 tianshuo78520a 提交于 4月 27, 2021
```
This reverts commit 4b7242b0.
```
  809ac036
- A
  
  Fix grad calculation bug in tensor_array_to_tensor (#32558) · 6579432f
  由 Aurelius84 提交于 4月 27, 2021
  
  6579432f
- X
  Check for cuda errors immediately after kernel launch (#32557) · 19eefef4
  由 XiangGao 提交于 4月 27, 2021
```
Co-authored-by: NYang Zhang <yangzhang@live.com>
```
  19eefef4
- S
  [HybridParallel] Fix amp bug in ModelParallel (#32579) · c1db7e32
  由 ShenLiang 提交于 4月 27, 2021
```
* fix amp bug

* fix name of wordsize
```
  c1db7e32
- Z
  
  update 2.0 public api in dataset&framework (#31985) · 9930a582
  由 zhiboniu 提交于 4月 27, 2021
  
  9930a582
- Z
  
  update 2.0 public api in tensor (#32026) · f1bc322c
  由 zhiboniu 提交于 4月 27, 2021
  
  f1bc322c
- Z
  
  update 2.0 public api in utils (#32008) · 0bc97e92
  由 zhiboniu 提交于 4月 27, 2021
  
  0bc97e92
26 4月, 2021 7 次提交
- L
  add send/recv api (#32504) · c47bafc6
  由 lilong12 提交于 4月 26, 2021
```
* add sendrecv, test=develop
```
  c47bafc6
- W
  
  deal with conflict. (#32578) · a7be32cc
  由 WeiXin 提交于 4月 26, 2021
  
  a7be32cc
- S
  
  add barrier for new group (#32572) · 4ba49af5
  由 ShenLiang 提交于 4月 26, 2021
  
  4ba49af5
- Z
  
  fix no-value-for-parameter in iscan (#32551) · fcd18ef1
  由 zhangchunle 提交于 4月 26, 2021
  
  fcd18ef1
- Z
  Fix OPENBLAS ci and fix windows CPU CI to parallel compile (#32548) · 1ec9525a
  由 Zhou Wei 提交于 4月 26, 2021
```
* clear CUDA compile environment on windows

* fix Windows CI

* fix Windows CI

* fix Windows CI
```
  1ec9525a
- J
  Optimize where_index_op(prefix sum) (#30601) · 6ec4e640
  由 jiangcheng 提交于 4月 26, 2021
```
* new optimize for where_index_op with prefix sum version.

* write a scan prefix sum kernel with stream for where index op.

* optimize where_index by using cub::DeviceScan::InclusiveSum instead of imperfect self-kernel.

* remove CheckTrue struct and rename stide_array for readable.

* optimize variable name for readable.

* optimize function name and annotation.
```
  6ec4e640
- T
  [PsCore] optimize performance of large kv (#32535) · 4b7242b0
  由 Thunderbrook 提交于 4月 26, 2021
```
* optimize pull sparse

* optimize pull sparse

* change macro

* format
```
  4b7242b0

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功