提交 · 4fec182d24e17675b59d1b19b3417e67e04de998 · BaiXuePrincess / Paddle

30 11月, 2020 8 次提交

W

[Lite-Subgraph] Fix compile error for lite subgraph. (#29146) · 4fec182d
由 Wilber 提交于 11月 30, 2020

4fec182d

由 123malin 提交于 11月 30, 2020

* fix paramete prefetch & device guard
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

b5c63423

Check whether there is any inplace operation affecting gradient calculation. (#27901) · 865a4598

由 liym27 提交于 11月 30, 2020

* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable.

* Add a new attribute `_inplace_version` for VarBase.

* Raise exception if an inplace operation can result in incorrect gradient computation.

* Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation.

* For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode.

* Use original var_wrapper if the inplace_version is not changed.

* Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.

865a4598

1
prefetch optimize (#29095) · 03d4665f
由 123malin 提交于 11月 30, 2020
```
* test=develop, optimize async prefetch
```
03d4665f
W

optimizer amp, all use fp16 communication, overlap last comm and compute (#28957) · 0c2a51d2
由 WangXi 提交于 11月 30, 2020

0c2a51d2

Polish unittests details and execution conditions to adapt to MUSL (#29044) · 0b032fae

由 Chen Weihang 提交于 11月 30, 2020

* fix failed tests in yingchun gived list

* add unittests into static_mode_white_list

* add enable static

* fix dist unittest

* skip test_sigmoid_focal_loss_op & add gym

* revert no need skip unittests

* remove gym

0b032fae

W

Add quantization of multi_gru op and tests (#28615) · 4fd4095d
由 Wojciech Uss 提交于 11月 30, 2020

4fd4095d
J
fix gru gcc7.4 bug for the gru compile · bc6033f8
由 Jack Zhou 提交于 11月 30, 2020
```
fix gru gcc7.4 bug for the gru compile
```
bc6033f8

28 11月, 2020 1 次提交
- W
  
  optimize cumsum OP (#29193) · b818429a
  由 wangchaochaohu 提交于 11月 28, 2020
  
  b818429a
27 11月, 2020 8 次提交

Support dynamic graph distributed (#28997) · e2d01eb6

由 ShenLiang 提交于 11月 27, 2020

* add reducer

* refine envent for memorycopy

* add concat&split for allreduce

* apply concat & split for fuse tensor

* fix nccl dep

* fix the untest, compile problem and ddp initialize problem

* fix untest for mac & add some comments & solve the repeated param in sublayers

* fix untest for windows & fix document

e2d01eb6

L
update expand as op to use the shape of the target tensor instead of the... · 7e5e9934
由 lilong12 提交于 11月 27, 2020
```
update expand as op to use the shape of the target tensor instead of the target tensor itself. (#29020)

* update, test=develop
```
7e5e9934
Z

fix CUDA 11 error on windows (#29101) · e668cb07
由 Zhou Wei 提交于 11月 27, 2020

e668cb07
J
Add eigen gru and fix the dropout bug in the rnn · 085260f3
由 Jack Zhou 提交于 11月 27, 2020
```
Add eigen gru and fix the dropout bug in the rnn 
```
085260f3
Y

add user_define_dump (#28596) · 545df287
由 yaoxuefeng 提交于 11月 27, 2020

545df287
A

Fixes mkldnn dygraph learning rate scheduler crashes (#28988) · bc902044
由 arlesniak 提交于 11月 27, 2020

bc902044

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

L

fix typo of flag name (#29154) · fd3fcb05
由 Leo Chen 提交于 11月 27, 2020

fd3fcb05

26 11月, 2020 9 次提交
- N
  Fix ops doc for some ops · da71173b
  由 Noel 提交于 11月 26, 2020
```
Fix ops doc for some ops 
```
  da71173b
- L
  Split train_mode and has_grad for tracer (#29064) · 770395cb
  由 Leo Chen 提交于 11月 26, 2020
```
* split train_mode and has_grad

* fix format

* fix ci problems

* fix sample code
```
  770395cb
- A
  
  Polish CUDA Information stdout (#29109) · 7ae3cb55
  由 Aurelius84 提交于 11月 26, 2020
  
  7ae3cb55
- W
  
  optimize fast graph executor (#28962) · 173c22ae
  由 WangXi 提交于 11月 26, 2020
  
  173c22ae
- S
  
  fix unittest trt_dynamic_shape_transformer_prune_test error (#29122) · 562ded10
  由 Shang Zhizhou 提交于 11月 26, 2020
  
  562ded10
- S
  add API serialize_program, serialize_persistables, save_to_file,... · db412585
  由 Shibo Tao 提交于 11月 26, 2020
```
add API serialize_program, serialize_persistables, save_to_file, deserialize_program, deserialize_persistables, load_from_file. (#29034)
```
  db412585
- J
  Add bf16 pool2d and unify bf16 unit tests (#29039) · b0d1ac16
  由 joanna.wozna.intel 提交于 11月 26, 2020
```
* Add bf16 pool2d and unify bf16 unit tests

* Add change default ops test
```
  b0d1ac16
- J
  Fix cpu_bfloat16_pass (#28730) · fddea674
  由 joanna.wozna.intel 提交于 11月 26, 2020
```
* Fix cpu_bfloat16_pass

* Add output_format

* Fix incorrect SetOutput

* Change fromating
```
  fddea674
- Q
  fix win ci failure, test=develop (#29089) · 2fd16cf6
  由 Qi Li 提交于 11月 26, 2020
```
* fix win ci failure, test=develop

* add ci test, test=develop
```
  2fd16cf6
25 11月, 2020 7 次提交
- C
  Hide the C++ stack by default and add hints (#29042) · fea0e294
  由 Chen Weihang 提交于 11月 25, 2020
```
* default not show cpp statck & add hint

* fix failed unittest

* fix failed unittests
```
  fea0e294
- J
  add uint8 for reshape op (#28996) · 582c0a04
  由 joejiong 提交于 11月 25, 2020
```
add uint8 for reshape operator
```
  582c0a04
- Z
  fix tensor detach to zero copy (#27921) · 8ca0a8a8
  由 Zhou Wei 提交于 11月 25, 2020
```
* fix tensor detach to zero copy

* fix tensor detach to zero copy
```
  8ca0a8a8
- T
  
  add xpu elementwise ops (#29031) · a5aa4dc7
  由 taixiurong 提交于 11月 25, 2020
  
  a5aa4dc7
- J
  Update pow (#29000) · b04c78ef
  由 joejiong 提交于 11月 25, 2020
```
Simple code clean up
```
  b04c78ef
- W
  remove eigen threadpool for the speed up · b2c8a007
  由 wawltor 提交于 11月 25, 2020
```
remove eigen threadpool for the speed up
```
  b2c8a007
- W
  Add multi_gru_fuse_pass and tests (#28601) · 7b5a8e46
  由 Wojciech Uss 提交于 11月 25, 2020
```
* Add multi_gru_fuse_pass and tests

* fix date

* cleaned up headers
```
  7b5a8e46
24 11月, 2020 3 次提交
- L
  
  update, test=develop (#28700) · 767d0ba2
  由 lilong12 提交于 11月 24, 2020
  
  767d0ba2
- W
  Add multi_gru_seq_fuse_pass and tests (#28604) · 991345b3
  由 Wojciech Uss 提交于 11月 24, 2020
```
* Add multi_gru_seq_fuse_pass and tests

* fix date

* removed unused functions
```
  991345b3
- 1
  【paddle.distributed.fleet】Optimize ParameterServer's Async Mode (#28442) · fbf9564f
  由 123malin 提交于 11月 24, 2020
```
* test=develop, optimize global_step
```
  fbf9564f
23 11月, 2020 4 次提交
- L
  enable pipeline to run with Executor.run() (#28373) · f77a78cd
  由 lilong12 提交于 11月 23, 2020
```
* update, test=develop
```
  f77a78cd
- T
  support ps-gpu (#28752) · 0073f9bd
  由 Thunderbrook 提交于 11月 23, 2020
```
* ps gpu transpile

* ps gpu

* remove op

* gps trainer

* local ps

* add macro

* HeterBox

* def cuda

* tab

* code style

* style

Co-authored-by: Thunderbrook <a754913769#163.com>
```
  0073f9bd
- C
  
  polish two api doc detail, test=document_fix (#28971) · 768dab44
  由 Chen Weihang 提交于 11月 23, 2020
  
  768dab44
- F
  refactor momentum op to combine weight (#27414) · 8ff35506
  由 furnace 提交于 11月 23, 2020
```
* refactor momentum op to combine weight_decay (scale op and sum op)
```
  8ff35506

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致