提交 · cb74dac3816ed68d32bb8005252314b238470bc4 · 机器未来 / Paddle

30 8月, 2019 1 次提交

[Cherry-pick] Support memory eager deletion on recurrent OP (#19411) · cb74dac3

由 Huihuang Zheng 提交于 8月 30, 2019

* Support memory eager deletion on recurrent OP (#17710)

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.
                   
GPU memory (MiB):   6414 (this PR)     vs   6837 (without this PR)
Speed (steps/s):         10.28 (this PR)    vs    9.89 (without this PR)

* Fix random test_recurrent_op failure (#18718)

The change includes 3 things:

1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1.

2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values.

3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.

cb74dac3

29 8月, 2019 3 次提交

C
[Cherry pick] Support feed single persistable variable to PE (#19435) · a7a4b72b
由 chengduo 提交于 8月 29, 2019
```
* update executor feed
```
a7a4b72b
J
test=release/1.5, fix multiple Layers parameter missing error in dygraph mode (#19491) · 5860cc47
由 Jiabin Yang 提交于 8月 29, 2019
```
This PR cherry-picked the fix of multiple Layers parameter missing error in dygraph mode，the original one is #18968
```
5860cc47

Distributed training cherry-pick for Release 1.5 (#19486) · 416922e2

由 tangwei12 提交于 8月 29, 2019

* fix bug in Class MultiSlotDataGenerator's function _gen_str, test=develop (#18222)
* fix some bug when merge sparse embedding parameters, test=develop (#18223)
* fix communicator with pyreader (#18350)
* delete AllocatorFacade destructor  (#18606)
* fix distribute transpiler GRPC error code 4, RPC Deadline (#18984)
* merge pr #18441

416922e2

28 8月, 2019 1 次提交
- C
  [Cherry pick] Remove unnecessary op when trainable is false (#19434) · 9048229b
  由 chengduo 提交于 8月 28, 2019
```
* fix optimizer bug
test=develop
```
  9048229b
27 8月, 2019 2 次提交

test=release/1.5, fix problem that get_attr method can't using default mode... · 5b3d33bd

由 Jiabin Yang 提交于 8月 27, 2019

test=release/1.5, fix problem that get_attr method can't using default mode when we call has_attr in dygraph (#19328) (#19414)

* add default getItem

* test=develop, fix has_attr disabled error in Layer

* test=develop, fix GroupNorm and deepcf bug on attrs

5b3d33bd

L
Fix depthwise conv gpu kernel bug (#18582) (#19392) · 07e7ebeb
由 LielinJiang 提交于 8月 27, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
07e7ebeb

26 8月, 2019 3 次提交

L
Make roi_perspective_transform op return mask and transform matrix,test=release/1.5 (#19391) · ec64f44f
由 LielinJiang 提交于 8月 26, 2019
```
* make_roi_perspective_transform_op_return_mask_and_matrix

* make_roi_perspective_transform_op_return_mask_and_matrix
```
ec64f44f

CHERRY PICK FROM 18941, 18860, 19213：Fix Mask RCNN bug AND Paddle-TRT fp16 support (#19378) · 6fbd224e

由 Zhaolong Xing 提交于 8月 26, 2019

* CHERRY_PICK 18941, 18860: TRT fp16 support.

test=release/1.5

* CHERRY_PICK 19213: Fix BUG: Mask RCNN inference diff When using AnalysisPredictor.
    1. fix affine channel fuse pass.
    2. fix condition block op.
    3. fix merge lod tensor op bug.
    4. fix memory optim cause by reset lod op.

    test=release/1.5

6fbd224e

石
Fusion: seqpool_cvm_concat, test=release/1.5 (#19381) · fae79811
由石晓伟提交于 8月 26, 2019
```
* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop
```
fae79811

21 8月, 2019 2 次提交

C
[Cherry Pick] Add error info during compile (#19300) · c737116c
由 chengduo 提交于 8月 21, 2019
```
* Add call stack info during runtime and compile time
test=develop
```
c737116c

[Cherry Pick] Bug fix and speedup dygraph multi-cards on v1.5 (#19298) · 71168dad

由 chengduo 提交于 8月 21, 2019

* add warning info for CPU_NUM
test=develop

* update dygraph parallel.py
test=develop

* prune the feed op in compiler
test=release/1.5

* remove compile from PE
test=develop

* test CUDAPinnedPlace in reader
test=release/1.5

71168dad

20 8月, 2019 1 次提交
- C
  [Cherry pick] Fix register op without gradient (#19272) · 305bd25b
  由 chengduo 提交于 8月 20, 2019
```
* fix REGISTER_OP_WITHOUT_GRADIENT
test=develop
```
  305bd25b
29 7月, 2019 2 次提交
- C
  [Cherry pick] Fix backward error (#18835) · cc3ba765
  由 chengduo 提交于 7月 29, 2019
```
* fix backward bug
```
  cc3ba765
- Z
  
  fix affine_channel no_need buffer bug, test=release/1.5 (#18849) · 46c5345f
  由 Zeng Jinle 提交于 7月 29, 2019
  
  46c5345f
08 7月, 2019 2 次提交
- J
  test=release/1.5, cherry-pick hide not_support for dygraph (#18528) · 7c73a68f
  由 Jiabin Yang 提交于 7月 08, 2019
```
* test=release/1.5, cherry-pick hide not_support for dygraph

* test=release/1.5, cherry-pick hide not_support for dygraph
```
  7c73a68f
- Z
  cherry-pick Fix topk cannot handle 1D vector bug (#18466) · 856536b9
  由 zhaoyuchen2018 提交于 7月 08, 2019
```
Add path to handle 1D vector
```
  856536b9
05 7月, 2019 1 次提交
- G
  
  checkerrpick Make fuse_all_reduce_op_pass support mix_precision test=develop test=release (#18490) · 3232618a
  由 gongweibao 提交于 7月 05, 2019
  
  3232618a
29 6月, 2019 1 次提交

[cherry-pick] Update lamb optimizer (#18333) (#18380) · 880fb833

由 Yibing Liu 提交于 6月 29, 2019

* Update lamb optimizer (#18333)

* Update lamb optimizer

* Regenerate api spec

test=release/1.5

* Give an experimental warning

test=release/1.5

880fb833

28 6月, 2019 1 次提交
- H
  
  test=develop, disable basic gru related ut (#18329) (#18387) · b5556f2d
  由 Hongyu Liu 提交于 6月 28, 2019
  
  b5556f2d
25 6月, 2019 2 次提交

Sequence mask support tensor (#18249) (#18318) · c8d00cb2

由 Hongyu Liu 提交于 6月 25, 2019

* sequnce mask support max length tensor input; test=develop

* add rnn_impl.py; test=develop

* add basic gru lstm unittest; test=develop

* fix api spec; test=develop

* fix sequence_mask op bug;
test=develop
test=document_preview

* change +-*x to elmentwise_op; test=develop

* add mkl flag; test=develop

* fix rnn impl bug; test=develop

* update api spec; test=develop

* fix doc bug; test=develop

* fix lstm bugs; test=develop

c8d00cb2

Revert "Cherry pick install check for multi gpu" (#18312) · f6432604

由 Jiabin Yang 提交于 6月 25, 2019

* Revert "Cherry pick install check for multi gpu (#18245)"

This reverts commit d0219002.

* test=release/1.5, ci start

f6432604

24 6月, 2019 1 次提交

Fix slice op shape=-1 bug (#18107) (#18227) · 8aa5757a

由 Hongyu Liu 提交于 6月 24, 2019

* fix slice op bug; test=develop

* fix variabel test bug; test=develop

* remove slice while true; test=develop

8aa5757a

21 6月, 2019 1 次提交

Cherry pick install check for multi gpu (#18245) · d0219002

由 Jiabin Yang 提交于 6月 21, 2019

* test=develop, add add_multi_gpu_install_check (#18157)

* test=develop, add add_multi_gpu_install_check

* test=develop, refine warning doc

* test=develop, refine warning doc

* test=develop, refine warning doc

* test=develop, support multi cpu

* test=release/1.5, cherry-picked from develop

d0219002

20 6月, 2019 2 次提交

[cherry-pick]Update backward appending stragety to support double backward. (#18216) · a839f724

由 qingqing01 提交于 6月 20, 2019

* Update backward appending stragety to support double backward and fix some bug. (#18104)

* Update backward.py:
     - If there is no input grad var in all outputs of previous ops, do not append this op into graph.
     - Only apply this stragety when double backward.
* Update some double backward op.
* Update sum_op to judge whether a tensor is empty by numel or IsInitialized().

a839f724

翟

Fix spelling errors (#18213) · 6e310e2d
由翟飞跃提交于 6月 19, 2019

6e310e2d

19 6月, 2019 4 次提交

Release/1.5 cherry pick (#18139) · 598addf1

由 tangwei12 提交于 6月 19, 2019

* fix save/load in fleet (#17675)

* fix save/load in Fleet
* add UT framework of Fleet (#18058)

* add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121)

add paddle cloud role maker for specific cloud usage. This pr will simplifies user's configuration in distributed training.

* assign role_maker before use (#18137)

598addf1

Cherry pick retinanet_target_assign_op(#17893), sigmoid_focal_loss_op(#17895)... · 3305045c

由 FlyingQianMM 提交于 6月 19, 2019

Cherry pick retinanet_target_assign_op(#17893), sigmoid_focal_loss_op(#17895) and retinanet_detection_output_op(#17896) for supporting retinanet (#18141)

* test=release/1.5
Fix conflicts in test_layers.py when adding target assign operator for supporting retinanet. Cherry pick #17893

* test=release/1.5
Add sigmoid focal loss operator for supporting retinanet. Cherry pick #17895

* test=release/1.5
Add detection output operator for supporting retinanet. Cherry pick #17896

* test=release/1.5
fix wrong code style in test_layers.py when cherry pick retinanet_target_assign #17893

* test=release/1.5
Fix type error of std::pow in sigmoid_focal_loss. Cherry pick #17895

3305045c

[cherry-pick] Fix logging to release/1.5 (#18026) · 7c7afef7

由 Kaipeng Deng 提交于 6月 19, 2019

* fix logging unable. test=develop

* unset sys.stdout for stream handler. test=develop

* fix newly add basicConfig. test=develop

* fix import error. test=release/1.5

7c7afef7

[Cherry Pick] Not init nccl when rank is 1 (#18170) · 041bc72c

由 chengduo 提交于 6月 19, 2019

* remove nccl dep when the number of GPU is 1
test=develop

* use multi card run syncBN
test=release/1.5

041bc72c

18 6月, 2019 2 次提交
- A
  add cascade rcnn support (#18136) · 262a7c0a
  由 AIFollowers 提交于 6月 18, 2019
```
Add cascade rcnn support.
```
  262a7c0a
- C
  test=release/1.5 (#18134) · c50fb58c
  由 cjt222 提交于 6月 18, 2019
```
cherry pick for deform roi pooling
```
  c50fb58c
15 6月, 2019 2 次提交
- Z
  [Release/1.5][Cherry-pick #18108] Fix py_reader iterable bug (#18109) · 31ef8c1c
  由 Zeng Jinle 提交于 6月 15, 2019
```
* fix py_reader iterable bug, test=release/1.5

* move data from buffered_reader,test=release/1.5
```
  31ef8c1c
- C
  [Cherry pick]Update CPU_NUM config (#18110) · be8c82cc
  由 chengduo 提交于 6月 15, 2019
```
* update CPU_NUM config
test=develop
```
  be8c82cc
14 6月, 2019 1 次提交
- G
  
  cherrpick fixncclid 18025 test=release/1.5 (#18093) · 751497db
  由 gongweibao 提交于 6月 14, 2019
  
  751497db
13 6月, 2019 2 次提交

Fix gather and scatter op has same index bug cherry-pick from #17952 · 072347ff

由 wawltor 提交于 6月 13, 2019

test=release/1.5
cherry-pick from #17952
The scatter op has a calc bug when the indices has same index, the scatter op use overwrite mode to calculate the same index, fix this bug by using the accumulate mode to calculate the same index.At the same time, the gather op has the same bug when the op calc the grad. And we use the lib of open-blas and eigen to optimize the time cost in accumulate mode.

072347ff

T
cherry pick concat op support negative axis (#18050) · a114a39e
由 tensor-tang 提交于 6月 13, 2019
```
test=release/1.5
```
a114a39e

12 6月, 2019 1 次提交

Cherry-pick: fix random CI failure. (#17976) · 21554bcb

由 Huihuang Zheng 提交于 6月 12, 2019

* Cherry-pick fix random Python3 CI failure.

In some tests, SWEs used "print('xxx').format('xxx')". The syntax
is only supported in Python2, not python3. However, since those
lines are related to data download, if the CI machines already have
the data, it passes CI tests. That causes random failure.

* Cherry-pick: disable CUDNN case of test_warpctc_op

test=release

21554bcb

10 6月, 2019 2 次提交

H
Ignore a unit test which failed on cuda9/10 python3 ci task (#17950) · 9f519baf
由 Huihuang Zheng 提交于 6月 10, 2019
```
TODO: it is a temporary fix for Paddle release 1.5. We have to fix
this failed unit test soon.

test=develop
```
9f519baf

Enable seq_pool op to accept len 0 input (#17284) · 33d1e565

由 Yibing Liu 提交于 6月 10, 2019

* Enable seq_pool op to accept len 0 input

test=develop

* Update sequence_pool's api

test=develop

* Add more unittest cases for seq_pool op

test=develop

* Remove legacy comments

test=develop

* Don't use template in op maker

test=develop

33d1e565

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致