提交 · cb74dac3816ed68d32bb8005252314b238470bc4 · BaiXuePrincess / Paddle

30 8月, 2019 1 次提交

[Cherry-pick] Support memory eager deletion on recurrent OP (#19411) · cb74dac3

由 Huihuang Zheng 提交于 8月 30, 2019

* Support memory eager deletion on recurrent OP (#17710)

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.
                   
GPU memory (MiB):   6414 (this PR)     vs   6837 (without this PR)
Speed (steps/s):         10.28 (this PR)    vs    9.89 (without this PR)

* Fix random test_recurrent_op failure (#18718)

The change includes 3 things:

1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1.

2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values.

3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.

cb74dac3

29 8月, 2019 1 次提交

Distributed training cherry-pick for Release 1.5 (#19486) · 416922e2

由 tangwei12 提交于 8月 29, 2019

* fix bug in Class MultiSlotDataGenerator's function _gen_str, test=develop (#18222)
* fix some bug when merge sparse embedding parameters, test=develop (#18223)
* fix communicator with pyreader (#18350)
* delete AllocatorFacade destructor  (#18606)
* fix distribute transpiler GRPC error code 4, RPC Deadline (#18984)
* merge pr #18441

416922e2

27 8月, 2019 1 次提交
- L
  Fix depthwise conv gpu kernel bug (#18582) (#19392) · 07e7ebeb
  由 LielinJiang 提交于 8月 27, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
  07e7ebeb
26 8月, 2019 4 次提交

L
Make roi_perspective_transform op return mask and transform matrix,test=release/1.5 (#19391) · ec64f44f
由 LielinJiang 提交于 8月 26, 2019
```
* make_roi_perspective_transform_op_return_mask_and_matrix

* make_roi_perspective_transform_op_return_mask_and_matrix
```
ec64f44f

CHERRY PICK FROM 18941, 18860, 19213：Fix Mask RCNN bug AND Paddle-TRT fp16 support (#19378) · 6fbd224e

由 Zhaolong Xing 提交于 8月 26, 2019

* CHERRY_PICK 18941, 18860: TRT fp16 support.

test=release/1.5

* CHERRY_PICK 19213: Fix BUG: Mask RCNN inference diff When using AnalysisPredictor.
    1. fix affine channel fuse pass.
    2. fix condition block op.
    3. fix merge lod tensor op bug.
    4. fix memory optim cause by reset lod op.

    test=release/1.5

6fbd224e

石

tensor_array_to_tensor_op.cc, test=develop (#19380) · c328a9e5
由石晓伟提交于 8月 26, 2019

c328a9e5
石
Fusion: seqpool_cvm_concat, test=release/1.5 (#19381) · fae79811
由石晓伟提交于 8月 26, 2019
```
* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop
```
fae79811

21 8月, 2019 1 次提交

[Cherry Pick] Bug fix and speedup dygraph multi-cards on v1.5 (#19298) · 71168dad

由 chengduo 提交于 8月 21, 2019

* add warning info for CPU_NUM
test=develop

* update dygraph parallel.py
test=develop

* prune the feed op in compiler
test=release/1.5

* remove compile from PE
test=develop

* test CUDAPinnedPlace in reader
test=release/1.5

71168dad

20 8月, 2019 1 次提交

Modify PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS (#19247) · 3b5f3548

由 silingtong123 提交于 8月 20, 2019

* add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211)

* test=develop,Modify PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS

3b5f3548

15 8月, 2019 1 次提交
- C
  [Cherry-pick]Fix gather op bug (#19169) · de5dec84
  由 chengduo 提交于 8月 15, 2019
```
* fix gather op bug
test=release/1.5
```
  de5dec84
29 7月, 2019 2 次提交
- C
  [Cherry pick] Fix backward error (#18835) · cc3ba765
  由 chengduo 提交于 7月 29, 2019
```
* fix backward bug
```
  cc3ba765
- Z
  
  fix affine_channel no_need buffer bug, test=release/1.5 (#18849) · 46c5345f
  由 Zeng Jinle 提交于 7月 29, 2019
  
  46c5345f
26 7月, 2019 1 次提交
- F
  fix roi_align_op cpu backward's bug (#18825) · deee78ab
  由 FDInSky 提交于 7月 26, 2019
```
[cherry pick]fix roi_align_op cpu backward's bug
```
  deee78ab
25 7月, 2019 3 次提交
- W
  Cudnn convolution reconstruction (#18284) (#18776) · 1b22dd2a
  由 wangchaochaohu 提交于 7月 25, 2019
```
* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop
```
  1b22dd2a
- Q
  
  Fix CPU implementation of roi_align_op backward (#18728) (#18742) · 7af67f9f
  由 qingqing01 提交于 7月 25, 2019
  
  7af67f9f
- Q
  
  Refine Infershape in activation_op for double_grad (#18731) · 11a1284c
  由 qingqing01 提交于 7月 25, 2019
  
  11a1284c
08 7月, 2019 2 次提交
- Z
  CHERRY-Pick: Inference: fix mask rcnn model diff, optim memory usage, memory leak. #18532 (#18547) · bc9fd1fc
  由 Zhaolong Xing 提交于 7月 08, 2019
```
fix mask rcnn
add interface for setting optim_cache_dir(eg: when in trt int8 mode, and load model from memory, there should be a interface for setting the trt calibration table data dir)

test=release/1.5
```
  bc9fd1fc
- Z
  cherry-pick Fix topk cannot handle 1D vector bug (#18466) · 856536b9
  由 zhaoyuchen2018 提交于 7月 08, 2019
```
Add path to handle 1D vector
```
  856536b9
05 7月, 2019 1 次提交
- G
  
  checkerrpick Make fuse_all_reduce_op_pass support mix_precision test=develop test=release (#18490) · 3232618a
  由 gongweibao 提交于 7月 05, 2019
  
  3232618a
29 6月, 2019 1 次提交

[cherry-pick] Update lamb optimizer (#18333) (#18380) · 880fb833

由 Yibing Liu 提交于 6月 29, 2019

* Update lamb optimizer (#18333)

* Update lamb optimizer

* Regenerate api spec

test=release/1.5

* Give an experimental warning

test=release/1.5

880fb833

28 6月, 2019 2 次提交

Q
Simplify multi_box_head API in detection.py and remove assign op. (#18310) (#18388) · 5b103c24
由 qingqing01 提交于 6月 28, 2019
```
* Simplify multi_box_head API in detection.py and remove assign op.
```
5b103c24

石

Update the Anakin interfaces for content-dnn and MLU, test=release/1.5 (#18028) · 924e53b7

由石晓伟提交于 6月 28, 2019

* Update the Anakin interfaces for content-dnn and MLU (#17890)

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* modify the access level of anakin engine (#18015)

test=develop

* fix ci test cmake test=develop

924e53b7

26 6月, 2019 1 次提交
- T
  cherry pick fix softrelu doc (#18328) · e517202c
  由 tensor-tang 提交于 6月 26, 2019
```
* fix softrelu doc
* update API doc

test=release/1.5
```
  e517202c
25 6月, 2019 4 次提交

Sequence mask support tensor (#18249) (#18318) · c8d00cb2

由 Hongyu Liu 提交于 6月 25, 2019

* sequnce mask support max length tensor input; test=develop

* add rnn_impl.py; test=develop

* add basic gru lstm unittest; test=develop

* fix api spec; test=develop

* fix sequence_mask op bug;
test=develop
test=document_preview

* change +-*x to elmentwise_op; test=develop

* add mkl flag; test=develop

* fix rnn impl bug; test=develop

* update api spec; test=develop

* fix doc bug; test=develop

* fix lstm bugs; test=develop

c8d00cb2

cherry-pick from #17935 (#18051) · 5cd4bbfe

由 Guo Sheng 提交于 6月 25, 2019

test=release/1.5

* Fix the GetExpectedKernelType of add_position_encoding_op.

* Fix the doc of lstm_unit outputs in nn.py.

5cd4bbfe

Y
Optimize fused_elewise_activation_grad op. (#18282) · 8640456b
由 Yiqun Liu 提交于 6月 25, 2019
```
test=release/1.5
```
8640456b
Y
Fix the bug of sequence_unpad op (#18290) (#18305) · 45bd5898
由 Yibing Liu 提交于 6月 25, 2019
```
* Use TensorCopySync for sequence_unpad op

* Fix the tensor memory alloc bug

test=release/1.5
```
45bd5898

24 6月, 2019 2 次提交

Fix slice op shape=-1 bug (#18107) (#18227) · 8aa5757a

由 Hongyu Liu 提交于 6月 24, 2019

* fix slice op bug; test=develop

* fix variabel test bug; test=develop

* remove slice while true; test=develop

8aa5757a

[cherry pick] update load_error_info (#18256) · 618c2c75

由 lujun 提交于 6月 24, 2019

Repair error prompt: Users are prompted to check whether the model or parameter files are damaged when loading parameters are wrong.

* cherry pick 18000, test=release/1.5

618c2c75

20 6月, 2019 2 次提交

[cherry-pick]Update backward appending stragety to support double backward. (#18216) · a839f724

由 qingqing01 提交于 6月 20, 2019

* Update backward appending stragety to support double backward and fix some bug. (#18104)

* Update backward.py:
     - If there is no input grad var in all outputs of previous ops, do not append this op into graph.
     - Only apply this stragety when double backward.
* Update some double backward op.
* Update sum_op to judge whether a tensor is empty by numel or IsInitialized().

a839f724

翟

Fix spelling errors (#18213) · 6e310e2d
由翟飞跃提交于 6月 19, 2019

6e310e2d

19 6月, 2019 2 次提交

Release/1.5 cherry pick (#18139) · 598addf1

由 tangwei12 提交于 6月 19, 2019

* fix save/load in fleet (#17675)

* fix save/load in Fleet
* add UT framework of Fleet (#18058)

* add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121)

add paddle cloud role maker for specific cloud usage. This pr will simplifies user's configuration in distributed training.

* assign role_maker before use (#18137)

598addf1

Cherry pick retinanet_target_assign_op(#17893), sigmoid_focal_loss_op(#17895)... · 3305045c

由 FlyingQianMM 提交于 6月 19, 2019

Cherry pick retinanet_target_assign_op(#17893), sigmoid_focal_loss_op(#17895) and retinanet_detection_output_op(#17896) for supporting retinanet (#18141)

* test=release/1.5
Fix conflicts in test_layers.py when adding target assign operator for supporting retinanet. Cherry pick #17893

* test=release/1.5
Add sigmoid focal loss operator for supporting retinanet. Cherry pick #17895

* test=release/1.5
Add detection output operator for supporting retinanet. Cherry pick #17896

* test=release/1.5
fix wrong code style in test_layers.py when cherry pick retinanet_target_assign #17893

* test=release/1.5
Fix type error of std::pow in sigmoid_focal_loss. Cherry pick #17895

3305045c

18 6月, 2019 2 次提交
- A
  add cascade rcnn support (#18136) · 262a7c0a
  由 AIFollowers 提交于 6月 18, 2019
```
Add cascade rcnn support.
```
  262a7c0a
- C
  test=release/1.5 (#18134) · c50fb58c
  由 cjt222 提交于 6月 18, 2019
```
cherry pick for deform roi pooling
```
  c50fb58c
15 6月, 2019 2 次提交
- Z
  [Release/1.5][Cherry-pick #18108] Fix py_reader iterable bug (#18109) · 31ef8c1c
  由 Zeng Jinle 提交于 6月 15, 2019
```
* fix py_reader iterable bug, test=release/1.5

* move data from buffered_reader,test=release/1.5
```
  31ef8c1c
- C
  [Cherry pick]Update CPU_NUM config (#18110) · be8c82cc
  由 chengduo 提交于 6月 15, 2019
```
* update CPU_NUM config
test=develop
```
  be8c82cc
13 6月, 2019 3 次提交

Fix gather and scatter op has same index bug cherry-pick from #17952 · 072347ff

由 wawltor 提交于 6月 13, 2019

test=release/1.5
cherry-pick from #17952
The scatter op has a calc bug when the indices has same index, the scatter op use overwrite mode to calculate the same index, fix this bug by using the accumulate mode to calculate the same index.At the same time, the gather op has the same bug when the op calc the grad. And we use the lib of open-blas and eigen to optimize the time cost in accumulate mode.

072347ff

Cherry-pick of #17814 and #18030 (#18067) · 80a3fd2e

由 Wojciech Uss 提交于 6月 13, 2019

Added unit test for QAT FP32 & INT8 comparison (#17814)
Disable MKLDNN FC in Resnet50 test (#18030)

test=release/1.5

80a3fd2e

T
cherry pick concat op support negative axis (#18050) · a114a39e
由 tensor-tang 提交于 6月 13, 2019
```
test=release/1.5
```
a114a39e

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致