提交 · 12542320c52720c9812e2d558a953bcc397c8546 · BaiXuePrincess / Paddle

11 9月, 2019 6 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

Make leaky relu inplacable (#19676) · 0daa5c97

由 Zeng Jinle 提交于 9月 11, 2019

* make leaky relu inplacable, test=develop

* force add unittests to pass coverage, test=develop

0daa5c97

C
Open fuse broadcast option (#18833) · e506c99c
由 chengduo 提交于 9月 11, 2019
```
* fix vlog level and fuse option type
test=develop
```
e506c99c

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

C
Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418) · 5866a7a5
由 chengduo 提交于 9月 11, 2019
```
* Enable fused_all_reduce_op_handle support GPU and CPU Gradients
```
5866a7a5
T
paddle::framework::vectorize() templatization (#19730) · ec9bc1bd
由 Tao Luo 提交于 9月 11, 2019
```
remove unused accuracy-diff warpctc-cudnn implementation

test=develop
```
ec9bc1bd

10 9月, 2019 2 次提交

Z

add logs to left var memory size, test=develop (#19722) · bb4f8dee
由 Zeng Jinle 提交于 9月 10, 2019

bb4f8dee

merge empty lod tensor, test=develop (#19228) · 25dcd74d

由 wangguanzhong 提交于 9月 10, 2019

* merge_empty_lod_tensor, test=develop

* fix multiclass_nms, test=develop

* refine API.spec, test=develop

* add unittest case for fetch, test=develop

* add lod tensor test, test=develop

* return index for multiclass_nms, test=develop

* add api for multiclass_nms2

* update API.spc, test=develop

* refine api doc, test=develop

* fix test_detection.py, test=develop

* polish code, test=develop

* add more unittest case, test=develop

25dcd74d

09 9月, 2019 1 次提交
- Z
  
  refine tensor.mutable_data, test=develop (#19680) · 713c05dd
  由 Zeng Jinle 提交于 9月 09, 2019
  
  713c05dd
08 9月, 2019 1 次提交
- H
  fix cmakelist deps (#19668) · 1ca6ea03
  由 hutuxian 提交于 9月 08, 2019
```
fix cmakelist deps: remove unnecessary deps and add proper op deps
```
  1ca6ea03
07 9月, 2019 1 次提交

remove -Wmaybe-uninitialized warning (#19653) · bcddbc78

由 Tao Luo 提交于 9月 07, 2019

* remove -Wmaybe-uninitialized warning

test=develop

* remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc

test=develop

bcddbc78

06 9月, 2019 1 次提交
- W
  codegen for fused elementwise operation (#19520) · ed8f44ea
  由 wangchaochaohu 提交于 9月 06, 2019
```
* test=develop codegen for fused elementwise operation

* fix test=develop
```
  ed8f44ea
05 9月, 2019 3 次提交
- M
  add feed_var_names to Prune interface (#19589) · dca9b6c5
  由 mapingshuo 提交于 9月 05, 2019
```
* Fix bug: add feed_vars to the prune function
```
  dca9b6c5
- T
  unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) · 3ae939e4
  由 Tao Luo 提交于 9月 05, 2019
```
* remove assert.h

* change PADDLE_ASSERT_MSG to PADDLE_ENFORCE

test=develop

* fix tensorrt paddle_enforce

test=develop
```
  3ae939e4
- T
  
  fix scope lock bug on infer (#19624) · e3e98ed6
  由 tensor-tang 提交于 9月 05, 2019
  
  e3e98ed6
04 9月, 2019 3 次提交
- T
  refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607) · 0a46d345
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  0a46d345
- B
  Enable ngraph through build_strategy (#19266) · a3a4b6e5
  由 baojun 提交于 9月 04, 2019
```
* enable ngraph throught build_strategy test=develop

* add unittest test=develop

* put use_ngraph unconditional test=develop

* remove paddle_enforce test=develop

* remove paddle_enforce test=develop

* fix copyright test=develop

* limit for ngraph only test=develop
```
  a3a4b6e5
- A
  paddle::framework::vectorize() templatization (#19611) · 8d6d95cc
  由 Adam 提交于 9月 04, 2019
```
test=develop
```
  8d6d95cc
03 9月, 2019 4 次提交

T
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603) · 75d15719
由 Tao Luo 提交于 9月 03, 2019
```
test=develop
```
75d15719

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

A
using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568) · e94b26da
由 Adam 提交于 9月 03, 2019
```
* using MKLDNNMemoryFormat = mkldnn::memory::format changes
test=develop

* PADDLE_ENFORCE update
test=develop
```
e94b26da
G
Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506) · abaf87be
由 gongweibao 提交于 9月 03, 2019
```
Change backward_guard to optimize_guard to maximize the allreduce overlap
```
abaf87be

02 9月, 2019 2 次提交
- Z
  
  fix fast pe to run highest priority ops first, test=develop (#19575) · 19474019
  由 Zeng Jinle 提交于 9月 02, 2019
  
  19474019
- Z
  
  fix seg fault of share lod, test=develop (#19573) · 0af85497
  由 Zeng Jinle 提交于 9月 02, 2019
  
  0af85497
31 8月, 2019 1 次提交

Paddlebox Framework (#18982) · c756b5d2

由 hutuxian 提交于 8月 31, 2019

* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982

c756b5d2

30 8月, 2019 5 次提交

[MKL-DNN] Fix to face model on AVX512 platforms (#19282) · ecd9f330

由 Jacek Czaja 提交于 8月 30, 2019

- Refactor step 1

- Compilation fix

- Yet another compilation fix

- Even more compilation fix

- Lint fixes

test=develop

- Removed deprectaed PADDLE_ENFORCE occurance

test=develop

- Candidate fix to BN forward

- Lint fixes

test=develop

- Refactoring in data_layout_transform

- compilation fix

- Another comppilation fix

- Step further into darkness

- Yet another compilation fix

- Yet another compilation fix

- missing header

- compilation fix

- Added MKLDNN -> Paddle conversion in fetch op

test=develop

- Compilation fix

test=develop

- Lint

test=develop

- Mul fix

- Fix to MKLDNN MUL op and Elementwise MUL UT

test=develop

- Workaround for diffrent weights with groups representation Paddle vs
  MKL-DNN.

test=develop

- Candidate fix for 5D convolution with groups

- Refactor of fix for conv3d and conv2d in fetch op

test=develop

- Compilation fix

- Still same compilation fix

- Compilation fix

- Compilation fix

- Reverted refactoring of fixes

- Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format
  not NHWC

test=develop

- minor fix in UT

test=develop

- Lint fixes

test=develop

ecd9f330

add thread scope stat accurate metrics test=develop (#19480) · 10ca3f96

由 yaoxuefeng 提交于 8月 30, 2019

* add thread scope stat accurate metrics test=develop

* fix style

* fix style

* fix style

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix conflict

* fix style

* fix style test=develop

* fix error test=develop

* fix error test=develop

10ca3f96

T
remove unused assert.h (#19529) · 02270b3e
由 Tao Luo 提交于 8月 30, 2019
```
test=develop
```
02270b3e
C
Support feed single persistable variable to PE (#19417) · e340df01
由 chengduo 提交于 8月 30, 2019
```
* update executor feed
```
e340df01

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

29 8月, 2019 3 次提交

support debug each output of each ins (#19004) · 1fe468d3

由 Thunderbrook 提交于 8月 29, 2019

* dump slot

* test

* proto

* dump slot

* test

* proto

* code style

* code style

* code style

* style

* add delete after unseen days

* add unseen days

* code style

* conflict solve
test=develop

* add clear model

* code style
test=develop

* code style
test=develop

* support debug tensor of each ins
test=develop

* support debug tensor of each ins
test=develop

* learning rate

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style
test=develop

* code style
test=develop

* unitest

* style

* style

* multi phase

* add channel

* code style

* style

* style

* unitest

* style

* define

* define
test=develop

* style
test=develop

* rm define
test=develop

* linux

* linux
test=develop

* style
test=develop

* output format
test=develop

* windows ci
test=develop

1fe468d3

Z

refine inplace inference registry, test=develop (#19032) · 5c8f210c
由 Zeng Jinle 提交于 8月 29, 2019

5c8f210c

Increase num_iteration_per_drop_scope (#19075) · b6d1d890

由 chengduo 提交于 8月 29, 2019

* increase num_iteration_per_drop_scope
test=develop

* Fix bug of while_op
test=develop

* fix bug of whileOp
test=develop

b6d1d890

28 8月, 2019 1 次提交

Fix the correctness of async mode at distributed training (#18863) · 65c73684

由 tangwei12 提交于 8月 28, 2019

* fix correctness of the communicator

* fix a bug in send thread when sending var context is empty, test=develop

* add lookup_table_prefetch_op and prefetch optimize, test=develop

* remove remote prefetch GPU supported

* word2vec force with CPU, test=develop

* test dist remote lookup table force with CPU, test=develop

65c73684

27 8月, 2019 1 次提交
- J
  
  Add conv dequant squash for int8 (#18905) · 2e3ec66b
  由 joanna.wozna.intel 提交于 8月 27, 2019
  
  2e3ec66b
23 8月, 2019 1 次提交
- T
  remove unused conv_elementwise_add2_act_fuse.cc (#19344) · c82280e4
  由 Tao Luo 提交于 8月 23, 2019
```
test=develop
```
  c82280e4
22 8月, 2019 2 次提交

Enhance OpTest to check the consistency of operators when using and not using inplace (#19101) · a9d5fc51

由 Leo Chen 提交于 8月 22, 2019

* add pybind interface to get all inplace ops, test=develop

* enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop

* handle corner cases in op_test, test=develop

* support outputs without tensor holder_, like XShape in reshape_op, test=develop

* fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop

* use reshape_grad instead of reshape in FlattenGradOp, test=develop

* fix error debug dims info for variables like XShape, test=develop

* change computational order in sum_op to relieve computation difference using inplace, test=develop

* add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop

* follow sneaxiy's comments, test=develop

* remove unused DefaultGradOpDescMaker in mkldnn op, test=develop

a9d5fc51

T
stronger the error message of tensor's mutable_data (#19303) · e3c68bde
由 Tao Luo 提交于 8月 22, 2019
```
* stronger the error message of tensor's mutable_data

test=develop

* update error message

test=develop
```
e3c68bde

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 1 次提交

Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0

由 Zhaolong Xing 提交于 8月 19, 2019

* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop

76c95af0

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致