提交 · e2372750172be6d27afd5c48bcea9a0e9485492a · 机器未来 / Paddle

20 9月, 2019 1 次提交
- 石
  
  fix multi-thread exec of trt, test=develop (#19338) · d004a0f5
  由石晓伟提交于 9月 20, 2019
  
  d004a0f5
19 9月, 2019 1 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

18 9月, 2019 1 次提交
- 石
  
  support MLU nums, test=develop (#19372) · 71b2ed61
  由石晓伟提交于 9月 18, 2019
  
  71b2ed61
17 9月, 2019 2 次提交
- P
  zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff
  由 Pei Yang 提交于 9月 17, 2019
```
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
```
  9cbc1eff
- Z
  fix memory optimization type (#19781) · 110be57c
  由 Zhaolong Xing 提交于 9月 17, 2019
```
test=develop
```
  110be57c
16 9月, 2019 1 次提交

Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758

由 Yiqun Liu 提交于 9月 16, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

* Enhance fc_fuse_pass to enable fusing relu.

* Allow print the shapes of var_desc in graph.
test=develop

* Enhance fc_fuse_pass_tester.

* Remove the use of PADDLE_ENFORCE.
test=develop

* Correct the number of ops after fusing.
test=develop

* Fix a typo.
test=develop

* Set activation_type to null when there is no relu in fc.
test=develop

* Refine fc_fuse_pass's codes.

* Enable the set of shape for tensor.

* Refine repeated_fc_relu_pass and add unittest.
test=develop

c67c8758

11 9月, 2019 1 次提交

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

09 9月, 2019 1 次提交

paddle::framework::vectorize() templatization [PART3] (#19643) · f05d2c51

由 Tao Luo 提交于 9月 09, 2019

* paddle::framework::vectorize() templatization

test=develop

* update pybind/imperative.cc

test=develop

* revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc

test=develop

f05d2c51

05 9月, 2019 1 次提交

unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) · 3ae939e4

由 Tao Luo 提交于 9月 05, 2019

* remove assert.h

* change PADDLE_ASSERT_MSG to PADDLE_ENFORCE

test=develop

* fix tensorrt paddle_enforce

test=develop

3ae939e4

04 9月, 2019 1 次提交

Enable ngraph through build_strategy (#19266) · a3a4b6e5

由 baojun 提交于 9月 04, 2019

* enable ngraph throught build_strategy test=develop

* add unittest test=develop

* put use_ngraph unconditional test=develop

* remove paddle_enforce test=develop

* remove paddle_enforce test=develop

* fix copyright test=develop

* limit for ngraph only test=develop

a3a4b6e5

03 9月, 2019 1 次提交

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

30 8月, 2019 2 次提交

L

add dynamic C runtime support on windows, test=develop (#19502) · d6cb1a41
由 liuwei1031 提交于 8月 30, 2019

d6cb1a41

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

22 8月, 2019 1 次提交

add local user data conversion into full_pascalvoc_test_preprocess.py (#19283) · 9240e532

由 lidanqing 提交于 8月 22, 2019

* add local user data conversion into full_pascalvoc_test_preprocess.py
test=develop

* change PADDLE_ENFORCE to PADDLE_ENFORCE_GE
test=develop

* change according to reviews
test=develop

9240e532

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 2 次提交
- Z
  Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0
  由 Zhaolong Xing 提交于 8月 19, 2019
```
* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop
```
  76c95af0
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
15 8月, 2019 2 次提交
- L
  Fix mAP problem in unit test of int8 object detection test (#18946) · 07a4d8f8
  由 lidanqing 提交于 8月 15, 2019
```
* change the top1 comparison to mAP comparison
test=develop

* change the mobilenet-ssd tester demo data and batch_size settings
test=develop
```
  07a4d8f8
- A
  Add generalized Conv+Activation MKLDNN fuse pass creation (#19072) · b837689e
  由 Adam 提交于 8月 15, 2019
```
test=develop
```
  b837689e
12 8月, 2019 1 次提交
- W
  add tensorrt support for windows (#19084) · 80b7ef6f
  由 wopeizl 提交于 8月 12, 2019
```
* add tensorrt support for windows
```
  80b7ef6f
09 8月, 2019 1 次提交
- T
  inference_shared_library support profile (#16275) · 741ce8bb
  由 Tao Luo 提交于 8月 09, 2019
```
test=develop
```
  741ce8bb
08 8月, 2019 1 次提交

[WIP] Add Imdb train demo (#18895) · 4ad7c9d5

由 mapingshuo 提交于 8月 08, 2019

* add train demo for imdb text classification task

* make inference library release data_feed dataset dataset_factory data_feed_factory

* add String Data Generator

* new feature of train demo: save model params

* New feature of train demo: set training config using gflags

* change code style for CI

* add readme and dataset for imdb demo trainer

4ad7c9d5

05 8月, 2019 1 次提交
- S
  test=develop,Synchronize the contents of develop with release1.5 (#18937) · fd3b666d
  由 silingtong123 提交于 8月 05, 2019
```
Fix the third-party openblas dependency for paddle on windows
```
  fd3b666d
02 8月, 2019 2 次提交

Fix the CE error which caused by paddle-trt version (#18941) · 3816d221

由 Zhaolong Xing 提交于 8月 02, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

* fix trt fp16 ce error
test=develop

* add an vlog if the user use trt4 and specify fp16.
test=develop

3816d221

石

Fusion: seqpool_cvm_concat (#18471) · ee2f296e

由石晓伟提交于 8月 02, 2019

* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop

ee2f296e

31 7月, 2019 2 次提交

fix several security bugs reported by security team (#18831) · 0d996908

由 liuwei1031 提交于 7月 31, 2019

* fix security issue, test=develop

* bug fix, test=develop

* throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop

0d996908

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

30 7月, 2019 1 次提交

Revert "use static variable to do cache instead of thread local in thread... · 10eeed93

由 Leo Zhao 提交于 7月 30, 2019

Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428)" (#18879)

This reverts commit ce38bb53.

test=develop

10eeed93

27 7月, 2019 1 次提交
- H
  Merge cuda 9/10 dockerfile with root dockerfile (#18693) · cfce4994
  由 Huihuang Zheng 提交于 7月 27, 2019
```
Also fix a dependency error which may cause compile error
```
  cfce4994
24 7月, 2019 1 次提交

Update trt5 for paddle-trt (#18645) · 26ae6d49

由 Zhaolong Xing 提交于 7月 24, 2019

* update paddle-trt for:
    1. fix bug: when batch > 2, core in split plugin.
    2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
    3. add new attr to dropout.
    4. shuffle channel, swish, relu6 support
    test=develop

* 1. fix ci
test=develop

26ae6d49

17 7月, 2019 2 次提交

G
remove async executor and add data_feed.proto to the deps of train demo (#18659) · d714bf03
由 guru4elephant 提交于 7月 17, 2019
```
* remove async executor and add data_feed.proto to the deps of train demo
```
d714bf03

石

Fix Bitmain Predictor::Clone() (#18599) · 25d80791

由石晓伟提交于 7月 17, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* load model from buffer with length

test=develop

* modify the access level of class

test=develop

* support anakin for bitmain arch

test=develop

* remove files

* checkout cmakelists

test=develop

* modify interfaces

test=develop

* add cmake dependments

test=develop

* enforce the outputs of net

test=develop

25d80791

11 7月, 2019 1 次提交

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331

由 Tao Luo 提交于 7月 11, 2019

* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop

076f8331

09 7月, 2019 1 次提交

Fix/gcc 4.8 ubt link error (#18558) · 667f88f9

由 Jiabin Yang 提交于 7月 09, 2019

* test=develop, fix docker with paddle nccl problem

* test=develop, fix/gcc_4.8_ubt_link_error

* test=develop, fix code format

667f88f9

08 7月, 2019 4 次提交

Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532) · 88b52a27

由 Zhaolong Xing 提交于 7月 08, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

88b52a27

石

Support Bitmain Anakin (#18542) · 15291548

由石晓伟提交于 7月 08, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* load model from buffer with length

test=develop

* modify the access level of class

test=develop

* support anakin for bitmain arch

test=develop

* remove files

* checkout cmakelists

test=develop

15291548

L

use static variable to do cache instead of thread local in thread frequent switching case (#18428) · ce38bb53
由 Leo Zhao 提交于 7月 08, 2019

ce38bb53

add mkldnn shapeblob cache clear strategy (#18513) · fe32879d

由 Tao Luo 提交于 7月 08, 2019

* add mkldnn shapeblob cache clear strategy

test=develop

* refine with comments

test=develop

* make cache clear strategy more safey

test=develop

* add lock for GetShapeBlobSize

test=develop

fe32879d

05 7月, 2019 1 次提交
- B
  
  fix command line bug in int8v2 readme (#18507) · 3fe6bf5e
  由 bingyanghuang 提交于 7月 05, 2019
  
  3fe6bf5e
03 7月, 2019 1 次提交
- 石
  Remove the obsolete cmake options (#18481) · 047bba85
  由石晓伟提交于 7月 03, 2019
```
* remove the obsolete cmake options, test=develop

* remove unittests, test=develop
```
  047bba85

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致