提交 · fcec365d29a69e81e95578e3720faaabccafbae7 · Crayon鑫 / Paddle

30 8月, 2019 1 次提交

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

22 8月, 2019 1 次提交

add local user data conversion into full_pascalvoc_test_preprocess.py (#19283) · 9240e532

由 lidanqing 提交于 8月 22, 2019

* add local user data conversion into full_pascalvoc_test_preprocess.py
test=develop

* change PADDLE_ENFORCE to PADDLE_ENFORCE_GE
test=develop

* change according to reviews
test=develop

9240e532

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 2 次提交
- Z
  Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0
  由 Zhaolong Xing 提交于 8月 19, 2019
```
* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop
```
  76c95af0
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
15 8月, 2019 1 次提交
- A
  Add generalized Conv+Activation MKLDNN fuse pass creation (#19072) · b837689e
  由 Adam 提交于 8月 15, 2019
```
test=develop
```
  b837689e
12 8月, 2019 1 次提交
- W
  add tensorrt support for windows (#19084) · 80b7ef6f
  由 wopeizl 提交于 8月 12, 2019
```
* add tensorrt support for windows
```
  80b7ef6f
09 8月, 2019 1 次提交
- T
  inference_shared_library support profile (#16275) · 741ce8bb
  由 Tao Luo 提交于 8月 09, 2019
```
test=develop
```
  741ce8bb
05 8月, 2019 1 次提交
- S
  test=develop,Synchronize the contents of develop with release1.5 (#18937) · fd3b666d
  由 silingtong123 提交于 8月 05, 2019
```
Fix the third-party openblas dependency for paddle on windows
```
  fd3b666d
02 8月, 2019 1 次提交

石

Fusion: seqpool_cvm_concat (#18471) · ee2f296e

由石晓伟提交于 8月 02, 2019

* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop

ee2f296e

31 7月, 2019 2 次提交

fix several security bugs reported by security team (#18831) · 0d996908

由 liuwei1031 提交于 7月 31, 2019

* fix security issue, test=develop

* bug fix, test=develop

* throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop

0d996908

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

24 7月, 2019 1 次提交

Update trt5 for paddle-trt (#18645) · 26ae6d49

由 Zhaolong Xing 提交于 7月 24, 2019

* update paddle-trt for:
    1. fix bug: when batch > 2, core in split plugin.
    2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
    3. add new attr to dropout.
    4. shuffle channel, swish, relu6 support
    test=develop

* 1. fix ci
test=develop

26ae6d49

17 7月, 2019 1 次提交

石

Fix Bitmain Predictor::Clone() (#18599) · 25d80791

由石晓伟提交于 7月 17, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* load model from buffer with length

test=develop

* modify the access level of class

test=develop

* support anakin for bitmain arch

test=develop

* remove files

* checkout cmakelists

test=develop

* modify interfaces

test=develop

* add cmake dependments

test=develop

* enforce the outputs of net

test=develop

25d80791

11 7月, 2019 1 次提交

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331

由 Tao Luo 提交于 7月 11, 2019

* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop

076f8331

09 7月, 2019 1 次提交

Fix/gcc 4.8 ubt link error (#18558) · 667f88f9

由 Jiabin Yang 提交于 7月 09, 2019

* test=develop, fix docker with paddle nccl problem

* test=develop, fix/gcc_4.8_ubt_link_error

* test=develop, fix code format

667f88f9

08 7月, 2019 2 次提交

Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532) · 88b52a27

由 Zhaolong Xing 提交于 7月 08, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

88b52a27

石

Support Bitmain Anakin (#18542) · 15291548

由石晓伟提交于 7月 08, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* load model from buffer with length

test=develop

* modify the access level of class

test=develop

* support anakin for bitmain arch

test=develop

* remove files

* checkout cmakelists

test=develop

15291548

03 7月, 2019 1 次提交
- 石
  Remove the obsolete cmake options (#18481) · 047bba85
  由石晓伟提交于 7月 03, 2019
```
* remove the obsolete cmake options, test=develop

* remove unittests, test=develop
```
  047bba85
02 7月, 2019 1 次提交
- T
  remove unused AnalysisPredictor::SetMkldnnThreadID() (#18444) · 3123d187
  由 Tao Luo 提交于 7月 02, 2019
```
test=develop
```
  3123d187
01 7月, 2019 1 次提交

Fix Pooling output scale (#18186) · 7023a86c

由 Michał Gallus 提交于 7月 01, 2019

* Int8: Fix Pooling output scale

test=develop

* Update scales quantization for certain operators

These include: concat, transpose, pool and reshape. test=develop

* Move concat minimum scale finding to quantizer

test=develop

7023a86c

27 6月, 2019 2 次提交
- M
  Reset DeviceContext after quantization warmup (#18182) · 84096932
  由 Michał Gallus 提交于 6月 27, 2019
```
test=develop
```
  84096932
- S
  add int8 mkldnn prior_box (#17242) · 9252e8fa
  由 Sylwester Fraczek 提交于 6月 27, 2019
```
add prior_box quantization code

add scale algo rules for prior box

test=develop
```
  9252e8fa
21 6月, 2019 1 次提交
- W
  
  fix package generation for inference test=develop (#18220) · daa32d53
  由 wopeizl 提交于 6月 21, 2019
  
  daa32d53
19 6月, 2019 1 次提交

翟

fix spelling errors (#17941) · 802ea509

由翟飞跃提交于 6月 19, 2019

* fix spelling errors; test=develop

* Update API.spec

update md5

* Update API.spec

* change the order of api;test=develop

802ea509

12 6月, 2019 1 次提交
- 石
  modify the access level of anakin engine (#18015) · 04ea7cb0
  由石晓伟提交于 6月 12, 2019
```
test=develop
```
  04ea7cb0
11 6月, 2019 1 次提交

石

Update the Anakin interfaces for content-dnn and MLU (#17890) · bce259e5

由石晓伟提交于 6月 11, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

bce259e5

06 6月, 2019 3 次提交

石
update the initialization of anakin subgraph (#17880) · d008260f
由石晓伟提交于 6月 06, 2019
```
test=develop
```
d008260f
Z
fix: when use the load model from memory mode, the RAM occupy is high (#17788) · ae576f3c
由 Zhaolong Xing 提交于 6月 06, 2019
```
test=develop
```
ae576f3c

翟

INT8 MKL-DNN v2 integrate to slim (#17634) · 993c703b

由翟飞跃提交于 6月 06, 2019

* refactor PR 16865

* delete mergetool files

* test=develop

* test=develop

* test=develop

* test=develop

* create dir for int8 model before call SaveOptimModel

* test=develop

* mkldnn int8 only support linux; test=develop

* refine code; test=develop

* remove comment; test=develop

* refine code; test=develop

* fix bug; test=develop

* add exception for mkldnn_post_training_strategy

* reuse int8v2 CAPI dataset; test=develop

* fix accuracy check bug; test=develop

* remove tab

* convert files to unix format

* test=develop

* reduce CI time;test=develop

* reduce CI time and refine code;test=develop

* refine comment; test=develop

* add cmake FLAGS;test=develop

* remove predict_num;test=develop

993c703b

03 6月, 2019 1 次提交
- T
  make omp thread num default 1 after inference run (#17801) · e089e454
  由 Tao Luo 提交于 6月 03, 2019
```
test=develop
```
  e089e454
29 5月, 2019 1 次提交
- M
  
  Capi for a ngraph engine (#17037) · 5eb81fe5
  由 mozga-intel 提交于 5月 28, 2019
  
  5eb81fe5
28 5月, 2019 2 次提交

Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570) · 04b6c29e

由 lidanqing 提交于 5月 28, 2019

* add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test
test=develop

* change fasle and 0.0 to fuse_brelu and brelu_threshold
test=develop

change the "fuse_relu||fuse_brelu" to "unsigned_output"
test=develop

* Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18
test=develop

* continuous-integration fix
test=develop

04b6c29e

[MKL-DNN] conv_transpose mkldnn bias pass (#17644) · 6d8075ec

由 Jacek Czaja 提交于 5月 28, 2019

* - changes to graph detector

- Changes to pass

- Added ut for new pass

- use_pass

- Added pass to mkldnn passes

- fix to registration

- improved verbose messaging for conv bias passes

- Lint fixes

test=develop

* - Lint fixes

test=develop

6d8075ec

27 5月, 2019 3 次提交

add Concat quantization (#17448) · 96845d21

由 Sylwester Fraczek 提交于 5月 27, 2019

* add Concat quantization
add unit test for quantizing concat
fix for wrong value when the input is not in map of calculated scales
add use_quantizer to concat_op.cc
add scale_algo rules for concat

test=develop

* missing fix for multiple inputs quantize-squash

* wojtuss review fix: adding comment

test=develop

96845d21

Fix the bug in the AnalysisPredictor and add more directions about io APIs. (#17639) · 8bd651b7

由 Zhen Wang 提交于 5月 27, 2019

* fix the bug that sub_scope_ may be null in AnalysisPredictor::Run.

* add more directions about io APIs' docs.

* update the API.spec. test=develop test=document_preview

8bd651b7

Code clean of Allocator (#17602) · 4aa931dd

由 Zeng Jinle 提交于 5月 27, 2019

* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop

* clean code of allocator,test=develop

* delete zero_size_allocator.h,test=develop

* fix failed unittest,test=develop

4aa931dd

25 5月, 2019 1 次提交

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

24 5月, 2019 2 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

Conv concat relu quantization (#17466) · 5b2a3c4b

由 Sylwester Fraczek 提交于 5月 24, 2019

* add conv_concat_relu fuse

test=develop

* add test code

test=develop

* added missing include with unordered_map

test=develop

* review fixes for wojtuss

test=develop

* remove 'should (not) be fused' comment statements

one of them was invalid anyway

test=develop

5b2a3c4b

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致