提交 · c5548178b0a7dc428d545a532bf2bfcc74ffde3d · 机器未来 / Paddle

03 9月, 2019 1 次提交

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

19 8月, 2019 1 次提交

Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0

由 Zhaolong Xing 提交于 8月 19, 2019

* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop

76c95af0

31 7月, 2019 1 次提交

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

11 7月, 2019 1 次提交

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331

由 Tao Luo 提交于 7月 11, 2019

* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop

076f8331

08 7月, 2019 1 次提交

Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532) · 88b52a27

由 Zhaolong Xing 提交于 7月 08, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

88b52a27

16 6月, 2019 1 次提交
- W
  reuse C-API INT8 unit test application (#18077) · c26130f3
  由 Wojciech Uss 提交于 6月 16, 2019
```
* reuse C-API INT8 unit test application

test=develop

* updates after review

test=develop
```
  c26130f3
11 6月, 2019 2 次提交

石

Update the Anakin interfaces for content-dnn and MLU (#17890) · bce259e5

由石晓伟提交于 6月 11, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

bce259e5

Light mem reuse strategy for inference. (#17925) · 4e8d5a03

由 Zhaolong Xing 提交于 6月 11, 2019

* fix: when use the load model from memory mode, the RAM occupy is high

test=develop

* ligth mem reuse
test=develop

* fix cpplint
test=develop

4e8d5a03

06 6月, 2019 1 次提交
- Z
  fix: when use the load model from memory mode, the RAM occupy is high (#17788) · ae576f3c
  由 Zhaolong Xing 提交于 6月 06, 2019
```
test=develop
```
  ae576f3c
29 5月, 2019 1 次提交
- M
  
  Capi for a ngraph engine (#17037) · 5eb81fe5
  由 mozga-intel 提交于 5月 28, 2019
  
  5eb81fe5
25 5月, 2019 1 次提交

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

16 5月, 2019 1 次提交

Add setting Scope function for the graph class (#17417) · 4a1b7fec

由 Zhen Wang 提交于 5月 16, 2019

* add set_not_owned function for graph

* add scope set. test=develop

* add scope_ptr enforce not null before setting.test=develop

4a1b7fec

09 5月, 2019 1 次提交

fix: (#17279) · 7a3bb061

由 Zhaolong Xing 提交于 5月 09, 2019

1. infernce multi card occupy
2. facebox model inference occupy too much
test=develop

7a3bb061

07 5月, 2019 1 次提交

石

Cherry-pick benchmark related changes from release/1.4 (#17156) · a72dbe9a

由石晓伟提交于 5月 07, 2019

* cherry-pick commit from 88770542

* cherry-pick commit from 3f0b97df

* cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn

(cherry picked from commit 8643dbc2)

* Cherry-Pick from 16662 : Anakin subgraph cpu support

(cherry picked from commit 7ad182e1)

* Cherry-pick from 1662, 16797.. : add anakin int8 support

(cherry picked from commit e14ab180)

* Cherry-pick from 16813 : change singleton to graph RegistBlock
test=release/1.4

(cherry picked from commit 4b9fa423)

* Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2

Support ShuffleNet and MobileNet-v2, test=release/1.4

(cherry picked from commit a6fb066f)

* Cherry-pick : anakin subgraph add opt config layout argument #16846
test=release/1.4

(cherry picked from commit 8121b3ec)

* 1. add shuffle_channel_detect

(cherry picked from commit 6efdea89)

* update shuffle_channel op convert, test=release/1.4

(cherry picked from commit e4726a06)

* Modify symbol export rules

test=develop

a72dbe9a

29 3月, 2019 2 次提交
- S
  
  update tensorrt subgraph_util test=develop · 7b9fc710
  由 Shixiaowei02 提交于 3月 29, 2019
  
  7b9fc710
- S
  
  resolve conflicts with the develop branch test=develop · bddb2cd3
  由 Shixiaowei02 提交于 3月 28, 2019
  
  bddb2cd3
28 3月, 2019 2 次提交

Anakin ssd support · d065b5bf

由 nhzlx 提交于 3月 28, 2019

refine trt first run
add quant dequant fuse pass
omit simplify_anakin_priorbox_detection template
omit transpose_flatten_concat_fuse template
test=develop

d065b5bf

Fix the interface of Pass::Apply (#16484) · ed61d67c

由 chengduo 提交于 3月 27, 2019

* modify the interface of Pass::Allay
test=develop

* Polish code
test=develop

* Fix Travis CI
test=develop

* fix Pass::Apply interface
test=develop

* Fix Travis CI
test=develop

ed61d67c

25 3月, 2019 1 次提交
- W
  Move cpu_quantize_* passes into mkldnn subfolder · 46677fb0
  由 Wojciech Uss 提交于 3月 25, 2019
```
test=develop
```
  46677fb0
22 3月, 2019 1 次提交
- N
  1. Add ANAKIN_ROOT compile option · f3a2e4b3
  由 nhzlx 提交于 3月 22, 2019
```
2. refine trt code
test=develop
```
  f3a2e4b3
21 3月, 2019 1 次提交
- W
  Add enabling quantization (#16326) · cbe2dbf0
  由 Wojciech Uss 提交于 3月 21, 2019
```
* Add enabling quantization

test=develop

* remove unused (here) function
```
  cbe2dbf0
20 3月, 2019 6 次提交
- N
  
  git cherry-pick from feature/anakin-engine: update anakin subgraph #16278 · 07dcf285
  由 nhzlx 提交于 3月 20, 2019
  
  07dcf285
- N
  
  cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189 · a25331bc
  由 nhzlx 提交于 3月 20, 2019
  
  a25331bc
- N
  
  cherry-pick from feature/anakin-engine: add batch interface for pd-anakin #16178 · c79f06d3
  由 nhzlx 提交于 3月 20, 2019
  
  c79f06d3
- N
  cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157 · 69d37f81
  由 nhzlx 提交于 3月 20, 2019
```
support change input size
```
  69d37f81
- N
  
  cherry-pick from feature/anakin-engine: Anakin support facebox #16111 · a1d200a5
  由 nhzlx 提交于 3月 20, 2019
  
  a1d200a5
- N
  
  cherry-pick from feature/anakin-engine: Add subgraph fuse support and anakin engine #16018 · b21770a2
  由 nhzlx 提交于 3月 20, 2019
  
  b21770a2
19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
18 3月, 2019 1 次提交

Add cpu_quantize_pass for C-API quantization (#16127) · 2579ade4

由 Wojciech Uss 提交于 3月 18, 2019

* Add cpu_quantize_pass for C-API quantization

test=develop

* add cpu_quantize_pass test

* fix lint: add include memory unorderd_map and unordered_set

test=develop

* fuse_relu 1

test=develop

* tuned 2 without squash

* fixes

test=develop

* remove unused vars

test=develop

* refactored

test=develop

* fix lint c-style cast -> C++ style cast

test=develop

* remove QuantMax and c style casts

test=develop

* last usage of QuantMax removed

test=develop

* Fix Analysis Predictor UT

Check if memory_optimize_pass has already been added
to the analysis config before adding a new one, so
that it is not added multiple times.
test=develop

* change map to unordered_map

fix the forgotten part of cpu_quantize_pass_tester.cc

test=develop

* removed quantized attribute

* fixed cpu_quantize_pass_tester and op attr comments

test=develop

* removed redundant line

test=debug

* removed gmock

test=develop

* fix after merge

2579ade4

08 3月, 2019 7 次提交
- N
  cant not pass ci · 2891070c
  由 nhzlx 提交于 3月 07, 2019
```
add if use static engine for trt
test=develop
```
  2891070c
- N
  fix comments and fix cpplint · 4b59646e
  由 nhzlx 提交于 2月 27, 2019
```
test=develop
```
  4b59646e
- N
  6. delete useless predictor id · 5863c861
  由 nhzlx 提交于 2月 26, 2019
```
test=develop
```
  5863c861
- N
  5. add static trt load model · f3d164fa
  由 nhzlx 提交于 2月 22, 2019
```
1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop
```
  f3d164fa
- N
  4. do the trt_engine optim during init. · 31008100
  由 nhzlx 提交于 2月 18, 2019
```
add simple static mode loading
test=develop
```
  31008100
- N
  3. when runing in trt mode, do not allocate memory for parameters in fluid. · 4f77248d
  由 nhzlx 提交于 2月 15, 2019
```
test=develop
```
  4f77248d
- N
  add static model load for trt · 88c24baa
  由 nhzlx 提交于 2月 14, 2019
```
1. bind trt input and output to fluid tensors
```
  88c24baa
07 3月, 2019 1 次提交
- N
  cant not pass ci · a9ed4277
  由 nhzlx 提交于 3月 07, 2019
```
add if use static engine for trt
test=develop
```
  a9ed4277
27 2月, 2019 1 次提交
- N
  fix comments and fix cpplint · 06a088a1
  由 nhzlx 提交于 2月 27, 2019
```
test=develop
```
  06a088a1
26 2月, 2019 1 次提交
- N
  6. delete useless predictor id · 0ed63b21
  由 nhzlx 提交于 2月 26, 2019
```
test=develop
```
  0ed63b21
22 2月, 2019 1 次提交

5. add static trt load model · 1d5ef7c9

由 nhzlx 提交于 2月 22, 2019

1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop

1d5ef7c9

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致