提交 · a8dfff991122ee208bd8b33010b38cccec27cf9b · 机器未来 / Paddle

02 2月, 2021 1 次提交

add DLA support：C++&&Python api (#30165) (#30810) · a8dfff99

由 Shang Zhizhou 提交于 2月 02, 2021

* add dla

* add python api
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>

a8dfff99

03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

14 9月, 2020 1 次提交
- P
  
  refine error message related to paddle-TRT (#27256) · aae41c6f
  由 Pei Yang 提交于 9月 14, 2020
  
  aae41c6f
19 8月, 2020 1 次提交
- Z
  fix dy shape bug in trt7.1 (#26273) · b7a86e92
  由 Zhaolong Xing 提交于 8月 19, 2020
```
test=develop
```
  b7a86e92
05 8月, 2020 1 次提交

Fix registering trt plugin (#25744) · b717895f

由 Pei Yang 提交于 8月 05, 2020

* develop dynamic shape serilization

* add test param for gelu

* fix bugs

* delete redundant comments

* debug

* fix conflict. test=develop

* fix bug. test=develop

* add trt dynamic shape serialized support

* fix ernie serialized bug
test=develop

* fix codestyle
test=develop

* fix bug
test=develop

* fix bug.test=develop

* modify cmakelist test=develop

* fix bug
test=develop

* fix error message.  test=develop

* fix trt register plugin based on pr#25003

* add trt dynload

* fix deserialization bug of not finding plugin registration

* refine code style

* recover engine key in tensorrt_subgraph_pass

* for ci coverage

* add unittest for deserialization
Co-authored-by: Nhaozech <chenhaoze94@gmail.com>

b717895f

15 6月, 2020 1 次提交

bugfix for unique_ptr of IOptimizationProfile (#23917) · bef4afa6

由 Jeng Bai-Cheng 提交于 6月 15, 2020

This commit fixs the compiling bug regarding unique_ptr of IOptimizationProfile.

IOptimizationProfile has protected dtor and is controlled by TensorRT
internally. Application shouldn't delete the pointer of IOptimizationProfile.
See TensorRT document: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_builder.html#a9ac47e100454151d8206ac91d543299a
test=develop

bef4afa6

26 3月, 2020 1 次提交

[Paddle-TRT]: Ernie Dynamic shape support. (#23138) · 430b0099

由 Zhaolong Xing 提交于 3月 26, 2020

* add dynamic plugin support.
test=develop

* change emb eltwise layernorm to math function
test=develop

* add emb eltwise layernorm
test=develop

* can run dynamic shape ernie
test=develop

* fix ci
test=develop

* add ut for trt ernie dynamic

test=develop

* refine dynamic shape c++ interface.
test=develop

* fix comments
test=develop

* fix comments
test=develop

430b0099

09 3月, 2020 1 次提交

[Paddle-TRT] : (Part1) Dynamic shape support (#22868) · dd67d44a

由 Zhaolong Xing 提交于 3月 09, 2020

* change the ci trt from version 5. to 6.0

* paddle-trt dynamic shape support init

* conv+bias or conv+bn dynamic shape support
test=develop

* modity trt engine opconvert
test=develop

* fix ci error
test=develop

dd67d44a

05 2月, 2020 1 次提交
- Z
  [Fix BUG]: Core when multi thread + clone + paddle-trt (#22442) · ceda0b9b
  由 Zhaolong Xing 提交于 2月 05, 2020
```
* add mutex for trt engine
test=develop

* add the test for copy_to_cpu
test=develop
```
  ceda0b9b
20 11月, 2019 1 次提交
- P
  fix trt weight bug (#21231) · 2e2f92a5
  由 Pei Yang 提交于 11月 20, 2019
```
added splitter "__" between weight name and suffix number to avoid conflicts.
```
  2e2f92a5
21 9月, 2019 1 次提交
- P
  Fix BUGS: paddle-TRT repeatedly sets weight_map and overdeletes repetitive_params (#19825) · 74812d1c
  由 Pei Yang 提交于 9月 21, 2019
```
* fix trt bugs when sharing params, test=develop

* add unittest for cascade_rcnn
```
  74812d1c
20 9月, 2019 1 次提交
- 石
  
  fix multi-thread exec of trt, test=develop (#19338) · d004a0f5
  由石晓伟提交于 9月 20, 2019
  
  d004a0f5
12 8月, 2019 1 次提交
- W
  add tensorrt support for windows (#19084) · 80b7ef6f
  由 wopeizl 提交于 8月 12, 2019
```
* add tensorrt support for windows
```
  80b7ef6f
31 7月, 2019 1 次提交

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

06 6月, 2019 1 次提交
- Z
  fix: when use the load model from memory mode, the RAM occupy is high (#17788) · ae576f3c
  由 Zhaolong Xing 提交于 6月 06, 2019
```
test=develop
```
  ae576f3c
25 5月, 2019 1 次提交

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

08 3月, 2019 5 次提交
- N
  6. delete useless predictor id · 5863c861
  由 nhzlx 提交于 2月 26, 2019
```
test=develop
```
  5863c861
- N
  5. add static trt load model · f3d164fa
  由 nhzlx 提交于 2月 22, 2019
```
1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop
```
  f3d164fa
- N
  4. do the trt_engine optim during init. · 31008100
  由 nhzlx 提交于 2月 18, 2019
```
add simple static mode loading
test=develop
```
  31008100
- N
  
  2. TRTEngine using stream only when execute. · 8c171902
  由 nhzlx 提交于 2月 14, 2019
  
  8c171902
- N
  add static model load for trt · 88c24baa
  由 nhzlx 提交于 2月 14, 2019
```
1. bind trt input and output to fluid tensors
```
  88c24baa
26 2月, 2019 1 次提交
- N
  6. delete useless predictor id · 0ed63b21
  由 nhzlx 提交于 2月 26, 2019
```
test=develop
```
  0ed63b21
22 2月, 2019 1 次提交

5. add static trt load model · 1d5ef7c9

由 nhzlx 提交于 2月 22, 2019

1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop

1d5ef7c9

18 2月, 2019 1 次提交
- N
  4. do the trt_engine optim during init. · 2070fb24
  由 nhzlx 提交于 2月 18, 2019
```
add simple static mode loading
test=develop
```
  2070fb24
14 2月, 2019 2 次提交
- N
  
  2. TRTEngine using stream only when execute. · 9cc6249c
  由 nhzlx 提交于 2月 14, 2019
  
  9cc6249c
- N
  add static model load for trt · 034ba1c2
  由 nhzlx 提交于 2月 14, 2019
```
1. bind trt input and output to fluid tensors
```
  034ba1c2
22 1月, 2019 1 次提交

fix trt stream bug. · ec213730

由 nhzlx 提交于 1月 22, 2019

BUG: After continuing to input different data, the output cannot be aligned
test=develop

ec213730

17 1月, 2019 1 次提交
- N
  fix unit test bug · 8817841c
  由 nhzlx 提交于 1月 17, 2019
```
test=develop
```
  8817841c
16 1月, 2019 1 次提交
- N
  add trt int8 calibration support · 312fe0ec
  由 nhzlx 提交于 1月 16, 2019
```
fix comments

test=develop
```
  312fe0ec
09 1月, 2019 1 次提交
- N
  add trt int8 support · 4e3522e5
  由 nhzlx 提交于 1月 09, 2019
```
test=develop
```
  4e3522e5
20 11月, 2018 1 次提交
- Y
  Implement the Tensorrt plugin for elementwise op (#14487) · 8bc1c5d2
  由 Yiqun Liu 提交于 11月 20, 2018
```
* Initialize the elementwise plugin.

* Implement the basic CUDA kernel of elementwise plugin.
test=develop
```
  8bc1c5d2
16 11月, 2018 2 次提交
- H
  
  Refine commit message to enable ci, test=develop · 6a7b9957
  由 hjchen2 提交于 11月 16, 2018
  
  6a7b9957
- H
  
  Complete PRelu plugin and Conv2d transpose op converter · 21f33b42
  由 hjchen2 提交于 11月 15, 2018
  
  21f33b42
13 11月, 2018 2 次提交
- N
  fix comments · 0b962680
  由 nhzlx 提交于 11月 13, 2018
```
test=develop
```
  0b962680
- N
  
  add plugin support and offer an simple split sample · d38fd6a0
  由 nhzlx 提交于 11月 13, 2018
  
  d38fd6a0
09 11月, 2018 1 次提交

Exhaustive search for cuDNN conv. (#14286) · abe20923

由 qingqing01 提交于 11月 09, 2018

* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop

abe20923

06 11月, 2018 1 次提交
- N
  
  fix comments and fix bug · 86b99ac9
  由 nhzlx 提交于 11月 06, 2018
  
  86b99ac9
17 10月, 2018 1 次提交
- N
  
  fix googlenet bug with relu · 849a6874
  由 nhzlx 提交于 10月 16, 2018
  
  849a6874
17 8月, 2018 1 次提交
- N
  
  1. change tensorrt op from cpu to gpu · 1600ba86
  由 nhzlx 提交于 8月 17, 2018
  
  1600ba86

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致