提交 · 0a51098a715fd648d8ff9cf95f48f5db6ba3ebf9 · Crayon鑫 / Paddle

06 1月, 2020 1 次提交

Add TRT support for BERT (#21135) · 0a51098a

由 Pei Yang 提交于 1月 06, 2020

* add gelu plugin

* align trt bert with gpu

* add support for fused fc with relu,

* add unittest for bert trt

0a51098a

30 12月, 2019 1 次提交
- Z
  
  Modify demo_ci to support Windows, prepare for PR_Windows_Inference (#21873) · e66f92d1
  由 zhouwei25 提交于 12月 30, 2019
  
  e66f92d1
27 12月, 2019 1 次提交
- 石
  fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) · 03479469
  由石晓伟提交于 12月 27, 2019
```
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop

* export FLAGS and GLOG symbols, test=develop
```
  03479469
20 12月, 2019 1 次提交

Disable memory opt pass when DNNL is on (#21826) · 253e6642

由 Michał Gallus 提交于 12月 20, 2019

* Disable memory opt pass when DNNL is on

* Refine comment above mem optimization pass enablement

test=develop

253e6642

16 12月, 2019 1 次提交
- 石
  
  fix analysis_predictor when func is called multiple times, test=release/1.6 (#21665) · 2bb13582
  由石晓伟提交于 12月 16, 2019
  
  2bb13582
12 12月, 2019 1 次提交

Add reshape int8 mkldnn op (#21428) · d419b859

由 joanna.wozna.intel 提交于 12月 12, 2019

* Add reshape int8 op

test=develop

* Change test to CPUPlace

test=develop

* Correct tests

test=develop

d419b859

11 12月, 2019 1 次提交
- Z
  there is bug for inference using auto grwoth allocator (#21621) · fbbd94a6
  由 Zhaolong Xing 提交于 12月 11, 2019
```
test=develop
```
  fbbd94a6
10 12月, 2019 1 次提交
- R
  fix: fail to call ZeroCopyTensor::mutable_data() when device_id is no… (#21461) · 7f5d532a
  由 rensilin 提交于 12月 10, 2019
```
* ZeroCopyTensor::mutable_data in the right device, test=develop

* add unittest for zerocopy, test=develop
```
  7f5d532a
05 12月, 2019 1 次提交
- P
  
  fix glog warning, test=develop (#21573) · 20d61414
  由 Pei Yang 提交于 12月 05, 2019
  
  20d61414
04 12月, 2019 1 次提交
- P
  make config option DisableGlogInfo() able to mute all inference logs (#21318) · 122b37ce
  由 Pei Yang 提交于 12月 04, 2019
```
* make DisableGlogInfo able to mute all logs in inference. 
```
  122b37ce
03 12月, 2019 1 次提交
- Z
  specify the auto growth allocator for inference. (#21448) · b39c0116
  由 Zhaolong Xing 提交于 12月 03, 2019
```
test=develop
```
  b39c0116
02 12月, 2019 1 次提交
- L
  Fix transpose conv (#21406) · 37f3e56d
  由 Lv Mengsi 提交于 12月 02, 2019
```
* fix transpose conv,test=develop

* fix comments
test=develop
```
  37f3e56d
27 11月, 2019 2 次提交

Z
fix C++ multicard inference bug. (#20955) · d1a6e112
由 Zhaolong Xing 提交于 11月 27, 2019
```
test=develop
```
d1a6e112

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

26 11月, 2019 1 次提交
- S
  
  add prediction demo and script on windows (#21248) · 45c1e7bb
  由 silingtong123 提交于 11月 26, 2019
  
  45c1e7bb
19 11月, 2019 1 次提交
- Z
  
  Determine whether to copy and link inference lib by ON_INFER (#20931) · c0dcb090
  由 zhouwei25 提交于 11月 19, 2019
  
  c0dcb090
18 11月, 2019 1 次提交
- Z
  TRT int8: refine trt int8 for dynamic range set (#21112) · 65f70525
  由 Zhaolong Xing 提交于 11月 18, 2019
```
* refine trt int8 for dynamic range set
test=develop

* refine trt int8
test=develop
```
  65f70525
08 11月, 2019 1 次提交

Add transpose2 INT8 for mkl-dnn (#19424) · 77c20835

由 joanna.wozna.intel 提交于 11月 08, 2019

* Add transpose2 INT8 for mkl-dnn

test=develop

* Fix test_transpose_int8_mkldnn

test=develop

* Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"

This reverts commit 34011bdb, reversing
changes made to 2ce6473f.

* Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2""

This reverts commit 23754dd7.

* Add template to TransposeMKLDNNHandler

test=develop

* Resolve conflict

test=develop

* Restore get_size and refactor

test=develop

77c20835

23 10月, 2019 2 次提交
- P
  Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and... · e89c16b9
  由 Pei Yang 提交于 10月 23, 2019
```
Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and "num" attribute in split op converter (#20733)

* fix pool2d trt converter, test=develop

* add fix for split op converter, test=develop
```
  e89c16b9
- 石
  
  optimize version error, test=develop (#20715) · e742760f
  由石晓伟提交于 10月 23, 2019
  
  e742760f
18 10月, 2019 1 次提交
- 石
  Ensure backward compatibility with the anakin interface, test=develop (#20691) · d8f4f423
  由石晓伟提交于 10月 18, 2019
```
* support MLU nums, test=develop

* change anakin apis, test=develop
```
  d8f4f423
14 10月, 2019 1 次提交
- P
  
  add DisableGlogInfo() to AnalysisConfig, test=develop (#20581) · 443f604c
  由 Pei Yang 提交于 10月 14, 2019
  
  443f604c
13 10月, 2019 1 次提交

Add Multihead matmul fuse pass (#20167) · b8333ede

由 zhaoyuchen2018 提交于 10月 13, 2019

* Add multihead fuse pass for ernie opt

* Refine softmax

test=develop

* Refine cuda kernel

* Refine cuda version

* Refine cmake

test=develop

* refine header file

* refine test case and pass
* refine comments

b8333ede

12 10月, 2019 1 次提交

Add ConvTranspose + BatchNorm fuse pass (#20161) · 7faa3e95

由 Adam 提交于 10月 12, 2019

* Add ConvTranspose + BatchNorm fuse pass
test=develop

* Add tests for conv+bn and conv_transpose+bn passes
test=develop

7faa3e95

10 10月, 2019 1 次提交
- 石
  
  fix analysis_predictor ci, test=release/1.6 (#20141) · 2c28e328
  由石晓伟提交于 10月 10, 2019
  
  2c28e328
27 9月, 2019 1 次提交

石

update operator compatible info, test=develop (#19978) · 01b9d079

由石晓伟提交于 9月 27, 2019

* update operator compatible info, test=develop

* revert cmake/version.cmake, test=develop

* add unit_tests and fix bugs, test=develop

* update ../paddle/fluid/framework/framework.proto, test=develop

* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop

* update paddle/fluid/framework/version_test.cc, test=develop

* add comments and rename interfaces, test=develop

01b9d079

25 9月, 2019 1 次提交

FIx C++ inference BUG: When open memory optim and enable trt subgraph at the... · e89b1288

由 Zhaolong Xing 提交于 9月 25, 2019

FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969)

* fix memory optimization type
test=develop

* 1. fix BUG: open trt and memory optim will trigger bug.
2. Clean memory optim bug.
test=develop

e89b1288

19 9月, 2019 1 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

18 9月, 2019 1 次提交
- 石
  
  support MLU nums, test=develop (#19372) · 71b2ed61
  由石晓伟提交于 9月 18, 2019
  
  71b2ed61
17 9月, 2019 1 次提交

zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff

由 Pei Yang 提交于 9月 17, 2019

zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)

9cbc1eff

11 9月, 2019 1 次提交

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

09 9月, 2019 1 次提交

paddle::framework::vectorize() templatization [PART3] (#19643) · f05d2c51

由 Tao Luo 提交于 9月 09, 2019

* paddle::framework::vectorize() templatization

test=develop

* update pybind/imperative.cc

test=develop

* revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc

test=develop

f05d2c51

03 9月, 2019 1 次提交

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

30 8月, 2019 2 次提交

L

add dynamic C runtime support on windows, test=develop (#19502) · d6cb1a41
由 liuwei1031 提交于 8月 30, 2019

d6cb1a41

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

22 8月, 2019 1 次提交

add local user data conversion into full_pascalvoc_test_preprocess.py (#19283) · 9240e532

由 lidanqing 提交于 8月 22, 2019

* add local user data conversion into full_pascalvoc_test_preprocess.py
test=develop

* change PADDLE_ENFORCE to PADDLE_ENFORCE_GE
test=develop

* change according to reviews
test=develop

9240e532

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 2 次提交
- Z
  Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0
  由 Zhaolong Xing 提交于 8月 19, 2019
```
* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop
```
  76c95af0
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
15 8月, 2019 1 次提交
- A
  Add generalized Conv+Activation MKLDNN fuse pass creation (#19072) · b837689e
  由 Adam 提交于 8月 15, 2019
```
test=develop
```
  b837689e

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致