提交 · af5bb2a3d98cba4733e98b57c5bd34a7edce9767 · 机器未来 / Paddle

13 1月, 2020 1 次提交
- 石
  
  revert paddle_fluid.map, test=release/1.7 (#22227) · af5bb2a3
  由石晓伟提交于 1月 13, 2020
  
  af5bb2a3
10 1月, 2020 2 次提交

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

石
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) (#22185) · e8e12499
由石晓伟提交于 1月 10, 2020
```
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop

* export FLAGS and GLOG symbols, test=develop
```
e8e12499

16 12月, 2019 1 次提交
- 石
  
  fix analysis_predictor when func is called multiple times, test=release/1.6 (#21663) · 70c073a0
  由石晓伟提交于 12月 16, 2019
  
  70c073a0
09 12月, 2019 1 次提交
- Z
  Revert "CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449)" (#21619) · f7c629d9
  由 Zhaolong Xing 提交于 12月 09, 2019
```
This reverts commit 0473cdb8.
```
  f7c629d9
08 12月, 2019 1 次提交
- Z
  CHERRY_PICK: Fix the bug for inference when using auto grwoth allocator (#21623) · d0943dbe
  由 Zhaolong Xing 提交于 12月 08, 2019
```
test=release/1.6
```
  d0943dbe
06 12月, 2019 1 次提交
- 石
  
  fix ZeroCopyTensor::mutable_data(), test=release/1.6 (#21581) · e228e707
  由石晓伟提交于 12月 06, 2019
  
  e228e707
05 12月, 2019 1 次提交
- P
  
  cherry-pick fix muting glog warning message, test=release/1.6 (#21576) · a6433f8b
  由 Pei Yang 提交于 12月 05, 2019
  
  a6433f8b
04 12月, 2019 2 次提交
- P
  make config option DisableGlogInfo() able to mute all inference logs (#21544) · 857cd9f8
  由 Pei Yang 提交于 12月 04, 2019
```
make config option DisableGlogInfo() able to mute all inference logs
```
  857cd9f8
- Z
  [cherry-pick] NV JETSON support and auto_growth strategy for inference. (#21500) · 20a09375
  由 Zhaolong Xing 提交于 12月 04, 2019
```
* ADD NV JETSON SUPPORT
test=release/1.6

* CHERRY_PICK: specify the auto growth allocator for inference.
test=release/1.6
```
  20a09375
03 12月, 2019 2 次提交
- L
  Fix transpose conv (#21406), test=release/1.6 (#21510) · 1fbc45b7
  由 Lv Mengsi 提交于 12月 03, 2019
```
* fix transpose conv,test=develop

* fix comments
test=develop
```
  1fbc45b7
- 石
  
  revert ProgOptimUnsupported check, test=release/1.6 (#21475) · 5c7c6b1e
  由石晓伟提交于 12月 03, 2019
  
  5c7c6b1e
02 12月, 2019 1 次提交
- Z
  
  CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449) · 0473cdb8
  由 Zhaolong Xing 提交于 12月 02, 2019
  
  0473cdb8
29 11月, 2019 2 次提交
- P
  fix trt weight bug (#21231) (#21443) · 77268831
  由 Pei Yang 提交于 11月 29, 2019
```
added splitter "__" between weight name and suffix number to avoid conflicts.
```
  77268831
- W
  
  Fp32 vs int8 qat C++ performance (#21244) (#21432) · 06545fcf
  由 Wojciech Uss 提交于 11月 29, 2019
  
  06545fcf
25 11月, 2019 1 次提交

Fix the CAPI ZeroCopy shape error and reuse the code to get output (#21240) (#21345) · c75b162a

由 liu zhengxi 提交于 11月 25, 2019

* fix the CAPI ZeroCopy shape error and reconstruct the output obtain

* use an anonymous namespace to cover the functor

* fix unit tests because of the output of typeid(T).name() is different from linux and windows, test=develop

c75b162a

02 11月, 2019 1 次提交
- 石
  fix infer crashes caused by conv/pool upgrades, test=release/1.6 (#20969) · 53f1e024
  由石晓伟提交于 11月 02, 2019
```
* fix infer crashes caused by conv/pool upgrades, test=release/1.6

* fix bug, test=release/1.6
```
  53f1e024
01 11月, 2019 1 次提交

CHERRY_PICK: 20955, 20966 (#20968) · 692a04ec

由 Zhaolong Xing 提交于 11月 01, 2019

Paddle-trt inference: filter conv, depthwise_conv, pooling when padding size > 4
fix C++ multicard  inference bug.
test=develop

692a04ec

31 10月, 2019 1 次提交

Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and... · 1948210c

由 Pei Yang 提交于 10月 31, 2019

Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and "num" attribute in split op converter (#20733) (#20902)

* fix pool2d trt converter, test=develop

* add fix for split op converter, test=develop

1948210c

20 10月, 2019 1 次提交
- B
  
  update int8 doc (#20738) · 06452f1a
  由 bingyanghuang 提交于 10月 20, 2019
  
  06452f1a
18 10月, 2019 2 次提交
- 石
  
  change anakin apis, test=develop (#20692) · 8fb760da
  由石晓伟提交于 10月 18, 2019
  
  8fb760da
- L
  [cherry-pick] c api update in PD_PredictorRun (#20705) · 33a58e58
  由 liu zhengxi 提交于 10月 18, 2019
```
* improve the performance of capi in PD_PredictorRun (#20665)

* alter the capi of PD_PredictorRun to provide proper function, test=release/1.6
```
  33a58e58
16 10月, 2019 1 次提交
- M
  Add document for int8 object detection quantization (#19356) (#20669) · 3880f3d2
  由 Michał Gallus 提交于 10月 16, 2019
```
test=release/1.6 test=document_fix
```
  3880f3d2
15 10月, 2019 1 次提交
- L
  
  fix the PD_ZeroCopyPredictorRun output problem and cmake, test=release/1.6 (#20624) · 1822f86e
  由 liu zhengxi 提交于 10月 15, 2019
  
  1822f86e
14 10月, 2019 2 次提交

P

add DisableGlogInfo() to AnalysisConfig, test=develop (#20581) (#20600) · fed1263c
由 Pei Yang 提交于 10月 14, 2019

fed1263c

[cherry-pick] Add multihead matmul fuse pass(#20167) (#20592) · cefbcf77

由 zhaoyuchen2018 提交于 10月 13, 2019

* Add Multihead matmul fuse pass (#20167)

* Add multihead fuse pass for ernie opt

* Refine softmax

test=develop

* Refine cuda kernel

* Refine cuda version

* Refine cmake

test=develop

* refine header file

* refine test case and pass
* refine comments

* Delete useless code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

cefbcf77

12 10月, 2019 1 次提交
- L
  
  remove incorrect new in c style, test=release/1.6 (#20456) · f0a0c338
  由 liu zhengxi 提交于 10月 12, 2019
  
  f0a0c338
10 10月, 2019 1 次提交

[Cherry-pick] Add C-API for fluid inference api (#20259) · f72d82cc

由 liu zhengxi 提交于 10月 10, 2019

* Add capi for fluid inference api (#20092)

* add capi for fluid inference api, including AnalysisConfig, AnalysisPredictor, PaddleBuf, PaddleTensor, ZeroCopyTensor

* add dll to inference capi (#20180)

* add dll to inference capi, test=develop

* add if win32 in cmakelists, test=develop

f72d82cc

01 10月, 2019 1 次提交
- 石
  
  fix analysis_predictor ci, test=release/1.6 (#20140) · 411f7b42
  由石晓伟提交于 10月 01, 2019
  
  411f7b42
27 9月, 2019 1 次提交

石

update operator compatible info, test=develop (#19978) · 01b9d079

由石晓伟提交于 9月 27, 2019

* update operator compatible info, test=develop

* revert cmake/version.cmake, test=develop

* add unit_tests and fix bugs, test=develop

* update ../paddle/fluid/framework/framework.proto, test=develop

* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop

* update paddle/fluid/framework/version_test.cc, test=develop

* add comments and rename interfaces, test=develop

01b9d079

25 9月, 2019 2 次提交

FIx C++ inference BUG: When open memory optim and enable trt subgraph at the... · e89b1288

由 Zhaolong Xing 提交于 9月 25, 2019

FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969)

* fix memory optimization type
test=develop

* 1. fix BUG: open trt and memory optim will trigger bug.
2. Clean memory optim bug.
test=develop

e89b1288

Removing length dims constraints of seq_pad and seq_unpad (#19497) · 99a9615a

由 Aurelius84 提交于 9月 25, 2019

* Removing last dims constraints of seq_pad and seq_unpad test=develop

* fix test_layer api code test=develop

* fix sequence_pad_op.cc conflict test=develop

* remove test_analyzer_mm_dnn test=develop

* fix vectorize bug test=develop

* fix vectorize<int> test=develop

99a9615a

21 9月, 2019 3 次提交
- P
  Add two extra flags for test_analyzer_int8_image_classification to disable fp32/int8 (#19840) · 2c5c6365
  由 pawelpiotrowicz 提交于 9月 21, 2019
```
test=develop
```
  2c5c6365
- P
  Add TRT input shape check between model and runtime (#19864) · baccd7e2
  由 Pei Yang 提交于 9月 21, 2019
```
* add TRT shape check, test=develop

* model_input_shape == runtime_input_shape, refine message, test=develop
```
  baccd7e2
- P
  Fix BUGS: paddle-TRT repeatedly sets weight_map and overdeletes repetitive_params (#19825) · 74812d1c
  由 Pei Yang 提交于 9月 21, 2019
```
* fix trt bugs when sharing params, test=develop

* add unittest for cascade_rcnn
```
  74812d1c
20 9月, 2019 1 次提交
- 石
  
  fix multi-thread exec of trt, test=develop (#19338) · d004a0f5
  由石晓伟提交于 9月 20, 2019
  
  d004a0f5
19 9月, 2019 1 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

18 9月, 2019 1 次提交
- 石
  
  support MLU nums, test=develop (#19372) · 71b2ed61
  由石晓伟提交于 9月 18, 2019
  
  71b2ed61
17 9月, 2019 2 次提交
- P
  zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff
  由 Pei Yang 提交于 9月 17, 2019
```
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
```
  9cbc1eff
- Z
  fix memory optimization type (#19781) · 110be57c
  由 Zhaolong Xing 提交于 9月 17, 2019
```
test=develop
```
  110be57c

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致