提交 · 9cbc1eff2dc11142893805c2260a4386e39ebfcd · PaddlePaddle / Paddle

17 9月, 2019 1 次提交

zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff

由 Pei Yang 提交于 9月 17, 2019

zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)

9cbc1eff

16 9月, 2019 1 次提交

Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758

由 Yiqun Liu 提交于 9月 16, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

* Enhance fc_fuse_pass to enable fusing relu.

* Allow print the shapes of var_desc in graph.
test=develop

* Enhance fc_fuse_pass_tester.

* Remove the use of PADDLE_ENFORCE.
test=develop

* Correct the number of ops after fusing.
test=develop

* Fix a typo.
test=develop

* Set activation_type to null when there is no relu in fc.
test=develop

* Refine fc_fuse_pass's codes.

* Enable the set of shape for tensor.

* Refine repeated_fc_relu_pass and add unittest.
test=develop

c67c8758

03 9月, 2019 1 次提交

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

22 8月, 2019 1 次提交

add local user data conversion into full_pascalvoc_test_preprocess.py (#19283) · 9240e532

由 lidanqing 提交于 8月 22, 2019

* add local user data conversion into full_pascalvoc_test_preprocess.py
test=develop

* change PADDLE_ENFORCE to PADDLE_ENFORCE_GE
test=develop

* change according to reviews
test=develop

9240e532

15 8月, 2019 1 次提交

Fix mAP problem in unit test of int8 object detection test (#18946) · 07a4d8f8

由 lidanqing 提交于 8月 15, 2019

* change the top1 comparison to mAP comparison
test=develop

* change the mobilenet-ssd tester demo data and batch_size settings
test=develop

07a4d8f8

30 7月, 2019 1 次提交

Revert "use static variable to do cache instead of thread local in thread... · 10eeed93

由 Leo Zhao 提交于 7月 30, 2019

Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428)" (#18879)

This reverts commit ce38bb53.

test=develop

10eeed93

11 7月, 2019 1 次提交

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331

由 Tao Luo 提交于 7月 11, 2019

* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop

076f8331

08 7月, 2019 2 次提交
- L
  
  use static variable to do cache instead of thread local in thread frequent switching case (#18428) · ce38bb53
  由 Leo Zhao 提交于 7月 08, 2019
  
  ce38bb53
- T
  add mkldnn shapeblob cache clear strategy (#18513) · fe32879d
  由 Tao Luo 提交于 7月 08, 2019
```
* add mkldnn shapeblob cache clear strategy

test=develop

* refine with comments

test=develop

* make cache clear strategy more safey

test=develop

* add lock for GetShapeBlobSize

test=develop
```
  fe32879d
05 7月, 2019 1 次提交
- B
  
  fix command line bug in int8v2 readme (#18507) · 3fe6bf5e
  由 bingyanghuang 提交于 7月 05, 2019
  
  3fe6bf5e
03 7月, 2019 2 次提交
- 石
  Remove the obsolete cmake options (#18481) · 047bba85
  由石晓伟提交于 7月 03, 2019
```
* remove the obsolete cmake options, test=develop

* remove unittests, test=develop
```
  047bba85
- T
  add transfer_scope_cache unit-test (#18467) · d234aa02
  由 Tao Luo 提交于 7月 03, 2019
```
test=develop
```
  d234aa02
02 7月, 2019 1 次提交
- T
  remove unused AnalysisPredictor::SetMkldnnThreadID() (#18444) · 3123d187
  由 Tao Luo 提交于 7月 02, 2019
```
test=develop
```
  3123d187
27 6月, 2019 1 次提交

some fixes for int8 mobilenet_ssd tester (#18112) · 5fd68ac1

由 lidanqing 提交于 6月 27, 2019

* some fixes for int8 mobilenet_ssd tester
test=develop

* change wrong data file name
test=develop

* change test images bin file from 200 images to 100 images

* change directory existence to file existence during downloading
test=develop

* reuse download_data
test=develop

* run full dataset when iterations=0
test=develop

5fd68ac1

19 6月, 2019 2 次提交
- 翟
  Change int8v2 CAPI unit test name and add log in the prediction stage (#18200) · de42fe8f
  由翟飞跃提交于 6月 19, 2019
```
* fix issue 18111;test=develop

* fix timer;test=develop

* refine code;test=develop
```
  de42fe8f
- 翟
  
  add mkldnn Int8v2 slim doc (#17909) · 78441c54
  由翟飞跃提交于 6月 19, 2019
  
  78441c54
16 6月, 2019 2 次提交
- W
  unify FP32 vs. INT8 comparison tests output (#18111) · ca5642c8
  由 Wojciech Uss 提交于 6月 16, 2019
```
test=develop
```
  ca5642c8
- W
  reuse C-API INT8 unit test application (#18077) · c26130f3
  由 Wojciech Uss 提交于 6月 16, 2019
```
* reuse C-API INT8 unit test application

test=develop

* updates after review

test=develop
```
  c26130f3
14 6月, 2019 1 次提交

add Mobilienet ssd int8 analyzer tester (#18075) · 46625415

由 lidanqing 提交于 6月 14, 2019

* add pascalvoc preprocess script and mobilenet-ssd analyzer_tester, wait 17737

* change converting local dataset to downloading and converting tarfile
test=develop

* change the test data_path
test=develop

* change copyright (c) 2016 to copyright (c) 2019
test=develop

46625415

13 6月, 2019 2 次提交
- 石
  
  fix ci test cmake test=develop (#18060) · 42f12a4a
  由石晓伟提交于 6月 13, 2019
  
  42f12a4a
- M
  
  Disable MKLDNN FC in Resnet50 test (#18030) · 8462e2b8
  由 Michał Gallus 提交于 6月 13, 2019
  
  8462e2b8
11 6月, 2019 1 次提交

石

Update the Anakin interfaces for content-dnn and MLU (#17890) · bce259e5

由石晓伟提交于 6月 11, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

bce259e5

06 6月, 2019 2 次提交
- M
  
  [NGraph] Bert model for a capi, ngraph's support test=develop (#17844) · c1379bf2
  由 mozga-intel 提交于 6月 06, 2019
  
  c1379bf2
- Z
  fix: when use the load model from memory mode, the RAM occupy is high (#17788) · ae576f3c
  由 Zhaolong Xing 提交于 6月 06, 2019
```
test=develop
```
  ae576f3c
29 5月, 2019 3 次提交
- T
  add fc_mkldnn_pass in compare_mkldnn (#17712) · b4b16946
  由 Tao Luo 提交于 5月 29, 2019
```
test=develop
```
  b4b16946
- Z
  fix trt ci timeout error (#17701) · 4337009b
  由 Zhaolong Xing 提交于 5月 29, 2019
```
test=develop
```
  4337009b
- M
  
  Capi for a ngraph engine (#17037) · 5eb81fe5
  由 mozga-intel 提交于 5月 28, 2019
  
  5eb81fe5
28 5月, 2019 1 次提交

Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570) · 04b6c29e

由 lidanqing 提交于 5月 28, 2019

* add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test
test=develop

* change fasle and 0.0 to fuse_brelu and brelu_threshold
test=develop

change the "fuse_relu||fuse_brelu" to "unsigned_output"
test=develop

* Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18
test=develop

* continuous-integration fix
test=develop

04b6c29e

24 5月, 2019 2 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

fix quantize_squash_pass segfault when no tensor linked to Bias (#17292) · bccb0ba4

由 Sylwester Fraczek 提交于 5月 24, 2019

* fix quantize_squash_pass segfault when there is no tensor linked do Bias input

test=develop

* add googlenet test

test=develop

* fix concat CreateKey not using input format

test=develop

bccb0ba4

22 5月, 2019 1 次提交
- L
  fix bug that saved optimal model path in test_analyzer_save_model con… (#17555) · daf88968
  由 lijianshe02 提交于 5月 22, 2019
```
* modify saved model path in analyzer_save_model.cc test=develop
```
  daf88968
21 5月, 2019 2 次提交
- T
  remove unused SERIAL compiler option (#17500) · 3d19f44a
  由 Tao Luo 提交于 5月 21, 2019
```
test=develop
```
  3d19f44a
- L
  Enabling resnet101, vgg16, vgg19 INT8v2 model tests (#17468) · 36757ed2
  由 lidanqing 提交于 5月 21, 2019
```
* Add 6 models tests support in CMake

* enabling resnet101, vgg16, vgg19 INT8v2 model tests
test=develop

* remove SERIAL
test=develop
```
  36757ed2
15 5月, 2019 1 次提交
- F
  bug fix (#17392) · e48dd92f
  由 flame 提交于 5月 15, 2019
```
fix secure bug
```
  e48dd92f
08 5月, 2019 1 次提交
- W
  improved unit test output (#17266) · 984aa905
  由 Wojciech Uss 提交于 5月 08, 2019
```
added printing data type to differentiate int8 and fp32 latency results

test=develop
```
  984aa905
07 5月, 2019 1 次提交

call SetNumThreads everytime to avoid missing omp thread setting (#17224) · 54636a19

由 Leo Zhao 提交于 5月 07, 2019

* call SetNumThreads everytime to avoid missing omp thread setting

resolve #17153
test=develop

* add paddle_num_threads into config for test_analyzer_pyramid_dnn

resolve #17153
test=develop

54636a19

05 5月, 2019 1 次提交
- W
  
  use two GPUs to run the exclusive test test=develop (#17187) · 83c4f772
  由 wopeizl 提交于 5月 05, 2019
  
  83c4f772
30 4月, 2019 1 次提交

fix bn fuse vardesc and add model saver (#17143) · 79ed1c76

由 tensor-tang 提交于 4月 30, 2019

* fix bn fuse vardesc and add model saver

test=develop

* unify save model in test helper

test=develop

* fix mkdir on windows

test=develop

* remove magic number use bn bias var desc

test=develop

79ed1c76

23 4月, 2019 1 次提交
- L
  fix runtime_context_cache bug when gpu model has an op runs only on cpu · 490e7462
  由 luotao1 提交于 4月 23, 2019
```
test=develop
```
  490e7462
22 4月, 2019 1 次提交

add parallel build script to ci … (#16901) · d9991dcc

由 wopeizl 提交于 4月 22, 2019

* add parallel build script to ci test=develop
* 1. classify the test case as single card/two cards/multiple cards type
   2. run test case according to the run type

d9991dcc

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功