提交 · 41ab76e55baad03af2e57a160137b555b366812e · BaiXuePrincess / Paddle

01 7月, 2019 1 次提交

Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964) · 4bc2987d

由 Brian Liu 提交于 7月 01, 2019

* Fix bug in quantize kernel which cause crash in vgg16/19 model

test=develop

* refine the code to reduce verbose code; test=develop

* remove useless code; test=develop

4bc2987d

28 6月, 2019 1 次提交

Fix potential mkldnn concat/pool/conv kernel issues (#18393) · 681d3553

由 Leo Zhao 提交于 6月 28, 2019

1. some key generation method is not aligned with PR#17965
2. enlarge ptr lifetime to avoid memory release if SetBlob fails
   otherwise it will get core dump.

test=develop

681d3553

13 6月, 2019 2 次提交

refactor the function ConvFwdPrimitiveDesc (#17897) · f8ecc3de

由 lidanqing 提交于 6月 13, 2019

* refractor the function ConvFwdPrimitiveDesc
test=develop

* change according to review
test=develop

* use pointer way without boost::optional
test=develop

* pass vector to function by reference instead of raw vector
test=develop

* change pointer to shared_ptr
test=develop

f8ecc3de

Added unit test for QAT FP32 & INT8 comparison (#17814) · 78e93286

由 Wojciech Uss 提交于 6月 13, 2019

* added unit test for QAT FP32 & INT8 comparison

test=develop

* enabled other models and updated filenames

test=develop

* added accuracy check and multiple batch handling

test=develop

* removed quantization_mkldnn_pass.py

test=develop

* cleanup

test=develop

* updated model paths

test=develop

* renamed tests without MKL-DNN

test=develop

* fix reusing mkldnn pool2d primitive

test=develop

* add performance measuring

test=develop

* fix accuracy statistics

test=develop

* removed non-mkldnn tests

test=develop

* added conv2d_depthwise->conv2d mkldnn transformation

test=develop

* format update

test=develop

* fixed creating key for pool2d grad

test=develop

* added pass

* Fix the accuracy issue while using float precision to get the scale.

test=develop

* Fix the format issue when 'X' is not nchw.

test=develop

* removed output comparing and changed number of images

test=develop

* cmake and comment fix

test=develop

* updated acc threshold for QAT comparison tests

test=develop

* added OMP_NUM_THREADS setting

test=develop

* enable all QAT INT8 tests

test=develop

* restored upstream version of a file

test=develop

* modified directory names

test=develop

78e93286

11 6月, 2019 1 次提交

[MKL-DNN] Thread-Safety for MKL-DNN reusing Part 1 (#17965) · 84bb45c0

由 Jacek Czaja 提交于 6月 11, 2019

* - removed is_reusing_

* - Added TID to keys for reusing apart from softmax PD

* - compilation fix

* - Yet another compilation fix

* - Batch Norm and Conv adapted

* - Fix to softmax MT

* - Fixes to MT code of MKL-DNN

* - Lint fixes

test=develop

84bb45c0

10 6月, 2019 1 次提交
- Z
  Remove attribute in Allocator::Allocate (#17878) · 3ece61f7
  由 Zeng Jinle 提交于 6月 10, 2019
```
* remove attribute in Allocator::Allocate, test=develop

* fix travis ci error, test=develop
```
  3ece61f7
07 6月, 2019 1 次提交
- Y
  Fix the accuracy issue while using float precision to get the scale. (#17884) · 14a32bf0
  由 Yihua Xu 提交于 6月 07, 2019
```
test=develop
```
  14a32bf0
31 5月, 2019 1 次提交

Add input format in Transpose GetHash (#17737) · d7c5c2bd

由 lidanqing 提交于 5月 31, 2019

* fix the bug of mobilenet-ssd INT8 inference without overloading GetHash
test=develop

* remove the out_grad->format() in TransposeMKLDNNGradOpKernel
test=develop

d7c5c2bd

28 5月, 2019 1 次提交

Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570) · 04b6c29e

由 lidanqing 提交于 5月 28, 2019

* add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test
test=develop

* change fasle and 0.0 to fuse_brelu and brelu_threshold
test=develop

change the "fuse_relu||fuse_brelu" to "unsigned_output"
test=develop

* Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18
test=develop

* continuous-integration fix
test=develop

04b6c29e

24 5月, 2019 2 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

fix quantize_squash_pass segfault when no tensor linked to Bias (#17292) · bccb0ba4

由 Sylwester Fraczek 提交于 5月 24, 2019

* fix quantize_squash_pass segfault when there is no tensor linked do Bias input

test=develop

* add googlenet test

test=develop

* fix concat CreateKey not using input format

test=develop

bccb0ba4

22 5月, 2019 1 次提交

Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130) · 2281ebf0

由 guomingz 提交于 5月 22, 2019

* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.

Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280

test=develop

* Fix the format issue

test=develop

* Add the missing nolint comments.

test=develop

* Fix the typos.

test=develop

* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.

test=develop

* Adjust the indentation.

test=develop

* Add the test_conv_brelu_mkldnn_fuse_pass case.

test=develop

* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.

test=develop

2281ebf0

16 4月, 2019 1 次提交

[MKL-DNN] Added reusing of primitive descriptors (fp32) (#16667) · 87a44b11

由 Jacek Czaja 提交于 4月 15, 2019

* - Reuse of conv PD

- conv transpose pd reused

- Added PD reusing of softmax and Batch Norm

- Refactoring and removal of not needed routines of mkl-dnn ops

test=develop

- Fix to reusing conv

test=develop

- Lint fixes

test=develop

- Further lint fixes

test=develop

- Lint  fixes

test=develop

- lint fixes

test=develop

- Lint workaround

test=develop

* - Fix after review on including boost as third party header

test=develop

* - Fix after review. Name change to something more descriptive

test=develop

87a44b11

28 3月, 2019 1 次提交

[MKL-DNN] Tensor modifications revert (#16462) · 26323274

由 Jacek Czaja 提交于 3月 28, 2019

* Revert "[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233)"

This reverts commit 13816dd4.
Apart from enabling transformer for MKL-DNN

* Revert "- MKL-DNN pooling updated to set_prim_desc"

This reverts commit c63f6b20.

Conflicts:
	paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc

* Revert "[MKL-DNN] MKL-DNN specific Tensor modification (#15429)"

test=develop

This reverts commit dec9cf53.

* - concat compilation fix

- lint

test=develop

- Lint fixes

test=develop

- Lint fixes

test=develop

- Fix Transpose MKLDNN op

test=develop

26323274

22 3月, 2019 1 次提交

Enable MKL-DNN INT8 Concat Kernel. (#16156) · e235882c

由 xiaolil1 提交于 3月 22, 2019

* Enable INT8 Concat Kernel to improve the performance of MobileNet-SSD.
test=develop

* Optimize UT format.
test=develop

* Fix UT file address issue.
test=develop

* Refine the license year.
test=develop

* Optimize code for new API.
test=develop

* Restructure INT8 Concat kernel.
test=develop

e235882c

19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
18 3月, 2019 7 次提交

D

refine softmax kernel. test=develop · 6c641827
由 dengkaipeng 提交于 3月 18, 2019

6c641827
D

fix format. test=develop · 2ddd23da
由 dengkaipeng 提交于 3月 09, 2019

2ddd23da
D

add mkldnn support. test=develop · 365e6cfd
由 dengkaipeng 提交于 3月 05, 2019

365e6cfd
D

add mkldnn support. test=develop · 217db273
由 dengkaipeng 提交于 3月 05, 2019

217db273

Enable INT8 transpose kernel for MobileNet-SSD improvement. (#16159) · e818fa10

由 xiaolil1 提交于 3月 18, 2019

* Enable INT8 transpose kernel for MobileNet-SSD improvement.
test=develop

* Refine the license year.
test=develop

* Delete redundant code.
test=develop

* Add axis check.
test=develop

e818fa10

L
refine with comments · d9f0e725
由 luotao1 提交于 3月 18, 2019
```
test=develop
```
d9f0e725

Add cpu_quantize_pass for C-API quantization (#16127) · 2579ade4

由 Wojciech Uss 提交于 3月 18, 2019

* Add cpu_quantize_pass for C-API quantization

test=develop

* add cpu_quantize_pass test

* fix lint: add include memory unorderd_map and unordered_set

test=develop

* fuse_relu 1

test=develop

* tuned 2 without squash

* fixes

test=develop

* remove unused vars

test=develop

* refactored

test=develop

* fix lint c-style cast -> C++ style cast

test=develop

* remove QuantMax and c style casts

test=develop

* last usage of QuantMax removed

test=develop

* Fix Analysis Predictor UT

Check if memory_optimize_pass has already been added
to the analysis config before adding a new one, so
that it is not added multiple times.
test=develop

* change map to unordered_map

fix the forgotten part of cpu_quantize_pass_tester.cc

test=develop

* removed quantized attribute

* fixed cpu_quantize_pass_tester and op attr comments

test=develop

* removed redundant line

test=debug

* removed gmock

test=develop

* fix after merge

2579ade4

15 3月, 2019 1 次提交
- L
  refine fc_infershape · 721c2c00
  由 luotao1 提交于 3月 15, 2019
```
test=develop
```
  721c2c00
06 3月, 2019 1 次提交

Add Requantize OP (#15318) · a177d482

由 xiaolil1 提交于 3月 06, 2019

* Enable INT8 ReQuantize OP
test=develop

* Clean code
test=develop

* Add comments
test=develop

* Revert "Clean code"
test=develop

This reverts commit a7a49b8a.

* Modify requantize op test
test=develop

* fix requantize UT by moving public function to public test file.
test=develop

* Fix test fail due to file address change.
test=develop

* Change file address for requantize op.
test=develop

a177d482

04 3月, 2019 2 次提交
- X
  Optimize Quantize Op with primitive reuse. (#15929) · 91838c32
  由 xiaolil1 提交于 2月 27, 2019
```
test=develop
```
  91838c32
- X
  Optimize INT8 DeQuantize Op with primitive reuse. · f8cbc4f3
  由 xiaoli.liu@intel.com 提交于 2月 26, 2019
```
test=develop
```
  f8cbc4f3
27 2月, 2019 2 次提交
- X
  Optimize Quantize Op with primitive reuse. (#15929) · 1abddd8d
  由 xiaolil1 提交于 2月 27, 2019
```
test=develop
```
  1abddd8d
- X
  INT8 Pool kernel Key Creation Optimization. (#15883) · 6724be2b
  由 xiaolil1 提交于 2月 27, 2019
```
* Optimize key creation of INT8 pool kernel to improve the peformance of ResNet-50 and MobileNet, especially for latency.
test=develop

* Optimize key creation of pool fp32 grad.
test=develop
```
  6724be2b
26 2月, 2019 2 次提交

- MKL-DNN pooling updated to set_prim_desc · c63f6b20

由 Jacek Czaja 提交于 2月 04, 2019

- MKLDNN ops revisited

- disabled softmax modifications

- disabled elementwise_add

- reverted LRN modifications

- reverted SUM primitive

- Partial reviing of softmax

- Enable softmax

- Softmax changes

- LRN is back

- LRN partially disabled

- LRN is back

- LRN fix

- compilation fixes

- Sum fixed(hopefully)

- Enabling (partially) elementwise_add

- Fixes to elemenwise_add

- Lint fixes

quantize fix

- compilation fix

test=develop

Disabling pooling

- Disabled quantize op

test=develop

c63f6b20

X
Optimize INT8 DeQuantize Op with primitive reuse. · 70759d18
由 xiaoli.liu@intel.com 提交于 2月 26, 2019
```
test=develop
```
70759d18

25 2月, 2019 3 次提交

M
Improve code reuse at MKL-DNN sum · 6ebe9877
由 Michal Gallus 提交于 2月 25, 2019
```
test=develop
```
6ebe9877
L
Enable function coverage for U8/S8 ConvMKLDNNOpKernel · 4acc5220
由 liangan1 提交于 2月 25, 2019
```
test=develop
```
4acc5220

[MKL-DNN] MKL-DNN specific Tensor modification (#15429) · dec9cf53

由 Jacek Czaja 提交于 2月 25, 2019

* - Implemented draft of primitive desc keeping in Tensor

test=develop

- TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented

- Added nchw and nc formats setting for sake of compatiblity

Fixed unit tests

- Worakaround to problem with 5D data in conv

- Added 3D and 1D MKL-DNN formats for name handles for tensor

test=develop

- Fix to UTs

test=develop

- Conv fp32 op was updated

Cosmetic fixes

test=develop

- tensor mkldnn cosmetics

test=develop

- Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils

* - Lint fixes

test=develop

* - setting prim dec in Tensor , sets also layout to kMKLDNN

test=develop

* - Moved creation of prim desc totally out of Tensor

test=develop

* - Cosmetic fixes adter review

test=develop

dec9cf53

22 2月, 2019 1 次提交
- S
  Change *(smart_ptr.get()) -> *smart_ptr · 74672d1a
  由 Sylwester Fraczek 提交于 2月 07, 2019
```
reason: dereferencing smart pointer is the same as the underlying pointer
test=develop
```
  74672d1a
21 2月, 2019 1 次提交
- K
  Add new ut and remove unnecessary code · 1578c60b
  由 Krzysztof Binias 提交于 2月 21, 2019
```
test=develop
```
  1578c60b
13 2月, 2019 1 次提交
- M
  Fix old FC backward weights descriptor creation · 7a8eff36
  由 Michal Gallus 提交于 2月 13, 2019
```
test=develop
```
  7a8eff36
29 1月, 2019 2 次提交
- K
  Small fix · 69b7c595
  由 Krzysztof Binias 提交于 1月 29, 2019
```
test=develop
```
  69b7c595
- K
  Make separate folders for mkldnn codes · b1bdcd4d
  由 Krzysztof Binias 提交于 1月 28, 2019
```
test=develop
```
  b1bdcd4d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致