提交 · 0c39b97b4ee0f732a6a9a349511b3ac3cdc1633c · 机器未来 / Paddle

24 5月, 2019 18 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

W
add __str__ method for tensor and lodtensor to support print test=dev… (#17588) · 6724a652
由 wopeizl 提交于 5月 24, 2019
```
* add __str__ method for tensor and lodtensor to support print test=develop
```
6724a652
K
Enable logical operators for the nGraph Bridge. (#17543) · e9216d06
由 Krzysztof Binias 提交于 5月 24, 2019
```
test=develop
```
e9216d06

Fix api example [ lstm, sequence_enumerate, sequence_expand,sequence_expand_as ] (#17210) · cbaf9e53

由 Hongyu Liu 提交于 5月 24, 2019

* fix example; test=develop

* fix api spec; test=develop

* fix api spec; test=develop

* add doc check
test=develop
test=document_preview

* test=develop,test=document_preview

add blank line to fix format, add one more "import"

* fix bug; test=develop

* fix bug; test=develop

cbaf9e53

G
add Run Prepared Ctx (#17616) · 326bf829
由 guru4elephant 提交于 5月 24, 2019
```
add Run Prepared Ctx, fix pybind problem
```
326bf829
Y
Fix trust ratio in lamb (#17614) · e8990e64
由 Yibing Liu 提交于 5月 24, 2019
```
test=develop
```
e8990e64

Fix the example code in some Python API. (#17343) · 2a7b3211

由 Guo Sheng 提交于 5月 24, 2019

* Fix the example code in some Python API.
test=develop

* Fix the example code in some Python API by adding import.
test=develop

2a7b3211

Add broadcast operators (#17503) · b5f4d5ed

由 chengduo 提交于 5月 24, 2019

* This PR adds broadcast for multi-process. And it could be used in dynamic graph to broadcast parameters.

b5f4d5ed

F
BuildStrategy api comment (#17348) · 2280f185
由 flame 提交于 5月 24, 2019
```
Python examples of fluid.layers.io.double_buffer and some BuildStrategy's methods.
```
2280f185

Conv concat relu quantization (#17466) · 5b2a3c4b

由 Sylwester Fraczek 提交于 5月 24, 2019

* add conv_concat_relu fuse

test=develop

* add test code

test=develop

* added missing include with unordered_map

test=develop

* review fixes for wojtuss

test=develop

* remove 'should (not) be fused' comment statements

one of them was invalid anyway

test=develop

5b2a3c4b

fix quantize_squash_pass segfault when no tensor linked to Bias (#17292) · bccb0ba4

由 Sylwester Fraczek 提交于 5月 24, 2019

* fix quantize_squash_pass segfault when there is no tensor linked do Bias input

test=develop

* add googlenet test

test=develop

* fix concat CreateKey not using input format

test=develop

bccb0ba4

Add profiler in tracer (#17076) · 2dc1c6f2

由 chengduo 提交于 5月 24, 2019

* add profiler in tracer.cc

* add profiler in layer.cc
test=develop

* add profiler in Layer.cc
test=develop

2dc1c6f2

M

[NGraph] Enable elementwise mul operator (#17552) · 0d4cbdad
由 mozga-intel 提交于 5月 23, 2019

0d4cbdad
T
Delete LoDTensorset in API.spec (#17577) · cee9dcc3
由 tianshuo78520a 提交于 5月 24, 2019
```
* test=develop

* test=develop

* test=develop

* del #
```
cee9dcc3
M
[NGraph] Enable assign operator for a ngraph, test=develop (#17437) · f2694e12
由 mozga-intel 提交于 5月 23, 2019
```
*  Enable assign operator for a ngraph, test=develop

* Cross_entropy operators needs to be updated
```
f2694e12
M

Enable elementwise sub operator for ngraph (#17527) · cf02cb5e
由 mozga-intel 提交于 5月 23, 2019

cf02cb5e
G
polish_executor_and_add_ctx_cache (#17536) · 7f8bc49d
由 guru4elephant 提交于 5月 24, 2019
```
* polish_executor_and_add_ctx_cache
```
7f8bc49d

[CPU] refine cpu softmax bwd (#17534) · 7ae461eb

由 tensor-tang 提交于 5月 24, 2019

* refine softmax fwd

test=develop

* refine cpu softmax bwd

test=develop

* fix batch size

test=develop

* fix compile issue with gpu

test=develop

* add value clip

7ae461eb

23 5月, 2019 12 次提交
- Y
  Add exponential moving average (#17562) · 6e11f977
  由 Yibing Liu 提交于 5月 23, 2019
```
* Add exponential moving average

test=develop, test=document_preview

* Polish documents

test=develop, test=document_preview

* Update API spec

test=develop, test=document_preview
```
  6e11f977
- T
  [CPU] refine softmax op fwd on CPU (#17522) · 0600b370
  由 tensor-tang 提交于 5月 23, 2019
```
* refine softmax fwd

test=develop

* fix compile issue wih gpu

test=develop

* add value clip to avoid exp
```
  0600b370
- Z
  Fix allocator bug (#16712) · c6189637
  由 Zeng Jinle 提交于 5月 23, 2019
```
* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop
```
  c6189637
- M
  
  Enable elementwise min operator for ngraph (#17521) · 03577151
  由 mozga-intel 提交于 5月 23, 2019
  
  03577151
- K
  fix API python example (#17226) · cf60e5a2
  由 Kaipeng Deng 提交于 5月 23, 2019
```
* fix api example. test=develop

* fix API.spec. test=develop

* fix spectral_norm format. test=develpp

* merge develop

* add import. test=develop

* fix indent. test=develop

* fix indent. test=develop

* add import fluid. test=develop
```
  cf60e5a2
- Q
  fix distribute doc test=develop (#17318) · 92e7d5d7
  由 Qiao Longfei 提交于 5月 23, 2019
```
* fix distribute doc
```
  92e7d5d7
- J
  Fix GetExpectedKernelType in Concat op (#17459) · c1aae8b8
  由 jerrywgz 提交于 5月 23, 2019
```
* fix concat op vartype check, test=develop
```
  c1aae8b8
- Q
  Async exe support communicator (#17386) · 58f7695a
  由 Qiao Longfei 提交于 5月 23, 2019
```
Async exe support communicator
```
  58f7695a
- Z
  fix trt ci bug temporary. (#17565) · 38da1030
  由 Zhaolong Xing 提交于 5月 23, 2019
```
ban all trt ut. will fix it later.

test=develop
```
  38da1030
- M
  
  [NGraph] Enable reshape operator test=develop (#17512) · 109b5aed
  由 mozga-intel 提交于 5月 22, 2019
  
  109b5aed
- Z
  fix bpr_loss data_norm teacher_student_sigmoid_loss api & fix continuous_value_model (#17331) · 9bb6a421
  由 zhang wenhui 提交于 5月 23, 2019
```
* fix bpr data_norm teacher_student_sigmoid , test=develop test=document_preview

修复了bpr_loss data_norm teacher_student_sigmoid_loss三个api, 同时修复了continuous_value_model文档英文拼写错误
```
  9bb6a421
- L
  fix api-doc related bugs test=develop test=document_preview (#17360) · 300bd750
  由 lijianshe02 提交于 5月 23, 2019
```
* fix api doc according to the reviewer's comment test=develop
```
  300bd750
22 5月, 2019 6 次提交

L
fix bug that saved optimal model path in test_analyzer_save_model con… (#17555) · daf88968
由 lijianshe02 提交于 5月 22, 2019
```
* modify saved model path in analyzer_save_model.cc test=develop
```
daf88968
K
Enable square operator for the nGraph Bridge. (#17551) · 43d15b9d
由 Krzysztof Binias 提交于 5月 22, 2019
```
test=develop
```
43d15b9d
S
[NGraph] add increment op to ngraph engine (#16929) · f86f49e7
由 Sevin F. Varoglu 提交于 5月 21, 2019
```
* add increment op to ngraph engine

test=develop

* fix style errors

test=develop
```
f86f49e7
B

NGraph enable parse serialized graph test=develop (#17453) · 8923612b
由 baojun 提交于 5月 21, 2019

8923612b

Fix examples of fluid.layers.sums and fluid.layers.DynamicRNN (#17308) · cf5d271c

由 Yiqun Liu 提交于 5月 22, 2019

* Fix examples of fluid.layers.sums.
test=document_preview

* Correct the example of DynamicRNN and its functions.
test=develop

* Add 'import paddle.fluid as fluid' to examples.
test=develop

* Update API.spec.
test=develop

* Add space lines.
test=develop

* Update the API.spec.
test=develop

cf5d271c

Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130) · 2281ebf0

由 guomingz 提交于 5月 22, 2019

* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.

Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280

test=develop

* Fix the format issue

test=develop

* Add the missing nolint comments.

test=develop

* Fix the typos.

test=develop

* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.

test=develop

* Adjust the indentation.

test=develop

* Add the test_conv_brelu_mkldnn_fuse_pass case.

test=develop

* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.

test=develop

2281ebf0

21 5月, 2019 4 次提交

Add LAMB Optimizer support (#17489) · f9796b12

由 Yibing Liu 提交于 5月 21, 2019

* Add LAMB optimizer

* Expose LAMB Optimizer's APIs

test=develop, test=document_preview

* Cleanup code & doc

test=develop, test=document_preview

* Update lamb optimizer's formula

test=develop

f9796b12

M

Enabled ngraph elementwise max operator (#17517) · 99ab5712
由 mozga-intel 提交于 5月 21, 2019

99ab5712
T
remove unused SERIAL compiler option (#17500) · 3d19f44a
由 Tao Luo 提交于 5月 21, 2019
```
test=develop
```
3d19f44a

Add api doc code examples (#17285) · dfdcd918

由 zhaoyuchen2018 提交于 5月 21, 2019

* Add api doc code examples

add or fix topk, squeeze, stack, StaticRNN,
StaticRNN memory in doc

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Add squeeze md5.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Add import package

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

dfdcd918

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致