提交 · ac92e4c0669fdb75b1b3043fe81bb71c6b54bd84 · 机器未来 / Paddle

30 5月, 2019 2 次提交

Y
Optimize recurrent_op using Prepare and RunPreparedContext, avoiding create... · 2704479b
由 Yiqun Liu 提交于 5月 30, 2019
```
Optimize recurrent_op using Prepare and RunPreparedContext, avoiding create operators in every iter. (#17689)

test=develop
```
2704479b

Enable less_than ngraph operator (#17642) · 9b998764

由 pawelpiotrowicz 提交于 5月 30, 2019

* Enable less_than ngraph operator

test=develop

* Added compare unit-tests test=develop

* Update: date && removed import test=develop

9b998764

29 5月, 2019 6 次提交
- H
  
  remove useless input 'Softmax@GRAD' from softmax_with_cross_entropy op (#17612) · 4ff87c04
  由 hutuxian 提交于 5月 29, 2019
  
  4ff87c04
- P
  [NGraph] Add reduce_sum operator for Ngraph (#17450) · 70a887af
  由 pawelpiotrowicz 提交于 5月 29, 2019
```
test=develop
```
  70a887af
- B
  add depthwise_conv2d op to ngraph engine (#17454) · 29baca0d
  由 baojun 提交于 5月 29, 2019
```
* add depthwise_conv2d test=develop

* use cpu for ngraph test=develop
```
  29baca0d
- G
  
  fix 2dconn test=develop (#17681) · 0d561ef4
  由 gongweibao 提交于 5月 29, 2019
  
  0d561ef4
- M
  
  [Lite] Enable cast operator test=develop (#17294) · ccf9e232
  由 mozga-intel 提交于 5月 28, 2019
  
  ccf9e232
- Y
  Optimize the concat and split kernel for specical cases when the number of... · 5782ddda
  由 Yiqun Liu 提交于 5月 29, 2019
```
Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)

* Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
test=develop

* Refine codes.
test=develop

* Correct the condition.
test=develop

* Move the define of tmp_data outside the if statement.

* Print the cudnn minor version.
test=develop

* Fix the case when in_num/o_num is 1 in concat/split op.
test=develop

* Remove const_cast.
test=develop
```
  5782ddda
28 5月, 2019 2 次提交

Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570) · 04b6c29e

由 lidanqing 提交于 5月 28, 2019

* add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test
test=develop

* change fasle and 0.0 to fuse_brelu and brelu_threshold
test=develop

change the "fuse_relu||fuse_brelu" to "unsigned_output"
test=develop

* Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18
test=develop

* continuous-integration fix
test=develop

04b6c29e

T
Revert "Enable SQRT operator for the nGraph Bridge (#17549)" (#17680) · 962eed6f
由 Tao Luo 提交于 5月 28, 2019
```
This reverts commit f34830e2.
```
962eed6f

27 5月, 2019 7 次提交

K
Enable SQRT operator for the nGraph Bridge (#17549) · f34830e2
由 Krzysztof Binias 提交于 5月 27, 2019
```
* Enable sqrt operator for the nGraph Bridge.

test=develop

* Update activation_op.h
```
f34830e2

add Concat quantization (#17448) · 96845d21

由 Sylwester Fraczek 提交于 5月 27, 2019

* add Concat quantization
add unit test for quantizing concat
fix for wrong value when the input is not in map of calculated scales
add use_quantizer to concat_op.cc
add scale_algo rules for concat

test=develop

* missing fix for multiple inputs quantize-squash

* wojtuss review fix: adding comment

test=develop

96845d21

G

Add multi-ncclcomm and 2D ncclallreduce support. (#17263) · 65bbf950
由 gongweibao 提交于 5月 27, 2019

65bbf950
K
[NGraph] Enable gelu operator for the nGraph Bridge. (#17547) · b1bd483a
由 Krzysztof Binias 提交于 5月 27, 2019
```
test=develop
```
b1bd483a
C
Polish Print Op (#17651) · 34301732
由 chengduo 提交于 5月 27, 2019
```
* enhance print
```
34301732

Code clean of Allocator (#17602) · 4aa931dd

由 Zeng Jinle 提交于 5月 27, 2019

* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop

* clean code of allocator,test=develop

* delete zero_size_allocator.h,test=develop

* fix failed unittest,test=develop

4aa931dd

G
Fix the usage of out_grad lod in sequence_slice_op. (#17625) · 430e2565
由 Guo Sheng 提交于 5月 27, 2019
```
test=develop
```
430e2565

25 5月, 2019 3 次提交

Gather Op Index Support int64_t datatype (#17610) · 1670db5e

由 hutuxian 提交于 5月 25, 2019

* gather_op support int64_t index by adding a template typename

* add UT and rename typename

test=develop

1670db5e

M

Enable elementwise pow operator for ngraph (#17526) · 2b83d75b
由 mozga-intel 提交于 5月 24, 2019

2b83d75b

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

24 5月, 2019 9 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

K
Enable logical operators for the nGraph Bridge. (#17543) · e9216d06
由 Krzysztof Binias 提交于 5月 24, 2019
```
test=develop
```
e9216d06
Y
Fix trust ratio in lamb (#17614) · e8990e64
由 Yibing Liu 提交于 5月 24, 2019
```
test=develop
```
e8990e64

Add broadcast operators (#17503) · b5f4d5ed

由 chengduo 提交于 5月 24, 2019

* This PR adds broadcast for multi-process. And it could be used in dynamic graph to broadcast parameters.

b5f4d5ed

fix quantize_squash_pass segfault when no tensor linked to Bias (#17292) · bccb0ba4

由 Sylwester Fraczek 提交于 5月 24, 2019

* fix quantize_squash_pass segfault when there is no tensor linked do Bias input

test=develop

* add googlenet test

test=develop

* fix concat CreateKey not using input format

test=develop

bccb0ba4

M

[NGraph] Enable elementwise mul operator (#17552) · 0d4cbdad
由 mozga-intel 提交于 5月 23, 2019

0d4cbdad
M
[NGraph] Enable assign operator for a ngraph, test=develop (#17437) · f2694e12
由 mozga-intel 提交于 5月 23, 2019
```
*  Enable assign operator for a ngraph, test=develop

* Cross_entropy operators needs to be updated
```
f2694e12
M

Enable elementwise sub operator for ngraph (#17527) · cf02cb5e
由 mozga-intel 提交于 5月 23, 2019

cf02cb5e

[CPU] refine cpu softmax bwd (#17534) · 7ae461eb

由 tensor-tang 提交于 5月 24, 2019

* refine softmax fwd

test=develop

* refine cpu softmax bwd

test=develop

* fix batch size

test=develop

* fix compile issue with gpu

test=develop

* add value clip

7ae461eb

23 5月, 2019 5 次提交
- T
  [CPU] refine softmax op fwd on CPU (#17522) · 0600b370
  由 tensor-tang 提交于 5月 23, 2019
```
* refine softmax fwd

test=develop

* fix compile issue wih gpu

test=develop

* add value clip to avoid exp
```
  0600b370
- M
  
  Enable elementwise min operator for ngraph (#17521) · 03577151
  由 mozga-intel 提交于 5月 23, 2019
  
  03577151
- J
  Fix GetExpectedKernelType in Concat op (#17459) · c1aae8b8
  由 jerrywgz 提交于 5月 23, 2019
```
* fix concat op vartype check, test=develop
```
  c1aae8b8
- Q
  Async exe support communicator (#17386) · 58f7695a
  由 Qiao Longfei 提交于 5月 23, 2019
```
Async exe support communicator
```
  58f7695a
- M
  
  [NGraph] Enable reshape operator test=develop (#17512) · 109b5aed
  由 mozga-intel 提交于 5月 22, 2019
  
  109b5aed
22 5月, 2019 4 次提交

K
Enable square operator for the nGraph Bridge. (#17551) · 43d15b9d
由 Krzysztof Binias 提交于 5月 22, 2019
```
test=develop
```
43d15b9d
S
[NGraph] add increment op to ngraph engine (#16929) · f86f49e7
由 Sevin F. Varoglu 提交于 5月 21, 2019
```
* add increment op to ngraph engine

test=develop

* fix style errors

test=develop
```
f86f49e7
B

NGraph enable parse serialized graph test=develop (#17453) · 8923612b
由 baojun 提交于 5月 21, 2019

8923612b

Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130) · 2281ebf0

由 guomingz 提交于 5月 22, 2019

* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.

Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280

test=develop

* Fix the format issue

test=develop

* Add the missing nolint comments.

test=develop

* Fix the typos.

test=develop

* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.

test=develop

* Adjust the indentation.

test=develop

* Add the test_conv_brelu_mkldnn_fuse_pass case.

test=develop

* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.

test=develop

2281ebf0

21 5月, 2019 2 次提交

Add LAMB Optimizer support (#17489) · f9796b12

由 Yibing Liu 提交于 5月 21, 2019

* Add LAMB optimizer

* Expose LAMB Optimizer's APIs

test=develop, test=document_preview

* Cleanup code & doc

test=develop, test=document_preview

* Update lamb optimizer's formula

test=develop

f9796b12

M

Enabled ngraph elementwise max operator (#17517) · 99ab5712
由 mozga-intel 提交于 5月 21, 2019

99ab5712

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致