提交 · 3df38f5cdd0866c1e78f1c2674d3d6cf3166d35f · 机器未来 / Paddle

10 1月, 2020 1 次提交

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

29 10月, 2019 1 次提交

[Cherry-pick to 1.6] Block part of "tensor should not be null" error message (#20845) · d29e9aa4

由 Chen Weihang 提交于 10月 29, 2019

* Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044)

* add indicate_var_data_type inferface, test=develop

* add unittests & polish error message, test=develop

* remove needless include, test=develop

* extract public function & polish message, test=develop

* delete empty var check, test=develop

* change data_type to pointer parameter, test=develop

* polish details, test=develop

* Replace risky GetInputType method with secure IndicateVarDataType interface (#20668)

* replace part of the old implementation, test=develop

* restore concat op, test=develop

* update all ops implemention & delete GetDataTypeOfVar func, test=develop

test=release/1.6

d29e9aa4

16 9月, 2019 1 次提交

Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758

由 Yiqun Liu 提交于 9月 16, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

* Enhance fc_fuse_pass to enable fusing relu.

* Allow print the shapes of var_desc in graph.
test=develop

* Enhance fc_fuse_pass_tester.

* Remove the use of PADDLE_ENFORCE.
test=develop

* Correct the number of ops after fusing.
test=develop

* Fix a typo.
test=develop

* Set activation_type to null when there is no relu in fc.
test=develop

* Refine fc_fuse_pass's codes.

* Enable the set of shape for tensor.

* Refine repeated_fc_relu_pass and add unittest.
test=develop

c67c8758

11 9月, 2019 1 次提交

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
18 3月, 2019 1 次提交
- L
  refine with comments · d9f0e725
  由 luotao1 提交于 3月 18, 2019
```
test=develop
```
  d9f0e725
15 3月, 2019 1 次提交
- L
  refine fc_infershape · 721c2c00
  由 luotao1 提交于 3月 15, 2019
```
test=develop
```
  721c2c00
19 2月, 2019 1 次提交
- T
  fix warnings (#15790) · e1c707fe
  由 tensor-tang 提交于 2月 19, 2019
```
* fix warnings

test=develop

* fix enforce test

test=develop
```
  e1c707fe
18 12月, 2018 1 次提交
- S
  rewrite ddim · a500dfa5
  由 sneaxiy 提交于 12月 18, 2018
```
test=develop
```
  a500dfa5
12 12月, 2018 1 次提交
- Y
  Change tensor uses proto::VarType::type · 9bd70a1e
  由 Yu Yang 提交于 12月 11, 2018
```
test=develop
```
  9bd70a1e
14 11月, 2018 2 次提交
- T
  fix typo to pass the ci · 980a6753
  由 Tao Luo 提交于 11月 14, 2018
```
test=develop
```
  980a6753
- T
  
  add in_num_col_dims for fc · 8ea13e33
  由 Tao Luo 提交于 11月 14, 2018
  
  8ea13e33
21 8月, 2018 1 次提交

fea/link ir to inference analysis and fc fuse support (#12789) · 896a37b6

由 Yan Chunwei 提交于 8月 21, 2018

* link IR graph to analysis graph

* add clean code and update

* add infer_clean_pass

* add ir_pass_manager

* support fc fuse executation

* fix ir circle

896a37b6

16 8月, 2018 2 次提交
- T
  
  fix lod and op test · df28a3b4
  由 tensor-tang 提交于 8月 16, 2018
  
  df28a3b4
- T
  
  refine fc and use the fc compute in fusion_lstm · f3cd2612
  由 tensor-tang 提交于 8月 16, 2018
  
  f3cd2612
14 8月, 2018 3 次提交
- T
  
  fix unkown omp pragmas · 742300ba
  由 tensor-tang 提交于 8月 14, 2018
  
  742300ba
- T
  
  enable fc op in normal case · 4b5986bb
  由 tensor-tang 提交于 8月 14, 2018
  
  4b5986bb
- T
  
  enable native fc forward · e133df60
  由 tensor-tang 提交于 8月 13, 2018
  
  e133df60
13 8月, 2018 1 次提交
- T
  
  add bias for fc op · 038cbf79
  由 tensor-tang 提交于 8月 13, 2018
  
  038cbf79
07 6月, 2018 1 次提交

Mkldnn layout (#11040) · 3ff9ba0e

由 mozga-intel 提交于 6月 07, 2018

* Add MKLDNN layout support in Paddle

Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.

* Add MKLDNN layout support in activation OP

* Don't populate layout from input to output when kMKLDNN in

* Refine pool mkldnn op kernel

* MKLDNN layout

* Remove the inferitance from tensor file

* MKLDNN layout: refactoring

* Remove additional #define to register new operator

* Prepare mkldnn tests to work with layout

3ff9ba0e

08 5月, 2018 1 次提交

Clean OpProtoAndCheckerMaker · 0e78cb69

由 Yu Yang 提交于 5月 08, 2018

Do not use ctor

* Reduce line of codes.
* We can use virtual function for Maker now.
* The implementation does not care what maker holds, it is easier to
refactor later.

0e78cb69

19 4月, 2018 1 次提交
- Y
  add semicolon to op registry (#10034) · e04c43d5
  由 Yang Yang(Tony) 提交于 4月 18, 2018
```
* script to add semicolon

* fix typo
```
  e04c43d5
17 4月, 2018 1 次提交
- Y
  
  script to fix all · ce7c2e86
  由 Yang Yang 提交于 4月 16, 2018
  
  ce7c2e86
03 4月, 2018 3 次提交
- M
  
  Enforce: 2 and 4 dims, remove information about out in format · 46e14bbc
  由 mozga-intel 提交于 4月 03, 2018
  
  46e14bbc
- M
  
  Remove additional message · 32f8ac7d
  由 mozga-intel 提交于 3月 30, 2018
  
  32f8ac7d
- M
  
  Added new fc files, register fc kernel · 34a80843
  由 mozga-intel 提交于 3月 29, 2018
  
  34a80843

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致