提交 · 3d2a99e7119d2559de69cce3fc347ef7a7a09d13 · PaddlePaddle / Paddle-Lite

14 4月, 2020 2 次提交
- S
  
  [windows compile]support inference library compiling on windows (#3403) · 58b2d7dd
  由 silingtong123 提交于 4月 14, 2020
  
  58b2d7dd
- H
  
  [mac python] mac env supports outputs of python lib and installer (#3394) · 3190c354
  由 huzhiqiang 提交于 4月 14, 2020
  
  3190c354
09 4月, 2020 1 次提交

由 jackzhang235 提交于 4月 09, 2020

[MLU] add some basic support for MLU, including related passes, kernels, gtests and some api in padddle_api.h
Passes：mlu_subgraph_pass ,mlu_postprocess_pass
Kernels:  act，batch_norm, concat, conv, elementwise, fc, interpolate, pool, scale, softmax

dc481d49

08 4月, 2020 3 次提交

Add hard_swish, ctc_align and reciprocal op (#3354) · 47869a59

由 cc 提交于 4月 08, 2020

* Add hard_swish, ctc_align and reciprocal op, test=develop
* Move some activation ops to extra, test=develop

47869a59

[Core][XPU] Add XPU op kernels (#3274) · 99deb7d9

由 hong19860320 提交于 4月 08, 2020

* [LITE][XPU] bind xpu resnet50 kernels

* [LITE][XPU] fuse resnet50 and encoder

* [LITE][XPU] bind xpu bert kernels

* [LITE][XPU] refine xpu_resnet_fuse_pass.cc

* [LITE][XPU] add xpu stack kernel

* [LITE][XPU] add xpu slice/tanh kernel

* [LITE][XPU] refine resnet50 and encoder fusor

* [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h

* [LITE][XPU] clean workspace

* [LITE][XPU] add build script

* [LITE][XPU] fix compilation errors

* [LITE][XPU] fix kernel matmul

* [LITE][XPU] fix kernel ewadd ewsub

* [LITE][XPU] add xpu cast kernel

* [LITE][XPU] fix kernel slice

* [LITE][XPU] switch dev by LITE_XPU_DEV env

* [LITE][XPU] eliminate useless cast op

* [LITE][XPU] add PerThread Ops

* [LITE][X86] add SequenceUnpad op and kernel

* [LITE][XPU] add LITE_WITH_XTCL option

* [LITE][X86] add SequenceConv kernel

* [LITE][XPU] fix cmake dependency

* [LITE][XPU] add xpu sigmoid kernel

* [XPU] Remove the dependencies of framework.pb.h
test=develop

Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d

* [XPU] Fix the precision of cast kernel
test=develop

Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492

* [Core] Fix the compiling error when build for the target that disable XPU
test=develop

Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683

* [XPU] Add io_copy kernel for xpu<->arm
test=develop

Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7

* fix
test=develop

Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3

* fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel
test=develop

Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a

* [X86] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e

* Fix the build.sh for XPU and x86
test=develop

Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6

* [XPU] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749

* [XPU] Add XTCL compiling option in build.sh
test=develop

Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80

* fix namespace conflicts, test=develop

* [API][XPU] Move the XPU related APIs into CxxConfig
test=develop

Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9

* [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h
test=develop

Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93

* [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread
test=develop

Change-Id: I515958f56f8e129280bae61c923513cc91fb9728

* [API][Core][XPU] Refine the test case and remove the necessary modifications
test=develop

Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56

* [Core] Remove useless code
test=develop

Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678

* [XPU] Refine the test cases
test=develop

Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f

* [XPU] Remove useless scripts and code
test=develop

Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684

* [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op
test=develop

Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c

* test=develop

Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519

* test=develop

Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e

* [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder
test=develop

Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2

* [XPU] Fix and refine the xpu fuse pass
test=develop

Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907

* test=develop

Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89

* [XPU] Remove the dependency on xpu api for xpu fuse passes
test=develop

Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92

* [XPU] Move unit tests from lite/api to lite/tests/api
test=develop

Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250

* test=develop

Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794

* [XPU] Refine code
test=develop

Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06

* [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass
test=develop

Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204

* [XPU] refine code
test=develop

Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79

* [XPU] Refine code
test=develop

Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde

* [XPU] Add comments for the XPU APIs
test=develop

Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd
Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com>
Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

99deb7d9

H

[x86] Fix x86 code style (#3287) · 77734ce7
由 huzhiqiang 提交于 4月 08, 2020

77734ce7

29 2月, 2020 1 次提交
- Z
  
  split fill_constant & fill_constant_batch_size_like, enable and add uts (#3023) · 2882edb1
  由 zhupengyang 提交于 2月 29, 2020
  
  2882edb1
17 2月, 2020 1 次提交
- Y
  Fix bug in reduce_ops when keep_dim is true. (#2906) · be6f1fb4
  由 Yiqun Liu 提交于 2月 17, 2020
```
test=develop
```
  be6f1fb4
14 2月, 2020 2 次提交

G
Replace Softsign Eigen with c implementation (#2864) · 9f0a1964
由 GaoWei8 提交于 2月 14, 2020
```
* Replace Softsign Eigen with c implementation
test=develop
```
9f0a1964

[X86] Optimize gru and softmax (#2877) · 0a890050

由 Yiqun Liu 提交于 2月 14, 2020

* Optimize softmax. When the input tensor is 2-D and axis is 1, there is no need to resize.

* Optimize the gru, avoid calling Tensor::Slice.
test=develop

* Remove a std::vector in softmax.
test=develop

* Define CalculateSeqWidth to get the width of a sequence.
test=develop

0a890050

11 2月, 2020 2 次提交

Optimize the compute implementation of several operators. (#2843) · 53604bac

由 Yiqun Liu 提交于 2月 11, 2020

* Optimize the transform from Paddle' Tensor to EigenVector, avoiding defining multiple DDim.

* Optimize the compute implementation of several operators.
test=develop

53604bac

Optimize the InferShape of several operators. (#2839) · 800f5ce6

由 Yiqun Liu 提交于 2月 11, 2020

* Optimize the InferShape of several operators.
test=develop

* Remove the new function, resize and CheckPositive in DDim.
test=develop

* Fix a bug in fc_op's InferShape.
test=develop

800f5ce6

07 2月, 2020 1 次提交
- L
  
  add leaky_relu op for x86, test=develop (#2819) · 48b2b942
  由 liu zhengxi 提交于 2月 07, 2020
  
  48b2b942
15 1月, 2020 1 次提交
- W
  fix var_conv_2d to support cascading use. test=develop (#2766) · d8143103
  由 Wilber 提交于 1月 15, 2020
```
- 修复var_conv_2d级联使用中计算错误的bug
- x86的var_conv_2d中显示指定lod level为3
```
  d8143103
06 1月, 2020 1 次提交
- 石
  
  fix build errors, test=develop (#2728) · 947cda26
  由石晓伟提交于 1月 06, 2020
  
  947cda26
26 12月, 2019 1 次提交
- W
  fix fluid-lite-subgraph x86 compile error test=develop (#2682) · 53a5906c
  由 Wilber 提交于 12月 26, 2019
```
-fix fluid-lite-subgraph x86 compile error
    - Replace FLAGS with environment variables
```
  53a5906c
25 12月, 2019 1 次提交

[X86] Polish the implementation of fc and imporve the unittest (#2656) · 28481458

由 Yiqun Liu 提交于 12月 25, 2019

* Remove GEMM padding in fc_compute.
test=develop

* Write a common ParallelFor function to run the for loop in parallel.

* Add the codes of padding GEMM back in fc.

* Refine the code of fc when padding_weight is false to avoid the definition of temporary Tensor.

* Refine the unit test of fc and add testing case of padding and parallel.
test=develop

* Enable more test cases in common fc unittest, including padding and parallel for x86 target.

* Remove the fc test under kernels/x86.
test=develop

* Disable relu in test of fc for non-x86 target.
test=develop

* Change the eps of arm.
test=develop

28481458

24 12月, 2019 1 次提交

[LITE][NPU][XPU] Support multiple types for XPU and NPU op bridges (#2646) · 05da0c72

由 hong19860320 提交于 12月 24, 2019

* Support multiple types for XPU and NPU op bridges

* Add lookup_table, gather, slice, stack and scale op bridges for supporting BERT

* Fix the definition of lookup_table kernel for X86

05da0c72

19 12月, 2019 1 次提交
- W
  optimize cuda kernel test=develop (#2628) · 09aa15a5
  由 Wilber 提交于 12月 19, 2019
```
* optimize content-dnn cuda kernel
```
  09aa15a5
10 12月, 2019 1 次提交
- H
  
  modify code_style of CMakeList.txt to make ci_build_test_server able to work on gcc_4.8.2 · 4ae43560
  由 huzhiqiang 提交于 12月 10, 2019
  
  4ae43560
08 12月, 2019 1 次提交
- L
  
  Add fc op on lite x86 platform (#2568) · d76c529a
  由 liu zhengxi 提交于 12月 08, 2019
  
  d76c529a
02 12月, 2019 2 次提交
- L
  
  delete useless code for x86 platform (#2535) · 1b875ae8
  由 liu zhengxi 提交于 12月 02, 2019
  
  1b875ae8
- L
  
  Fix the conv compute shape (#2534) · 0a100120
  由 liu zhengxi 提交于 12月 02, 2019
  
  0a100120
28 11月, 2019 1 次提交
- H
  
  add dependency of fluid_data_type into gather_compute_x86 test=develop (#2516) · 1d232071
  由 huzhiqiang 提交于 11月 28, 2019
  
  1d232071
27 11月, 2019 1 次提交
- fill_constant op support param shape can be tensor or tensorlist, test=develop (#2459) · 89df8f01
  由 myq406450149 提交于 11月 27, 2019
```
* fill_constant can support shape is tensor or tensorlist
```
  89df8f01
26 11月, 2019 1 次提交
- L
  Add gather op on x86 platform (#2419) · f0683804
  由 liu zhengxi 提交于 11月 26, 2019
```
* add gather op on x86 platform and add its unittests, test=develop
```
  f0683804
25 11月, 2019 1 次提交
- W
  update x86 op and kernel to run content-dnn model test=develop (#2481) · a1a41934
  由 Wilber 提交于 11月 25, 2019
```
* update x86 op and kernel to run content-dnn model test=develop
```
  a1a41934
22 11月, 2019 2 次提交

update conv 2-pad to 4-pad (#2404) · 820eb6d4

由 HappyAngel 提交于 11月 22, 2019

* fix conv 2-pad to 4-pad

* fix compute conv shape

* fix pad, test=develop

* change conv_depthwise_3x3s1_fp.cc name to conv3x3s1p01_depthwise_fp32.cc to distinguish between conv3x3s1_depthwise_fp32.cc

* delete printf note in conv3x3s1, test=develop

* delete printf note, test=develop

* delete gem_sdot.h, test=develop

it is coped from __gemm_sdot_meta_.h

* update compute padding, test=develop

* fix padding size, must be 2 or 4. test=develop

* fix format in operators/conv_op.cc, test=develop

* change #if 0 to #if 1, test=develop

* put 2-pad to 4-pad in AttachImpl, test=develop

* fix clang-format error inn tests/math/connv_compute_test, test=develop

* fix x86 test result error, test=develop

* add asymmetric padding test case in liite/tests/math/conv_compute.cc, test=develop

* change paddings type to support dynamically modify, test=develop

* fix x86 build error in connv_compute_test, test=develop

* fix opencl build error, test=develop

* fix oopencl build error, test=develop

* fix  opencl/conv_compute build error, test=develop

* fix  opencl/conv_compute build error, test=develop

* fix format in kernels/opencl/conv_computte_ttest,test=develop

* fix build error, test=develop

fix build error in kernels/x86/conv_compute.h

820eb6d4

update pooling 2-padding to 4-padding (#2410) · a7f7d49b

由 HappyAngel 提交于 11月 22, 2019

* fix pooling bug and speed

* fix build error

* delete VLOGin pool, test=develop

* add openmp, test=develop

* fix lite/kernels/arm/pool_compute_test basic_pooling compute error bug, test=develop

* update pooling 2-pad to 4-pad, test=develop

* fix 2-pad to 4-pad in operators/pool_op.h, AttachKernel will set param, so 2-pad to 4-pad funcs should put in AttachKernel. test=ddevellop

* put 2-pad to 4-pad in AttachImpl, test=develop

* according to reviews, fix some format error. test=develop

* fix format errorr, add (). test=develop

* change paddings type to support dynamically modify, test=develop

* update padding type int other devices, test=develop

* fix x8d build error on shared_ptr, test=ddevelop

* fix formmat in operators pool_op.cc, test=develop

a7f7d49b

20 11月, 2019 3 次提交
- J
  fix x86 search_grnn, add cuda search_grnn and unit test (#2448) · e1b67433
  由 juncaipeng 提交于 11月 20, 2019
```
* fix x86 search_grnn and add unit test
* add cuda search_grnn and unit test
```
  e1b67433
- L
  
  Add layer_norm op on Lite x86 platform (#2463) · fa8c8971
  由 liu zhengxi 提交于 11月 20, 2019
  
  fa8c8971
- L
  Add stack op on Lite x86 platform and fix extra cmake error (#2458) · 79f8f42d
  由 liu zhengxi 提交于 11月 20, 2019
```
* add stack op and its unit tests, test=develop
```
  79f8f42d
19 11月, 2019 3 次提交
- Z
  add search_seq_softmax op; regist search_seq_softmax x86 kernel and cuda kernel (#2445) · f9930fc1
  由 zhupengyang 提交于 11月 19, 2019
```
test=develop
```
  f9930fc1
- H
  add x86 kernels: search_fc and sequence_topk_ave_pooling (#2443) · 68fe5b5c
  由 huzhiqiang 提交于 11月 18, 2019
```
* add x86 op and kernel : search_fc and sequence_topk_avg_pooling   for content-dnn model test=develop
```
  68fe5b5c
- Z
  [X86][CUDA] add attention_padding_mask op, x86 kernel, cuda kernel and unit tests (#2437) · ef6f7b84
  由 zhupengyang 提交于 11月 19, 2019
```
* [X86] add attention_padding_mask op, x86 kernel and unit test

test=develop

* [CUDA] add attention_padding_mask cuda kernel and unit test

test=develop
```
  ef6f7b84
18 11月, 2019 3 次提交
- W
  add var_conv_2d cuda kernel and unit test test=develop (#2441) · 884c840d
  由 Wilber 提交于 11月 18, 2019
```
- add var_conv_2d cuda kernel

- add var_conv_2d cuda kernel unit test

- temporarily set to two input mode, remove input(ROW) and input(COLUMN)
```
  884c840d
- P
  add search_group_padding op and x86 kernel, test=develop (#2440) · 1e88d1e8
  由 Pei Yang 提交于 11月 18, 2019
```
add search_group_padding op and x86 kernel
```
  1e88d1e8
- Z
  [X86][CUDA] add sequence_arithmetic op , x86 kernel, cuda kernel and unit test (#2436) · 8599c042
  由 zhupengyang 提交于 11月 18, 2019
```
* [X86][CUDA] add sequence_arithmetic op , x86 kernel, cuda kernel and unit test

test=develop

* add sequence_arithmetic cuda kernel unit test

test=develop
```
  8599c042
16 11月, 2019 1 次提交
- H
  
  [LITE][X86] Add search_aligned_mat_mul and search_seq_fc op for X86 (#2428) · 78f76834
  由 hong19860320 提交于 11月 16, 2019
  
  78f76834
15 11月, 2019 1 次提交

Add content-dnn ops (#2429) · 603b810f

由 juncaipeng 提交于 11月 15, 2019

* add search_seq_depadding x86 and cuda
* add match_matrix_tensor x86
* add search_grnn x86, no test

603b810f