提交 · f16090b5085fc28b7b7c354efdc47288bef47b65 · PaddlePaddle / Paddle-Lite

01 6月, 2020 1 次提交
- W
  
  [CUDA] [Framework] [FP16] Lite framework support fp16. (#3673) · c0a8e2dd
  由 Wilber 提交于 6月 01, 2020
  
  c0a8e2dd
28 5月, 2020 1 次提交

[Libsize] Reduce size of dynamic library ".so" (#3717) · ec8ef528

由 T8T9 提交于 5月 28, 2020

* reduce .so size. test=develop

* compile all targets when LITE_ON_TINY_PUBLISH=OFF

* unordered_map is more convenient when key is customized class

* test=develop

ec8ef528

09 5月, 2020 1 次提交
- W
  
  fix graphics memory leak problem. test=develop (#3598) · 89ec0241
  由 Wilber 提交于 5月 09, 2020
  
  89ec0241
08 5月, 2020 1 次提交
- W
  add eltwise_activate fuse. test=develop (#3367) · 2a344823
  由 Wilber 提交于 5月 08, 2020
```
* add eltwise_activate_fuse. test=develop
```
  2a344823
13 4月, 2020 1 次提交
- W
  lite cuda support exec multi-stream. (#2949) · 4a7284f9
  由 Wilber 提交于 4月 13, 2020
```
lite cuda support exec multi-stream
```
  4a7284f9
01 4月, 2020 1 次提交
- W
  add cuda kernels. test=develop (#3315) · 91a58fba
  由 Wilber 提交于 4月 01, 2020
```
add cuda kernel.

abs, tanh, elementwise_sub
```
  91a58fba
17 3月, 2020 1 次提交
- W
  For cuda compilation products and ci (#3152) · 774b4652
  由 Wilber 提交于 3月 17, 2020
```
add cuda ci.

Organize cuda compilation products.
```
  774b4652
20 2月, 2020 1 次提交
- W
  Optimize cuda kernel and remove io_copy added by default due to missing fetch_cuda kernel (#2920) · 823f0dae
  由 Wilber 提交于 2月 20, 2020
```
Optimize cuda kernel and remove io_copy added by default due to missing fetch_cuda kernel
```
  823f0dae
17 1月, 2020 1 次提交
- 石
  
  support dynamic cuda libs, test=develop (#2780) · 9343782b
  由石晓伟提交于 1月 17, 2020
  
  9343782b
19 12月, 2019 1 次提交
- W
  optimize cuda kernel test=develop (#2628) · 09aa15a5
  由 Wilber 提交于 12月 19, 2019
```
* optimize content-dnn cuda kernel
```
  09aa15a5
15 12月, 2019 1 次提交
- W
  optimize search_grnn test=develop (#2608) · dad43f81
  由 Wilber 提交于 12月 15, 2019
```
optimize search_grnn
```
  dad43f81
04 12月, 2019 1 次提交

[cuda] [int8] resnet50 cuda int8 support (#2417) · f7574646

由 Zhaolong Xing 提交于 12月 04, 2019

* init resnet cuda int8 support
test=develop

* refine cuda unit test
test=develop

* add the forgeted file.
test=develop

f7574646

22 11月, 2019 1 次提交

update conv 2-pad to 4-pad (#2404) · 820eb6d4

由 HappyAngel 提交于 11月 22, 2019

* fix conv 2-pad to 4-pad

* fix compute conv shape

* fix pad, test=develop

* change conv_depthwise_3x3s1_fp.cc name to conv3x3s1p01_depthwise_fp32.cc to distinguish between conv3x3s1_depthwise_fp32.cc

* delete printf note in conv3x3s1, test=develop

* delete printf note, test=develop

* delete gem_sdot.h, test=develop

it is coped from __gemm_sdot_meta_.h

* update compute padding, test=develop

* fix padding size, must be 2 or 4. test=develop

* fix format in operators/conv_op.cc, test=develop

* change #if 0 to #if 1, test=develop

* put 2-pad to 4-pad in AttachImpl, test=develop

* fix clang-format error inn tests/math/connv_compute_test, test=develop

* fix x86 test result error, test=develop

* add asymmetric padding test case in liite/tests/math/conv_compute.cc, test=develop

* change paddings type to support dynamically modify, test=develop

* fix x86 build error in connv_compute_test, test=develop

* fix opencl build error, test=develop

* fix oopencl build error, test=develop

* fix  opencl/conv_compute build error, test=develop

* fix  opencl/conv_compute build error, test=develop

* fix format in kernels/opencl/conv_computte_ttest,test=develop

* fix build error, test=develop

fix build error in kernels/x86/conv_compute.h

820eb6d4

21 11月, 2019 1 次提交

石

fix cuda build error, test=develop (#2464) · d8ddbcc6

由石晓伟提交于 11月 21, 2019

* fix cuda building, test=develop

* remove sequence_pool from cmake because build error, test=develop

d8ddbcc6

19 11月, 2019 1 次提交
- H
  
  [LITE][CUDA] Add CUDA kernel for search_aligned_mat_mul and search_seq_fc Op (#2449) · 8373aec5
  由 hong19860320 提交于 11月 19, 2019
  
  8373aec5
17 11月, 2019 1 次提交
- J
  Add cuda match_matrix_tensor op and test (#2434) · ce21ff5d
  由 juncaipeng 提交于 11月 17, 2019
```
* add cuda match_matrix_tensor op and test, test=develop
```
  ce21ff5d
23 10月, 2019 2 次提交
- W
  modify yolobox_cuda to support multiple runs (#2245) · a4a19ba4
  由 Wilber 提交于 10月 23, 2019
```
* modify yolobox_cuda to support multiple runs test=develop
```
  a4a19ba4
- S
  add python api (#2225) · 76e74ef1
  由 sangoly 提交于 10月 23, 2019
```
* [python api] init add python api test=develop
```
  76e74ef1
21 10月, 2019 2 次提交
- 石
  link static library with cuda, test=develop (#2228) · 7c69b6b4
  由石晓伟提交于 10月 21, 2019
```
* add static libraries of cuda, test=develop

* update cuda make
```
  7c69b6b4
- to support yolov3 unet alexnet can run on tx2 (#2216) · 57d8e42e
  由 myq406450149 提交于 10月 21, 2019
```
* add gpu kernel mul pool relu scale softmax dropout bilinear_interp and can run in tx2

* rm GREATER_EQUAL
```
  57d8e42e
16 10月, 2019 1 次提交
- Z
  Ban feed and fetch op during inference (#2198) · 75e8a6fc
  由 Zhaolong Xing 提交于 10月 16, 2019
```
* init: delete feed and fetch op, using zero copy
test=develop

* delete the unused test
test=develop
```
  75e8a6fc
14 10月, 2019 1 次提交
- Z
  align yolov3 cuda int8 (#2183) · 80d35725
  由 Zhaolong Xing 提交于 10月 14, 2019
```
test=develop
```
  80d35725
11 10月, 2019 1 次提交

CUDA: can run yolov3 int8 (#2172) · 7931104f

由 Zhaolong Xing 提交于 10月 11, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

* can run yolov3 int8 test=develop

7931104f

10 10月, 2019 1 次提交
- W
  fix yolobox_cuda bug · f4ac2768
  由 Wilber 提交于 10月 10, 2019
```
* fix yolobox_cuda bug 
* update code format
```
  f4ac2768
27 9月, 2019 1 次提交

can run yolov3 fp32 on cuda devices (#2092) · 3d6d744f

由 Zhaolong Xing 提交于 9月 27, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

3d6d744f

19 9月, 2019 1 次提交

石

add full_api_static target and fix building errors, test=develop (#2064) · eef7ea0f

由石晓伟提交于 9月 19, 2019

* add full_api_static target and fix building errors, test=develop

* fix build errors, test=develop

* fix code style, test=develop

* fix lite/model_parser/pb/var_desc.cc, test=develop

* fix building errors, test=develop

* modify lite/tools/debug/CMakeLists.txt, test=develop

eef7ea0f

12 9月, 2019 1 次提交
- W
  add transpose kernel for cuda test=develop (#1997) · cba5736f
  由 Wilber 提交于 9月 12, 2019
```
add transpose kernel for cuda
```
  cba5736f
11 9月, 2019 1 次提交
- 石
  make passes related to the device type, test=develop (#2012) · 8ca10db8
  由石晓伟提交于 9月 11, 2019
```
* make passes related to the device type, test=develop

* improve tips, test=develop
```
  8ca10db8
06 9月, 2019 1 次提交

add cudnn conv fp32, int8 support (#1974) · f3124b30

由 Zhaolong Xing 提交于 9月 06, 2019

* paddle lite cuda init
can run model with leaky_relu

* add the missing file.
test=develop

* add the load from memory interface.
test=develop

* refine this pr. fix comments
fix ci error
test=develop

* conv impl
fp32:
conv, conv+bais, conv+bias+relu, conv+bias+leaky_relu

int8:
conv, conv+bais+relu(int8 or fp32 output), conv+bias+leaky_relu(int8 or fp32 output)

can run conv+ bias+relu using cxx_api
test=develop

* move the lite/cuda/math to backends/cuda/math
test=develop

f3124b30

03 9月, 2019 1 次提交
- H
  
  create backends directory and move hardware backends into it (#1954) · 31ee212a
  由 huzhiqiang 提交于 9月 03, 2019
  
  31ee212a