提交 · f16090b5085fc28b7b7c354efdc47288bef47b65 · PaddlePaddle / Paddle-Lite

01 6月, 2020 1 次提交
- W
  
  [CUDA] [Framework] [FP16] Lite framework support fp16. (#3673) · c0a8e2dd
  由 Wilber 提交于 6月 01, 2020
  
  c0a8e2dd
01 4月, 2020 1 次提交
- W
  add cuda kernels. test=develop (#3315) · 91a58fba
  由 Wilber 提交于 4月 01, 2020
```
add cuda kernel.

abs, tanh, elementwise_sub
```
  91a58fba
25 3月, 2020 1 次提交
- H
  
  [Python lib] Add opt lib into python lib (#3209) · 5fea8e10
  由 huzhiqiang 提交于 3月 25, 2020
  
  5fea8e10
17 3月, 2020 1 次提交
- W
  For cuda compilation products and ci (#3152) · 774b4652
  由 Wilber 提交于 3月 17, 2020
```
add cuda ci.

Organize cuda compilation products.
```
  774b4652
20 2月, 2020 1 次提交
- W
  Optimize cuda kernel and remove io_copy added by default due to missing fetch_cuda kernel (#2920) · 823f0dae
  由 Wilber 提交于 2月 20, 2020
```
Optimize cuda kernel and remove io_copy added by default due to missing fetch_cuda kernel
```
  823f0dae
28 12月, 2019 1 次提交
- H
  
  Upgrade of Model_optimize_tool (#2624) · 4300ef75
  由 huzhiqiang 提交于 12月 28, 2019
  
  4300ef75
23 12月, 2019 1 次提交
- W
  add sequence_pool_concat fuse and kernel test=develop (#2645) · 1b74fded
  由 Wilber 提交于 12月 23, 2019
```
add sequence_pool_concat fuse pass

add fuse kernel
```
  1b74fded
15 12月, 2019 1 次提交
- W
  optimize search_grnn test=develop (#2608) · dad43f81
  由 Wilber 提交于 12月 15, 2019
```
optimize search_grnn
```
  dad43f81
04 12月, 2019 1 次提交

[cuda] [int8] resnet50 cuda int8 support (#2417) · f7574646

由 Zhaolong Xing 提交于 12月 04, 2019

* init resnet cuda int8 support
test=develop

* refine cuda unit test
test=develop

* add the forgeted file.
test=develop

f7574646

22 11月, 2019 1 次提交
- P
  
  add search_group_padding cuda kernel, test=develop (#2472) · 36c0068e
  由 Pei Yang 提交于 11月 22, 2019
  
  36c0068e
21 11月, 2019 3 次提交
- H
  add cuda kernel for sequence_topk_avg_pooling and search_fc (#2451) · 3a881861
  由 huzhiqiang 提交于 11月 21, 2019
```
* cuda kernel for sequence_topk_avg_pooling and search_fc test=develop
```
  3a881861
- P
  
  remove duplicate cmake targets of sequence-pool (#2467) · bf2c6fca
  由 Pei Yang 提交于 11月 21, 2019
  
  bf2c6fca
- 石
  fix cuda build error, test=develop (#2464) · d8ddbcc6
  由石晓伟提交于 11月 21, 2019
```
* fix cuda building, test=develop

* remove sequence_pool from cmake because build error, test=develop
```
  d8ddbcc6
20 11月, 2019 2 次提交

J
fix x86 search_grnn, add cuda search_grnn and unit test (#2448) · e1b67433
由 juncaipeng 提交于 11月 20, 2019
```
* fix x86 search_grnn and add unit test
* add cuda search_grnn and unit test
```
e1b67433

fix sequence pool cuda (#2457) · b094b2b6

由 Pei Yang 提交于 11月 20, 2019

* add sequence_pool cuda kernel, test=develop

* fix sequence_pool cuda,test=develop

* fix and complete unittest, test=develop

b094b2b6

19 11月, 2019 2 次提交
- H
  
  [LITE][CUDA] Add CUDA kernel for search_aligned_mat_mul and search_seq_fc Op (#2449) · 8373aec5
  由 hong19860320 提交于 11月 19, 2019
  
  8373aec5
- Z
  [X86][CUDA] add attention_padding_mask op, x86 kernel, cuda kernel and unit tests (#2437) · ef6f7b84
  由 zhupengyang 提交于 11月 19, 2019
```
* [X86] add attention_padding_mask op, x86 kernel and unit test

test=develop

* [CUDA] add attention_padding_mask cuda kernel and unit test

test=develop
```
  ef6f7b84
18 11月, 2019 3 次提交
- W
  add var_conv_2d cuda kernel and unit test test=develop (#2441) · 884c840d
  由 Wilber 提交于 11月 18, 2019
```
- add var_conv_2d cuda kernel

- add var_conv_2d cuda kernel unit test

- temporarily set to two input mode, remove input(ROW) and input(COLUMN)
```
  884c840d
- P
  add sequence_pool cuda kernel, test=develop (#2430) · 3d73dea9
  由 Pei Yang 提交于 11月 18, 2019
```
add sequence_pool cuda kernel
```
  3d73dea9
- Z
  [X86][CUDA] add sequence_arithmetic op , x86 kernel, cuda kernel and unit test (#2436) · 8599c042
  由 zhupengyang 提交于 11月 18, 2019
```
* [X86][CUDA] add sequence_arithmetic op , x86 kernel, cuda kernel and unit test

test=develop

* add sequence_arithmetic cuda kernel unit test

test=develop
```
  8599c042
17 11月, 2019 1 次提交
- J
  Add cuda match_matrix_tensor op and test (#2434) · ce21ff5d
  由 juncaipeng 提交于 11月 17, 2019
```
* add cuda match_matrix_tensor op and test, test=develop
```
  ce21ff5d
15 11月, 2019 1 次提交

Add content-dnn ops (#2429) · 603b810f

由 juncaipeng 提交于 11月 15, 2019

* add search_seq_depadding x86 and cuda
* add match_matrix_tensor x86
* add search_grnn x86, no test

603b810f

13 11月, 2019 2 次提交

W
add sequence_reverse op and kerenl for arm and cuda test=develop (#2397) · acf09294
由 Wilber 提交于 11月 13, 2019
```
- add sequence_reverse op

- add sequence_reverse kernel for x86 and cuda

- add sequence_reverse_test for x86 and cuda
```
acf09294

add sequence_concat op kernel and test test=develop (#2414) · 8a1d942a

由 Wilber 提交于 11月 13, 2019

- add sequence_concat op

- add sequence_concat kernel for x86 and cuda

- add sequence_concat_test for x86 and cuda

8a1d942a

11 11月, 2019 1 次提交
- P
  add cuda kernel:lookup table, test=develop (#2403) · 15eccb9e
  由 Pei Yang 提交于 11月 11, 2019
```
add cuda kernel:lookup table
```
  15eccb9e
23 10月, 2019 2 次提交
- 石
  Add kernel version table and update framework.proto, test=develop (#2243) · 362275ed
  由石晓伟提交于 10月 23, 2019
```
* update framework.proto

* add compatibility check, test=develop

* remove head files, test=develop
```
  362275ed
- S
  add python api (#2225) · 76e74ef1
  由 sangoly 提交于 10月 23, 2019
```
* [python api] init add python api test=develop
```
  76e74ef1
21 10月, 2019 2 次提交

to support yolov3 unet alexnet can run on tx2 (#2216) · 57d8e42e
由 myq406450149 提交于 10月 21, 2019
```
* add gpu kernel mul pool relu scale softmax dropout bilinear_interp and can run in tx2

* rm GREATER_EQUAL
```
57d8e42e

add cuda op(pool & softmax), support conv with padding_algorithm · 305130fc

由 yiicy 提交于 10月 21, 2019

* cuda add softmax and pool op

* * fix armlinux can find sys/system_properties.h
* conv add padding_algorithm
test=develop

* delete padding_algorithm in op param, test=develop

* fix bugs, test=develop

305130fc

17 10月, 2019 1 次提交
- J
  
  add bilinear_interp_cuda_op, test=develop (#2197) · 4ac51a6b
  由 juncaipeng 提交于 10月 17, 2019
  
  4ac51a6b
11 10月, 2019 1 次提交

CUDA: can run yolov3 int8 (#2172) · 7931104f

由 Zhaolong Xing 提交于 10月 11, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

* can run yolov3 int8 test=develop

7931104f

27 9月, 2019 1 次提交

can run yolov3 fp32 on cuda devices (#2092) · 3d6d744f

由 Zhaolong Xing 提交于 9月 27, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

3d6d744f

12 9月, 2019 1 次提交
- W
  add transpose kernel for cuda test=develop (#1997) · cba5736f
  由 Wilber 提交于 9月 12, 2019
```
add transpose kernel for cuda
```
  cba5736f
09 9月, 2019 2 次提交

Add concat and elementwise_add cuda kernel (#1979) · 6d1da405

由 Pei Yang 提交于 9月 09, 2019

* add nearest_interp_cuda kernel, test=develop

* add concat op and elementwise_add op

* remove eigen dependency from nearest_interp cuda kernel, test=develop

* free cuda pointers, test=develop

6d1da405

Z
add calib cuda kernel. (#1977) · da328594
由 Zhen Wang 提交于 9月 09, 2019
```
* add calib cuda kernel.

* add unit test for calib cuda kernel. test=develop
```
da328594

06 9月, 2019 1 次提交

add cudnn conv fp32, int8 support (#1974) · f3124b30

由 Zhaolong Xing 提交于 9月 06, 2019

* paddle lite cuda init
can run model with leaky_relu

* add the missing file.
test=develop

* add the load from memory interface.
test=develop

* refine this pr. fix comments
fix ci error
test=develop

* conv impl
fp32:
conv, conv+bais, conv+bias+relu, conv+bias+leaky_relu

int8:
conv, conv+bais+relu(int8 or fp32 output), conv+bias+leaky_relu(int8 or fp32 output)

can run conv+ bias+relu using cxx_api
test=develop

* move the lite/cuda/math to backends/cuda/math
test=develop

f3124b30

30 8月, 2019 1 次提交
- P
  add nearest_interp_cuda kernel, test=develop (#1920) · 029971b4
  由 Pei Yang 提交于 8月 30, 2019
```
add nearest_interp cuda kernel for Paddle-Lite
```
  029971b4
29 8月, 2019 1 次提交

Add yolo_box_cuda multiclass_nms_host kernel. (#1908) · de43e479

由 Wilber 提交于 8月 29, 2019

* add yolo_box_compute cuda

* move multiclass_nms(arm) to host

* add lod in scale op

* add yolo_box_cuda cmake config

* modify shuffle_channel_fuse and transpose_softmax_transpose_fuse to support run ssd model. test=develop

* reshape and transpose op don't have xshape output.

* modify yolo_box_compute_cuda, use tensor to manage cuda memory test=develop

* add yolo_box use kernel test=develop

de43e479

27 8月, 2019 1 次提交
- Z
  lite cuda init: can run a simple model with leaky_relu (#1860) · 05d3b19b
  由 Zhaolong Xing 提交于 8月 27, 2019
```
* paddle lite cuda init
can run model with leaky_relu

* add the missing file.
test=develop
```
  05d3b19b
16 8月, 2019 1 次提交
- Y
  
  publish lite (#1800) · 699d6cd0
  由 Yan Chunwei 提交于 8月 16, 2019
  
  699d6cd0