提交 · a1527e8088694c4edae1d4a2014535d21fb0b4bb · PaddlePaddle / Paddle-Lite

03 1月, 2020 2 次提交
- W
  temporarily remove cuda fc fuse because we don't support cuda fc now. test=develop (#2715) · a1527e80
  由 Wilber 提交于 1月 03, 2020
```
temporarily remove cuda fc fuse because we don't support cuda fc now
```
  a1527e80
- H
  
  [LITE][NPU][XPU] Fix the data feeding of the input tensors in subgraph_pass_test (#2714) · e4aa194b
  由 hong19860320 提交于 1月 03, 2020
  
  e4aa194b
02 1月, 2020 3 次提交
- 石
  
  Jit macro definition ambiguity fix, test=develop (#2713) · 0176d4bf
  由石晓伟提交于 1月 02, 2020
  
  0176d4bf
- G
  [X86] Enhance fc_fuse_pass to enable fusing relu to fc_op (#2701) · 73450636
  由 GaoWei8 提交于 1月 02, 2020
```
* Enhance fc_fuse_pass to enable fusing relu to fc_op
test=develop

* restrict fusing relu in x86
test=develop
```
  73450636
- H
  
  [LITE][XPU] Supporting llvm and xpu device target (#2711) · 8c0397c6
  由 hong19860320 提交于 1月 02, 2020
  
  8c0397c6
31 12月, 2019 5 次提交

X

[mobile][opencl] suite model male2fe ,support a type element_mul ,test=mobile (#2705) · ab30ccc2
由 xiebaiyuan 提交于 12月 31, 2019

ab30ccc2
N

[mobile][opencl] add universal conv_transpose. test=develop (#2710) · ec18a843
由 NazgulLee 提交于 12月 31, 2019

ec18a843

X86 and cuda compile simutaneously cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo... · f1cedb8f

由 Wilber 提交于 12月 31, 2019

X86 and cuda compile simutaneously cmake ..  -DCMAKE_BUILD_TYPE=RelWithDebInfo  -DWITH_MKL=ON           -DLITE_WITH_CUDA=ON           -DWITH_MKLDNN=OFF           -DLITE_WITH_X86=ON           -DLITE_WITH_PROFILE=OFF          -DWITH_LITE=OFF           -DLITE_WITH_LIGHT_WEIGHT_FRAMEWORK=OFF           -DWITH_PYTHON=OFF           -DWITH_TESTING=ON           -DLITE_WITH_ARM=OFF           -DLITE_ON_TINY_PUBLISH=OFF           -DCUDNN_ROOT=/usr/local/cudnn/           -DLITE_BUILD_EXTRA=ON (#2708)

x86 and cuda compile simutaneously

f1cedb8f

Z
[XPU] bn unit test (#2706) · bc6d5adc
由 zhupengyang 提交于 12月 31, 2019
```
test=develop
```
bc6d5adc

[LITE][NPU][XPU] Refine the registration and implementation of op bridges (#2700) · a29c84a2

由 hong19860320 提交于 12月 31, 2019

* Fix the compiling error which occurs when specify the ddk_root path and build for huawei NPU.

* Refine the registration of op bridges and make it similar to the registration of op and kernel.

* Refine the interfaces of the graph and node for op bridges, and support creating constant and data node automatically according to the attribute 'persistable' of the target tensor.

* Add the unit test of the scale and softmax op bridge for NPU.

a29c84a2

30 12月, 2019 2 次提交
- J
  Fix yolo_box bug (#2704) · bc5bd154
  由 juncaipeng 提交于 12月 30, 2019
```
* fix yolov3 bug when run several times, test=develop
```
  bc5bd154
- Y
  Optimize the execution of RuntimeProgram by saving the bool whether the op is... · bb1cf7ff
  由 Yiqun Liu 提交于 12月 30, 2019
```
Optimize the execution of RuntimeProgram by saving the bool whether the op is feed/fetch op. (#2703)

test=develop
```
  bb1cf7ff
29 12月, 2019 1 次提交
- X
  
  [mobile][opencl] suite inputs or outputs mismatch,,test=mobile (#2698) · 9188571c
  由 xiebaiyuan 提交于 12月 29, 2019
  
  9188571c
28 12月, 2019 3 次提交
- X
  [mobile][opencl] make model attr support issue more readable ,fastfail before... · 165b02f1
  由 xiebaiyuan 提交于 12月 28, 2019
```
[mobile][opencl] make model attr support issue more readable ,fastfail before cxxlib err ,test=mobile (#2697)
```
  165b02f1
- J
  
  [Mobile][ARM]fix possible memory leak of tensor in CPUContext.test=mobile (#2695) · 2856c2b7
  由 Jiaying Zhao 提交于 12月 28, 2019
  
  2856c2b7
- H
  
  Upgrade of Model_optimize_tool (#2624) · 4300ef75
  由 huzhiqiang 提交于 12月 28, 2019
  
  4300ef75
27 12月, 2019 6 次提交
- X
  [mobile][opencl]optimise log print , use kNOLOG to close develop time… (#2693) · 86762e1e
  由 xiebaiyuan 提交于 12月 27, 2019
```
* [mobile][opencl]optimise log print , ues kNOLOG to close develop time logs ,test=mobile

* [mobile][opencl]optimise log print , ues kNOLOG to close develop time logs ,test=mobile
```
  86762e1e
- 石
  
  update profiler, test=develop (#2644) · 9171b70e
  由石晓伟提交于 12月 27, 2019
  
  9171b70e
- Y
  
  move flatten op from extra to basic, test=develop (#2659) · ba32906a
  由 yiicy 提交于 12月 27, 2019
  
  ba32906a
- H
  
  fix ios compile failure (#2688) · 2bf61581
  由 huzhiqiang 提交于 12月 27, 2019
  
  2bf61581
- H
  
  [LITE][NPU][XPU] Add kernel context to NPU/XPU subgraph engine (#2686) · 1e10b471
  由 hong19860320 提交于 12月 27, 2019
  
  1e10b471
- H
  remove test_models ci projects, because these project hass been removed in ci... · ad1dfbf2
  由 huzhiqiang 提交于 12月 27, 2019
```
remove test_models ci projects, because these project hass been removed in ci test test=develop (#2669)
```
  ad1dfbf2
26 12月, 2019 6 次提交
- X
  [mobile]Develop common deepwise & fix bug in element mul (#2687) · 8d332397
  由 xiebaiyuan 提交于 12月 26, 2019
```
* [mobile][opencl]common deepwise conv,test=mobile

* [mobile][opencl]revert deepwise 3x3 for stable ,test = mobile

* [mobile][opencl]format convkernel.inc.cl with clang-format ,test = mobile

* [mobile][opencl] suite 1*X Y element_y ,test=mobile

* [mobile][opencl] add whole print method for cl_image ,test=mobile
```
  8d332397
- W
  fix fluid-lite-subgraph x86 compile error test=develop (#2682) · 53a5906c
  由 Wilber 提交于 12月 26, 2019
```
-fix fluid-lite-subgraph x86 compile error
    - Replace FLAGS with environment variables
```
  53a5906c
- J
  
  [Mobile][OpenCL]Fix CI compile error. (#2684) · 52f325e3
  由 Jiaying Zhao 提交于 12月 26, 2019
  
  52f325e3
- J
  [Mobile][ARM]Fix memory leaks when using input tensor created with external... · 1f262f9f
  由 Jiaying Zhao 提交于 12月 26, 2019
```
[Mobile][ARM]Fix memory leaks when using input tensor created with external pointer in lod_mode. (#2681)
```
  1f262f9f
- X
  add multi_thread ut (#2677) · 19c08de2
  由 xiaogang 提交于 12月 26, 2019
```
* feat: add multi_thread ut
```
  19c08de2
- Z
  [XPU] mul unittest (#2676) · 6bce0133
  由 zhupengyang 提交于 12月 26, 2019
```
test=develop
```
  6bce0133
25 12月, 2019 6 次提交

J
fix mask rcnn error when run twice, test=develop (#2675) · cd49b0a3
由 juncaipeng 提交于 12月 25, 2019
```
add clear for tensor
```
cd49b0a3
J
fix op inputs and outputs type (#2647) · 168ce9a9
由 juncaipeng 提交于 12月 25, 2019
```
* fix op inputs and outputs type, test=develop
```
168ce9a9
W
optimize softmax cuda kernel test=develop (#2660) · 8f593443
由 Wilber 提交于 12月 25, 2019
```
optimize softmax cuda kernel
```
8f593443
J

add benchmark in cmakefile, test=develop (#2663) · 00fee283
由 juncaipeng 提交于 12月 25, 2019

00fee283
H

[LITE][XPU] Fix matmul op bridge (#2668) · 3fe5cddf
由 hong19860320 提交于 12月 25, 2019

3fe5cddf

[X86] Polish the implementation of fc and imporve the unittest (#2656) · 28481458

由 Yiqun Liu 提交于 12月 25, 2019

* Remove GEMM padding in fc_compute.
test=develop

* Write a common ParallelFor function to run the for loop in parallel.

* Add the codes of padding GEMM back in fc.

* Refine the code of fc when padding_weight is false to avoid the definition of temporary Tensor.

* Refine the unit test of fc and add testing case of padding and parallel.
test=develop

* Enable more test cases in common fc unittest, including padding and parallel for x86 target.

* Remove the fc test under kernels/x86.
test=develop

* Disable relu in test of fc for non-x86 target.
test=develop

* Change the eps of arm.
test=develop

28481458

24 12月, 2019 6 次提交
- Z
  
  [XPU] matmul bridge and unit test (#2666) · d345a7fc
  由 zhupengyang 提交于 12月 24, 2019
  
  d345a7fc
- H
  
  [LITE][XPU] Fix dropout op bridge and unit test for BERT (#2665) · d444ecbf
  由 hong19860320 提交于 12月 24, 2019
  
  d444ecbf
- H
  
  conclude model_test in CI into Android test (#2639) · 18f9ea1b
  由 huzhiqiang 提交于 12月 24, 2019
  
  18f9ea1b
- H
  [LITE][NPU][XPU] Support multiple types for XPU and NPU op bridges (#2646) · 05da0c72
  由 hong19860320 提交于 12月 24, 2019
```
* Support multiple types for XPU and NPU op bridges

* Add lookup_table, gather, slice, stack and scale op bridges for supporting BERT

* Fix the definition of lookup_table kernel for X86
```
  05da0c72
- Y
  
  [ARM] multiclass_nms op add index output, test=develop (#2654) · e1c4adfd
  由 yiicy 提交于 12月 24, 2019
  
  e1c4adfd
- Z
  [XPU] add dropout bridge and unit test (#2650) · d904c9dd
  由 zhupengyang 提交于 12月 24, 2019
```
test=develop
```
  d904c9dd