提交 · c37e0b5500dc12c7f779545e6615e2a1a9353e7b · PaddlePaddle / Paddle-Lite

27 9月, 2020 1 次提交
- B
  
  [APU] Add model cache(#4456) · c37e0b55
  由 barry-ai 提交于 9月 27, 2020
  
  c37e0b55
25 9月, 2020 1 次提交

[arm] fix xiaodu a53 crash problem (#4437) · 2da739da

由 HappyAngel 提交于 9月 25, 2020

* set a53 use or no_use, test=develop

* fix xiaodu a53 crash . test=develop

* fix Mac OS build error. test=develop

* fix build error. test=develop

2da739da

08 9月, 2020 1 次提交

石

platform portability of tls, test=develop (#4261) · 914219cc

由石晓伟提交于 9月 08, 2020

* platform portability of tls, test=develop

* update build_ios.sh, test=develop

* add static keyword for tls, test=develop

* rename the alias of tls, test=develop

914219cc

30 7月, 2020 1 次提交
- Y
  [BugFix][OPENCL] Fix initalization sequence of opencl backend valid API. test=develop (#4003) · de299b9e
  由 ysh329 提交于 7月 30, 2020
```
* fix opencl backend. test=develop
```
  de299b9e
24 7月, 2020 1 次提交

[ASCEND] Add Huawei Ascend310 support (#3936) · 769ba40b

由 Qi Li 提交于 7月 24, 2020

* [ASCEND] Add Huawei Ascend310 support, test=develop

* [ASCEND] fix some typos, test=develop

* [ASCEND] address comments and fix opt ci python file, test=develop

* [ASCEND] update based on new ascend env, test=develop

* [ASCEND] update after develop merge, test=develop

769ba40b

22 7月, 2020 1 次提交

[Core] Add the graph optimization of subblocks for transformer model (#3947) · 7af1a258

由 hong19860320 提交于 7月 22, 2020

* [Core][ARM] Fix beam_search, eltwise_mul supports broadcast and int64_t data type, add print op and kernel, add exeception
test=develop

* Fix the dims of parent idx of the arm kernel of beam_search op

* elementwise_mul supports int64_t data type with broadcasting

* Add print op and kernel for debugging

* Support throwing the exception when the internal error occurs

* Refine while and conditional_block op kernel

* Support the graph optimization on subblocks

* Pass program_desc and block_idx into the kernel of the control flow ops(while/conditional_block/subgraph), and create the RuntimeProgram online, it make it possiable to call the control flow ops recursively

*Add unit test for masked transformer model

7af1a258

13 7月, 2020 1 次提交

[LITE][XPU] Support ResnetCbam and MMDNN (#3844) · 4780849f

由 Cwndmiao 提交于 7月 13, 2020

* [LITE][XPU] accomodate resnet_cbam

* [LITE][XPU] accomodate content-dnn

* fix pr comments test=develop

* fix pr comments test=develop

* fix pr comments test=develop test=xpu

* fix compilation error, test=develop test=xpu

* [X86] Fix the unit test of slice op
test=develop test=xpu
Co-authored-by: Nhong19860320 <9973393+hong19860320@users.noreply.github.com>

4780849f

06 7月, 2020 1 次提交
- M
  
  [MLU] add cast on MLU as default, test=develop (#3776) · cc927184
  由 MaxwellDing 提交于 7月 06, 2020
  
  cc927184
04 6月, 2020 1 次提交
- 石
  
  refactor any.h, test=develop (#3736) · aa10b11e
  由石晓伟提交于 6月 04, 2020
  
  aa10b11e
28 5月, 2020 1 次提交

[Libsize] Reduce size of dynamic library ".so" (#3717) · ec8ef528

由 T8T9 提交于 5月 28, 2020

* reduce .so size. test=develop

* compile all targets when LITE_ON_TINY_PUBLISH=OFF

* unordered_map is more convenient when key is customized class

* test=develop

ec8ef528

13 5月, 2020 1 次提交
- Z
  
  [NPU] save subgraph model cache (#3589) · 56ea5fff
  由 zhupengyang 提交于 5月 13, 2020
  
  56ea5fff
12 5月, 2020 1 次提交

[LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix... · ca51f68f

由 Cwndmiao 提交于 5月 12, 2020

[LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN; 3. Enhance |__xpu__multi_encoder_fuse_pass|; (#3596)

* [LITE][XPU] Add precision switch(int16/int31) in XPUMultiEncoderOp

* [LITE][XPU] fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN

* test=develop

* [LITE][XPU] suppress linkage error
test=develop

* [LITE][XPU] 1. Reorder |identity_dropout_eliminate_pass| before |__xpu__multi_encoder_fuse_pass|; 2. Enhance |__xpu__multi_encoder_fuse_pass|, it works well in more scenarios;
test=develop

* [LITE][XPU] Remove XPUConfig
test=develop

ca51f68f

06 5月, 2020 1 次提交

[LITE][BM] support hd all models,test=develop (#3540) · 94416b2c

由 Santa An 提交于 5月 06, 2020

 fix reshape infer shape issue  
adaptive pool, 
support adaptive pool2,  
multi thread ok,   
optimize global pool, 
support faceboxes and behavior image, 
realize bm device info, 
multi device ok, 
support multi cards, 
support efficienet

94416b2c

22 4月, 2020 1 次提交
- C
  
  [XPU] Add more XPU op kernels (#3457) · d5a6a1e5
  由 Cwndmiao 提交于 4月 22, 2020
  
  d5a6a1e5
19 4月, 2020 1 次提交

[LITE][OPENCL]Fix opencl (#3433) · 9bd9311b

由 xiebaiyuan 提交于 4月 19, 2020

* [lite][opencl] remove event with clfinish, add strict check for cl warning. add conv 3x3opt fallback opt layout cast ,test=develop

* [LITE][OPENCL]rm event in element_add_buffer_compute test=develop

* [LITE][OPENCL]suite cl_functions_test.cc test=develop

* [LITE][OPENCL] suite cl_common.sh lint check test=develop

* [LITE][OPENCL] suite conv_image_compute.cc lint check test=develop

* [LITE][OPENCL] suite cl_wait_list() lint check test=develop

9bd9311b

15 4月, 2020 1 次提交
- H
  
  [APU] Add MTK APU backend (#3407) · 355d080b
  由 hong19860320 提交于 4月 15, 2020
  
  355d080b
14 4月, 2020 1 次提交
- A
  
  [RKNPU] Add Rockchip NPU backend (#3382) · fbe0799e
  由 airockchip 提交于 4月 14, 2020
  
  fbe0799e
13 4月, 2020 1 次提交
- W
  lite cuda support exec multi-stream. (#2949) · 4a7284f9
  由 Wilber 提交于 4月 13, 2020
```
lite cuda support exec multi-stream
```
  4a7284f9
09 4月, 2020 1 次提交

由 jackzhang235 提交于 4月 09, 2020

[MLU] add some basic support for MLU, including related passes, kernels, gtests and some api in padddle_api.h
Passes：mlu_subgraph_pass ,mlu_postprocess_pass
Kernels:  act，batch_norm, concat, conv, elementwise, fc, interpolate, pool, scale, softmax

dc481d49

08 4月, 2020 1 次提交

[Core][XPU] Add XPU op kernels (#3274) · 99deb7d9

由 hong19860320 提交于 4月 08, 2020

* [LITE][XPU] bind xpu resnet50 kernels

* [LITE][XPU] fuse resnet50 and encoder

* [LITE][XPU] bind xpu bert kernels

* [LITE][XPU] refine xpu_resnet_fuse_pass.cc

* [LITE][XPU] add xpu stack kernel

* [LITE][XPU] add xpu slice/tanh kernel

* [LITE][XPU] refine resnet50 and encoder fusor

* [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h

* [LITE][XPU] clean workspace

* [LITE][XPU] add build script

* [LITE][XPU] fix compilation errors

* [LITE][XPU] fix kernel matmul

* [LITE][XPU] fix kernel ewadd ewsub

* [LITE][XPU] add xpu cast kernel

* [LITE][XPU] fix kernel slice

* [LITE][XPU] switch dev by LITE_XPU_DEV env

* [LITE][XPU] eliminate useless cast op

* [LITE][XPU] add PerThread Ops

* [LITE][X86] add SequenceUnpad op and kernel

* [LITE][XPU] add LITE_WITH_XTCL option

* [LITE][X86] add SequenceConv kernel

* [LITE][XPU] fix cmake dependency

* [LITE][XPU] add xpu sigmoid kernel

* [XPU] Remove the dependencies of framework.pb.h
test=develop

Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d

* [XPU] Fix the precision of cast kernel
test=develop

Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492

* [Core] Fix the compiling error when build for the target that disable XPU
test=develop

Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683

* [XPU] Add io_copy kernel for xpu<->arm
test=develop

Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7

* fix
test=develop

Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3

* fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel
test=develop

Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a

* [X86] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e

* Fix the build.sh for XPU and x86
test=develop

Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6

* [XPU] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749

* [XPU] Add XTCL compiling option in build.sh
test=develop

Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80

* fix namespace conflicts, test=develop

* [API][XPU] Move the XPU related APIs into CxxConfig
test=develop

Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9

* [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h
test=develop

Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93

* [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread
test=develop

Change-Id: I515958f56f8e129280bae61c923513cc91fb9728

* [API][Core][XPU] Refine the test case and remove the necessary modifications
test=develop

Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56

* [Core] Remove useless code
test=develop

Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678

* [XPU] Refine the test cases
test=develop

Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f

* [XPU] Remove useless scripts and code
test=develop

Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684

* [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op
test=develop

Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c

* test=develop

Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519

* test=develop

Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e

* [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder
test=develop

Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2

* [XPU] Fix and refine the xpu fuse pass
test=develop

Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907

* test=develop

Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89

* [XPU] Remove the dependency on xpu api for xpu fuse passes
test=develop

Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92

* [XPU] Move unit tests from lite/api to lite/tests/api
test=develop

Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250

* test=develop

Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794

* [XPU] Refine code
test=develop

Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06

* [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass
test=develop

Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204

* [XPU] refine code
test=develop

Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79

* [XPU] Refine code
test=develop

Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde

* [XPU] Add comments for the XPU APIs
test=develop

Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd
Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com>
Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

99deb7d9

07 4月, 2020 1 次提交
- W
  
  solve cuda_x86_xpu compile problem. test=develop (#3353) · 720590c9
  由 Wilber 提交于 4月 07, 2020
  
  720590c9
31 3月, 2020 1 次提交
- J
  
  [MLU][lib_increased] add some basic mlu definition (#3275) · 181af56c
  由 jackzhang235 提交于 3月 31, 2020
  
  181af56c
25 3月, 2020 1 次提交
- H
  
  [Python lib] Add opt lib into python lib (#3209) · 5fea8e10
  由 huzhiqiang 提交于 3月 25, 2020
  
  5fea8e10
04 3月, 2020 1 次提交
- H
  [opencl compile] add into build.sh (#3031) · 51e14609
  由 huzhiqiang 提交于 3月 04, 2020
```
* test=devellop

* add cl file into resulted lib test=develop

* test=develop

* test=develop
```
  51e14609
14 1月, 2020 1 次提交
- Support bitman backend,test=develop (#2761) · 14811017
  由 myq406450149 提交于 1月 14, 2020
```
* Support bitman backend
```
  14811017
13 12月, 2019 1 次提交
- H
  [LITE][NPU][XPU] Refine subgraph pass, and support NPU/XPU model generation at... · d5434aa2
  由 hong19860320 提交于 12月 13, 2019
```
[LITE][NPU][XPU] Refine subgraph pass, and support NPU/XPU model generation at execution time (#2576)
```
  d5434aa2
21 11月, 2019 1 次提交

石

fix cuda build error, test=develop (#2464) · d8ddbcc6

由石晓伟提交于 11月 21, 2019

* fix cuda building, test=develop

* remove sequence_pool from cmake because build error, test=develop

d8ddbcc6

20 11月, 2019 1 次提交
- support build C++ cuda shared lib (#2401) · 7635d699
  由 myq406450149 提交于 11月 20, 2019
```
* support build C++ cuda shared lib
```
  7635d699
13 11月, 2019 1 次提交
- W
  update test=develop (#2416) · 694f7517
  由 Wilber 提交于 11月 13, 2019
```
add CUDAContext assignment operator to fix cuda compile bug
```
  694f7517
05 11月, 2019 1 次提交
- S
  
  [framework][util] fix any class copy bug test=develop (#2367) · 686a414a
  由 sangoly 提交于 11月 05, 2019
  
  686a414a
28 10月, 2019 1 次提交

[LITE][XPU] initial support for XPU (#2202) · 06d058fe

由 hong19860320 提交于 10月 28, 2019

* Initial support for XPU
* Fix compiling errors of XPU
* Move XPU op kernel bridges from backends to kernels to fix deps order
* Change the namespace and directory of XPU bridges
* Add XPU SDK
* Fix header files and namespace of XPU SDK
* Add unit tests for relu and conv2d ops
* Restore the modification of paddle_api_test
* Supports simple model which contains only a relu layer
* Add compiling scripts for XPU
* Fix compiling errors of XPU
* Add comments for XPU LoadModel and BuildModel

06d058fe

15 10月, 2019 1 次提交

[NPU] Fix and refine the supporting of multi NPU models (#2037) · 7a731b7f

由 hong19860320 提交于 10月 15, 2019

* [NPU] Fix the bug of loading multi NPU models
test=develop

* [NPU] Use lite tensor to store NPU model, fix the management of multi NPU models, support loading NPU model from memory and reduce the modification of framework
test=develop

* [NPU] Remove redundant header files for NPU bridges,
test=develop

* [NPU] fix NPU deps
test=develop

* [NPU] refine the compiling script for NPU
test=develop

* [NPU] remove redundant subdirectory in lite/CMakeLists.txt
test=develop

* [NPU] Fix and refine NPU test case
test=develop

* [NPU] revoke the modification of other non-NPU modules
test=develop

* [NPU] Remove NPU bridges if target is tiny publish
test=develop

7a731b7f

11 10月, 2019 1 次提交

[LITE][OPENCL] support image2d type (#2158) · 77cdbdce

由 Yuan Shuai 提交于 10月 11, 2019

* [LITE][OPENCL] support image2d. test=develop

* add context changed with consider image*. test=develop

* add layout, relu image kernels. test=develop

* replace image_data with data, mutable_image_data with mutable_data, test=develop

* comment unused var. test=develop

* remove unused var. test=develop

77cdbdce

27 9月, 2019 1 次提交

can run yolov3 fp32 on cuda devices (#2092) · 3d6d744f

由 Zhaolong Xing 提交于 9月 27, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

3d6d744f

19 9月, 2019 1 次提交

石

add full_api_static target and fix building errors, test=develop (#2064) · eef7ea0f

由石晓伟提交于 9月 19, 2019

* add full_api_static target and fix building errors, test=develop

* fix build errors, test=develop

* fix code style, test=develop

* fix lite/model_parser/pb/var_desc.cc, test=develop

* fix building errors, test=develop

* modify lite/tools/debug/CMakeLists.txt, test=develop

eef7ea0f

11 9月, 2019 1 次提交
- Y
  
  make model_optimize_tool run on host (#1990) · 83d4b0e8
  由 Yan Chunwei 提交于 9月 11, 2019
  
  83d4b0e8
03 9月, 2019 2 次提交
- H
  
  move npu into backends(directory) and move python/ into tools/python (#1958) · c5e65402
  由 huzhiqiang 提交于 9月 03, 2019
  
  c5e65402
- H
  
  create backends directory and move hardware backends into it (#1954) · 31ee212a
  由 huzhiqiang 提交于 9月 03, 2019
  
  31ee212a
27 8月, 2019 1 次提交
- Z
  lite cuda init: can run a simple model with leaky_relu (#1860) · 05d3b19b
  由 Zhaolong Xing 提交于 8月 27, 2019
```
* paddle lite cuda init
can run model with leaky_relu

* add the missing file.
test=develop
```
  05d3b19b
24 8月, 2019 1 次提交

support setting cluster and threads in MobileConfig (#1848) · bc79142c

由 Xiaoyang LI 提交于 8月 24, 2019

* fix building ios tiny publish lib error

* support setting cluster and threads in MobileConfig

* fix build error, test=develop

* fix building server publish error, test=develop

bc79142c