提交 · 8d2351c3ef8e8725a77abb9071eefa4533b7e45b · PaddlePaddle / Paddle-Lite

20 9月, 2020 1 次提交
- W
  
  [XPU] add resnet50-D fusion (#4276) · 8d2351c3
  由 weihaoji 提交于 9月 20, 2020
  
  8d2351c3
17 9月, 2020 1 次提交

[PROFILE] Add ENV var controls whether write output tensor of each op to... · d353b126

由 ysh329 提交于 9月 17, 2020

[PROFILE] Add ENV var controls whether write output tensor of each op to files; Rename output tensor name when mem_reuse pass enabled by default etc. (#4348)

*  Add ENV var controls whether write output tensor of each op to files;
*  Rename output tensor name when mem_reuse pass enabled by default etc.

d353b126

15 9月, 2020 1 次提交
- Y
  [CORE][PROFILE] Write output tensor to file for each OP when precision profiler enabled (#4255) · 78a303c8
  由 ysh329 提交于 9月 15, 2020
```
* [PROFILE] Write output tensor to file for each OP when precision profiler enabled. test=develop

* create output tensor files dir. test=develop
```
  78a303c8
02 9月, 2020 1 次提交
- S
  
  [core] [XPU] add added xpu conv2d fuse, vis fuse and many ops for wangpan clarity feature (#4084) · 42e62a74
  由 sunsetlh 提交于 9月 02, 2020
  
  42e62a74
17 8月, 2020 1 次提交
- add reshape pass. test=develop (#4073) · 5f6d8ce6
  由 myq406450149 提交于 8月 17, 2020
  
  5f6d8ce6
12 8月, 2020 2 次提交
- W
  
  add reverse embedding. test=develop (#4106) · 5dcd5637
  由 Wilber 提交于 8月 12, 2020
  
  5dcd5637
- W
  
  [CUDA] [Fuse] Add scals fuse. (#4094) · ebcdb28b
  由 Wilber 提交于 8月 12, 2020
  
  ebcdb28b
11 8月, 2020 1 次提交
- W
  
  [CUDA] [Fuse] Fuse relu. (#4090) · a4586f48
  由 Wilber 提交于 8月 11, 2020
  
  a4586f48
24 7月, 2020 1 次提交

[ASCEND] Add Huawei Ascend310 support (#3936) · 769ba40b

由 Qi Li 提交于 7月 24, 2020

* [ASCEND] Add Huawei Ascend310 support, test=develop

* [ASCEND] fix some typos, test=develop

* [ASCEND] address comments and fix opt ci python file, test=develop

* [ASCEND] update based on new ascend env, test=develop

* [ASCEND] update after develop merge, test=develop

769ba40b

22 7月, 2020 2 次提交

[Core] Add the graph optimization of subblocks for transformer model (#3947) · 7af1a258

由 hong19860320 提交于 7月 22, 2020

* [Core][ARM] Fix beam_search, eltwise_mul supports broadcast and int64_t data type, add print op and kernel, add exeception
test=develop

* Fix the dims of parent idx of the arm kernel of beam_search op

* elementwise_mul supports int64_t data type with broadcasting

* Add print op and kernel for debugging

* Support throwing the exception when the internal error occurs

* Refine while and conditional_block op kernel

* Support the graph optimization on subblocks

* Pass program_desc and block_idx into the kernel of the control flow ops(while/conditional_block/subgraph), and create the RuntimeProgram online, it make it possiable to call the control flow ops recursively

*Add unit test for masked transformer model

7af1a258

H
[arm] add conv+conv fusion (#3967) · f358cdb8
由 HappyAngel 提交于 7月 21, 2020
```
* add conv+conv(1x1s1p0) fusion

* fix build and run error

* fix formmat. test=develop
```
f358cdb8

13 7月, 2020 1 次提交

[LITE][XPU] Support ResnetCbam and MMDNN (#3844) · 4780849f

由 Cwndmiao 提交于 7月 13, 2020

* [LITE][XPU] accomodate resnet_cbam

* [LITE][XPU] accomodate content-dnn

* fix pr comments test=develop

* fix pr comments test=develop

* fix pr comments test=develop test=xpu

* fix compilation error, test=develop test=xpu

* [X86] Fix the unit test of slice op
test=develop test=xpu
Co-authored-by: Nhong19860320 <9973393+hong19860320@users.noreply.github.com>

4780849f

06 7月, 2020 1 次提交
- M
  
  [MLU] add cast on MLU as default, test=develop (#3776) · cc927184
  由 MaxwellDing 提交于 7月 06, 2020
  
  cc927184
12 6月, 2020 1 次提交
- Y
  [LITE][PASS] Remove reshape2 / squeeze2 for tf_mobilenetv1/v2 (#3773) · 29771f27
  由 Yuan Shuai 提交于 6月 12, 2020
```
* [LITE][PASS] Add pass for removing uesless reshape2 / squeeze2. test=develop
```
  29771f27
09 6月, 2020 1 次提交
- H
  
  [Parl] Add CxxPredictor->Clone() method (#3759) · 24d37695
  由 huzhiqiang 提交于 6月 09, 2020
  
  24d37695
12 5月, 2020 1 次提交

[LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix... · ca51f68f

由 Cwndmiao 提交于 5月 12, 2020

[LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN; 3. Enhance |__xpu__multi_encoder_fuse_pass|; (#3596)

* [LITE][XPU] Add precision switch(int16/int31) in XPUMultiEncoderOp

* [LITE][XPU] fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN

* test=develop

* [LITE][XPU] suppress linkage error
test=develop

* [LITE][XPU] 1. Reorder |identity_dropout_eliminate_pass| before |__xpu__multi_encoder_fuse_pass|; 2. Enhance |__xpu__multi_encoder_fuse_pass|, it works well in more scenarios;
test=develop

* [LITE][XPU] Remove XPUConfig
test=develop

ca51f68f

08 5月, 2020 1 次提交
- W
  add eltwise_activate fuse. test=develop (#3367) · 2a344823
  由 Wilber 提交于 5月 08, 2020
```
* add eltwise_activate_fuse. test=develop
```
  2a344823
24 4月, 2020 1 次提交
- H
  [arm] add scale+relu/relu6/leakyrelu fusion (#3461) · 1a64347a
  由 HappyAngel 提交于 4月 24, 2020
```
* add scale+relu/relu6/leakyrelu test=develop
* fix format， test=develop
```
  1a64347a
22 4月, 2020 1 次提交
- C
  
  [XPU] Add more XPU op kernels (#3457) · d5a6a1e5
  由 Cwndmiao 提交于 4月 22, 2020
  
  d5a6a1e5
15 4月, 2020 1 次提交
- H
  
  [APU] Add MTK APU backend (#3407) · 355d080b
  由 hong19860320 提交于 4月 15, 2020
  
  355d080b
14 4月, 2020 1 次提交
- A
  
  [RKNPU] Add Rockchip NPU backend (#3382) · fbe0799e
  由 airockchip 提交于 4月 14, 2020
  
  fbe0799e
13 4月, 2020 1 次提交
- W
  lite cuda support exec multi-stream. (#2949) · 4a7284f9
  由 Wilber 提交于 4月 13, 2020
```
lite cuda support exec multi-stream
```
  4a7284f9
09 4月, 2020 1 次提交

由 jackzhang235 提交于 4月 09, 2020

[MLU] add some basic support for MLU, including related passes, kernels, gtests and some api in padddle_api.h
Passes：mlu_subgraph_pass ,mlu_postprocess_pass
Kernels:  act，batch_norm, concat, conv, elementwise, fc, interpolate, pool, scale, softmax

dc481d49

08 4月, 2020 1 次提交

[Core][XPU] Add XPU op kernels (#3274) · 99deb7d9

由 hong19860320 提交于 4月 08, 2020

* [LITE][XPU] bind xpu resnet50 kernels

* [LITE][XPU] fuse resnet50 and encoder

* [LITE][XPU] bind xpu bert kernels

* [LITE][XPU] refine xpu_resnet_fuse_pass.cc

* [LITE][XPU] add xpu stack kernel

* [LITE][XPU] add xpu slice/tanh kernel

* [LITE][XPU] refine resnet50 and encoder fusor

* [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h

* [LITE][XPU] clean workspace

* [LITE][XPU] add build script

* [LITE][XPU] fix compilation errors

* [LITE][XPU] fix kernel matmul

* [LITE][XPU] fix kernel ewadd ewsub

* [LITE][XPU] add xpu cast kernel

* [LITE][XPU] fix kernel slice

* [LITE][XPU] switch dev by LITE_XPU_DEV env

* [LITE][XPU] eliminate useless cast op

* [LITE][XPU] add PerThread Ops

* [LITE][X86] add SequenceUnpad op and kernel

* [LITE][XPU] add LITE_WITH_XTCL option

* [LITE][X86] add SequenceConv kernel

* [LITE][XPU] fix cmake dependency

* [LITE][XPU] add xpu sigmoid kernel

* [XPU] Remove the dependencies of framework.pb.h
test=develop

Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d

* [XPU] Fix the precision of cast kernel
test=develop

Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492

* [Core] Fix the compiling error when build for the target that disable XPU
test=develop

Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683

* [XPU] Add io_copy kernel for xpu<->arm
test=develop

Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7

* fix
test=develop

Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3

* fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel
test=develop

Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a

* [X86] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e

* Fix the build.sh for XPU and x86
test=develop

Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6

* [XPU] Add the keyword 'template' to avoid the compiling errors
test=develop

Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749

* [XPU] Add XTCL compiling option in build.sh
test=develop

Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80

* fix namespace conflicts, test=develop

* [API][XPU] Move the XPU related APIs into CxxConfig
test=develop

Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9

* [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h
test=develop

Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93

* [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread
test=develop

Change-Id: I515958f56f8e129280bae61c923513cc91fb9728

* [API][Core][XPU] Refine the test case and remove the necessary modifications
test=develop

Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56

* [Core] Remove useless code
test=develop

Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678

* [XPU] Refine the test cases
test=develop

Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f

* [XPU] Remove useless scripts and code
test=develop

Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684

* [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op
test=develop

Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c

* test=develop

Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519

* test=develop

Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e

* [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder
test=develop

Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2

* [XPU] Fix and refine the xpu fuse pass
test=develop

Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907

* test=develop

Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89

* [XPU] Remove the dependency on xpu api for xpu fuse passes
test=develop

Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92

* [XPU] Move unit tests from lite/api to lite/tests/api
test=develop

Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250

* test=develop

Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794

* [XPU] Refine code
test=develop

Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06

* [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass
test=develop

Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204

* [XPU] refine code
test=develop

Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79

* [XPU] Refine code
test=develop

Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde

* [XPU] Add comments for the XPU APIs
test=develop

Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd
Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com>
Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

99deb7d9

10 3月, 2020 1 次提交
- H
  
  [CORE] Support the fully quantized model for MTK and RK NPU (#3096) · 08a3ed12
  由 hong19860320 提交于 3月 10, 2020
  
  08a3ed12
26 2月, 2020 1 次提交
- H
  
  [opencl]add pre_process attribute into layoutop (#3001) · 360b4013
  由 huzhiqiang 提交于 2月 26, 2020
  
  360b4013
21 2月, 2020 1 次提交
- H
  
  [NPU][XPU][BM] Remove the dependencies from X86 and ARM kernels (#2963) · 294375f9
  由 hong19860320 提交于 2月 21, 2020
  
  294375f9
06 2月, 2020 1 次提交

Support weight quantization (#2791) · 6329a9a2

由 juncaipeng 提交于 2月 06, 2020

* optimize quant_dequant_fuse_pass, test=develop

* update, test=develop

* update, test=develop

* fix bug for accessing the removed node, test=develop

* set the bias of int8 conv as float, test=develop

* support weight quantization, test=develop

* up, test=develop

* up, test=develop

* up, test=develop

6329a9a2

23 12月, 2019 1 次提交
- W
  add sequence_pool_concat fuse and kernel test=develop (#2645) · 1b74fded
  由 Wilber 提交于 12月 23, 2019
```
add sequence_pool_concat fuse pass

add fuse kernel
```
  1b74fded
20 12月, 2019 1 次提交
- W
  add var_conv_2d_relu pass test=develop (#2631) · 8304bc84
  由 Wilber 提交于 12月 20, 2019
```
add var_conv_2d + relu fuse pass
```
  8304bc84
17 12月, 2019 1 次提交

[lite]add some fusion (#2604) · ec8353e8

由 HappyAngel 提交于 12月 17, 2019

* add cv image process

* fix arm liunx build error

* add LITE_WITH_CV defien to make cv, test=develop

* fix cv format, annd add describe in utils/cv

* delete some Meaningless comments, test=develop

* set LITE_WITH_CV=OFF in build.sh, test=develop

* delete cv_enum.h in utils/cv, push the contents in cv_ennum.h to paddle_image_preprocess.h, test=develop

* according to reviews to redefine paddle_image_preprocess.h, test=develop

* add detailed note of flipParam, test=develop

* fix format in paddle_image_preprocess.h, test=develop

* fix error when build x86. test=develop

lite_with_X86 does not contain lite_with_cv

* fix cmake error in llite/CMakeLists.txt, missing mkdir cxx, test=develop

* according to review change, test=develop

* chang grb to rgb, test=develop

* add elemetnwise mul constant elimination and deconv+relu, deconv+batchnorm fusion, test=develop

* fix format, test=develop

ec8353e8

13 12月, 2019 1 次提交
- H
  [LITE][NPU][XPU] Refine subgraph pass, and support NPU/XPU model generation at... · d5434aa2
  由 hong19860320 提交于 12月 13, 2019
```
[LITE][NPU][XPU] Refine subgraph pass, and support NPU/XPU model generation at execution time (#2576)
```
  d5434aa2
04 12月, 2019 1 次提交

[cuda] [int8] resnet50 cuda int8 support (#2417) · f7574646

由 Zhaolong Xing 提交于 12月 04, 2019

* init resnet cuda int8 support
test=develop

* refine cuda unit test
test=develop

* add the forgeted file.
test=develop

f7574646

22 11月, 2019 1 次提交
- H
  [LITE][ALL] Refine NPU and XPU passes, fix the pass matching based on the... · c62fd634
  由 hong19860320 提交于 11月 22, 2019
```
[LITE][ALL] Refine NPU and XPU passes, fix the pass matching based on the bound targets and excluded targets (#2477)
```
  c62fd634
18 11月, 2019 1 次提交

[LITE][OPENCL] Enable full and light api for OpenCL (#2331) · d242bdfb

由 Yuan Shuai 提交于 11月 18, 2019

* Fix bug target for kHost and kARM not equal. test=develop

* Fix license. test=develop

* add debug -g option. test=develop

* enable opencl demo. test=develop

* Fix model_optimize_tool found no opencl kernel. test=develop

* add more vlog. test=develop

* remove macro LITE_WITH_OPENCL, LITE_WITH_FPGA in passes. test=develop

* Fix valid_places in mobilenetv1_test. test=develop

* Fix bug of find no real output of fetch, after tool OPs of optimzer passes. test=develop

* Fix vlog as log message in model_optimize_tool. test=develop

* fix miscs. test=develop

* fix comment. test=develop

* Fix misspell of opencl, fpga kernels name in lite/api/CMakeLists.txt. test=develop

* add opencl macro in full_api of demo. test=develop

d242bdfb

28 10月, 2019 1 次提交

[LITE][XPU] initial support for XPU (#2202) · 06d058fe

由 hong19860320 提交于 10月 28, 2019

* Initial support for XPU
* Fix compiling errors of XPU
* Move XPU op kernel bridges from backends to kernels to fix deps order
* Change the namespace and directory of XPU bridges
* Add XPU SDK
* Fix header files and namespace of XPU SDK
* Add unit tests for relu and conv2d ops
* Restore the modification of paddle_api_test
* Supports simple model which contains only a relu layer
* Add compiling scripts for XPU
* Fix compiling errors of XPU
* Add comments for XPU LoadModel and BuildModel

06d058fe

22 10月, 2019 1 次提交
- Z
  remove feed and fetch for npu subgraph pass (#2230) · 4e05ea29
  由 zhupengyang 提交于 10月 22, 2019
```
test=develop
```
  4e05ea29
16 10月, 2019 1 次提交

[framework][place] remove prefered_place and kHost in valid_places (#2192) · 3012088b

由 sangoly 提交于 10月 16, 2019

* [framework][place] remove prefered_place, use place order in valid_place array instead test=develop

* remove kHost from valid_places test=develop

3012088b

15 10月, 2019 2 次提交

石

fix pass selection, test=develop (#2187) · da55f674
由石晓伟提交于 10月 15, 2019

da55f674

[NPU] Fix and refine the supporting of multi NPU models (#2037) · 7a731b7f

由 hong19860320 提交于 10月 15, 2019

* [NPU] Fix the bug of loading multi NPU models
test=develop

* [NPU] Use lite tensor to store NPU model, fix the management of multi NPU models, support loading NPU model from memory and reduce the modification of framework
test=develop

* [NPU] Remove redundant header files for NPU bridges,
test=develop

* [NPU] fix NPU deps
test=develop

* [NPU] refine the compiling script for NPU
test=develop

* [NPU] remove redundant subdirectory in lite/CMakeLists.txt
test=develop

* [NPU] Fix and refine NPU test case
test=develop

* [NPU] revoke the modification of other non-NPU modules
test=develop

* [NPU] Remove NPU bridges if target is tiny publish
test=develop

7a731b7f