- 20 9月, 2020 1 次提交
-
-
由 weihaoji 提交于
-
- 17 9月, 2020 1 次提交
-
-
由 ysh329 提交于
[PROFILE] Add ENV var controls whether write output tensor of each op to files; Rename output tensor name when mem_reuse pass enabled by default etc. (#4348) * Add ENV var controls whether write output tensor of each op to files; * Rename output tensor name when mem_reuse pass enabled by default etc.
-
- 15 9月, 2020 1 次提交
-
-
由 ysh329 提交于
* [PROFILE] Write output tensor to file for each OP when precision profiler enabled. test=develop * create output tensor files dir. test=develop
-
- 02 9月, 2020 1 次提交
-
-
由 sunsetlh 提交于
-
- 17 8月, 2020 1 次提交
-
-
由 myq406450149 提交于
-
- 12 8月, 2020 2 次提交
- 11 8月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 24 7月, 2020 1 次提交
-
-
由 Qi Li 提交于
* [ASCEND] Add Huawei Ascend310 support, test=develop * [ASCEND] fix some typos, test=develop * [ASCEND] address comments and fix opt ci python file, test=develop * [ASCEND] update based on new ascend env, test=develop * [ASCEND] update after develop merge, test=develop
-
- 22 7月, 2020 2 次提交
-
-
由 hong19860320 提交于
* [Core][ARM] Fix beam_search, eltwise_mul supports broadcast and int64_t data type, add print op and kernel, add exeception test=develop * Fix the dims of parent idx of the arm kernel of beam_search op * elementwise_mul supports int64_t data type with broadcasting * Add print op and kernel for debugging * Support throwing the exception when the internal error occurs * Refine while and conditional_block op kernel * Support the graph optimization on subblocks * Pass program_desc and block_idx into the kernel of the control flow ops(while/conditional_block/subgraph), and create the RuntimeProgram online, it make it possiable to call the control flow ops recursively *Add unit test for masked transformer model
-
由 HappyAngel 提交于
* add conv+conv(1x1s1p0) fusion * fix build and run error * fix formmat. test=develop
-
- 13 7月, 2020 1 次提交
-
-
由 Cwndmiao 提交于
* [LITE][XPU] accomodate resnet_cbam * [LITE][XPU] accomodate content-dnn * fix pr comments test=develop * fix pr comments test=develop * fix pr comments test=develop test=xpu * fix compilation error, test=develop test=xpu * [X86] Fix the unit test of slice op test=develop test=xpu Co-authored-by: Nhong19860320 <9973393+hong19860320@users.noreply.github.com>
-
- 06 7月, 2020 1 次提交
-
-
由 MaxwellDing 提交于
-
- 12 6月, 2020 1 次提交
-
-
由 Yuan Shuai 提交于
* [LITE][PASS] Add pass for removing uesless reshape2 / squeeze2. test=develop
-
- 09 6月, 2020 1 次提交
-
-
由 huzhiqiang 提交于
-
- 12 5月, 2020 1 次提交
-
-
由 Cwndmiao 提交于
[LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN; 3. Enhance |__xpu__multi_encoder_fuse_pass|; (#3596) * [LITE][XPU] Add precision switch(int16/int31) in XPUMultiEncoderOp * [LITE][XPU] fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN * test=develop * [LITE][XPU] suppress linkage error test=develop * [LITE][XPU] 1. Reorder |identity_dropout_eliminate_pass| before |__xpu__multi_encoder_fuse_pass|; 2. Enhance |__xpu__multi_encoder_fuse_pass|, it works well in more scenarios; test=develop * [LITE][XPU] Remove XPUConfig test=develop
-
- 08 5月, 2020 1 次提交
-
-
由 Wilber 提交于
* add eltwise_activate_fuse. test=develop
-
- 24 4月, 2020 1 次提交
-
-
由 HappyAngel 提交于
* add scale+relu/relu6/leakyrelu test=develop * fix format, test=develop
-
- 22 4月, 2020 1 次提交
-
-
由 Cwndmiao 提交于
-
- 15 4月, 2020 1 次提交
-
-
由 hong19860320 提交于
-
- 14 4月, 2020 1 次提交
-
-
由 airockchip 提交于
-
- 13 4月, 2020 1 次提交
-
-
由 Wilber 提交于
lite cuda support exec multi-stream
-
- 09 4月, 2020 1 次提交
-
-
由 jackzhang235 提交于
[MLU] add some basic support for MLU, including related passes, kernels, gtests and some api in padddle_api.h Passes:mlu_subgraph_pass ,mlu_postprocess_pass Kernels: act,batch_norm, concat, conv, elementwise, fc, interpolate, pool, scale, softmax
-
- 08 4月, 2020 1 次提交
-
-
由 hong19860320 提交于
* [LITE][XPU] bind xpu resnet50 kernels * [LITE][XPU] fuse resnet50 and encoder * [LITE][XPU] bind xpu bert kernels * [LITE][XPU] refine xpu_resnet_fuse_pass.cc * [LITE][XPU] add xpu stack kernel * [LITE][XPU] add xpu slice/tanh kernel * [LITE][XPU] refine resnet50 and encoder fusor * [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h * [LITE][XPU] clean workspace * [LITE][XPU] add build script * [LITE][XPU] fix compilation errors * [LITE][XPU] fix kernel matmul * [LITE][XPU] fix kernel ewadd ewsub * [LITE][XPU] add xpu cast kernel * [LITE][XPU] fix kernel slice * [LITE][XPU] switch dev by LITE_XPU_DEV env * [LITE][XPU] eliminate useless cast op * [LITE][XPU] add PerThread Ops * [LITE][X86] add SequenceUnpad op and kernel * [LITE][XPU] add LITE_WITH_XTCL option * [LITE][X86] add SequenceConv kernel * [LITE][XPU] fix cmake dependency * [LITE][XPU] add xpu sigmoid kernel * [XPU] Remove the dependencies of framework.pb.h test=develop Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d * [XPU] Fix the precision of cast kernel test=develop Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492 * [Core] Fix the compiling error when build for the target that disable XPU test=develop Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683 * [XPU] Add io_copy kernel for xpu<->arm test=develop Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7 * fix test=develop Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3 * fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel test=develop Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a * [X86] Add the keyword 'template' to avoid the compiling errors test=develop Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e * Fix the build.sh for XPU and x86 test=develop Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6 * [XPU] Add the keyword 'template' to avoid the compiling errors test=develop Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749 * [XPU] Add XTCL compiling option in build.sh test=develop Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80 * fix namespace conflicts, test=develop * [API][XPU] Move the XPU related APIs into CxxConfig test=develop Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9 * [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h test=develop Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93 * [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread test=develop Change-Id: I515958f56f8e129280bae61c923513cc91fb9728 * [API][Core][XPU] Refine the test case and remove the necessary modifications test=develop Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56 * [Core] Remove useless code test=develop Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678 * [XPU] Refine the test cases test=develop Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f * [XPU] Remove useless scripts and code test=develop Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684 * [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op test=develop Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c * test=develop Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519 * test=develop Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e * [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder test=develop Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2 * [XPU] Fix and refine the xpu fuse pass test=develop Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907 * test=develop Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89 * [XPU] Remove the dependency on xpu api for xpu fuse passes test=develop Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92 * [XPU] Move unit tests from lite/api to lite/tests/api test=develop Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250 * test=develop Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794 * [XPU] Refine code test=develop Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06 * [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass test=develop Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204 * [XPU] refine code test=develop Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79 * [XPU] Refine code test=develop Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde * [XPU] Add comments for the XPU APIs test=develop Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com> Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 10 3月, 2020 1 次提交
-
-
由 hong19860320 提交于
-
- 26 2月, 2020 1 次提交
-
-
由 huzhiqiang 提交于
-
- 21 2月, 2020 1 次提交
-
-
由 hong19860320 提交于
-
- 06 2月, 2020 1 次提交
-
-
由 juncaipeng 提交于
* optimize quant_dequant_fuse_pass, test=develop * update, test=develop * update, test=develop * fix bug for accessing the removed node, test=develop * set the bias of int8 conv as float, test=develop * support weight quantization, test=develop * up, test=develop * up, test=develop * up, test=develop
-
- 23 12月, 2019 1 次提交
-
-
由 Wilber 提交于
add sequence_pool_concat fuse pass add fuse kernel
-
- 20 12月, 2019 1 次提交
-
-
由 Wilber 提交于
add var_conv_2d + relu fuse pass
-
- 17 12月, 2019 1 次提交
-
-
由 HappyAngel 提交于
* add cv image process * fix arm liunx build error * add LITE_WITH_CV defien to make cv, test=develop * fix cv format, annd add describe in utils/cv * delete some Meaningless comments, test=develop * set LITE_WITH_CV=OFF in build.sh, test=develop * delete cv_enum.h in utils/cv, push the contents in cv_ennum.h to paddle_image_preprocess.h, test=develop * according to reviews to redefine paddle_image_preprocess.h, test=develop * add detailed note of flipParam, test=develop * fix format in paddle_image_preprocess.h, test=develop * fix error when build x86. test=develop lite_with_X86 does not contain lite_with_cv * fix cmake error in llite/CMakeLists.txt, missing mkdir cxx, test=develop * according to review change, test=develop * chang grb to rgb, test=develop * add elemetnwise mul constant elimination and deconv+relu, deconv+batchnorm fusion, test=develop * fix format, test=develop
-
- 13 12月, 2019 1 次提交
-
-
由 hong19860320 提交于
[LITE][NPU][XPU] Refine subgraph pass, and support NPU/XPU model generation at execution time (#2576)
-
- 04 12月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* init resnet cuda int8 support test=develop * refine cuda unit test test=develop * add the forgeted file. test=develop
-
- 22 11月, 2019 1 次提交
-
-
由 hong19860320 提交于
[LITE][ALL] Refine NPU and XPU passes, fix the pass matching based on the bound targets and excluded targets (#2477)
-
- 18 11月, 2019 1 次提交
-
-
由 Yuan Shuai 提交于
* Fix bug target for kHost and kARM not equal. test=develop * Fix license. test=develop * add debug -g option. test=develop * enable opencl demo. test=develop * Fix model_optimize_tool found no opencl kernel. test=develop * add more vlog. test=develop * remove macro LITE_WITH_OPENCL, LITE_WITH_FPGA in passes. test=develop * Fix valid_places in mobilenetv1_test. test=develop * Fix bug of find no real output of fetch, after tool OPs of optimzer passes. test=develop * Fix vlog as log message in model_optimize_tool. test=develop * fix miscs. test=develop * fix comment. test=develop * Fix misspell of opencl, fpga kernels name in lite/api/CMakeLists.txt. test=develop * add opencl macro in full_api of demo. test=develop
-
- 28 10月, 2019 1 次提交
-
-
由 hong19860320 提交于
* Initial support for XPU * Fix compiling errors of XPU * Move XPU op kernel bridges from backends to kernels to fix deps order * Change the namespace and directory of XPU bridges * Add XPU SDK * Fix header files and namespace of XPU SDK * Add unit tests for relu and conv2d ops * Restore the modification of paddle_api_test * Supports simple model which contains only a relu layer * Add compiling scripts for XPU * Fix compiling errors of XPU * Add comments for XPU LoadModel and BuildModel
-
- 22 10月, 2019 1 次提交
-
-
由 zhupengyang 提交于
test=develop
-
- 16 10月, 2019 1 次提交
-
-
由 sangoly 提交于
* [framework][place] remove prefered_place, use place order in valid_place array instead test=develop * remove kHost from valid_places test=develop
-
- 15 10月, 2019 2 次提交
-
-
由 石晓伟 提交于
-
由 hong19860320 提交于
* [NPU] Fix the bug of loading multi NPU models test=develop * [NPU] Use lite tensor to store NPU model, fix the management of multi NPU models, support loading NPU model from memory and reduce the modification of framework test=develop * [NPU] Remove redundant header files for NPU bridges, test=develop * [NPU] fix NPU deps test=develop * [NPU] refine the compiling script for NPU test=develop * [NPU] remove redundant subdirectory in lite/CMakeLists.txt test=develop * [NPU] Fix and refine NPU test case test=develop * [NPU] revoke the modification of other non-NPU modules test=develop * [NPU] Remove NPU bridges if target is tiny publish test=develop
-