- 19 7月, 2020 1 次提交
-
-
由 ysh329 提交于
* [OPENCL][API] add opencl valid api for device. test=develop (#3951)
-
- 17 7月, 2020 1 次提交
-
-
由 ysh329 提交于
* [cherry-pick][OPENCL] remove conv redundant's for opencl kernel. test=develop Co-authored-by: Nxiebaiyuan <xiebaiyuan@qq.com>
-
- 13 7月, 2020 1 次提交
-
-
由 Qi Li 提交于
* [NPU] enhance cache offline model, test=develop
-
- 09 7月, 2020 3 次提交
-
-
由 ysh329 提交于
[cherry-pick][OPENCL] Fix opencl create cmd with prop, fix int16 model for fc opencl kernel. (#3918) * [OPENCL] Fix opencl fc int16 model bug caused by fc kernel (#3900) * fix opencl fc kernel caused int16 model weight abnormal. test=develop
-
由 ysh329 提交于
* fix opencl fc kernel caused int16 model weight abnormal. test=develop
-
由 HappyAngel 提交于
* [arm]add 2x2s2p1 pooling (#3705) * fix pooling bug and speed * add 2x2s2p1 pooling. test=develop * fix conflict, test=develop * fix conflict in wino * [arm] add 3x3s1 Winograd int8 (#3767) * fix: winograd support unsame pad test=develop * feat: add winograd int8 kernel test=develop * fix: style fix test=develo * fix winograd_int8 ut sgement default. test=develop * close basic_test, test=develop Co-authored-by: NMyPandaShaoxiang <txg4794@163.com> * fix xiaodu crash in gemm prepacked * in huwen phone, 3x3s2p0 avg pooling will rand crash, other phone does not have this feature * [arm] update con int8 kernel choose (#3834) * fix conv int8 kernel choose and sooftmax compute bug * change axis_size = 4 kernel choose, test=develop * fix format. test=develop * fix format.test=develop * fix build test=develop * fix buiild error test=develop * fix wino_int8 computte erroor. test=develop * Update the link to debug, test=develop, test=document_fix (#3870) (#3871) Co-authored-by: NMyPandaShaoxiang <txg4794@163.com> Co-authored-by: Ncc <52520497+juncaipeng@users.noreply.github.com>
-
- 10 5月, 2020 1 次提交
-
-
由 Wilber 提交于
* update cuda demo.
-
- 01 5月, 2020 1 次提交
-
-
由 huzhiqiang 提交于
-
- 30 4月, 2020 1 次提交
-
-
由 hong19860320 提交于
-
- 29 4月, 2020 1 次提交
-
-
由 Yuan Shuai 提交于
* [LITE][OPENCL] add gpu perf mode, priority level for qcom adreno. test=develop
-
- 27 4月, 2020 1 次提交
-
-
由 huzhiqiang 提交于
-
- 21 4月, 2020 2 次提交
- 20 4月, 2020 3 次提交
-
-
由 xiebaiyuan 提交于
-
由 xiaogang 提交于
-
由 yiicy 提交于
improve sgemm performance on A53
-
- 19 4月, 2020 1 次提交
-
-
由 xiebaiyuan 提交于
* [lite][opencl] remove event with clfinish, add strict check for cl warning. add conv 3x3opt fallback opt layout cast ,test=develop * [LITE][OPENCL]rm event in element_add_buffer_compute test=develop * [LITE][OPENCL]suite cl_functions_test.cc test=develop * [LITE][OPENCL] suite cl_common.sh lint check test=develop * [LITE][OPENCL] suite conv_image_compute.cc lint check test=develop * [LITE][OPENCL] suite cl_wait_list() lint check test=develop
-
- 15 4月, 2020 3 次提交
-
-
由 Yuan Shuai 提交于
* fix bilinear opencl kernel. test=develop * [LITE][OPENCL] replace map with memsync. test=develop
-
由 MaxwellDing 提交于
refactor(*): reduce Wsign-compare warning
-
由 hong19860320 提交于
-
- 14 4月, 2020 4 次提交
-
-
由 silingtong123 提交于
-
由 airockchip 提交于
-
由 xiaogang 提交于
-
由 huzhiqiang 提交于
-
- 13 4月, 2020 2 次提交
- 12 4月, 2020 1 次提交
-
-
由 Yuan Shuai 提交于
-
- 11 4月, 2020 2 次提交
-
-
由 Yuan Shuai 提交于
1. clean code; 2. change `cl::Kernel` from unique to shared ptr; 3. `reset` `cl::Program` and `clear` `device_info_` in destroyed of CLRuntime; 4. remove clFlush in destroyed of CLRuntime.
-
由 xiebaiyuan 提交于
fix opencl hang on mali
-
- 10 4月, 2020 1 次提交
-
-
由 Yuan Shuai 提交于
* [LITE][OPENCL] fix OpenCL global static resources. test=develop * Fix Cxx and light api. test=develop
-
- 09 4月, 2020 2 次提交
-
-
由 xiebaiyuan 提交于
为了及时响应业务上线问题,先行合入
-
由 jackzhang235 提交于
[MLU] add some basic support for MLU, including related passes, kernels, gtests and some api in padddle_api.h Passes:mlu_subgraph_pass ,mlu_postprocess_pass Kernels: act,batch_norm, concat, conv, elementwise, fc, interpolate, pool, scale, softmax
-
- 08 4月, 2020 5 次提交
-
-
由 cc 提交于
* Add hard_swish, ctc_align and reciprocal op, test=develop * Move some activation ops to extra, test=develop
-
由 hong19860320 提交于
* [LITE][XPU] bind xpu resnet50 kernels * [LITE][XPU] fuse resnet50 and encoder * [LITE][XPU] bind xpu bert kernels * [LITE][XPU] refine xpu_resnet_fuse_pass.cc * [LITE][XPU] add xpu stack kernel * [LITE][XPU] add xpu slice/tanh kernel * [LITE][XPU] refine resnet50 and encoder fusor * [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h * [LITE][XPU] clean workspace * [LITE][XPU] add build script * [LITE][XPU] fix compilation errors * [LITE][XPU] fix kernel matmul * [LITE][XPU] fix kernel ewadd ewsub * [LITE][XPU] add xpu cast kernel * [LITE][XPU] fix kernel slice * [LITE][XPU] switch dev by LITE_XPU_DEV env * [LITE][XPU] eliminate useless cast op * [LITE][XPU] add PerThread Ops * [LITE][X86] add SequenceUnpad op and kernel * [LITE][XPU] add LITE_WITH_XTCL option * [LITE][X86] add SequenceConv kernel * [LITE][XPU] fix cmake dependency * [LITE][XPU] add xpu sigmoid kernel * [XPU] Remove the dependencies of framework.pb.h test=develop Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d * [XPU] Fix the precision of cast kernel test=develop Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492 * [Core] Fix the compiling error when build for the target that disable XPU test=develop Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683 * [XPU] Add io_copy kernel for xpu<->arm test=develop Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7 * fix test=develop Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3 * fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel test=develop Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a * [X86] Add the keyword 'template' to avoid the compiling errors test=develop Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e * Fix the build.sh for XPU and x86 test=develop Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6 * [XPU] Add the keyword 'template' to avoid the compiling errors test=develop Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749 * [XPU] Add XTCL compiling option in build.sh test=develop Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80 * fix namespace conflicts, test=develop * [API][XPU] Move the XPU related APIs into CxxConfig test=develop Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9 * [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h test=develop Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93 * [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread test=develop Change-Id: I515958f56f8e129280bae61c923513cc91fb9728 * [API][Core][XPU] Refine the test case and remove the necessary modifications test=develop Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56 * [Core] Remove useless code test=develop Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678 * [XPU] Refine the test cases test=develop Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f * [XPU] Remove useless scripts and code test=develop Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684 * [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op test=develop Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c * test=develop Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519 * test=develop Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e * [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder test=develop Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2 * [XPU] Fix and refine the xpu fuse pass test=develop Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907 * test=develop Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89 * [XPU] Remove the dependency on xpu api for xpu fuse passes test=develop Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92 * [XPU] Move unit tests from lite/api to lite/tests/api test=develop Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250 * test=develop Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794 * [XPU] Refine code test=develop Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06 * [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass test=develop Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204 * [XPU] refine code test=develop Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79 * [XPU] Refine code test=develop Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde * [XPU] Add comments for the XPU APIs test=develop Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com> Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Yuan Shuai 提交于
* [LITE][OPENCL] Add ReleaseResource for OpenCL when Predictor dead. test=develop * fix void for decontrust. test=develop * fix miscs. test=develop * fix miscs. test=develop * fix miscs. test=develop * fix miscs. test=develop * [LITE][OPENCL] fix Hang of mobilenetv1_test and kernel test. test=develop * [LITE][OPENCL] Fix miscs. test is ok. test=develop
-
由 HappyAngel 提交于
* add boxcoder opencl kernel, test=develop * fix format, test=develop * fix , test=develop
-
由 huzhiqiang 提交于
-
- 05 4月, 2020 1 次提交
-
-
由 Yuan Shuai 提交于
[LITE][OPENCL] Fix opencl backend: Free opencl resources; Fix AddKernel/GetKernel, program and all opencl kernels (#3344) * [DONT MERGE] Fix opencl backend. * [LITE][OPENCL] Fix kernels overlapped when add/get for kernels of mnasnet/yolonano. test=develop * remove useless. test=develop * add all image kernels for Get/Add kernel. test=develop * add all image kernels for Get/Add kernel. test=develop * fix buffer kernels of opencl. test=develop * fix release opencl. test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 xiebaiyuan 提交于
* [LITE][OPENCL][Image] add lws turn & close cl check when shutdownlog , test=develop * [LITE][OPENCL][Image] add lws turn & close cl check when shutdownlog , test=develop * [LITE][OPENCL][Image] add lws turn & close cl check when shutdownlog , test=develop * [LITE][OPENCL][Image] add lws turn & close cl check when shutdownlog , test=develop * [LITE][OPENCL][Image] add lws turn & close cl check when shutdownlog , test=develop
-
- 01 4月, 2020 1 次提交
-
-
由 Wilber 提交于
add cuda kernel. abs, tanh, elementwise_sub
-