1. 14 4月, 2020 2 次提交
  2. 09 4月, 2020 1 次提交
  3. 08 4月, 2020 3 次提交
    • C
      Add hard_swish, ctc_align and reciprocal op (#3354) · 5dced130
      cc 提交于
      * Add hard_swish, ctc_align and reciprocal op, test=develop
      * Move some activation ops to extra, test=develop
      5dced130
    • H
      [Core][XPU] Add XPU op kernels (#3274) · 2b80bab6
      hong19860320 提交于
      * [LITE][XPU] bind xpu resnet50 kernels
      
      * [LITE][XPU] fuse resnet50 and encoder
      
      * [LITE][XPU] bind xpu bert kernels
      
      * [LITE][XPU] refine xpu_resnet_fuse_pass.cc
      
      * [LITE][XPU] add xpu stack kernel
      
      * [LITE][XPU] add xpu slice/tanh kernel
      
      * [LITE][XPU] refine resnet50 and encoder fusor
      
      * [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h
      
      * [LITE][XPU] clean workspace
      
      * [LITE][XPU] add build script
      
      * [LITE][XPU] fix compilation errors
      
      * [LITE][XPU] fix kernel matmul
      
      * [LITE][XPU] fix kernel ewadd ewsub
      
      * [LITE][XPU] add xpu cast kernel
      
      * [LITE][XPU] fix kernel slice
      
      * [LITE][XPU] switch dev by LITE_XPU_DEV env
      
      * [LITE][XPU] eliminate useless cast op
      
      * [LITE][XPU] add PerThread Ops
      
      * [LITE][X86] add SequenceUnpad op and kernel
      
      * [LITE][XPU] add LITE_WITH_XTCL option
      
      * [LITE][X86] add SequenceConv kernel
      
      * [LITE][XPU] fix cmake dependency
      
      * [LITE][XPU] add xpu sigmoid kernel
      
      * [XPU] Remove the dependencies of framework.pb.h
      test=develop
      
      Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d
      
      * [XPU] Fix the precision of cast kernel
      test=develop
      
      Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492
      
      * [Core] Fix the compiling error when build for the target that disable XPU
      test=develop
      
      Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683
      
      * [XPU] Add io_copy kernel for xpu<->arm
      test=develop
      
      Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7
      
      * fix
      test=develop
      
      Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3
      
      * fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel
      test=develop
      
      Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a
      
      * [X86] Add the keyword 'template' to avoid the compiling errors
      test=develop
      
      Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e
      
      * Fix the build.sh for XPU and x86
      test=develop
      
      Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6
      
      * [XPU] Add the keyword 'template' to avoid the compiling errors
      test=develop
      
      Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749
      
      * [XPU] Add XTCL compiling option in build.sh
      test=develop
      
      Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80
      
      * fix namespace conflicts, test=develop
      
      * [API][XPU] Move the XPU related APIs into CxxConfig
      test=develop
      
      Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9
      
      * [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h
      test=develop
      
      Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93
      
      * [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread
      test=develop
      
      Change-Id: I515958f56f8e129280bae61c923513cc91fb9728
      
      * [API][Core][XPU] Refine the test case and remove the necessary modifications
      test=develop
      
      Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56
      
      * [Core] Remove useless code
      test=develop
      
      Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678
      
      * [XPU] Refine the test cases
      test=develop
      
      Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f
      
      * [XPU] Remove useless scripts and code
      test=develop
      
      Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684
      
      * [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op
      test=develop
      
      Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c
      
      * test=develop
      
      Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519
      
      * test=develop
      
      Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e
      
      * [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder
      test=develop
      
      Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2
      
      * [XPU] Fix and refine the xpu fuse pass
      test=develop
      
      Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907
      
      * test=develop
      
      Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89
      
      * [XPU] Remove the dependency on xpu api for xpu fuse passes
      test=develop
      
      Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92
      
      * [XPU] Move unit tests from lite/api to lite/tests/api
      test=develop
      
      Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250
      
      * test=develop
      
      Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794
      
      * [XPU] Refine code
      test=develop
      
      Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06
      
      * [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass
      test=develop
      
      Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204
      
      * [XPU] refine code
      test=develop
      
      Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79
      
      * [XPU] Refine code
      test=develop
      
      Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde
      
      * [XPU] Add comments for the XPU APIs
      test=develop
      
      Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd
      Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com>
      Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
      2b80bab6
    • H
      [x86] Fix x86 code style (#3287) · aa1db862
      huzhiqiang 提交于
      aa1db862
  4. 29 2月, 2020 1 次提交
  5. 17 2月, 2020 1 次提交
  6. 14 2月, 2020 2 次提交
    • G
      Replace Softsign Eigen with c implementation (#2864) · e83d6b87
      GaoWei8 提交于
      * Replace Softsign Eigen with c implementation
      test=develop
      e83d6b87
    • Y
      [X86] Optimize gru and softmax (#2877) · 6b30c58a
      Yiqun Liu 提交于
      * Optimize softmax. When the input tensor is 2-D and axis is 1, there is no need to resize.
      
      * Optimize the gru, avoid calling Tensor::Slice.
      test=develop
      
      * Remove a std::vector in softmax.
      test=develop
      
      * Define CalculateSeqWidth to get the width of a sequence.
      test=develop
      6b30c58a
  7. 11 2月, 2020 2 次提交
  8. 07 2月, 2020 1 次提交
  9. 15 1月, 2020 1 次提交
  10. 06 1月, 2020 1 次提交
  11. 26 12月, 2019 1 次提交
  12. 25 12月, 2019 1 次提交
    • Y
      [X86] Polish the implementation of fc and imporve the unittest (#2656) · a0f01efa
      Yiqun Liu 提交于
      * Remove GEMM padding in fc_compute.
      test=develop
      
      * Write a common ParallelFor function to run the for loop in parallel.
      
      * Add the codes of padding GEMM back in fc.
      
      * Refine the code of fc when padding_weight is false to avoid the definition of temporary Tensor.
      
      * Refine the unit test of fc and add testing case of padding and parallel.
      test=develop
      
      * Enable more test cases in common fc unittest, including padding and parallel for x86 target.
      
      * Remove the fc test under kernels/x86.
      test=develop
      
      * Disable relu in test of fc for non-x86 target.
      test=develop
      
      * Change the eps of arm.
      test=develop
      a0f01efa
  13. 24 12月, 2019 1 次提交
  14. 19 12月, 2019 1 次提交
  15. 10 12月, 2019 1 次提交
  16. 08 12月, 2019 1 次提交
  17. 02 12月, 2019 2 次提交
  18. 28 11月, 2019 1 次提交
  19. 27 11月, 2019 1 次提交
  20. 26 11月, 2019 1 次提交
  21. 25 11月, 2019 1 次提交
  22. 22 11月, 2019 2 次提交
    • H
      update conv 2-pad to 4-pad (#2404) · b3a5fc1a
      HappyAngel 提交于
      * fix conv 2-pad to 4-pad
      
      * fix compute conv shape
      
      * fix pad, test=develop
      
      * change conv_depthwise_3x3s1_fp.cc name to conv3x3s1p01_depthwise_fp32.cc to distinguish between conv3x3s1_depthwise_fp32.cc
      
      * delete printf note in conv3x3s1, test=develop
      
      * delete printf note, test=develop
      
      * delete gem_sdot.h, test=develop
      
      it is coped from __gemm_sdot_meta_.h
      
      * update compute padding, test=develop
      
      * fix padding size, must be 2 or 4. test=develop
      
      * fix format in operators/conv_op.cc, test=develop
      
      * change #if 0 to #if 1, test=develop
      
      * put 2-pad to 4-pad in AttachImpl, test=develop
      
      * fix clang-format error inn tests/math/connv_compute_test, test=develop
      
      * fix x86 test result error, test=develop
      
      * add asymmetric padding test case in liite/tests/math/conv_compute.cc, test=develop
      
      * change paddings type to support dynamically modify, test=develop
      
      * fix x86 build error in connv_compute_test, test=develop
      
      * fix opencl build error, test=develop
      
      * fix oopencl build error, test=develop
      
      * fix  opencl/conv_compute build error, test=develop
      
      * fix  opencl/conv_compute build error, test=develop
      
      * fix format in kernels/opencl/conv_computte_ttest,test=develop
      
      * fix build error, test=develop
      
      fix build error in kernels/x86/conv_compute.h
      b3a5fc1a
    • H
      update pooling 2-padding to 4-padding (#2410) · 4bdb6171
      HappyAngel 提交于
      * fix pooling bug and speed
      
      * fix build error
      
      * delete VLOGin pool, test=develop
      
      * add openmp, test=develop
      
      * fix lite/kernels/arm/pool_compute_test basic_pooling compute error bug, test=develop
      
      * update pooling 2-pad to 4-pad, test=develop
      
      * fix 2-pad to 4-pad in operators/pool_op.h, AttachKernel will set param, so 2-pad to 4-pad funcs should put in AttachKernel. test=ddevellop
      
      * put 2-pad to 4-pad in AttachImpl, test=develop
      
      * according to reviews, fix some format error. test=develop
      
      * fix format errorr, add (). test=develop
      
      * change paddings type to support dynamically modify, test=develop
      
      * update padding type int other devices, test=develop
      
      * fix x8d build error on shared_ptr, test=ddevelop
      
      * fix formmat in operators pool_op.cc, test=develop
      4bdb6171
  23. 20 11月, 2019 3 次提交
  24. 19 11月, 2019 3 次提交
  25. 18 11月, 2019 3 次提交
  26. 16 11月, 2019 1 次提交
  27. 15 11月, 2019 1 次提交
    • J
      Add content-dnn ops (#2429) · 7f408ee8
      juncaipeng 提交于
      * add search_seq_depadding x86 and cuda
      * add match_matrix_tensor x86
      * add search_grnn x86, no test
      7f408ee8