1. 27 9月, 2020 1 次提交
  2. 25 9月, 2020 1 次提交
  3. 08 9月, 2020 1 次提交
  4. 30 7月, 2020 1 次提交
  5. 24 7月, 2020 1 次提交
    • Q
      [ASCEND] Add Huawei Ascend310 support (#3936) · 769ba40b
      Qi Li 提交于
      * [ASCEND] Add Huawei Ascend310 support, test=develop
      
      * [ASCEND] fix some typos, test=develop
      
      * [ASCEND] address comments and fix opt ci python file, test=develop
      
      * [ASCEND] update based on new ascend env, test=develop
      
      * [ASCEND] update after develop merge, test=develop
      769ba40b
  6. 22 7月, 2020 1 次提交
    • H
      [Core] Add the graph optimization of subblocks for transformer model (#3947) · 7af1a258
      hong19860320 提交于
      * [Core][ARM] Fix beam_search, eltwise_mul supports broadcast and int64_t data type, add print op and kernel, add exeception
      test=develop
      
      * Fix the dims of parent idx of the arm kernel of beam_search op
      
      * elementwise_mul supports int64_t data type with broadcasting
      
      * Add print op and kernel for debugging
      
      * Support throwing the exception when the internal error occurs
      
      * Refine while and conditional_block op kernel
      
      * Support the graph optimization on subblocks
      
      * Pass program_desc and block_idx into the kernel of the control flow ops(while/conditional_block/subgraph), and create the RuntimeProgram online, it make it possiable to call the control flow ops recursively
      
      *Add unit test for masked transformer model
      7af1a258
  7. 13 7月, 2020 1 次提交
  8. 06 7月, 2020 1 次提交
  9. 04 6月, 2020 1 次提交
  10. 28 5月, 2020 1 次提交
  11. 13 5月, 2020 1 次提交
  12. 12 5月, 2020 1 次提交
    • C
      [LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix... · ca51f68f
      Cwndmiao 提交于
      [LITE][XPU] 1. Add precision switch(int16/int31) in XPUMultiEncoderOp; 2. Fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN; 3. Enhance |__xpu__multi_encoder_fuse_pass|; (#3596)
      
      * [LITE][XPU] Add precision switch(int16/int31) in XPUMultiEncoderOp
      
      * [LITE][XPU] fix identity_dropout_eliminate_pass, |AttrType| of 'is_test' in OpDesc can be INT or BOOLEAN
      
      * test=develop
      
      * [LITE][XPU] suppress linkage error
      test=develop
      
      * [LITE][XPU] 1. Reorder |identity_dropout_eliminate_pass| before |__xpu__multi_encoder_fuse_pass|; 2. Enhance |__xpu__multi_encoder_fuse_pass|, it works well in more scenarios;
      test=develop
      
      * [LITE][XPU] Remove XPUConfig
      test=develop
      ca51f68f
  13. 06 5月, 2020 1 次提交
    • S
      [LITE][BM] support hd all models,test=develop (#3540) · 94416b2c
      Santa An 提交于
       fix reshape infer shape issue  
      adaptive pool, 
      support adaptive pool2,  
      multi thread ok,   
      optimize global pool, 
      support faceboxes and behavior image, 
      realize bm device info, 
      multi device ok, 
      support multi cards, 
      support efficienet
      94416b2c
  14. 22 4月, 2020 1 次提交
  15. 19 4月, 2020 1 次提交
    • X
      [LITE][OPENCL]Fix opencl (#3433) · 9bd9311b
      xiebaiyuan 提交于
      * [lite][opencl] remove event with clfinish, add strict check for cl warning. add conv 3x3opt fallback opt layout cast ,test=develop
      
      * [LITE][OPENCL]rm event in element_add_buffer_compute test=develop
      
      * [LITE][OPENCL]suite cl_functions_test.cc test=develop
      
      * [LITE][OPENCL] suite cl_common.sh lint check test=develop
      
      * [LITE][OPENCL] suite conv_image_compute.cc lint check test=develop
      
      * [LITE][OPENCL] suite cl_wait_list() lint check test=develop
      9bd9311b
  16. 15 4月, 2020 1 次提交
  17. 14 4月, 2020 1 次提交
  18. 13 4月, 2020 1 次提交
  19. 09 4月, 2020 1 次提交
  20. 08 4月, 2020 1 次提交
    • H
      [Core][XPU] Add XPU op kernels (#3274) · 99deb7d9
      hong19860320 提交于
      * [LITE][XPU] bind xpu resnet50 kernels
      
      * [LITE][XPU] fuse resnet50 and encoder
      
      * [LITE][XPU] bind xpu bert kernels
      
      * [LITE][XPU] refine xpu_resnet_fuse_pass.cc
      
      * [LITE][XPU] add xpu stack kernel
      
      * [LITE][XPU] add xpu slice/tanh kernel
      
      * [LITE][XPU] refine resnet50 and encoder fusor
      
      * [LITE][XPU] split resnet50 and multi_encoder op from subgraph_op.h
      
      * [LITE][XPU] clean workspace
      
      * [LITE][XPU] add build script
      
      * [LITE][XPU] fix compilation errors
      
      * [LITE][XPU] fix kernel matmul
      
      * [LITE][XPU] fix kernel ewadd ewsub
      
      * [LITE][XPU] add xpu cast kernel
      
      * [LITE][XPU] fix kernel slice
      
      * [LITE][XPU] switch dev by LITE_XPU_DEV env
      
      * [LITE][XPU] eliminate useless cast op
      
      * [LITE][XPU] add PerThread Ops
      
      * [LITE][X86] add SequenceUnpad op and kernel
      
      * [LITE][XPU] add LITE_WITH_XTCL option
      
      * [LITE][X86] add SequenceConv kernel
      
      * [LITE][XPU] fix cmake dependency
      
      * [LITE][XPU] add xpu sigmoid kernel
      
      * [XPU] Remove the dependencies of framework.pb.h
      test=develop
      
      Change-Id: Icfb44efb0482a6369b365b5c09017765328fc10d
      
      * [XPU] Fix the precision of cast kernel
      test=develop
      
      Change-Id: Icb18be47d7ab490de9fb9c92eae1165f49dbf492
      
      * [Core] Fix the compiling error when build for the target that disable XPU
      test=develop
      
      Change-Id: I38ec53f222391d3bf06b70512e6c3ad1282e4683
      
      * [XPU] Add io_copy kernel for xpu<->arm
      test=develop
      
      Change-Id: Iec7ea066f040534285557f9948b73e6a1970aed7
      
      * fix
      test=develop
      
      Change-Id: I4db1c93df48e22afbba904ce6c3b0babd9fda4c3
      
      * fix target matching of type_target_cast_pass and remove the unnecessary registration of io_copy kernel
      test=develop
      
      Change-Id: I432c10c9d1064e778d43fd0d12d8cf0599252f7a
      
      * [X86] Add the keyword 'template' to avoid the compiling errors
      test=develop
      
      Change-Id: I015d5d323adafb3884029c8287ced66c90ad931e
      
      * Fix the build.sh for XPU and x86
      test=develop
      
      Change-Id: I7d9575243669ce02af69a8ddbd6421db31902bd6
      
      * [XPU] Add the keyword 'template' to avoid the compiling errors
      test=develop
      
      Change-Id: I46d0b3b6861286a73ee2999934b8e185e453e749
      
      * [XPU] Add XTCL compiling option in build.sh
      test=develop
      
      Change-Id: I8b3fd998ca5f898d5bd2e665646e3874b3b73c80
      
      * fix namespace conflicts, test=develop
      
      * [API][XPU] Move the XPU related APIs into CxxConfig
      test=develop
      
      Change-Id: I75ac35e8bae96bcb835683f413f01b9db45afbf9
      
      * [API][XPU] Remove the LITE_WITH_XPU in paddle_api.h
      test=develop
      
      Change-Id: Idbd64013bdf331ad876919511c1c349332d46f93
      
      * [API][XPU] Remove XPUSetWorkspaceL3SizePerThread and XPUSetDevPerThread
      test=develop
      
      Change-Id: I515958f56f8e129280bae61c923513cc91fb9728
      
      * [API][Core][XPU] Refine the test case and remove the necessary modifications
      test=develop
      
      Change-Id: I1e0e2957a2f9d5f4207b06c0bc98a5ab611fee56
      
      * [Core] Remove useless code
      test=develop
      
      Change-Id: I6293faa10424aea2836d09d85ddb6a30f7811678
      
      * [XPU] Refine the test cases
      test=develop
      
      Change-Id: I6818fc3addf1bca5b96a7d66ee99263242e3374f
      
      * [XPU] Remove useless scripts and code
      test=develop
      
      Change-Id: I965ba6712d3cf881d0038f0473fec27d4c1bc684
      
      * [XPU] Use InferShapeImpl in sequence_unpad, resnet50 and multi_encoder op
      test=develop
      
      Change-Id: I5375f524d36836a394d426b4b2bc9fb44be0b59c
      
      * test=develop
      
      Change-Id: I42ee68c8a5e891dd0f3e95d6cfbc498be7cf1519
      
      * test=develop
      
      Change-Id: If679e5aa73e1368e0ee5bd5f286d2e1b4c2f354e
      
      * [XPU] Add __xpu__ prefix to the op and graph pass name of resnet50 and multi_encoder
      test=develop
      
      Change-Id: Idb61c99b4b8429cb87665bfd6835ab4d7d263be2
      
      * [XPU] Fix and refine the xpu fuse pass
      test=develop
      
      Change-Id: If1c5b6788d994e2809c1a00d9384685a89440907
      
      * test=develop
      
      Change-Id: Icfa333e322fc4351700103692c46cfcb3d4f9a89
      
      * [XPU] Remove the dependency on xpu api for xpu fuse passes
      test=develop
      
      Change-Id: I6094b5536f58ae18bab068284b32f9bd10a2ab92
      
      * [XPU] Move unit tests from lite/api to lite/tests/api
      test=develop
      
      Change-Id: I7ba27abb23abeffb0c95fdbbefec7ac16cdbd250
      
      * test=develop
      
      Change-Id: I33230c84d6c4e61bf19f46668bae2baa3ef68794
      
      * [XPU] Refine code
      test=develop
      
      Change-Id: I37bc5b948b4927e44cd3ea2594ebe3fd7671be06
      
      * [XPU] Add env XPU_ENABLE_XTCL to enable xpu_subgraph_pass
      test=develop
      
      Change-Id: Ifb8e07e86f307f562adaca3ce792015a6f2a2204
      
      * [XPU] refine code
      test=develop
      
      Change-Id: I1380654b930d51ae704dbc0cd855464d9c3b5b79
      
      * [XPU] Refine code
      test=develop
      
      Change-Id: I73285c2718ccd3612490eb2635bef4fd608c9bde
      
      * [XPU] Add comments for the XPU APIs
      test=develop
      
      Change-Id: Ieb5015f37984f8869b90c4c625c5894bb26164fd
      Co-authored-by: Nmiaotianxiang <miaotianxiang@baidu.com>
      Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
      99deb7d9
  21. 07 4月, 2020 1 次提交
  22. 31 3月, 2020 1 次提交
  23. 25 3月, 2020 1 次提交
  24. 04 3月, 2020 1 次提交
  25. 14 1月, 2020 1 次提交
  26. 13 12月, 2019 1 次提交
  27. 21 11月, 2019 1 次提交
  28. 20 11月, 2019 1 次提交
  29. 13 11月, 2019 1 次提交
  30. 05 11月, 2019 1 次提交
  31. 28 10月, 2019 1 次提交
    • H
      [LITE][XPU] initial support for XPU (#2202) · 06d058fe
      hong19860320 提交于
      * Initial support for XPU
      * Fix compiling errors of XPU
      * Move XPU op kernel bridges from backends to kernels to fix deps order
      * Change the namespace and directory of XPU bridges
      * Add XPU SDK
      * Fix header files and namespace of XPU SDK
      * Add unit tests for relu and conv2d ops
      * Restore the modification of paddle_api_test
      * Supports simple model which contains only a relu layer
      * Add compiling scripts for XPU
      * Fix compiling errors of XPU
      * Add comments for XPU LoadModel and BuildModel
      06d058fe
  32. 15 10月, 2019 1 次提交
    • H
      [NPU] Fix and refine the supporting of multi NPU models (#2037) · 7a731b7f
      hong19860320 提交于
      * [NPU] Fix the bug of loading multi NPU models
      test=develop
      
      * [NPU] Use lite tensor to store NPU model, fix the management of multi NPU models, support loading NPU model from memory and reduce the modification of framework
      test=develop
      
      * [NPU] Remove redundant header files for NPU bridges,
      test=develop
      
      * [NPU] fix NPU deps
      test=develop
      
      * [NPU] refine the compiling script for NPU
      test=develop
      
      * [NPU] remove redundant subdirectory in lite/CMakeLists.txt
      test=develop
      
      * [NPU] Fix and refine NPU test case
      test=develop
      
      * [NPU] revoke the modification of other non-NPU modules
      test=develop
      
      * [NPU] Remove NPU bridges if target is tiny publish
      test=develop
      7a731b7f
  33. 11 10月, 2019 1 次提交
    • Y
      [LITE][OPENCL] support image2d type (#2158) · 77cdbdce
      Yuan Shuai 提交于
      * [LITE][OPENCL] support image2d. test=develop
      
      * add context changed with consider image*. test=develop
      
      * add layout, relu image kernels. test=develop
      
      * replace image_data with data, mutable_image_data with mutable_data, test=develop
      
      * comment unused var. test=develop
      
      * remove unused var. test=develop
      77cdbdce
  34. 27 9月, 2019 1 次提交
    • Z
      can run yolov3 fp32 on cuda devices (#2092) · 3d6d744f
      Zhaolong Xing 提交于
      * add conv int8 support(in condition which the input or output channel not be the times of 4)
      add add_kernel for cuda.
      
      * can run yolov3 fp32
      test=develop
      
      * 1. fix bug with yolov3 run
      test=develop
      3d6d744f
  35. 19 9月, 2019 1 次提交
  36. 11 9月, 2019 1 次提交
  37. 03 9月, 2019 2 次提交
  38. 27 8月, 2019 1 次提交
  39. 24 8月, 2019 1 次提交