1. 27 1月, 2022 1 次提交
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
  2. 26 1月, 2022 1 次提交
  3. 25 1月, 2022 1 次提交
    • W
      [Move selected_rows PR #3] Change the relationship of [include/Cmake]. (#39128) · 2bafd338
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      2bafd338
  4. 24 1月, 2022 1 次提交
  5. 18 1月, 2022 2 次提交
    • S
      Mish FP32/BF16 kernel, conv and fc fuse passes (#38623) · 1d18bc2c
      Sławomir Siwek 提交于
      * Mish
      
      * Change exp() library
      
      * mish fuse pass
      
      * mish attrs
      
      * fixes
      
      * mishop maker
      
      * remove attrs
      
      * mish kernal for bf16
      
      * fc+mish fuse
      
      * fix code format error
      
      * Resolve merge conflicts
      
      * Update mish operator version
      
      * update mish variable to new naming convention
      1d18bc2c
    • Z
      [Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3
      Zhanlue Yang 提交于
      * Merged LoDTensor with Tensor,test=allcases
      
      * Patched python level LoDTensor
      
      * Patched python level LoDTensor
      
      * Merge Tensor into DenseTensor
      
      * Fixed namespace issues,test=allcases
      
      * Fixed merge issues
      
      * Fixed inference issues
      
      * Fixed NPU test issues
      
      * Fixed merge issues
      2052f1e3
  6. 17 1月, 2022 1 次提交
    • W
      [Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5
      Wilber 提交于
      * add pten::Place data structure.
      
      * update ci problem
      
      * fix ci problem
      
      * update
      
      * using platform::Place=pten::Place
      
      * remove BOOST_GET_CONST for CPUPlace and GPUPlace
      
      * compile pass 25%.
      
      * compile pass 45%
      
      * compile pass 60%
      
      * remove boost_get for xpu npu mlu and ipu
      
      * compile pass on cpu and gpu.
      
      * fix compile problem
      
      * fix compile error.
      
      * update
      
      * fix ci problem
      
      * update
      
      * ci approve
      
      * fix ci problem
      
      * fix ci eager test problem
      
      * remove BOOST_GET_CONST
      
      * fix npu compile
      c48a9ad5
  7. 15 1月, 2022 1 次提交
  8. 14 1月, 2022 1 次提交
    • H
      add flatten_contiguous_range OpConvert for Paddle-TRT (#38922) · 050aa6fe
      heliqi 提交于
      * add trt_convert_flatten_contiguous_rang op
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * test cast add trt version >=7 skip
      050aa6fe
  9. 13 1月, 2022 2 次提交
  10. 10 1月, 2022 1 次提交
    • Z
      [Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea
      Zhanlue Yang 提交于
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      
      * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor
      
      * Fixed issues with place
      
      * Added comments
      
      * Moved mutable_data with stream argument to DenseTensor
      
      * Added set_offset interface
      
      * Fixed CI issues,test=allcases
      
      * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor
      
      * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor
      
      * Modified framework::Tensor to inherit from DenseTensor
      
      * Reverted changes too pten_layout() interface
      
      * Removed friend classes
      
      * Rearranged cfunction calls from tensor.data<void>() to tensor.data()
      
      * Fixed CI issues
      
      * Fixed lite issues
      
      * Fixed data() interface issues,test=allcases
      
      * Resolved IsInitialized() issues
      
      * Fixed ResetHolder() issues
      
      * Fixed MKLDNN & Storage issues
      
      * Resolved ShareBufferWith() issues
      
      * Fixed LoD issues
      5c73a6ea
  11. 05 1月, 2022 1 次提交
  12. 31 12月, 2021 1 次提交
  13. 30 12月, 2021 2 次提交
  14. 23 12月, 2021 2 次提交
  15. 20 12月, 2021 2 次提交
  16. 17 12月, 2021 1 次提交
  17. 15 12月, 2021 2 次提交
  18. 14 12月, 2021 2 次提交
  19. 07 12月, 2021 1 次提交
  20. 03 12月, 2021 2 次提交
  21. 29 11月, 2021 1 次提交
  22. 27 11月, 2021 1 次提交
    • A
      [NPU] reorganization for device API abstraction (#37110) · 72241a6a
      Aganlengzi 提交于
      * [NPU] reorganization for device API abstraction
      
      * [NPU] delete old files
      
      * [NPU] fix npu_collective_helper
      
      * [NPU] fix collective_helper
      
      * [NPU] fix ut
      
      * [NPU] mod memory allocation and hccl_helper
      
      * [NPU] fix place_type
      
      * [NPU] split enfoce.h
      
      * move acl* call into npu_info
      
      * merge conflict
      
      * fix merge
      
      * merge conflict
      
      * merge conflict
      72241a6a
  23. 24 11月, 2021 1 次提交
  24. 19 11月, 2021 1 次提交
  25. 15 11月, 2021 1 次提交
    • C
      [Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a
      Chen Weihang 提交于
      * move extension into pten [no-verify]
      
      * append tensor methods by ext_tensor [no-verify]
      
      * append other tensor methods [no-verify]
      
      * ext related files tidy [no-verify]
      
      * include relation tidy [no-verify]
      
      * add pten tensor test [no-verify]
      
      * replace tensor in custom op & compile success
      
      * refine tensor constructor for unittest
      
      * custom relu jit run success
      
      * fix all custom op unittests
      
      * add inference cmake adapt [no-verify]
      
      * fix failed unittests
      
      * fix windows failed unittests
      
      * try to fix kunlun and inference failed
      
      * fix test_elementwise_api error
      
      * try to fix win compile failed
      
      * fix kunlun fp16 type error
      
      * remove useless haddle error macro
      
      * add custom linear op test
      
      * fix compile failed & add win symbols
      
      * fix non pten kernel cast failed
      
      * add dll decl for api
      
      * polish several deetails
      
      * polish details by review comment
      
      * add dll_decl for register
      1e598f1a
  26. 11 11月, 2021 1 次提交
    • J
      Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc
      jakpiase 提交于
      * added softplus + activation fuse plass
      
      * minor change
      
      * implemented reviewer suggestion
      
      * minor fix
      
      * minor fix
      
      * added scale_out parameter
      
      * minor fix
      
      * fix for iScan CI
      
      * conditionally disabled logs
      
      * refactored pass builder
      a346c4dc
  27. 02 11月, 2021 1 次提交
  28. 27 10月, 2021 3 次提交
  29. 26 10月, 2021 2 次提交
    • W
      [Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul,... · 93c591e2
      Wangzheee 提交于
      [Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul, mul) convert pass, fix (matmul, mul) op_teller (#36652)
      
      * new_Matmul2ToMatmulToMul
      
      * new_Matmul2ToMatmulToMul
      
      * fix paddle_pass_builder
      
      * fix paddle_pass_builder
      
      * fix paddle_pass_builder
      
      * tem
      
      * tem
      
      * Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass
      
      * Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass
      
      * add matmul_broadcast_unitest
      
      * fix op_teller
      93c591e2
    • F
      Pool3d 2.0 (#36545) · 229bae81
      feng_shuai 提交于
      229bae81