1. 14 2月, 2022 1 次提交
    • S
      [Bug fix] prevent squashing pair u8 dequantize -> s8 quantize (#39346) · 66b5348e
      Sylwester Fraczek 提交于
      * prevent squashing pair u8 dequantize -> s8 quantize
      
      * add relu op to check for uint8
      
      * fix ptq fc attr name fuse_activation->activation_type
      
      * fix
      
      * add unit test
      
      * remove unused variable
      
      * test fix unsuccessful
      
      * fix test and logic
      
      * multiline comment
      
      * remove cout
      
      * Revert "fix ptq fc attr name fuse_activation->activation_type"
      
      This reverts commit ffd023353a5e9b0fd15e81b9e9f9fe1794035017.
      
      * fix ptq fc attr name fuse_activation->activation_type
      66b5348e
  2. 11 2月, 2022 1 次提交
  3. 10 2月, 2022 1 次提交
  4. 09 2月, 2022 1 次提交
  5. 27 1月, 2022 1 次提交
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
  6. 26 1月, 2022 1 次提交
  7. 25 1月, 2022 1 次提交
    • W
      [Move selected_rows PR #3] Change the relationship of [include/Cmake]. (#39128) · 2bafd338
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      2bafd338
  8. 24 1月, 2022 1 次提交
  9. 18 1月, 2022 2 次提交
    • S
      Mish FP32/BF16 kernel, conv and fc fuse passes (#38623) · 1d18bc2c
      Sławomir Siwek 提交于
      * Mish
      
      * Change exp() library
      
      * mish fuse pass
      
      * mish attrs
      
      * fixes
      
      * mishop maker
      
      * remove attrs
      
      * mish kernal for bf16
      
      * fc+mish fuse
      
      * fix code format error
      
      * Resolve merge conflicts
      
      * Update mish operator version
      
      * update mish variable to new naming convention
      1d18bc2c
    • Z
      [Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3
      Zhanlue Yang 提交于
      * Merged LoDTensor with Tensor,test=allcases
      
      * Patched python level LoDTensor
      
      * Patched python level LoDTensor
      
      * Merge Tensor into DenseTensor
      
      * Fixed namespace issues,test=allcases
      
      * Fixed merge issues
      
      * Fixed inference issues
      
      * Fixed NPU test issues
      
      * Fixed merge issues
      2052f1e3
  10. 17 1月, 2022 1 次提交
    • W
      [Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5
      Wilber 提交于
      * add pten::Place data structure.
      
      * update ci problem
      
      * fix ci problem
      
      * update
      
      * using platform::Place=pten::Place
      
      * remove BOOST_GET_CONST for CPUPlace and GPUPlace
      
      * compile pass 25%.
      
      * compile pass 45%
      
      * compile pass 60%
      
      * remove boost_get for xpu npu mlu and ipu
      
      * compile pass on cpu and gpu.
      
      * fix compile problem
      
      * fix compile error.
      
      * update
      
      * fix ci problem
      
      * update
      
      * ci approve
      
      * fix ci problem
      
      * fix ci eager test problem
      
      * remove BOOST_GET_CONST
      
      * fix npu compile
      c48a9ad5
  11. 15 1月, 2022 1 次提交
  12. 14 1月, 2022 1 次提交
    • H
      add flatten_contiguous_range OpConvert for Paddle-TRT (#38922) · 050aa6fe
      heliqi 提交于
      * add trt_convert_flatten_contiguous_rang op
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * trt version >7,support trt_convert_flatten_contiguous_rang
      
      * test cast add trt version >=7 skip
      050aa6fe
  13. 13 1月, 2022 2 次提交
  14. 10 1月, 2022 1 次提交
    • Z
      [Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea
      Zhanlue Yang 提交于
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      
      * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor
      
      * Fixed issues with place
      
      * Added comments
      
      * Moved mutable_data with stream argument to DenseTensor
      
      * Added set_offset interface
      
      * Fixed CI issues,test=allcases
      
      * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor
      
      * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor
      
      * Modified framework::Tensor to inherit from DenseTensor
      
      * Reverted changes too pten_layout() interface
      
      * Removed friend classes
      
      * Rearranged cfunction calls from tensor.data<void>() to tensor.data()
      
      * Fixed CI issues
      
      * Fixed lite issues
      
      * Fixed data() interface issues,test=allcases
      
      * Resolved IsInitialized() issues
      
      * Fixed ResetHolder() issues
      
      * Fixed MKLDNN & Storage issues
      
      * Resolved ShareBufferWith() issues
      
      * Fixed LoD issues
      5c73a6ea
  15. 05 1月, 2022 1 次提交
  16. 31 12月, 2021 1 次提交
  17. 30 12月, 2021 2 次提交
  18. 23 12月, 2021 2 次提交
  19. 20 12月, 2021 2 次提交
  20. 17 12月, 2021 1 次提交
  21. 15 12月, 2021 2 次提交
  22. 14 12月, 2021 2 次提交
  23. 07 12月, 2021 1 次提交
  24. 03 12月, 2021 2 次提交
  25. 29 11月, 2021 1 次提交
  26. 27 11月, 2021 1 次提交
    • A
      [NPU] reorganization for device API abstraction (#37110) · 72241a6a
      Aganlengzi 提交于
      * [NPU] reorganization for device API abstraction
      
      * [NPU] delete old files
      
      * [NPU] fix npu_collective_helper
      
      * [NPU] fix collective_helper
      
      * [NPU] fix ut
      
      * [NPU] mod memory allocation and hccl_helper
      
      * [NPU] fix place_type
      
      * [NPU] split enfoce.h
      
      * move acl* call into npu_info
      
      * merge conflict
      
      * fix merge
      
      * merge conflict
      
      * merge conflict
      72241a6a
  27. 24 11月, 2021 1 次提交
  28. 19 11月, 2021 1 次提交
  29. 15 11月, 2021 1 次提交
    • C
      [Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a
      Chen Weihang 提交于
      * move extension into pten [no-verify]
      
      * append tensor methods by ext_tensor [no-verify]
      
      * append other tensor methods [no-verify]
      
      * ext related files tidy [no-verify]
      
      * include relation tidy [no-verify]
      
      * add pten tensor test [no-verify]
      
      * replace tensor in custom op & compile success
      
      * refine tensor constructor for unittest
      
      * custom relu jit run success
      
      * fix all custom op unittests
      
      * add inference cmake adapt [no-verify]
      
      * fix failed unittests
      
      * fix windows failed unittests
      
      * try to fix kunlun and inference failed
      
      * fix test_elementwise_api error
      
      * try to fix win compile failed
      
      * fix kunlun fp16 type error
      
      * remove useless haddle error macro
      
      * add custom linear op test
      
      * fix compile failed & add win symbols
      
      * fix non pten kernel cast failed
      
      * add dll decl for api
      
      * polish several deetails
      
      * polish details by review comment
      
      * add dll_decl for register
      1e598f1a
  30. 11 11月, 2021 1 次提交
    • J
      Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc
      jakpiase 提交于
      * added softplus + activation fuse plass
      
      * minor change
      
      * implemented reviewer suggestion
      
      * minor fix
      
      * minor fix
      
      * added scale_out parameter
      
      * minor fix
      
      * fix for iScan CI
      
      * conditionally disabled logs
      
      * refactored pass builder
      a346c4dc
  31. 02 11月, 2021 1 次提交
  32. 27 10月, 2021 1 次提交