1. 19 2月, 2022 1 次提交
    • Z
      [Pten] Add selected_rows kernel for Full (#39465) · 79f8eeca
      zyfncg 提交于
      * Add selected_rows kernel for full
      
      * remove fill_constant register in fluid
      
      * fix bug without GPU
      
      * add jit_kernel_helper dependency for fc
      
      * do some refactor
      
      * add unittest for ops signatures
      
      * add coverage unittest
      
      * fix merge conflict
      
      * fix full selectew_rows bug
      79f8eeca
  2. 18 2月, 2022 1 次提交
  3. 16 2月, 2022 2 次提交
  4. 15 2月, 2022 4 次提交
    • R
      [PluggableDevice] Add custom runtime support (#38740) · 3e7825f3
      ronnywang 提交于
      * [CustomRuntime] Add DeviceManager
      
      * [CustomRuntime] Add DeviceInterface
      
      * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager
      
      * [CustomRuntime] Add plug-in device
      
      * [CustomRuntime] Memory module support PluggableDevice
      
      * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option
      
      * update
      
      * [API] update API doc based on comments, test=develop
      Co-authored-by: Nqili93 <qili93@qq.com>
      3e7825f3
    • F
      [Pten] move paddle/operators/math/functors.h and compound_functors.h (#39514) · 0d46a108
      Feiyu Chan 提交于
      * move paddle/operators/math/functors.h
      * move paddle/operators/math/compound_functors.h
      0d46a108
    • F
      move algorithm.h (#39502) · 7eb9593e
      Feiyu Chan 提交于
      Move paddle/fluid/operators/math/algorithm.h to paddle/pten/kernels/funcs and rename all references to symbols in it.
      7eb9593e
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  5. 11 2月, 2022 1 次提交
  6. 09 2月, 2022 1 次提交
  7. 08 2月, 2022 1 次提交
  8. 06 2月, 2022 1 次提交
  9. 27 1月, 2022 1 次提交
  10. 25 1月, 2022 1 次提交
    • W
      [Move selected_rows PR #3] Change the relationship of [include/Cmake]. (#39128) · 2bafd338
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      2bafd338
  11. 24 1月, 2022 2 次提交
    • Y
      [Pten]Refactor elementwise_add grad / double grad / triple grad Kernel and... · 3bf3a6ee
      YuanRisheng 提交于
      [Pten]Refactor elementwise_add grad / double grad / triple grad Kernel and move them to pten (#39048)
      
      * refactor elementwise add grad
      
      * fix compile bugs
      
      * fix unit test bugs
      
      * fix file conflicts
      
      * fix bugs when buildPtenContext
      3bf3a6ee
    • z8hanghuan's avatar
      support sparse of adam, *test=kunlun (#38483) · e106901e
      z8hanghuan 提交于
      * support sparse of adam, *test=kunlun
      
      * add pre-commit-config.yaml
      
      * support sparse of adam in KL2,*test=kunlun
      
      * support sparse of adam in KL2, *test=kunlun
      
      * modify xpu.cmake, *test=kunlun
      
      * support sparse of adam, rm some wait, *test=kunlun
      
      * support sparse of adam, rm some wait, *test=kunlun
      
      * support sparse of adam, *test=kunlun
      
      * support sparse of adam, *test=kunlun
      
      * support sparse of adam, *test=kunlun
      
      * support sparse of adam, *test=kunlun
      
      * support sparse of adam, *test=kunlun
      e106901e
  12. 21 1月, 2022 4 次提交
  13. 20 1月, 2022 1 次提交
  14. 18 1月, 2022 1 次提交
  15. 17 1月, 2022 2 次提交
    • W
      [Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5
      Wilber 提交于
      * add pten::Place data structure.
      
      * update ci problem
      
      * fix ci problem
      
      * update
      
      * using platform::Place=pten::Place
      
      * remove BOOST_GET_CONST for CPUPlace and GPUPlace
      
      * compile pass 25%.
      
      * compile pass 45%
      
      * compile pass 60%
      
      * remove boost_get for xpu npu mlu and ipu
      
      * compile pass on cpu and gpu.
      
      * fix compile problem
      
      * fix compile error.
      
      * update
      
      * fix ci problem
      
      * update
      
      * ci approve
      
      * fix ci problem
      
      * fix ci eager test problem
      
      * remove BOOST_GET_CONST
      
      * fix npu compile
      c48a9ad5
    • S
      add squared_l2_norm (#38968) · 6eeb16b8
      sneaxiy 提交于
      6eeb16b8
  16. 15 1月, 2022 1 次提交
  17. 13 1月, 2022 1 次提交
  18. 12 1月, 2022 2 次提交
    • C
      [PTen] Remove hybird dir (#38863) · 5f5f626b
      Chen Weihang 提交于
      * remove hybird dir
      
      * resolve conflit
      5f5f626b
    • L
      Adjust warpper of gpu_lanuch_config (#38654) · f5166284
      limingshu 提交于
      * first commit
      
      * fix wrong filename
      
      * fix the wrong spell name
      
      * fix gpu config warper
      
      * modify according to pr advices
      
      * fix GpuLauchConfig1D api bugs
      
      * change the config for dropout grad
      
      * fix bugs
      
      * modification according to pr advices
      
      * modification according to pr advices
      f5166284
  19. 11 1月, 2022 1 次提交
    • Z
      【PTen】Add dot and matmul grad kernel in pten (#38713) · be817719
      zyfncg 提交于
      * refactor matmul directory in pten
      
      * fix merge conflict
      
      * add dot_grad kernel
      
      * add dot_grad kernel in pten
      
      * add matmul_grad kernel
      
      * update the code
      
      * delete useless code in fluid
      
      * fix some bug of running matmul grad kernel
      
      * fix merge conflict
      
      * refactor some code
      
      * refactor code
      be817719
  20. 10 1月, 2022 2 次提交
    • Z
      [Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea
      Zhanlue Yang 提交于
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      
      * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor
      
      * Fixed issues with place
      
      * Added comments
      
      * Moved mutable_data with stream argument to DenseTensor
      
      * Added set_offset interface
      
      * Fixed CI issues,test=allcases
      
      * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor
      
      * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor
      
      * Modified framework::Tensor to inherit from DenseTensor
      
      * Reverted changes too pten_layout() interface
      
      * Removed friend classes
      
      * Rearranged cfunction calls from tensor.data<void>() to tensor.data()
      
      * Fixed CI issues
      
      * Fixed lite issues
      
      * Fixed data() interface issues,test=allcases
      
      * Resolved IsInitialized() issues
      
      * Fixed ResetHolder() issues
      
      * Fixed MKLDNN & Storage issues
      
      * Resolved ShareBufferWith() issues
      
      * Fixed LoD issues
      5c73a6ea
    • A
      Add MaxUnPool3D op and MaxUnPool1D op (#38716) · 7e31542c
      andyjpaddle 提交于
      * add maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update sample code for maxunpool3d
      
      * add maxunpool1d op
      
      * update some code for maxunpool1d
      7e31542c
  21. 30 12月, 2021 2 次提交
  22. 24 12月, 2021 1 次提交
  23. 20 12月, 2021 2 次提交
  24. 17 12月, 2021 1 次提交
    • Z
      add launch bound to limit the registers usage for volta architecture (#38113) · 18a59822
      zlsh80826 提交于
      From --ptxas-options=-v, SegmentOpsKernel uses 66 registers in a block.
      There are two ways to resolve this problem:
          Reduce the threads per block launch configuration
          add __launch_bound__ to give information to nvcc compiler for reducing registers usage
      this PR chooses __launch_bound__ solution because changing gpu_launch_config may affect other ops.
      18a59822
  25. 09 12月, 2021 2 次提交
  26. 08 12月, 2021 1 次提交