1. 03 3月, 2022 1 次提交
  2. 02 3月, 2022 2 次提交
  3. 01 3月, 2022 2 次提交
  4. 28 2月, 2022 2 次提交
  5. 25 2月, 2022 1 次提交
  6. 24 2月, 2022 2 次提交
  7. 23 2月, 2022 3 次提交
  8. 22 2月, 2022 2 次提交
  9. 21 2月, 2022 1 次提交
  10. 20 2月, 2022 1 次提交
  11. 19 2月, 2022 1 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
  12. 18 2月, 2022 3 次提交
  13. 17 2月, 2022 1 次提交
    • H
      add softplus op for kunlun2. test=kunlun (#39555) · 9f99b591
      houj04 提交于
      * add softplus op for kunlun2. test=kunlun
      
      * add softplus op for kunlun2. test=kunlun
      
      * fix code style. test=kunlun
      
      * fix code style. test=kunlun
      
      * add more test cases. test=kunlun
      9f99b591
  14. 16 2月, 2022 1 次提交
    • L
      [bf16] pten matmul cuda kernel support bf16 (#39485) · d5a0d31a
      Leo Chen 提交于
      * pten matmul cuda kernel support bf16
      
      * fix pten kernel name
      
      * add matmul_grad bf16 kernel
      
      * add emptylike bf16 kernel
      
      * fix compile
      
      * suppport rocm
      
      * fix error
      
      * fix rocm
      
      * add bf16 header file
      
      * fix compile
      d5a0d31a
  15. 15 2月, 2022 2 次提交
    • R
      [PluggableDevice] Add custom runtime support (#38740) · 3e7825f3
      ronnywang 提交于
      * [CustomRuntime] Add DeviceManager
      
      * [CustomRuntime] Add DeviceInterface
      
      * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager
      
      * [CustomRuntime] Add plug-in device
      
      * [CustomRuntime] Memory module support PluggableDevice
      
      * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option
      
      * update
      
      * [API] update API doc based on comments, test=develop
      Co-authored-by: Nqili93 <qili93@qq.com>
      3e7825f3
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  16. 09 2月, 2022 1 次提交
  17. 08 2月, 2022 1 次提交
    • F
      Support allocate CUDA managed memory (#39075) · 42910361
      From00 提交于
      * Rough implementation for experiment
      
      * Support allocate cuda managed memory
      
      * Fix CI error
      
      * Modify UT
      
      * Check whether support memory oversubscription
      
      * Fix ROCM Compile error
      
      * Fix ROCM Compile error
      
      * Fix UT cuda_managed_memory_test
      
      * Set UT timeout to 40
      
      * Add UT OOMExceptionTest
      
      * Set UT timeout to 50
      42910361
  18. 07 2月, 2022 1 次提交
  19. 06 2月, 2022 1 次提交
  20. 30 1月, 2022 1 次提交
  21. 29 1月, 2022 2 次提交
    • L
      Add xpu2 compiler (#37254) · 92da5055
      Liu-xiandong 提交于
      * Add XPU compiler for paddle, test=develop
      
      * clean code
      
      * clean useless code
      
      * clean useless code
      
      * clean useless code
      
      * test
      
      * add include path
      
      * use clang compiler
      
      * xpu2.cmake
      
      * XPU2 compiler passed
      
      * update
      
      * update after pten
      
      * combination the WITH_XPU and WITH_XPU2
      
      * update the fuse operation in WITH_XPU and WITH_XPU2
      
      * update
      
      * update
      
      * update
      
      * fix the merge error
      
      * update
      
      * update the code
      
      * update the code
      
      * add run_kp_kernel flag
      
      * update
      
      * update
      
      * fix prepared type_ bug
      
      * clean and update the code
      
      * reset the kernel_primitives
      
      * update
      
      * clean the code
      
      * delete useless comment
      
      * fix the bug in WITH_XPU
      
      * update
      
      * update
      
      * modify the abi
      
      * delete some useless code
      
      * Parameter automation in xpu compilation
      
      * Parameter automation in xpu compilation
      
      * delete kps in cmake
      
      * delete useless comment
      
      * clean the code
      
      * clean the code
      92da5055
    • Q
      fix kunlun2 softmax unitest bug (#39274) · 23bb2836
      QingshuChen 提交于
      * fix kunlun2 softmax unitest bug
      *test=kunlun
      
      * minor
      23bb2836
  22. 28 1月, 2022 1 次提交
  23. 27 1月, 2022 2 次提交
  24. 26 1月, 2022 3 次提交
  25. 25 1月, 2022 2 次提交
    • J
      [MLU]add mlu kernel for split and concat (#39020) · ac3dc0bb
      joeqiao12 提交于
      * [MLU]add mlu kernel for concat and split op
      
      * delete device_context DEPS
      ac3dc0bb
    • L
      Optimize nearest_interp forward (#38528) · 232bbce2
      Lijunhui 提交于
      * init commit
      
      * remove comments
      
      * remove nchw branch
      
      * optimize code
      
      * apply fast div mod in 1D kernel, rm 3D kernel
      
      * move init of FastDivMode to CPU
      
      * 3D kernel for nchw, FastDiv for 1D kernel
      
      * debug done. process boundary
      
      * 2^n
      
      * optimize
      
      * optimize
      
      * change code & optimize code
      232bbce2