1. 12 4月, 2023 1 次提交
  2. 20 3月, 2023 1 次提交
  3. 16 3月, 2023 1 次提交
    • H
      Update from_blob API (#51646) · c07c7712
      Huang Jiyi 提交于
      * remove contexts in tensor_utils
      
      * update from_blob
      
      * update from_blob
      
      * update from_blob
      
      * fix bug
      
      * fix bug
      c07c7712
  4. 06 3月, 2023 1 次提交
    • H
      [phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b
      Huang Jiyi 提交于
      * move DeviceContextPool to phi
      
      * add EmplaceExternalContextFunc
      
      * update namespace
      
      * update cmake
      
      * fix bugs and create context_pool_impl.h
      
      * replace platform::is_xxx_place
      
      * fix bugs
      
      * update generator
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix enforce usage
      
      * Revert "fix enforce usage"
      
      This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.
      
      * fix bugs
      
      * rm XPUDeviceContext and CustomDeviceContext
      
      * fix bugs
      
      * fix fix context init bug
      
      * fix bugs after merge
      
      * fix bugs
      
      * fix name
      
      * fix mutable_data
      
      * update and fix bugs
      
      * fix bugs
      
      * update
      
      * fix bugs
      
      * fix name
      
      * fix bugs
      
      * merge
      
      * fix bugs
      
      * create context_pool in phi/backends
      
      * create context_pool in phi/backends
      
      * fix bugs
      
      * fix xpu bugs
      
      * fix rocm bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix xpu bugs
      
      * update
      
      * update
      
      * fix bugs
      
      * fix bugs
      a1006b2b
  5. 21 2月, 2023 1 次提交
  6. 17 2月, 2023 1 次提交
    • Y
      Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2
      yuehuayingxueluo 提交于
      * rename multi_tensor_adam to fused_adam
      
      * fix some bugs
      
      * fix CI coverage
      
      * rename test_fused_adam.py
      
      * fix some bug
      
      * add test_fused_adam_op.py
      
      * fix some bugs
      
      * fix fused_adam_op.cc
      
      * fix CI bugs
      
      * fix CI bug
      
      * fix CI bug
      e6af9bd2
  7. 09 2月, 2023 2 次提交
    • H
      [PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a
      Huang Jiyi 提交于
      * decouple strided_memcpy
      
      * move strided_memcpy
      
      * move strided_memcpy to phi
      
      * fix namespace
      
      * update
      
      * fix gpu compile bugs
      17318c1a
    • Y
      Add MultiTenosrAdam OP (#49220) · 10654c77
      yuehuayingxueluo 提交于
      * add multi_tenosr_adam
      
      * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py
      
      * fix adam.py optimizer.py
      
      * fix adamw.py
      
      * fix test_multi_tensor_adam.py
      
      * fix CI bug
      
      * fix CI coverage
      
      * fix ci bug
      
      * fix betapow
      
      * fix some bugs
      
      * fix test_adamw_op.py
      
      * fix CI coverage
      
      * fix multi_tensor_adam_kernel.cc
      
      * fix CI bug
      
      * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py
      
      * fix code style
      
      * update C++ parts
      
      * remove python parts modification temporarily
      
      * add C++ ut
      
      * update betapow copy code logic
      
      * fix ci ut
      
      * fix windows ci
      
      * fix coverage ci
      
      * improve coverage rate
      
      ---------
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      10654c77
  8. 15 12月, 2022 1 次提交
  9. 08 11月, 2022 1 次提交
  10. 26 10月, 2022 1 次提交
  11. 30 9月, 2022 1 次提交
  12. 19 9月, 2022 1 次提交
    • L
      Performance fix for broadcast kernel [Part3] (#46071) · 46e4fb2a
      limingshu 提交于
      * first commit
      
      * refine code with template argument
      
      * refine code with template argument
      
      * add ternary broadcast test file
      
      * add ternary broadcast test file
      
      * fix accoriding to ci
      
      * fix op-benchmark ci error
      46e4fb2a
  13. 26 8月, 2022 1 次提交
    • K
      Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a
      kangguangli 提交于
      * remove fluid kernel and activate phi kernel
      
      * fix parameter error
      
      * transfer mkldnn part
      
      * modify header file path
      
      * fix compile error
      
      * transfer special case
      
      * fix lod setting and special case for layout setting
      
      * add testcase and refine code
      985f2a4a
  14. 25 8月, 2022 1 次提交
    • K
      Transfer memcpy d2h from fluid to phi (#45150) · 0d14e74a
      kangguangli 提交于
      * transfer memcpy_d2h from fluid to phi
      
      * refine arg check and add comment
      
      * fix cannot fallback to phi kernel
      
      * fix gpu_context host alloc when tensor size = 0
      
      * add kernel for std::vector<DenseTensor> args
      
      * fix bugs in MemcpyD2HMultiIOKernel
      
      * remove useless header file
      
      * polish format
      
      * fix typo
      
      * add testcase for cudapinned place
      
      * refine check condition in test
      
      * polish error message
      
      * polish error message
      
      * remove header in fluid  directory
      
      * merge memcpy_h2d and memcpy_d2h into one file, change register method to simplify implementation
      
      * fix code style check
      0d14e74a
  15. 23 6月, 2022 1 次提交
  16. 04 6月, 2022 1 次提交
  17. 29 3月, 2022 1 次提交
  18. 27 3月, 2022 1 次提交
    • J
      Add StringTensor (#39830) · 0695e1ac
      Jack Zhou 提交于
      * add string tensor and case convert kernels
      
      * Add strings empty kernel; Reorganize the structure of case convert kernel
      
      * Add string infermeta
      
      * Update mutable_data of string tensor
      
      * rename kernel name
      
      * add string copy tmp
      
      * Fix strings copy device bug
      
      * add utf8 gpu converter
      
      * add string tensor c++ api
      
      * Remove mutable_data of string tensor
      
      * update string tensor interface
      
      * remove charcases_flag.h
      
      * remove some fluid headers
      
      * Add make_ddim
      
      * __HIPCC__ -> PADDLE_WITH_HIP
      
      * remove fluid headers
      
      * fix cpu compile
      
      * remove std::hash
      
      * Fix cudaMalloc
      
      * Remove strings/impl directory
      
      * Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps
      
      * Add empty kernel test
      
      * Remove some comments
      
      * Modify lower/upper api encoding type: string->bool
      
      * STRING->PSTRING; Add CreateInferLikeMeta
      
      * Add code gen for C++ String API
      
      * remove strings_api_utils.h
      
      * Add ignore file (strings_api.h, strings_api.cc)
      
      * update strings gen script
      
      * change args order of case convert kernels
      
      * Add comments for pstring, StringTensor
      
      * cpstring_internal.h -> cpstring_impl.h
      
      * Update accordding to comments:
      
      1. Remove fluid headers
      2. paddle::platform::errors -> phi::errors
      3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
      4. Use camel code style
      
      * Remove all singletons in strings kernels
      
      * fix rocm compile
      
      * Fix py3 compile
      
      * Fix c++ coverage
      
      * 1. Add pstring proto type
      2. Add StringTensor debug info
      3. Rename case_convert_kernel to strings_lower_upper
      4. Remove serialize derialize strings kernel
      
      * DataLayout::PSTRING -> DataLayout::PSTRING_UNION
      
      * Register pstring data type
      
      * Fix strings api gen
      
      * Fix dense tensor register pstring dtype
      
      * Fix error messages
      
      * remove line
      
      * add pstring unittest
      
      * remove test string api unitest
      
      * remove empty line
      
      * Remove some headers to decrease the size of executable file
      0695e1ac
  19. 18 3月, 2022 1 次提交
  20. 04 3月, 2022 1 次提交
  21. 28 2月, 2022 1 次提交
    • Z
      Add sparse conv3d kernel (#39879) · bc99a76c
      zhangkaihuo 提交于
      * fix incorrect dims settings
      
      * sparse conv3d
      
      * fix out dims
      
      * test performance
      
      * test large shape success
      
      * opt scatter, double performance
      
      * test float16
      
      * remove profiling code
      
      * remove pten
      
      * opt code lines
      
      * correct boundary judgment
      
      * only cpu
      
      * test ci
      
      * test ci
      
      * remove the including paddle/fluid header; extract the conmmon function
      
      * opt code lines
      
      * use DenseTensor::data() instead of mutable_data
      
      * return rulebook for backward
      
      * specify layout
      
      * rename:conv -> sparse_conv3d
      bc99a76c
  22. 24 2月, 2022 1 次提交
  23. 20 2月, 2022 1 次提交
  24. 17 2月, 2022 1 次提交
  25. 14 2月, 2022 1 次提交
    • C
      [pten] add split kernel (#39060) · d0df5632
      chentianyu03 提交于
      * add split kernel
      
      * add split kernel signature
      
      * fix split bug
      
      * modify MakePtenScalarArrayFromVarList
      
      * modify MakePtenScalarArrayFromVarList
      
      * fix split windows register error
      
      * add test case for split kernel
      
      * replace raw split kernel with pten kernel
      
      * fix makeScalar/ScalarArray bug
      
      * remove debug log
      
      * remove int64_t type in buildPtcontext
      
      * update by code review
      
      * fix split dev test failed
      
      * change DenseTensorMeta to MetaTensor
      
      * change split api code from auto gen to manual
      
      * split cuda kernel support bfloat16 type
      
      * fix conflict
      
      * rm raw split kernel
      
      * merge develop branch
      
      * change to pten::errors
      d0df5632
  26. 30 1月, 2022 1 次提交
    • Z
      Add a Sparse OP : to_sparse_coo (#39264) · 78132fe1
      zhangkaihuo 提交于
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      78132fe1
  27. 21 1月, 2022 1 次提交
  28. 28 12月, 2021 1 次提交
  29. 23 12月, 2021 1 次提交
  30. 20 12月, 2021 1 次提交
  31. 29 11月, 2021 1 次提交
    • C
      [Pten] Add reduce mean kernel, replace with mean API (#37559) · f9e9fd19
      chentianyu03 提交于
      * add pten reduce kernel
      
      * add reduce_sum kernel
      
      * update attribute args and order
      
      * make out dtype undefined
      
      * fix empty input error
      
      * merge develop branch
      
      * rename sum as reduce function
      
      * rename sum as reduce function
      
      * fix reducekernelImpl args error
      
      * add reduce cuda kernel
      
      * modify dims type to const &
      
      * remove unsed log
      
      * fix reduce_all out eigen function error
      
      * remove unused codes
      
      * add the missing sum api define and testcase
      
      * merge develop branch
      
      * fix sum test axis value error
      
      * replace pten mean kernel with reduce_mean
      
      * revcover meam cuda to original implement
      f9e9fd19
  32. 22 11月, 2021 1 次提交
    • C
      [PTen] Add variable transform to/from ptenTensor and add cast kernel (#36916) · 5caa6fc5
      chentianyu03 提交于
      * add cast kernel
      
      * add cast cuda kernel
      
      * add cast kernel
      
      * make cast kernel output dtype undefined
      
      * get cast dtype from vardesc
      
      * move cast to manipulation and add test case
      
      * add castinfershape
      
      * avoid reinitilaze variable
      
      * InitializeVariable support datatype
      
      * merge develop branch
      
      * fix merge bug
      
      * revert modify initializeVariable
      
      * revert modify on InitializeVariable
      
      * revert modify on InitializeVariable
      
      * mutable support reset dtype
      
      * enable make pten tensor from variable when def_arg.type is undefined
      
      * fix build pten ctx start_idx error
      
      * copy pten out tensor to variable
      
      * merge develop branch
      
      * fix non pten kernel cast failed
      
      * add reset allocation place for remake tensor
      
      * fix inplace realloc error
      
      * add mutable on pten kernles and remove unused cast files
      
      * rename function names
      
      * fix output type error
      
      * fix conflict with develop branch
      
      * set data type to variable with pten's dtype
      
      * fix test_cast_api type mismatch
      
      * densorTensro mutable_data support 0 bytes value
      
      * fix the inplace bug of reshape kernel
      
      * fix pten.backend != variable.place when moving storage, palce mismatch bug
      
      * fix conflict with develop branch
      
      * Fix bug of paddle::experimental::MovesStorage
      
      * fix ReMakePtenDenseTensor place mismatch bug
      
      * Revert "fix ReMakePtenDenseTensor place mismatch bug"
      
      This reverts commit 86336032f60b8a15eacd2c1ff2fa513f5d8dfd1a.
      
      * fix ReMakePtenDenseTensor place mismatch bug
      
      * reverts the set_lod interface, test=develop
      
      * modify by the review options
      
      * modify error message
      
      * add & for const input arguments
      
      * add reference in params
      
      * elementwise_sub add mutable_data
      
      * fix ResetHolderWithType check size bug
      
      * add dependence pten_tensor to test_cast_api object
      
      * remove unused code to pass ci coverage
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
      5caa6fc5
  33. 16 11月, 2021 1 次提交
    • Y
      Add API and unit test for reshape (#37232) · 79b49c20
      YuanRisheng 提交于
      * reshape kernel refactor
      
      * fix compile bugs when run ci
      
      * support xpu for reshape
      
      * fix bugs when run unittest in kunlun ci
      
      * fix compile bugs when run kunlun
      
      * perfect code according to suggestion
      
      * add api and unit test for reshape
      79b49c20
  34. 12 11月, 2021 1 次提交
    • Y
      [Pten]Refactor the Elementwise_add Kernel (#37043) · c1310343
      YuanRisheng 提交于
      * elementwise_add kernel refactor
      
      * fix compile bugs in elementwise_add refactor
      
      * fix compile bugs when run in npu/xpu
      
      * fix bugs when run unit test
      
      * fix bugs when run ci-windows
      
      * modify code as recommended
      
      * code format adjust
      
      * fix bugs when run ci
      
      * fix compile bug when run in ci-windwos
      c1310343
  35. 05 11月, 2021 1 次提交