1. 28 1月, 2022 5 次提交
  2. 27 1月, 2022 11 次提交
    • Z
      implement AllocateFrom (#39280) · d89f246c
      zhangkaihuo 提交于
      d89f246c
    • C
      Add kernelsignature constructor for windows (#39253) · 33e3f5ac
      Chen Weihang 提交于
      * add constructor for win
      
      * change impl
      
      * fix bug
      33e3f5ac
    • Z
      【PTen】Remove ReMakePtenDenseTensor (#39094) · 98c1829b
      zyfncg 提交于
      * remove remake densetensor
      
      * fix eager test error
      
      * fix bug in eager
      98c1829b
    • Y
      refactor elementwise sub grad (#39225) · 7a1e1193
      YuanRisheng 提交于
      7a1e1193
    • A
      [PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c
      Aurelius84 提交于
      * Support allocate_from in Tensor and allocate_data in Context
      
      * fix #ifdef CUDA
      
      * fix cycle depends
      
      * fix test_xxx_dev_api failed
      
      * fix windows compiling error
      
      * fix unittest
      
      * modify into PImpl
      
      * fix selected rows
      
      * add TODO comment
      
      * refine interface according reviewer
      5631da9c
    • C
      [PTen] Add infermeta registry (#39204) · f3f16126
      Chen Weihang 提交于
      * add infermeta registry
      
      * add infermeta registry
      
      * add unittest
      
      * polish details
      f3f16126
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
    • C
      [pten] add full xpu kernel (#39172) · 93839717
      chentianyu03 提交于
      * add full_kernel xpu
      
      * fix full xpu register device type error
      
      * fix full kernel bug
      
      * add fulllike kernel impl and replace with raw kernel
      
      * fix dev_ctx convert template args error
      
      * modify namespace and header file
      
      * add isinf check
      
      * fix input type args in TensorSetConstantXPU error
      93839717
    • Q
      optimize kunlun/xpu softmax_with_cross_entropy add add unitest (#39180) · 2b9bb8bb
      QingshuChen 提交于
      * optimize kunlun/xpu softmax_with_cross_entropy add add unitest
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      2b9bb8bb
    • Z
      Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3
      zhangkaihuo 提交于
      * fix bug:
      1. atten: set the default value of attn_dropout_rate to None
      2. ffn: add activation parameter
      
      * for pure fp16
      
      * Add a SparseCsrTensor
      
      * remove unused functional
      
      * remove const
      
      * remove SetMemoberTensor
      
      * remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows
      
      * SparseCooTensor
      
      * add SetMember
      
      * merge upstream; add SetMember
      
      * merge upstream
      
      * merge upstream; add newline at end of file
      
      * add newline at end of file
      
      * remove newline at end of file
      
      * remove newline at end of file
      
      * stash
      
      * user pten::framework::make_ddim
      
      * user pten::framework::make_ddim
      
      * merge upstream; use the latest mutable_data
      
      * merge upstream; use the latest mutable_data
      
      * return mutable dense tensor
      a7edb3f3
    • F
      move math_cuda_utils.h to pten/kernels/funcs (#39246) · 809a10b6
      Feiyu Chan 提交于
      809a10b6
  3. 26 1月, 2022 9 次提交
    • L
      [pten] remove deprecated fluid op kernel for pten (#38842) · 3ab9aef1
      Leo Chen 提交于
      * update cmake file to remove fluid kernel
      
      * add pten declaration.h to where pybind.h used
      
      * fix sync_bn and tensorrt_engine
      
      * refine detection_library
      
      * fix interpreter_core
      
      * support eager legacy
      
      * fit eager legacy for pten
      
      * fall back to cpu if not found kernel
      
      * fix compile problem
      
      * fix compile problem
      
      * refine fallback logic
      
      * fit operator.run()
      
      * fix xpu compile
      
      * fit for new_exec
      
      * add REGISTER_OP_WITHOUT_GRADIENT
      
      * un-cache pt_kernel_context
      
      * fix compile
      
      * fix cudnn
      
      * fix compiling with on_infer
      
      * fix mkldnn
      
      * fix isfinite_v2
      
      * fix xpu problem
      
      * fix op_device
      
      * refine fallback for xpu
      
      * fix xpu compile
      
      * merge develop
      
      * refine code format
      
      * fix compile
      
      * fix compile
      
      * add data_transfer
      
      * fix PreparePtenData
      
      * fix cpu context
      
      * merge develop
      
      * fix compile
      
      * fix error device context
      
      * fix xpu
      
      * fix dev_ctx
      3ab9aef1
    • C
      [pten] Cast xpu kernel (#39179) · 93d2f0a6
      chentianyu03 提交于
      * cast xpu kernel init
      
      * cast xpu kernel
      
      * replace with raw cast xpu kernel
      
      * fix cast kernel bug
      
      * add the missing break
      
      * modify namespace and header file
      93d2f0a6
    • X
      add dependences of enforce (#39237) · 2c0160e5
      xiongkun 提交于
      2c0160e5
    • W
      [Move selected_rows PR #5] VisitDataType use Pten::DataType (#39236) · 42a0947e
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Selected_Rows inherits from TensorBase
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      
      * Use paddle/pten/core/enforce and polish code
      
      * Use pten::DataType instead of using proto_type
      
      * Move part of data_type to pten
      
      * Polish Code
      42a0947e
    • Y
      [Pten]Move kernel_primitives lib to Pten directory (#39169) · 452bcbe2
      YuanRisheng 提交于
      * move kernel_primitives
      
      * use pten's errors
      452bcbe2
    • W
      [PTEN] cpu_context add eigen deps (#39234) · bd5c962d
      Wilber 提交于
      * add eigen deps
      
      * update
      bd5c962d
    • W
      [Move selected_rows PR #4] SelectedRows inherits from TensorBase. (#39162) · 3e80253a
      Weilong Wu 提交于
      * Added selected_rows and rw_lock to pten
      
      * Renamed the unit test target to fix CI
      
      * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid
      
      * Remove rw_lock.h,rw_lock_test.cc in fluid
      
      * Use pten::RWLock and pten::AutoRDLock, fix CI
      
      * Use pten::SelectedRows
      
      * Use pten::SelectedRows
      
      * Fix to pass NPU CI
      
      * Selected_Rows inherits from TensorBase
      
      * Use pten::SelectedRows, to pass NPU CI
      
      * To fix NPU CI
      
      * To fix NPU CI again
      
      * Use paddle/pten/core/enforce and polish code
      3e80253a
    • C
      [PTen] Unify InferMeta(Shape) Function in pten and fluid op (#38976) · b75507d3
      Chen Weihang 提交于
      * infermeta context init design
      
      * support infermeta called in fluid op
      
      * add hasattr and attr methods
      
      * add dygraah GetVarPtrs support
      
      * rename arg_map_context to arg_map_utils
      
      * add registry for arg map func
      
      * resolve conflit
      
      * refactor op utils design
      
      * polish meta config
      
      * fix details
      
      * remove hasattr method
      
      * resolve conflit
      
      * revert cmake order change
      
      * revert some change
      
      * change init pos
      
      * fix compile faileed
      
      * fix typo
      
      * fix inference failed
      
      * fix windows ccompile failed
      
      * polish format
      Co-authored-by: NWang Huan <wanghuan29@baidu.com>
      b75507d3
  4. 25 1月, 2022 9 次提交
  5. 24 1月, 2022 6 次提交