1. 09 2月, 2022 6 次提交
    • C
      move stream into pten (#39392) · 266955a9
      Chen Weihang 提交于
      266955a9
    • H
      update basic infrastructure (#39383) · b12e7a17
      hong 提交于
      * update basic infrastructure; support string,  suport vecotr<int>, add tensor args type index; test=develop
      
      * remove useless code; test=develop
      
      * fix bug; test=develop
      
      * polish code; test=develop
      b12e7a17
    • Z
    • N
      Delete BASE_SIZE in elementwise_base.h (#39390) · b007a031
      niuliling123 提交于
      b007a031
    • Z
      Add a Sparse Op: to_sparse_csr (#39333) · 76d527e1
      zhangkaihuo 提交于
      * implement AllocateFrom
      
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      
      * sparse_csr_to_dense
      
      * test to_sparse_coo: csr_to_coo
      
      * fix writing error
      
      * to_sparse_csr: dense_to_sparse_csr and sparse_coo_to_csr
      
      * fix check shape
      
      * fix unit test
      
      * replace CUDADeviceContext by GPUContext
      76d527e1
    • H
      Move norm to pten (#39324) · ece200b3
      hong 提交于
      * add norm cpu
      
      * update code;
      
      * norm bug fix
      
      * move norm op to pten; test=develop
      
      * move norm op to pten; test=develop
      
      * add norm util; test=develop
      
      * fix norm npu bug; test=develop
      
      * fix norm kernel bug; test=develop
      
      * move kernel args to pten; test=develop
      
      * move kernel args to pten sig; test=develop
      ece200b3
  2. 08 2月, 2022 7 次提交
  3. 07 2月, 2022 1 次提交
  4. 06 2月, 2022 1 次提交
  5. 04 2月, 2022 2 次提交
  6. 02 2月, 2022 1 次提交
  7. 30 1月, 2022 4 次提交
    • Z
      Add a Sparse OP:sparse_csr_to_coo (#39266) · bafea65c
      zhangkaihuo 提交于
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      
      * sparse_csr_to_dense
      
      * test to_sparse_coo: csr_to_coo
      
      * fix writing error
      bafea65c
    • C
      [PTen] Change all InferMeta functions (#39222) · 7e29cea9
      Chen Weihang 提交于
      * change unary infermeta
      
      * change other infermeta
      
      * change all infermeta format
      
      * resolve conflit
      
      * fix test failed
      
      * resolve reshape conflit
      
      * fix compile failed
      
      * adapt auto api gen
      
      * fix reshape failed
      
      * fix concat failed
      
      * resolve conflict
      7e29cea9
    • Z
      Add a Sparse OP : to_sparse_coo (#39264) · 78132fe1
      zhangkaihuo 提交于
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      78132fe1
    • L
      [pten] fit get all register op kernels (#39288) · eefe5feb
      Leo Chen 提交于
      * upgrade _get_all_register_op_kernels
      
      * add ut
      
      * support xpu/npu
      
      * fix device id
      
      * enhance TransToFluidPlace
      
      * fix compile
      eefe5feb
  8. 29 1月, 2022 2 次提交
    • C
      rename utils to manual (#39320) · 96bcf2df
      Chen Weihang 提交于
      96bcf2df
    • C
      [PTen] Tidy pten core headers (#39188) · dd990981
      Chen Weihang 提交于
      * open header for custom kernel
      
      * add core utils
      
      * tidy core code
      
      * tify header
      
      * tidy include
      
      * tidy namespace
      
      * resolve conflit
      
      * fix unittest and coverage
      
      * remove platform using
      
      * resolve conflict
      
      * resolve conflict
      
      * fix digamma namespace error
      
      * fix xpu full kernel error
      
      * fix xpu full kernel error
      
      * polish details
      
      * add place for lib storage
      dd990981
  9. 28 1月, 2022 5 次提交
  10. 27 1月, 2022 11 次提交
    • Z
      implement AllocateFrom (#39280) · d89f246c
      zhangkaihuo 提交于
      d89f246c
    • C
      Add kernelsignature constructor for windows (#39253) · 33e3f5ac
      Chen Weihang 提交于
      * add constructor for win
      
      * change impl
      
      * fix bug
      33e3f5ac
    • Z
      【PTen】Remove ReMakePtenDenseTensor (#39094) · 98c1829b
      zyfncg 提交于
      * remove remake densetensor
      
      * fix eager test error
      
      * fix bug in eager
      98c1829b
    • Y
      refactor elementwise sub grad (#39225) · 7a1e1193
      YuanRisheng 提交于
      7a1e1193
    • A
      [PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c
      Aurelius84 提交于
      * Support allocate_from in Tensor and allocate_data in Context
      
      * fix #ifdef CUDA
      
      * fix cycle depends
      
      * fix test_xxx_dev_api failed
      
      * fix windows compiling error
      
      * fix unittest
      
      * modify into PImpl
      
      * fix selected rows
      
      * add TODO comment
      
      * refine interface according reviewer
      5631da9c
    • C
      [PTen] Add infermeta registry (#39204) · f3f16126
      Chen Weihang 提交于
      * add infermeta registry
      
      * add infermeta registry
      
      * add unittest
      
      * polish details
      f3f16126
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
    • C
      [pten] add full xpu kernel (#39172) · 93839717
      chentianyu03 提交于
      * add full_kernel xpu
      
      * fix full xpu register device type error
      
      * fix full kernel bug
      
      * add fulllike kernel impl and replace with raw kernel
      
      * fix dev_ctx convert template args error
      
      * modify namespace and header file
      
      * add isinf check
      
      * fix input type args in TensorSetConstantXPU error
      93839717
    • Q
      optimize kunlun/xpu softmax_with_cross_entropy add add unitest (#39180) · 2b9bb8bb
      QingshuChen 提交于
      * optimize kunlun/xpu softmax_with_cross_entropy add add unitest
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      2b9bb8bb
    • Z
      Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3
      zhangkaihuo 提交于
      * fix bug:
      1. atten: set the default value of attn_dropout_rate to None
      2. ffn: add activation parameter
      
      * for pure fp16
      
      * Add a SparseCsrTensor
      
      * remove unused functional
      
      * remove const
      
      * remove SetMemoberTensor
      
      * remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows
      
      * SparseCooTensor
      
      * add SetMember
      
      * merge upstream; add SetMember
      
      * merge upstream
      
      * merge upstream; add newline at end of file
      
      * add newline at end of file
      
      * remove newline at end of file
      
      * remove newline at end of file
      
      * stash
      
      * user pten::framework::make_ddim
      
      * user pten::framework::make_ddim
      
      * merge upstream; use the latest mutable_data
      
      * merge upstream; use the latest mutable_data
      
      * return mutable dense tensor
      a7edb3f3
    • F
      move math_cuda_utils.h to pten/kernels/funcs (#39246) · 809a10b6
      Feiyu Chan 提交于
      809a10b6