1. 15 2月, 2022 6 次提交
    • F
      Move Abs OP to pten (#39492) · fb473067
      From00 提交于
      * Move Abs op to pten
      
      * Fix NPU compilation error
      
      * Fix CI error
      
      * Use LaunchSameDimsElementwiseCudaKernel in pten
      fb473067
    • Z
      [Pten] Support SelectedRows in C++ API (#39497) · 5bb3b668
      zyfncg 提交于
      * add data_transform in pten api
      
      * support GetKernelTypeForVar
      
      * fix complie problem of bfloat16
      
      * add scale_sr in api
      
      * suppport select_row in C++ api
      
      * merge code
      5bb3b668
    • F
      move algorithm.h (#39502) · 7eb9593e
      Feiyu Chan 提交于
      Move paddle/fluid/operators/math/algorithm.h to paddle/pten/kernels/funcs and rename all references to symbols in it.
      7eb9593e
    • L
      [Pten]Move expand_v2 to pten (#39471) · 2d16d69b
      Linjie Chen 提交于
      * move expand to pten
      
      * move expand_v2 to pten
      
      * move expand_v2 to pten
      
      * fix grad register
      
      * fix grad register
      
      * fix tensorcpry
      
      * fix tensorcopy
      
      * fix tensorcopy
      
      * fix tensorcopy
      
      * fix tensorcopy
      
      * fix ci
      
      * fix tensorcopy
      2d16d69b
    • C
      [PTen] Polish trace moving (#39510) · ab866777
      Chen Weihang 提交于
      * polish trace moving
      
      * remove useless header
      ab866777
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  2. 14 2月, 2022 1 次提交
    • C
      [pten] add split kernel (#39060) · d0df5632
      chentianyu03 提交于
      * add split kernel
      
      * add split kernel signature
      
      * fix split bug
      
      * modify MakePtenScalarArrayFromVarList
      
      * modify MakePtenScalarArrayFromVarList
      
      * fix split windows register error
      
      * add test case for split kernel
      
      * replace raw split kernel with pten kernel
      
      * fix makeScalar/ScalarArray bug
      
      * remove debug log
      
      * remove int64_t type in buildPtcontext
      
      * update by code review
      
      * fix split dev test failed
      
      * change DenseTensorMeta to MetaTensor
      
      * change split api code from auto gen to manual
      
      * split cuda kernel support bfloat16 type
      
      * fix conflict
      
      * rm raw split kernel
      
      * merge develop branch
      
      * change to pten::errors
      d0df5632
  3. 11 2月, 2022 3 次提交
  4. 10 2月, 2022 3 次提交
    • H
      move Masked select to pten (#39193) · e2ad433b
      hong 提交于
      * move masked select cpu kernel
      
      * add masked selected gpu kernel; test=develop
      
      * fix bugs; test=develop
      
      * bug fix; test=develop
      
      * bug fix; test=develop
      
      * add namespace to set mask array; test=develop
      
      * fix bug; test=develop
      
      * fix bugs; test=develop
      
      * fix ddim bug; test=develop
      
      * fix npu op bug; test=develop
      
      * fix xpu dependecy bug; test=develop
      
      * move kernel args to sig.cc; test=develop
      e2ad433b
    • Z
      [bf16] add bf16 kernel: dropout & reshape & slice (#39395) · e8ac7fc3
      zhangbo9674 提交于
      * add dropout
      
      * add reshape
      
      * add slice
      
      * refien slice unittest
      
      * refine slice unittest
      
      * add cpu bf16 kernel
      e8ac7fc3
    • Z
      Fix code conflict of empty dev_api (#39430) · 2a5d858c
      zyfncg 提交于
      * fix code conflict
      
      * clear cache
      
      * just try
      2a5d858c
  5. 09 2月, 2022 10 次提交
    • Z
      【Pten】Adjust the Empyt dev_api (#39143) · 9d4d0c3b
      zyfncg 提交于
      * adjust the Empyt dev_api
      
      * fix merge conflict
      
      * fix sparse_utils_kernel
      9d4d0c3b
    • H
      Fix trace conflict (#39421) · 87f4a681
      hong 提交于
      * add trace op
      
      * bug fix
      
      * bug fix; test=develop
      
      * thrust bug fix; test=develop
      
      * remove useless register; test=develop
      
      * fix bug; test=develop
      
      * update trace kernel; test=develop
      
      * move kernel args to trace_sig; test=develop
      
      * try to fix trace kernel conflict; test=develop
      87f4a681
    • N
    • Z
      Add a Sparse Op to_dense (#39335) · aca86470
      zhangkaihuo 提交于
      * implement AllocateFrom
      
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      
      * sparse_csr_to_dense
      
      * test to_sparse_coo: csr_to_coo
      
      * fix writing error
      
      * to_sparse_csr: dense_to_sparse_csr and sparse_coo_to_csr
      
      * fix check shape
      
      * fix unit test
      
      * to_dense: sparse_coo_to_dense, sparse_csr_to_dense
      
      * replace CUDADeviceContext by GPUContext
      aca86470
    • Y
    • H
      Move trace op to pten (#39227) · d7dddf94
      hong 提交于
      * add trace op
      
      * bug fix
      
      * bug fix; test=develop
      
      * thrust bug fix; test=develop
      
      * remove useless register; test=develop
      
      * fix bug; test=develop
      
      * update trace kernel; test=develop
      
      * move kernel args to trace_sig; test=develop
      d7dddf94
    • Z
    • N
      Delete BASE_SIZE in elementwise_base.h (#39390) · b007a031
      niuliling123 提交于
      b007a031
    • Z
      Add a Sparse Op: to_sparse_csr (#39333) · 76d527e1
      zhangkaihuo 提交于
      * implement AllocateFrom
      
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      
      * sparse_csr_to_dense
      
      * test to_sparse_coo: csr_to_coo
      
      * fix writing error
      
      * to_sparse_csr: dense_to_sparse_csr and sparse_coo_to_csr
      
      * fix check shape
      
      * fix unit test
      
      * replace CUDADeviceContext by GPUContext
      76d527e1
    • H
      Move norm to pten (#39324) · ece200b3
      hong 提交于
      * add norm cpu
      
      * update code;
      
      * norm bug fix
      
      * move norm op to pten; test=develop
      
      * move norm op to pten; test=develop
      
      * add norm util; test=develop
      
      * fix norm npu bug; test=develop
      
      * fix norm kernel bug; test=develop
      
      * move kernel args to pten; test=develop
      
      * move kernel args to pten sig; test=develop
      ece200b3
  6. 08 2月, 2022 3 次提交
  7. 06 2月, 2022 1 次提交
  8. 04 2月, 2022 1 次提交
  9. 30 1月, 2022 3 次提交
    • Z
      Add a Sparse OP:sparse_csr_to_coo (#39266) · bafea65c
      zhangkaihuo 提交于
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      
      * sparse_csr_to_dense
      
      * test to_sparse_coo: csr_to_coo
      
      * fix writing error
      bafea65c
    • C
      [PTen] Change all InferMeta functions (#39222) · 7e29cea9
      Chen Weihang 提交于
      * change unary infermeta
      
      * change other infermeta
      
      * change all infermeta format
      
      * resolve conflit
      
      * fix test failed
      
      * resolve reshape conflit
      
      * fix compile failed
      
      * adapt auto api gen
      
      * fix reshape failed
      
      * fix concat failed
      
      * resolve conflict
      7e29cea9
    • Z
      Add a Sparse OP : to_sparse_coo (#39264) · 78132fe1
      zhangkaihuo 提交于
      * dense_to_sparse_coo
      
      * optimize unit testing; support rocm
      
      * 1. delete fluid related header file
      2. update the copyright
      
      * fix hipMemcpy
      
      * update dense_to_sparsecoo
      
      * add namespace sparse
      78132fe1
  10. 29 1月, 2022 1 次提交
    • C
      [PTen] Tidy pten core headers (#39188) · dd990981
      Chen Weihang 提交于
      * open header for custom kernel
      
      * add core utils
      
      * tidy core code
      
      * tify header
      
      * tidy include
      
      * tidy namespace
      
      * resolve conflit
      
      * fix unittest and coverage
      
      * remove platform using
      
      * resolve conflict
      
      * resolve conflict
      
      * fix digamma namespace error
      
      * fix xpu full kernel error
      
      * fix xpu full kernel error
      
      * polish details
      
      * add place for lib storage
      dd990981
  11. 28 1月, 2022 2 次提交
  12. 27 1月, 2022 6 次提交
    • Z
      implement AllocateFrom (#39280) · d89f246c
      zhangkaihuo 提交于
      d89f246c
    • Y
      refactor elementwise sub grad (#39225) · 7a1e1193
      YuanRisheng 提交于
      7a1e1193
    • A
      [PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c
      Aurelius84 提交于
      * Support allocate_from in Tensor and allocate_data in Context
      
      * fix #ifdef CUDA
      
      * fix cycle depends
      
      * fix test_xxx_dev_api failed
      
      * fix windows compiling error
      
      * fix unittest
      
      * modify into PImpl
      
      * fix selected rows
      
      * add TODO comment
      
      * refine interface according reviewer
      5631da9c
    • C
      [pten] add full xpu kernel (#39172) · 93839717
      chentianyu03 提交于
      * add full_kernel xpu
      
      * fix full xpu register device type error
      
      * fix full kernel bug
      
      * add fulllike kernel impl and replace with raw kernel
      
      * fix dev_ctx convert template args error
      
      * modify namespace and header file
      
      * add isinf check
      
      * fix input type args in TensorSetConstantXPU error
      93839717
    • Z
      Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3
      zhangkaihuo 提交于
      * fix bug:
      1. atten: set the default value of attn_dropout_rate to None
      2. ffn: add activation parameter
      
      * for pure fp16
      
      * Add a SparseCsrTensor
      
      * remove unused functional
      
      * remove const
      
      * remove SetMemoberTensor
      
      * remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows
      
      * SparseCooTensor
      
      * add SetMember
      
      * merge upstream; add SetMember
      
      * merge upstream
      
      * merge upstream; add newline at end of file
      
      * add newline at end of file
      
      * remove newline at end of file
      
      * remove newline at end of file
      
      * stash
      
      * user pten::framework::make_ddim
      
      * user pten::framework::make_ddim
      
      * merge upstream; use the latest mutable_data
      
      * merge upstream; use the latest mutable_data
      
      * return mutable dense tensor
      a7edb3f3
    • F
      move math_cuda_utils.h to pten/kernels/funcs (#39246) · 809a10b6
      Feiyu Chan 提交于
      809a10b6