- 18 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Patched python level LoDTensor * Merge Tensor into DenseTensor * Fixed namespace issues,test=allcases * Fixed merge issues * Fixed inference issues * Fixed NPU test issues * Fixed merge issues
-
- 17 1月, 2022 1 次提交
-
-
由 Wilber 提交于
* add pten::Place data structure. * update ci problem * fix ci problem * update * using platform::Place=pten::Place * remove BOOST_GET_CONST for CPUPlace and GPUPlace * compile pass 25%. * compile pass 45% * compile pass 60% * remove boost_get for xpu npu mlu and ipu * compile pass on cpu and gpu. * fix compile problem * fix compile error. * update * fix ci problem * update * ci approve * fix ci problem * fix ci eager test problem * remove BOOST_GET_CONST * fix npu compile
-
- 15 1月, 2022 3 次提交
-
-
由 石晓伟 提交于
-
由 石晓伟 提交于
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Fixed example code failure * Polished function names, removed duplicated forward declarations
-
- 14 1月, 2022 1 次提交
-
-
由 石晓伟 提交于
-
- 13 1月, 2022 2 次提交
-
-
由 chentianyu03 提交于
* move dot_dev api into dot_kernel.h * add infermate header * modify to dotkerel in dot_op.h * mvoe conj dev api into complex_kernel.h * move sign dev api into sign_kernel.h * move scale dev api into kernel.h and remove infermete.h * rm paddle/pten/include/math.h * rm paddle/pten/include/math.h * rm include dir * rm paddle/pten/include/math.h * fix conflict with develop branch * rm devContext in conj_op.h * add the missing complex_kernel header
-
由 石晓伟 提交于
-
- 11 1月, 2022 1 次提交
-
-
由 Wilber 提交于
* add pten::Place data structure. * update ci problem * fix ci problem * update
-
- 10 1月, 2022 2 次提交
-
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
由 Chen Weihang 提交于
* unify infer_shape func calling * support set grad infer shape fn for custom op * unify infershape in new executor and eager * remove todo comment * revert infershape in operator
-
- 04 1月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
- 31 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs
-
- 30 12月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* remove offset in storage * revert api change * fix custom op slice bug * fix mutable_data error
-
- 28 12月, 2021 4 次提交
-
-
由 Jiabin Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * support inference test * refine test and fix initializer failed * support create varbase and fix retain grad error * fix windows error * support test code coverage * support test code coverage * support test code coverage Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NWang Huan <wanghuan29@baidu.com>
-
由 zyfncg 提交于
* refactor matmul directory in pten * fix merge conflict
-
由 chentianyu03 提交于
* remove intype arg in cast kernel * modify conj config in api.yaml by dictionary order * rm unused code in cast_kernel.cu
-
由 Zhanlue Yang 提交于
-
- 27 12月, 2021 2 次提交
-
-
由 YuanRisheng 提交于
* move reshape * fix compile bugs * delete manipulation file * fix compile bugs
-
由 Chen Weihang 提交于
* rename to api to copy_to * revert needless change * polish format
-
- 24 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
-
由 chentianyu03 提交于
* combine reduce_cuda codes * support float16 in pten redcue_mean * replace ReduceCudaKernel impl with pten reduce impl * mv reduce funcs into reduce_cuda_impl * rm unsed codes and headers * mv GetReduceDim into reduce_cuda_impl * recover GetReduceDim in reduce_op.h * add new dispatch macro * fix pool op output not inited and cause transform to pten::denseTensor error * fix output tensor not initialized error * rename new dispatch macro and format code style * rm reduce_functor_op.h file
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #1] Replaced pten::Allocation with shared_ptr<memory::Allocation> for Storage (#38301) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage
-
- 23 12月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
-
由 zyfncg 提交于
* add empty and empty_like kernel in pten * add empty dev_api
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* move dot kernel impl * remove needless cmake items
-
由 石晓伟 提交于
* updates the pten allocation, test=develop * avoids an error message, test=develop
-
- 22 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* add pten kernel cmake * add pten kernel cmake function * fix compile error * add enforce include for full kernel * fix compile failed * change cuda to gpu * fix cmake function error
-
由 YuanRisheng 提交于
* move flatten * fix bugs of test * modify header file * add copy declare * fix compile bugs
-
- 21 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
由 Chen Weihang 提交于
* remove eigen and blas dir * fix declare error
-
- 20 12月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add pten conj kernel * modify conj_kernel file path * add defined cuda macro to cuda/conj_kernel.h
-
- 17 12月, 2021 2 次提交
-
-
由 Jiabin Yang 提交于
* support more eager tensor api * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * refine test in pure cpu * refine test in pure cpu
-
由 chentianyu03 提交于
* modify sum mean args * add GetExpectedPtenKernelArgs for redcue_op * modify kernel args number * modify kernel args number
-
- 16 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* add register_ctx_kernel and move scale kernel * polish details by reviewer comment * fix xpu compile failed * fix cmake error
-
由 Chen Weihang 提交于
* unify device context entrance * move all_context include to header * polish cmake relay for device_context * fix npu compile failed * fix npu compile failed * revert part of change
-
- 15 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
-