1. 24 12月, 2021 13 次提交
    • C
      [pten] combine reduce_cuda codes (#38328) · 08941eda
      chentianyu03 提交于
      * combine reduce_cuda codes
      
      * support float16 in pten redcue_mean
      
      * replace ReduceCudaKernel impl with pten reduce impl
      
      * mv reduce funcs into reduce_cuda_impl
      
      * rm unsed codes and headers
      
      * mv GetReduceDim into reduce_cuda_impl
      
      * recover GetReduceDim in reduce_op.h
      
      * add new dispatch macro
      
      * fix pool op output not inited and cause transform to pten::denseTensor error
      
      * fix output tensor not initialized error
      
      * rename new dispatch macro and format code style
      
      * rm reduce_functor_op.h file
      08941eda
    • Z
      [Unify Tensors PR #1] Replaced pten::Allocation with... · 42cf2bee
      Zhanlue Yang 提交于
      [Unify Tensors PR #1] Replaced pten::Allocation with shared_ptr<memory::Allocation> for Storage (#38301)
      
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      42cf2bee
    • Z
      [heterps]move pre-init id logic from common_sparse_table to sparse_geo_table (#38173) · 52329f6f
      zmxdream 提交于
      * remove pre-init id in common_sparse_tabl.cc
      52329f6f
    • zhouweiwei2014's avatar
      add new API/OP:paddle.Tensor.exponential_ (#38256) · 33185000
      zhouweiwei2014 提交于
      * add new API/OP:paddle.Tensor.exponential_
      
      * fix CI
      33185000
    • 努力努力在努力丶's avatar
      [MLU]add mlu op interface (#38241) · c396ee65
      努力努力在努力丶 提交于
      * [MLU]add mlu op interface
      
      * [MLU]fix alpha of activation op
      c396ee65
    • Y
      add pull gpups sparse op (#37124) · 572b3e90
      yaoxuefeng 提交于
       add pull gpups sparse op
      572b3e90
    • B
      fix share buffer to (#38407) · 9409ff6b
      Baibaifan 提交于
      9409ff6b
    • 4b3d5195
    • C
      add register general kernel marco (#38409) · fc0a50aa
      Chen Weihang 提交于
      fc0a50aa
    • Z
      Add new API cholesky_solve (#38167) · 39f7c41f
      zhiboniu 提交于
      39f7c41f
    • zhouweiwei2014's avatar
      add new API/OP: paddle.poisson (#38117) · bcf86e5c
      zhouweiwei2014 提交于
      * add new API/OP:paddle.poisson
      
      * fix comment
      bcf86e5c
    • B
      add conv+hard_sigmoid and conv+hard_swish fuse pass ut (#37553) · a858326a
      baoachun 提交于
      * add conv+hard_sigmoid fuse pass ut
      
      * update conv_elementwise_add_mkldnn_fuse_pass ut
      
      * update conv_hard_sigmoid_mkldnn_fuse_pass ut
      
      * update conv+hard_sigmoid and conv+hard_swish fuse pass ut
      
      * update ut
      
      * update ut
      a858326a
    • J
      Support test imperative basic in eager (#38313) · d48f7c89
      Jiabin Yang 提交于
      * Rearranged Eager AutoCodeGen directory structure
      
      * Removed USE_OP in Eager AutoCodeGen
      
      * Enabled generation for Operators without Grad/Inputs/Outputs
      
      * Resolved operators without input
      
      * Fixed merge conflicts
      
      * Enabled Eager AutoCodeGen for 10+ more operators
      
      * Refactored Eager AutoCodeGen with more organized helper objects
      
      * Enabled Eager AutoCodeGen for operators with multiple OpBases
      
      * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument
      
      * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
      
      * Adjusted function generation/call between Python-C API & Dygraph API
      
      * Synchronized auto-generated Python-C API with Dygraph Forward Functions
      
      * support more eager tensor api
      
      * fix merge compile error
      
      * fix compile error and fit develop code
      
      * support pure CPU
      
      * fix some logic error in eager_mode
      
      * support _varbase_creator in eager mode
      
      * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs
      
      * for eager mode
      
      * refine
      
      * support multiple constructor for eager tensor
      
      * add place related code
      
      * polish code
      
      * specific randint with dtype of int64
      
      * Support pure cpu test
      
      * eager logic
      
      * refine test in pure cpu
      
      * eager logic
      
      * eager logic
      
      * eager logic, test=develop
      
      * skip core.eager when in inference, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * call RetainGrad after run forward kernel, test=develop
      
      * refine, test=develop
      
      * support dygraph util, meta, guard test
      
      * support inference test
      
      * refine test and fix initializer failed
      Co-authored-by: Njim19930609 <jim19930609@gmail.com>
      Co-authored-by: NWang Huan <wanghuan29@baidu.com>
      d48f7c89
  2. 23 12月, 2021 16 次提交
  3. 22 12月, 2021 11 次提交