1. 10 1月, 2022 11 次提交
    • Z
      [Unify Tensors PR #6] Removed interfaces & members from lod_tensor,test=allcases (#38811) · 953638e0
      Zhanlue Yang 提交于
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      
      * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor
      
      * Fixed issues with place
      
      * Added comments
      
      * Moved mutable_data with stream argument to DenseTensor
      
      * Added set_offset interface
      
      * Fixed CI issues,test=allcases
      
      * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor
      
      * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor
      
      * Modified framework::Tensor to inherit from DenseTensor
      
      * Reverted changes too pten_layout() interface
      
      * Removed friend classes
      
      * Rearranged cfunction calls from tensor.data<void>() to tensor.data()
      
      * Fixed CI issues
      
      * Fixed lite issues
      
      * Fixed data() interface issues,test=allcases
      
      * Resolved IsInitialized() issues
      
      * Fixed ResetHolder() issues
      
      * Fixed MKLDNN & Storage issues
      
      * Resolved ShareBufferWith() issues
      
      * Fixed LoD issues
      
      * Removed interfaces & members from lod_tensor,test=allcases
      953638e0
    • S
      [bug fix] fix unfold runtime bug (#38819) · 5c357504
      shangliang Xu 提交于
      5c357504
    • L
      Profiler skeleton (#38826) · a8afed69
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * Revert "Add EventsWaiter"
      
      This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.
      
      * profiler skeleton
      
      * update
      
      * update
      
      * update
      Co-authored-by: Nliutiexing <liutiexing@google.com>
      a8afed69
    • T
    • W
      fix attr missing in conv cudnn kernel (#38827) · 066a8063
      wangxinxin08 提交于
      066a8063
    • Z
      [Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea
      Zhanlue Yang 提交于
      * Added shared_ptr<Allocation> member & corresponding interfaces to Storage
      
      * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly
      
      * Fixed issues with storage offset
      
      * Used place to malloc allocation for TensorStorage
      
      * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor
      
      * Fixed issues with place
      
      * Added comments
      
      * Moved mutable_data with stream argument to DenseTensor
      
      * Added set_offset interface
      
      * Fixed CI issues,test=allcases
      
      * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor
      
      * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor
      
      * Modified framework::Tensor to inherit from DenseTensor
      
      * Reverted changes too pten_layout() interface
      
      * Removed friend classes
      
      * Rearranged cfunction calls from tensor.data<void>() to tensor.data()
      
      * Fixed CI issues
      
      * Fixed lite issues
      
      * Fixed data() interface issues,test=allcases
      
      * Resolved IsInitialized() issues
      
      * Fixed ResetHolder() issues
      
      * Fixed MKLDNN & Storage issues
      
      * Resolved ShareBufferWith() issues
      
      * Fixed LoD issues
      5c73a6ea
    • C
      Support setting infershape function for custom grad op (#38776) · 046553c7
      Chen Weihang 提交于
      * unify infer_shape func calling
      
      * support set grad infer shape fn for custom op
      
      * unify infershape in new executor and eager
      
      * remove todo comment
      
      * revert infershape in operator
      046553c7
    • L
      [fleet_executor] Add barrier rpc (#38799) · cd2855b0
      LiYuRio 提交于
      cd2855b0
    • A
      Add MaxUnPool3D op and MaxUnPool1D op (#38716) · 7e31542c
      andyjpaddle 提交于
      * add maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update doc for maxunpool3d op
      
      * update sample code for maxunpool3d
      
      * add maxunpool1d op
      
      * update some code for maxunpool1d
      7e31542c
    • G
    • G
  2. 07 1月, 2022 6 次提交
  3. 06 1月, 2022 10 次提交
  4. 05 1月, 2022 13 次提交
    • L
      optimize elementwise_mul_grad using new interfaces (#37728) · 36a102f8
      Lijunhui 提交于
      * init commit: new elem_mul_grad
      
      * add template speciallization for complex in multiply
      
      * reply review comments
      
      * correct dx and dy computation when T is complex
      
      * reply review comments
      
      * update to new ReduceRunctor
      
      * mul-output broadcast
      
      * call functions
      
      * call functions with comments
      
      * remove comments
      36a102f8
    • F
      Fix bug for UT GetAllocatorInterfaceTest (#38720) · 905c8022
      From00 提交于
      * Fix bug of GetAllocatorInterfaceTest
      
      * Replace some shared_ptr with unique_ptr
      
      * Change Alloc call
      905c8022
    • J
      60c51de5
    • T
      update masked_select_op for kunlun (#38678) · 40078103
      TTerror 提交于
      40078103
    • W
      [Eager] Support test imperative basic in eager test_empty_grad (#38376) · 9108e777
      wanghuancoder 提交于
      * Rearranged Eager AutoCodeGen directory structure
      
      * Removed USE_OP in Eager AutoCodeGen
      
      * Enabled generation for Operators without Grad/Inputs/Outputs
      
      * Resolved operators without input
      
      * Fixed merge conflicts
      
      * Enabled Eager AutoCodeGen for 10+ more operators
      
      * Refactored Eager AutoCodeGen with more organized helper objects
      
      * Enabled Eager AutoCodeGen for operators with multiple OpBases
      
      * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument
      
      * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
      
      * Adjusted function generation/call between Python-C API & Dygraph API
      
      * Synchronized auto-generated Python-C API with Dygraph Forward Functions
      
      * support more eager tensor api
      
      * fix merge compile error
      
      * fix compile error and fit develop code
      
      * support pure CPU
      
      * fix some logic error in eager_mode
      
      * support _varbase_creator in eager mode
      
      * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs
      
      * for eager mode
      
      * refine
      
      * support multiple constructor for eager tensor
      
      * add place related code
      
      * polish code
      
      * specific randint with dtype of int64
      
      * Support pure cpu test
      
      * eager logic
      
      * refine test in pure cpu
      
      * eager logic
      
      * eager logic
      
      * eager logic, test=develop
      
      * skip core.eager when in inference, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * call RetainGrad after run forward kernel, test=develop
      
      * refine, test=develop
      
      * support dygraph util, meta, guard test
      
      * eager test case
      
      * support inference test
      
      * refine test and fix initializer failed
      
      * modify eagertensor patch method
      
      * add eagertensor.clear_grandint, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * call monkey_patch_varbase in _test_eager_guard, test=develop
      
      * split clear_gradient to clear_gradient and zero_grads, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      Co-authored-by: Njim19930609 <jim19930609@gmail.com>
      Co-authored-by: NJiabinYang <360788950@qq.com>
      9108e777
    • W
      add depthwise_conv2d op for mkldnn (#38484) · e1cc2236
      wangxinxin08 提交于
      e1cc2236
    • C
      [pten]Move reduce code new (#38648) · 7a4a512d
      chentianyu03 提交于
      * change 'math' to 'math_kernel'
      
      * fix compile bugs
      
      * merge develop
      
      * fix compile bugs
      
      * fix compile bugs
      
      * move reduce files by new rule
      
      * add set header
      
      * format code style
      
      * merge develop and fix conflict
      
      * merge develop and fix conflict
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      7a4a512d
    • J
      Fix for matmul_v2 oneDNN op broadcasting when inputs dims have different lengths (#38665) · 67923124
      jakpiase 提交于
      * fix for matmul_v2 broadcasting
      
      * fix for output shape not broadcasted
      67923124
    • W
      inference c_api support std::string (#38667) · f289cf85
      Wilber 提交于
      * c_api support std::string
      
      * update
      
      * update
      
      * add NOTE
      
      * fix delete error.
      f289cf85
    • J
      Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d
      joanna.wozna.intel 提交于
      * Quantize nearest_interp and nearest_interp_v2
      
      * Check if avx_core supported
      
      * Add depthwise_conv2d to supported quantization list
      1456b02d
    • T
      add huber_loss for kunlun (#38589) · a268c7ce
      TTerror 提交于
      * add huber_loss for kunlun
      
      * update xpu.cmake
      
      * update unitests
      
      * update unitests
      
      * update elementwise_add
      
      * update elementwise_add
      
      * update elementwise_add
      a268c7ce
    • W
      Support EagerTensor initialization with kwargs (#38488) · 4ba6d4e4
      Weilong Wu 提交于
      * Support EagerTensor init with kwargs
      
      * Updated comments
      
      * Updated unit tests case
      
      * Refactor InitTensor related code to reduce duplicate code
      
      * Updated the error reporting msg
      
      * Updated VLOG msg
      
      * Merge develop and Update EagerTensor init func
      
      * Polish switch case, reduce some code
      
      * Add SyntaxError unit test case
      
      * Refactor the related initialization func of EagerTensor
      
      * Remove ParseStopGradient and ParseZeroCopy and ParsePersistable, construct ParseBooleanArgs instead.
      
      * Updated error msg to pass CI
      
      * Updated PADDLE_ENFORCE error type
      4ba6d4e4
    • C
      implementation of broadcast div backward by reduce (#38044) · 55cd9cb8
      crystal 提交于
      * add elementwise div
      
      * move mul and div grad functor
      
      * Combine multiple CUDA kernels
      
      * Update the reduce interface call
      
      * add multi-output
      
      * add multi-output div
      
      * add branch judge
      
      * Package branch
      
      * Combine the x and y functions into one
      55cd9cb8