- 06 1月, 2022 3 次提交
-
-
由 zyfncg 提交于
* adjust the full kernel * remove creation.h * use Empty to create tensor in full
-
由 YuanRisheng 提交于
* move gpu_impl of elementwise kernel * change copyright to 2022
-
由 jakpiase 提交于
* added exp activation and use_dst_for_bwd kernels * CI RERUN * minor change
-
- 05 1月, 2022 15 次提交
-
-
由 Lijunhui 提交于
* init commit: new elem_mul_grad * add template speciallization for complex in multiply * reply review comments * correct dx and dy computation when T is complex * reply review comments * update to new ReduceRunctor * mul-output broadcast * call functions * call functions with comments * remove comments
-
由 From00 提交于
* Fix bug of GetAllocatorInterfaceTest * Replace some shared_ptr with unique_ptr * Change Alloc call
-
由 joanna.wozna.intel 提交于
-
由 TTerror 提交于
-
由 wanghuancoder 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * eager test case * support inference test * refine test and fix initializer failed * modify eagertensor patch method * add eagertensor.clear_grandint, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * call monkey_patch_varbase in _test_eager_guard, test=develop * split clear_gradient to clear_gradient and zero_grads, test=develop * refine, test=develop * refine, test=develop * refine, test=develop Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NJiabinYang <360788950@qq.com>
-
由 wangxinxin08 提交于
-
由 chentianyu03 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * fix compile bugs * move reduce files by new rule * add set header * format code style * merge develop and fix conflict * merge develop and fix conflict Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
-
由 Chen Weihang 提交于
* polish infermeta filename * polish infermeta filename
-
由 jakpiase 提交于
* fix for matmul_v2 broadcasting * fix for output shape not broadcasted
-
由 Wilber 提交于
* c_api support std::string * update * update * add NOTE * fix delete error.
-
由 joanna.wozna.intel 提交于
* Quantize nearest_interp and nearest_interp_v2 * Check if avx_core supported * Add depthwise_conv2d to supported quantization list
-
由 TTerror 提交于
* add huber_loss for kunlun * update xpu.cmake * update unitests * update unitests * update elementwise_add * update elementwise_add * update elementwise_add
-
由 Weilong Wu 提交于
* Support EagerTensor init with kwargs * Updated comments * Updated unit tests case * Refactor InitTensor related code to reduce duplicate code * Updated the error reporting msg * Updated VLOG msg * Merge develop and Update EagerTensor init func * Polish switch case, reduce some code * Add SyntaxError unit test case * Refactor the related initialization func of EagerTensor * Remove ParseStopGradient and ParseZeroCopy and ParsePersistable, construct ParseBooleanArgs instead. * Updated error msg to pass CI * Updated PADDLE_ENFORCE error type
-
由 crystal 提交于
* add elementwise div * move mul and div grad functor * Combine multiple CUDA kernels * Update the reduce interface call * add multi-output * add multi-output div * add branch judge * Package branch * Combine the x and y functions into one
-
由 王明冬 提交于
-
- 04 1月, 2022 14 次提交
-
-
由 niuliling123 提交于
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
-
由 Qi Li 提交于
-
由 Aurelius84 提交于
* Fix memcpyD2H sync behavior with other stream * add wait * add wait * add wait
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
由 furnace 提交于
[NPU] add pad and pad_grad
-
由 LiYuRio 提交于
-
由 jakpiase 提交于
-
由 zhangkaihuo 提交于
-
由 王明冬 提交于
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #3]Port framework::Tensor members & interfaces to pten::DenseTensor, test=allcases (#38473) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes
-
由 houj04 提交于
-
由 Chen Weihang 提交于
* move inner cast api to cast_kernel.h * resolve conflit
-
由 yaoxuefeng 提交于
heter context support dynamic mf dim
-
由 zlsh80826 提交于
-
- 31 12月, 2021 8 次提交
-
-
由 Zhangjingyu06 提交于
* [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun,*test=kunlun Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
-
由 JYChen 提交于
* add new api/op kthvalue * kthvalue cuda kernel to cub sorting * fix example code error * throw errors instead of LOG in cuda sort * throw errors by Paddle_ENFORCE
-
由 baoachun 提交于
* add mul_gru_fuse_pass ut * update ut * update ut * update ut timeout setting * update ut
-
由 jakpiase 提交于
* glog fix * changed approach
-
由 jakpiase 提交于
* 6 dims fix * removed limitations of max dims
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * fix compile bugs
-
由 zmxdream 提交于
-
由 tianshuo78520a 提交于
-