- 31 12月, 2021 12 次提交
-
-
由 JYChen 提交于
* add new api/op kthvalue * kthvalue cuda kernel to cub sorting * fix example code error * throw errors instead of LOG in cuda sort * throw errors by Paddle_ENFORCE
-
由 baoachun 提交于
* add mul_gru_fuse_pass ut * update ut * update ut * update ut timeout setting * update ut
-
由 Xiaoxu Chen 提交于
* add beta distribution * add kl_divergence and register_kl api
-
由 fwenguang 提交于
* [MLU]support calling mlu op from python interface * [MLU]fix * fix * [mlu]fix mlu_places * [mlu]fix required mlu * fix * [MLU]fix tensor copy * [mlu] fix MLUPlace call path
-
由 JYChen 提交于
* add new api paddle.quantile and paddle.Tensor.quantile * add take_todo and fix UT
-
由 zhiboniu 提交于
-
由 xiayanming 提交于
* [Auto Parallel] add gradient merge pass * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix ci issue * fix pr review * fix pr review * fix pr review * fix pr review * fix pr review * fix pr review
-
由 xiaoting 提交于
* add fold opereators, test=develop * add fold opereators, test=develop * add fold opereators, test=develop * update fold op error test, test=develop * fix unitext, test=develop * fix unitext, test=develop
-
由 Double_V 提交于
-
由 Huihuang Zheng 提交于
Paddle new APIs: put_along_axis. Xu Huang is on holiday so we created this PR to work on it. It is based on his PR: https://github.com/PaddlePaddle/Paddle/pull/37921
-
由 zhiboniu 提交于
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
- 30 12月, 2021 13 次提交
-
-
由 zhiboniu 提交于
LGTM
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl1. test=kunlun * add sigmoid cross entropy with logits to kl1. test=kunlun
-
由 zhangyk0314 提交于
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
-
由 JYChen 提交于
* add new OP mode * rename trans-variable name and fix UT
-
由 Yulong Ao 提交于
* [Auto parallel] Make the id of var and op unique * [Auto Parallel] Rename back dist_context to distop_context
-
由 Haohongxiang 提交于
* add cpu kernel of lstsq * update * modify code style * modify unittest * remove support for complex
-
由 Jiabin Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * support inference test * refine test and fix initializer failed * support create varbase and fix retain grad error * fix windows error * support test_imperative_basic test in eager mode * remove additional log in variable.h * remove additional log in variable.h * remove additional code create in merge Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NWang Huan <wanghuan29@baidu.com>
-
由 jakpiase 提交于
* working test for padding only * added full conv2d grad kernel * removed some trash * minor change * Ci fix * format fix
-
由 zmxdream 提交于
-
由 Chen Weihang 提交于
* remove offset in storage * revert api change * fix custom op slice bug * fix mutable_data error
-
由 Xiaoxu Chen 提交于
* extend Distribution baseclass for supporting multivariant distribution and prob method * add ExponentialFamily base class and entropy using Bregman divergence * add dirichlet probability distribution
-
由 Xiaoxu Chen 提交于
* add dirichlet sample op and cpu backend kernel * add Dirichlet op cuda kernel (#6) * add dirichlet op hip kernel Co-authored-by: NFeiyu Chan <chenfeiyu@baidu.com>
-
由 Leo Guo 提交于
* Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. * Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. test=kunlun Co-authored-by: NZibin <guozibin@baidu.com>
-
- 29 12月, 2021 11 次提交
-
-
由 Leo Chen 提交于
-
由 ShenLiang 提交于
* fix bug of dp in pfp16 * fix topo
-
由 zhouweiwei2014 提交于
-
由 zhangbo9674 提交于
* add bn_1d_2d_3d for fp16 decorate * add unittest
-
由 JZ-LIANG 提交于
* auto parallel sharding base * chmod * add unitest * set unitest cmake dist label * revise code according to rewiew * chmod
-
由 小湉湉 提交于
-
由 ykkk2333 提交于
-
由 Shang Zhizhou 提交于
-
由 heliqi 提交于
* del mkldnn options of baseline * add timeout for matmul_scale_fuse_pass * add timeout for matmul
-
由 TTerror 提交于
* add argsort/scatter for kunlun * update test_scatter * update xpu.cmake * update xpu.cmake * fix scatter
-
由 Tao Luo 提交于
-
- 28 12月, 2021 4 次提交
-
-
由 zhiboniu 提交于
-
由 baoachun 提交于
-
由 From00 提交于
* fix reshape move storage error * remove needless set type * alloc tensor by shared storage * Utilize StreamSafeCUDAAllocator to support fast GC in new executor * Fix compile error for Windows and ROCm * Fix compile error for Windows * Modify UT stream_safe_cuda_alloc_test * Modify UT stream_safe_cuda_alloc_test * Rewrite fast GC * Rewrite fast GC * Fix compile error for BOOST_GET_CONST * Fix compile error for BOOST_GET_CONST * Changes default stream for StreamSafeCUDAAllocator * Fix a small CI error * Remove some redundant code * Fix conflict * Fix compile error for ROCm * Fix Windoes CI error * Fix CI error * Remove some unnecessary code * Fix CI error * Add UT for fast GC * Fix CI error * add device-agnostic stream class * add stream.h * fix ut * fix cpu compile * Use RWLock in GetAllocator * Fix CI error Co-authored-by: NChen Weihang <chenweihang@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 heliqi 提交于
* add matmul_to_mul matmul_v2_to_mul matmul_v2_to_matmul test case * modify skip func to ignore_pass_case func * rebuild CI * rebuild CI * add test_map_xx_pass timeout * add test_map_xx_pass timeout * merge from develop * add timeout notest;test=coverage * Cmakelist add timeout * add timeout * add attr of matmul_v2 * add trt skip * delete trt config * add skip, mul diff on 3080
-