- 13 5月, 2023 1 次提交
-
-
由 Zhang Jun 提交于
* scale, square, sum, swish trt op converter support zero dim (#53660) * [Paddle-Inference] Support trt 0dims of expand_as_v2 and mish. (#53627) * support_expand_mish * add unitest for reshpe 0 dims (#53685) * Add trt pow converter. (#53462) * Add trt pow converter. * update to use AddConstantLayer * add dims=0 ut * [inference Zero-Dim]add equal, elementwise_op trt 0d (#53704) * [inference Zero-Dim]prelu trt converter support zero dim tensor (#53634) * prelu op trt converter support zero dim * [Inference Zero-Dim] Support trt 0dim of gelu, hard_swish, hard_sigmoid and leaky_relu (#53714) * support_act * delete_silu * [inference zero dim] softmax, stack op trt converter support zero dim (#53729) * softmax support * support stack * remove unused code * update --------- Co-authored-by: NYuanle Liu <yuanlehome@163.com> Co-authored-by: Nxiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com> Co-authored-by: Nzhoutianzi666 <39978853+zhoutianzi666@users.noreply.github.com> Co-authored-by: NWilber <jiweibo@baidu.com>
-
- 11 5月, 2023 1 次提交
-
-
由 JYChen 提交于
* up warning level * numpy still vlog-0
-
- 10 5月, 2023 1 次提交
-
-
由 Bo Zhang 提交于
* Support different dtypes of inputs for broadcast for dropout optimization (#52093) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * PR comment * dropout_nd_optimization (#51479) * with printf * add DropOutNdForwardKernel * PR comment * Dropout optimize & clean broadcast inT and ElementwiseType (#52969) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * clean ElementwiseT and InT for BroadcastKernel * default axis and clean inT * remove redundant fast divmod computation * optimize drop_nd & drop_nd_grad * optimize BroadcastDataLoader bf16 fp16 * rm InT etc. after merge develop * delete constexpr for windows ci * fix conflict * fix conflic with develop * fix conflic * new clean * clean * Fix xpu2 kp compile error (#53548) * fix conflict * conflict
-
- 09 5月, 2023 8 次提交
-
-
由 zqw_1997 提交于
* fix doc erros, test=allcase * conflict * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * test=allcase * fix doc erros, test=allcase * fix the to_tensor error
-
由 zhouweiwei2014 提交于
-
由 limingshu 提交于
Cherry pick fused linear
-
由 Zhang Jun 提交于
* [inference][trt]trt support 0 dims (#53383) * trt support 0 dim * update activation ut * fix trt Unary operation do not support 0d when TRT < 8.6 * Update op_teller.cc * update unary ut * add rsqrt to unary_list * move rsqrt to act_list
-
由 GGBond8488 提交于
* add complex support for optest * add complex grad test * append one * move some debug info * move some debug info * move some debug info * move some debug info * add more complex test * Fix naming ambiguity * Revert "add more complex test" This reverts commit dbcb0516b8e53ba42e2d6089878a39b395345969. * change backward gradient, add TODO
-
由 zhouweiwei2014 提交于
* [Zero-Dim] fix functool.reduce more safe with intial value, to support empty list (#53182) * [Zero-Dim] support 0d tensor for shape and squeeze onednn kernel (#52832) * support 0d tensor for shape and squeeze onednn kernel * set python api for shape op ut * [Zero-Dim] distributed scatter/all_to_all support input 0D tensor (#53186) * [Zero-Dim] Support paddle.sum/mean/loss api output 0D,test=allcase (#52739) * [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382) * [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily * Add unittest * [CINN Support 0D-Tensor] CINN hack squeeze2 with trick temporarily (#53454) * fix test_autograd_dynamic (#53473) Co-authored-by: Nzhwesky2010 <zhouwei25@baidu.com> --------- Co-authored-by: NYangQun <qun.yang@intel.com> Co-authored-by: NHongyuJia <jiahongyu@baidu.com> Co-authored-by: NHydrogenSulfate <490868991@qq.com>
-
由 JYChen 提交于
* support 0-D output and 0-D as indice in __getitem__ * fix tests * fix inference and UT * add unittest for setitem * fix xpu test * fix xpu 0-d * fix right value is 0d and index is List/Tensor * Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d * change PHI_DECLARE_xxx to DECLARE_xxx since the change not merged to 2.5 * hack 1-D tensor to Scalar * throw warning at __getitem__, not slice_utils
-
由 cyber-pioneer 提交于
-
- 08 5月, 2023 3 次提交
-
-
由 zhoutianzi666 提交于
* add ```converter_type``` for op converter
-
由 niuliling123 提交于
修复优化器精度检查bug
-
由 GGBond8488 提交于
* add 0D output support for inalg.slogdet,test=allcase * fix zerom dime test error test=allcase * fix test error test=allcase * add static backward test, test=allcase * support_0D_output_for_matrix_rank_multi_dot, test=allcase * add 0D output test for matrox_rank and mutli_dot test=allcase * fix assert error ,test=allcase * fix test error, test=allcase * fix other test error, test=allcase * fix other test error, test=allcase * fix test error, test=allcase * fix matrix_rank and multi dot test err test=allcase * fix test error test=allcase * fix test zero dim test, test=allcase * add static backward test for multi_dot, test=allcase * add tol 2d broadcast test case, test=allcase * fix test error test=allcase * fix test error test=allcase * test=allcase * support_0d_output_for_linalg.norm * fix test error test=allcase * fix 0D test * fix test error test=allcase * fix test error test=allcase * fix tets,test=allcase * fix error,test=allcase * fix errors ,test=allcase * add static backward , test=allcase * add static backwward test, test=allcase * slogdet_support_0D_output * add new case * fix tests, test=allcase * cherry-pick * cherry-pick * fix trace gpu kernel 0d error, test=allcase * fix windows error, test=allcase * add matrixrank cherry-pick
-
- 06 5月, 2023 1 次提交
-
-
由 Zhang Zheng 提交于
低精度算子支持和单测补充,合并 cherry pick 17个Hackathon PR,共覆盖25个OP的低精度支持及完善
-
- 05 5月, 2023 2 次提交
-
-
由 Aurelius84 提交于
[Dy2St]Get grad names when call append backward to fix high order gradient (#53250) Co-authored-by: NWangZhen <23097963+0x45f@users.noreply.github.com> -
由 Zhang Ting 提交于
Cherry-pick AMP
-
- 04 5月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 28 4月, 2023 1 次提交
-
-
由 duanyanhui 提交于
-
- 27 4月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
[cherry-pick2.5] [Zero-Dim] Support all/any/min/max/prod/logsumexp/amax/amin/some loss output 0D (#53192)
-
- 25 4月, 2023 2 次提交
-
-
由 niuliling123 提交于
移除过多的日志打印
-
由 niuliling123 提交于
新增enable_tensor_checker, disable_tensor_checker API (#52936)
-
- 24 4月, 2023 2 次提交
-
-
由 niuliling123 提交于
Print the forward's stack when backward op has nan/inf and FLAGS_check_nan_inf_level = 0 Delete temp param in eager_gen
- 23 4月, 2023 2 次提交
-
-
由 JYChen 提交于
* support 0-D output and 0-D as indice in __getitem__ * fix tests * fix inference and UT * add unittest for setitem * fix xpu test * fix xpu 0-d
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Remove climits. * Fix bug of BlockDesc::MoveFrom(). It's used to rebuild main_program_desc from ProgramDesc modified by Fusion Pass. As some fused operators need to create new Variables in modified ProgramDesc, MoveFrom function uses std::move() function to move these VarDesc to main_program_desc. As a result, their pointers holded by modified ProgramDesc become nullptr. When call block()->Program()->proto() function, it will call ProgramDesc::Flush() function at first, which may cause a segmentation fault.
-
- 20 4月, 2023 3 次提交
-
-
由 chalsliu 提交于
-
由 Yuanle Liu 提交于
* remove c++14 assert and remove include tensor.h in phi * update * remove delete_cast_op_pass
-
由 ronnywang 提交于
* [CustomDevice] add c_identity op * fix use calc stream
-
- 17 4月, 2023 10 次提交
-
-
由 YuanRisheng 提交于
* unify kernel * fix ci bugs * fix py3 bugs * fix py3 bugs * perfect code
-
由 lzydev 提交于
* fix bug in parse args * fix bug * recover legacy_*.yaml * change 'Out' to Output
-
由 LoneRanger 提交于
-
由 Galaxy1458 提交于
-
由 wangzhen38 提交于
* [CINN] fix concat&pow * update concat * composite_backward_api * for ci * for ci * update test & fix opmaker
-
由 JingZhuangzhuang 提交于
-
由 张春乔 提交于
-
由 Sonder 提交于
* add register info for eigh and eig_gard * add sync_batch_norm_op.cu register info * add lamb output register info * add unique register info * change type name * change type name * add output register info for check_finite_and_unscale * update cmake and config file * add register info for adagrad * fix build error * add sync to run_unittests.sh * add register info for unique_consecutive * fix build error * add eigh to STATIC_BUILD_TESTS * update eig_kernel.cc * update eig_kernel.cc * fix infer mate error * fix unique register error * fix lamb register info error * fix lamb register info * update lamb register info * fix lamb * remove one Output Register * update static build file * add eigh op to disable_wingpu_test * update run_unittests
-
由 Zhang Zheng 提交于
* [AMP OP&Test] Sync_batch_norm support bfloat16 * fix * fix
-
由 Haohongxiang 提交于
-
- 15 4月, 2023 1 次提交
-
-
由 HongyuJia 提交于
-