- 10 5月, 2023 5 次提交
-
-
由 Yiqun Liu 提交于
cherry-pick #53659
-
由 Zhang Zheng 提交于
Fix bug in log_softmax kernel when lastdim is larger than 100000 There is an unexpected log in the calculation Cherry-Pick: #53654
-
由 Qi Shao 提交于
Revert argsort to the version without full sort algorithm implemented
-
由 Bo Zhang 提交于
* Support different dtypes of inputs for broadcast for dropout optimization (#52093) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * PR comment * dropout_nd_optimization (#51479) * with printf * add DropOutNdForwardKernel * PR comment * Dropout optimize & clean broadcast inT and ElementwiseType (#52969) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * clean ElementwiseT and InT for BroadcastKernel * default axis and clean inT * remove redundant fast divmod computation * optimize drop_nd & drop_nd_grad * optimize BroadcastDataLoader bf16 fp16 * rm InT etc. after merge develop * delete constexpr for windows ci * fix conflict * fix conflic with develop * fix conflic * new clean * clean * Fix xpu2 kp compile error (#53548) * fix conflict * conflict
-
由 zhouweiwei2014 提交于
-
- 09 5月, 2023 4 次提交
-
-
由 limingshu 提交于
Cherry pick fused linear
-
由 GGBond8488 提交于
* add complex support for optest * add complex grad test * append one * move some debug info * move some debug info * move some debug info * move some debug info * add more complex test * Fix naming ambiguity * Revert "add more complex test" This reverts commit dbcb0516b8e53ba42e2d6089878a39b395345969. * change backward gradient, add TODO
-
由 zhouweiwei2014 提交于
* [Zero-Dim] fix functool.reduce more safe with intial value, to support empty list (#53182) * [Zero-Dim] support 0d tensor for shape and squeeze onednn kernel (#52832) * support 0d tensor for shape and squeeze onednn kernel * set python api for shape op ut * [Zero-Dim] distributed scatter/all_to_all support input 0D tensor (#53186) * [Zero-Dim] Support paddle.sum/mean/loss api output 0D,test=allcase (#52739) * [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily (#53382) * [CINN Support 0D-Tensor] CINN supports 0D-Tensor with trick temporarily * Add unittest * [CINN Support 0D-Tensor] CINN hack squeeze2 with trick temporarily (#53454) * fix test_autograd_dynamic (#53473) Co-authored-by: Nzhwesky2010 <zhouwei25@baidu.com> --------- Co-authored-by: NYangQun <qun.yang@intel.com> Co-authored-by: NHongyuJia <jiahongyu@baidu.com> Co-authored-by: NHydrogenSulfate <490868991@qq.com>
-
由 JYChen 提交于
* support 0-D output and 0-D as indice in __getitem__ * fix tests * fix inference and UT * add unittest for setitem * fix xpu test * fix xpu 0-d * fix right value is 0d and index is List/Tensor * Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d * change PHI_DECLARE_xxx to DECLARE_xxx since the change not merged to 2.5 * hack 1-D tensor to Scalar * throw warning at __getitem__, not slice_utils
-
- 08 5月, 2023 3 次提交
-
-
由 Zhang Zheng 提交于
Cherry-Pick: #53582 修改内容:在除法out = x / y中,将y的反向公式由dy = -dout * out / y 改为 dy = -dout * ((x / y) / y) 修改原因:使用result作为反向的输入,在低精度的时候本身cast之后就会存在一些精度损失,所以重新计算后才是更准确的结果 修改影响:此改动可以使结果更精确且对性能影响忽略不计
-
由 Yiqun Liu 提交于
* Add fused_gate_attention API. (#53432) * Add PADDLE_THROW in take_along_axis kernel when the datatype of index is wrong. (#53556)
-
由 GGBond8488 提交于
* add 0D output support for inalg.slogdet,test=allcase * fix zerom dime test error test=allcase * fix test error test=allcase * add static backward test, test=allcase * support_0D_output_for_matrix_rank_multi_dot, test=allcase * add 0D output test for matrox_rank and mutli_dot test=allcase * fix assert error ,test=allcase * fix test error, test=allcase * fix other test error, test=allcase * fix other test error, test=allcase * fix test error, test=allcase * fix matrix_rank and multi dot test err test=allcase * fix test error test=allcase * fix test zero dim test, test=allcase * add static backward test for multi_dot, test=allcase * add tol 2d broadcast test case, test=allcase * fix test error test=allcase * fix test error test=allcase * test=allcase * support_0d_output_for_linalg.norm * fix test error test=allcase * fix 0D test * fix test error test=allcase * fix test error test=allcase * fix tets,test=allcase * fix error,test=allcase * fix errors ,test=allcase * add static backward , test=allcase * add static backwward test, test=allcase * slogdet_support_0D_output * add new case * fix tests, test=allcase * cherry-pick * cherry-pick * fix trace gpu kernel 0d error, test=allcase * fix windows error, test=allcase * add matrixrank cherry-pick
-
- 06 5月, 2023 2 次提交
-
-
由 zhangkaihuo 提交于
att, cherry-pick: #52902 #53113
-
由 Zhang Zheng 提交于
低精度算子支持和单测补充,合并 cherry pick 17个Hackathon PR,共覆盖25个OP的低精度支持及完善
-
- 27 4月, 2023 2 次提交
-
-
由 zhouweiwei2014 提交于
[cherry-pick2.5] [Zero-Dim] Support all/any/min/max/prod/logsumexp/amax/amin/some loss output 0D (#53192)
-
由 wangfengsheng1999 提交于
[Cherry-Pick]Support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy (#53199) * support output 0D for is_empty/as_complex/inner/dot/rank/tensordot/squeeze_/static.accuracy/static.auc/metric.accuracy * test_dot_py * test_dot_py
-
- 25 4月, 2023 2 次提交
- 24 4月, 2023 1 次提交
-
- 23 4月, 2023 1 次提交
-
-
由 JYChen 提交于
* support 0-D output and 0-D as indice in __getitem__ * fix tests * fix inference and UT * add unittest for setitem * fix xpu test * fix xpu 0-d
-
- 21 4月, 2023 1 次提交
-
-
由 JYChen 提交于
* fix the set_value error in cpu * add a unitest for set_value OP * fix platform::is_gpu_place * add todo note for set_value * fix test
-
- 20 4月, 2023 1 次提交
-
-
由 GGBond8488 提交于
* add 0D output support for inalg.slogdet,test=allcase * fix zerom dime test error test=allcase * fix test error test=allcase * add static backward test, test=allcase
-
- 19 4月, 2023 1 次提交
-
-
由 Zhang Zheng 提交于
unique支持float16和bfloat16数据类型,并完善相关单测。
-
- 17 4月, 2023 7 次提交
-
-
由 Chitsing KUI 提交于
* add random control for fused dropout add * add __init__
-
由 Vvsmile 提交于
* fix multinomial * fix test_elementwise * fix convert_float_to_uint16 * aadd test_multimial_op * fix code style
-
由 thunder95 提交于
* untracked files * bce_loss_fp16 * remove unused files * back max_rel_erro still big * simplify code * upd * fix max_relative_error * restart ci * Update test_bce_loss.py * Update test_bce_loss.py * Update test_bce_loss.py * Update test_bce_loss.py * try to pass test * restore file * remove error value * fix bug --------- Co-authored-by: NZhang Ting <Douyaer2020@qq.com> -
由 Jiabin Yang 提交于
* fix multiply double grad error * fix multiply dy only kenrel
-
由 Hanchiao 提交于
* Implement optimized kernel for OP-expand_as. * Support fp16. Co-authored-by:
Timber-Ye <ye_hanqiao@163.com> Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com> * remove fp16 support * remove MAX_RANK_SUPPORTED --------- Co-authored-by: NBrianQian1999 <brianqianhitsz@gmail.com>
-
由 zhangyuqin1998 提交于
-
由 Sonder 提交于
* add register info for eigh and eig_gard * add sync_batch_norm_op.cu register info * add lamb output register info * add unique register info * change type name * change type name * add output register info for check_finite_and_unscale * update cmake and config file * add register info for adagrad * fix build error * add sync to run_unittests.sh * add register info for unique_consecutive * fix build error * add eigh to STATIC_BUILD_TESTS * update eig_kernel.cc * update eig_kernel.cc * fix infer mate error * fix unique register error * fix lamb register info error * fix lamb register info * update lamb register info * fix lamb * remove one Output Register * update static build file * add eigh op to disable_wingpu_test * update run_unittests
-
- 14 4月, 2023 10 次提交
-
-
由 Zhang Zheng 提交于
-
由 cyberslack_lee 提交于
-
由 cyberslack_lee 提交于
-
由 chenxujun 提交于
* Add digamma, dirichlet tests * Fix code
-
由 superwinner1 提交于
* add erf FP16 test
-
由 chenxujun 提交于
-
由 umiswing 提交于
-
由 YangQun 提交于
[Zero-Dim] support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion onednn kernels (#52185) * support 0-D tensor for reduce/reshape/stack/prelu/expand_v2/gaussion ops * fix gaussian random mkldnn op ut
-
由 gouzil 提交于
* [phi] move sequence_pool kernel to phi * [phi] mv sequence_pooling to phi funcs * [phi] mv sequence_pooling_test * [phi] RollBACK `paddle/fluid/operators/sequence_ops/sequence_pool_op.cc` * [phi][funcs] fix mutable_data * [phi][funcs] fix mutable_data
-
由 sneaxiy 提交于
-