- 12 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 11 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Fix scale kernel for low precision, cherry pick #50998. * Fix the FP16 precision problem of add_n. (#50129) * Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315. * Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993. * Cherry-pick the multi-precision support of AdamW for bf16, #48041. * Fix compiling error. * Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993. * Fix unittest. --------- Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>
-
- 10 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Broadcast the master weight along with param for distributed training. * Fix codestyle.
-
- 09 4月, 2023 2 次提交
-
-
由 Yiqun Liu 提交于
* Cherry-pick the register of bfloat16 for amp_kernel, pull request #45541. * Cherry-pick the master_grad support of adamw, pull request #51141. * add bf16 for some ops in static mode (#51582) * Add bfloat16 support for some api in static mode. * Fix codestyle. * Revert the change of layer_function_generator.py. --------- Co-authored-by: Shaojie WANG <wsjmessi@163.com>
-
由 Yiqun Liu 提交于
* Register exp/expm1/logit bf16 activation op kernels (#48702) * register more bf16 ops * update to register coresponding backward ops * Addition of bf16 type support for Compare OP (#46413) * first commit * clarify the quotes * change code style format * support bfloat16 * add bfloat16 support for more ops (#48272) * [Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) * Sync the pull request #51903. * Add some header files back. * modify cmake file for cuda11.8 compile (#49020) * modify cmake file for cuda11.8 compile * add op_library(fused_embedding_eltwise_layernorm_op DEPS bert_encoder_functor) * Fix compling error. * Cherry-pick pull request #51396. --------- Co-authored-by: Nsneaxiy <32832641+sneaxiy@users.noreply.github.com> Co-authored-by: Nlimingshu <61349199+JamesLim-sy@users.noreply.github.com> Co-authored-by: Shaojie WANG <wsjmessi@163.com> Co-authored-by: Nzqw_1997 <118182234+zhengqiwen1997@users.noreply.github.com>
-
- 03 4月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
-
- 20 3月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 09 3月, 2023 1 次提交
-
-
由 JZ-LIANG 提交于
-
- 17 2月, 2023 1 次提交
-
-
由 Wen Sun 提交于
-
- 04 1月, 2023 1 次提交
-
-
由 YUNSHEN XIE 提交于
* resolve conflict * fix format error
-
- 03 1月, 2023 2 次提交
-
-
由 xiaoting 提交于
* fix fold for large bs * fix fold for large bs * fix pre-commit
-
由 feng_shuai 提交于
* cherry-pick:Some version of TensorRT don't support qkv_plugin * cherry-pick:support coverage CI
-
- 30 12月, 2022 1 次提交
-
-
由 Chenxiao Niu 提交于
* [MLU] fix compute error of dropout op (#45923) * [MLU] add mergedAdam kernel. (#45965) * [MLU] add int64 support for mlu one_hot_v2 (#46313) * [MLU] fix profiler compile failure (#46208) * [MLU] add barrier_op kernel. (#46417) * [MLU] fluid: add mluop (#46429) * [MLU] add huber_loss kernel. (#46455) * [MLU] add mlu kernel for add_reduce_max_grad (#45651) Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> * [MLU] add_fluid_mluop_yolo_box (#46573) * [MLU] fix phi::Tensor compile error of mlu. (#46649) * [MLU] add fluid MLUOps prior_box (#46585) * [MLU] fix cmake error (#46772) * [MLU]fix unittest of sync_bn (#46797) * [MLU] add masterparam support for mlu adamw. (#46804) * [MLU] add int64 support for allgather. (#46830) * [MLU] fix compile error & add mlu blacklist function. (#47439) * [MLU] fix softmax_with_cross_entropy failed in 370-X8. * [MLU] fix cncl stuck caused by multiple initializations. * [MLU] fix code style check. Co-authored-by: Nqipengh <huangqipeng@cambricon.com> Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com> Co-authored-by: Lux et Veritas <1004239791@qq.com> Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> Co-authored-by: Nronnywang <ronny1996@163.com>
-
- 29 12月, 2022 2 次提交
-
-
由 zhouweiwei2014 提交于
-
由 YuanRisheng 提交于
* cherry-pick 45860 * [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265) * fix sum bug * fix ci bugs * fix ci bugs * update code according comment
-
- 28 12月, 2022 1 次提交
-
-
由 Huihuang Zheng 提交于
Fix CUDA11.8 Unittest Accuracy
-
- 27 12月, 2022 1 次提交
-
-
由 HongyuJia 提交于
* [Release2.4] Revert python link prs (#48573) * Revert "Fix mac link python (#48017)" This reverts commit 3fa7a736. * Revert "[Cherry-pick] Fix python link error (#47811)" This reverts commit ff642c68. * Update config.go * fix custom operator backward=None (#48656) * [Custom Extension] Fix custom double_grad backward=None (#49224) * fix custom double_grad backward=None * fix custom_relu.cu bug && polish testcase of double_grad * remove old dynamic graph test * add import fluid * add import fluid Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
- 22 12月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
-
- 21 12月, 2022 2 次提交
-
-
由 Aganlengzi 提交于
-
由 zhangkaihuo 提交于
-
- 28 11月, 2022 1 次提交
-
-
由 zlsh80826 提交于
* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098) * Add missing fp32 config and reduce the testing combination * Reduce trt matmul pass test max examples * Loose TRT fp16 tests tolerance (#47100) * Loose TRT half test tolerance to 1e-3 (#47101) * Loose TRT half test tolerance to 1e-3 (#47106) * Update distributed_strategy.proto (#46531) * Close popen pipe after used (#47053) * Add launch_bounds (#47285) * Fix TRT UT failures (#47488) * Format cherry-picked commits * CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100 Co-authored-by: NShijie <505749828@qq.com> Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com> Co-authored-by: NTian Zheng <tizheng@nvidia.com>
-
- 24 11月, 2022 1 次提交
-
-
由 ustiniankw 提交于
* fixdocs, test=document_fix * fixdocs, test=document_fix
-
- 11 11月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 10 11月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
* cherry-pick recompute doc update. * update.
-
- 09 11月, 2022 1 次提交
-
-
由 JYChen 提交于
* remove functions not belong to public-api from __all__ * fix code style * fix error in distributed
-
- 07 11月, 2022 4 次提交
-
-
由 Ligoml 提交于
* #46165 * #45752 * fix some doc bug test=document_fix (#45488) * fix some doc bug test=document_fix * fix some docs issues, test=document_fix * beta -> \beta in softplus * threshold -> \varepsilon in softplus * parameter name * delta -> \delta in smooth_l1_loss * fix some docs test=document_fix * fix docs test=document_fix * fix docs && 增加空行 test=document_fix * Update python/paddle/nn/functional/activation.py, test=document_fix * Update python/paddle/nn/layer/activation.py, test=document_fix Co-authored-by: NSigureMo <sigure.qaq@gmail.com> * [docs] add ipustrategy Hyperlink (#46422) * [docs] add ipustrategy Hyperlink * fix ipu_shard_guard docs; test=document_fix * [docs] add set_ipu_shard note * [docs] fix hyperlink * update framework.py * fix mlu_places docs; test=document_fix * fix put_along_axis docs; test=document_fix * fix flake8 W293 error, test=document_fix * fix typo in typing, test=document_fix Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * #46659 * Update README_cn.md (#46927) 修复了错别字 * #46738 * fix paddle.get_default_dtype (#47040) Chinese and English return values are inconsistent * fix bug Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com> Co-authored-by: NInfinity_lee <luhputu0815@gmail.com> Co-authored-by: Nmrcangye <chenloong@88.com> Co-authored-by: NSigureMo <sigure.qaq@gmail.com> Co-authored-by: Ngouzil <66515297+gouzil@users.noreply.github.com> Co-authored-by: NHamid Zare <12127420+hamidzr@users.noreply.github.com> Co-authored-by: NSqhttwl <61459740+Sqhttwl@users.noreply.github.com> Co-authored-by: NOccupyMars2025 <31559413+OccupyMars2025@users.noreply.github.com> Co-authored-by: N超级码牛 <54444805+SuperCodebull@users.noreply.github.com> Co-authored-by: Njzhang533 <jzhang533@gmail.com>
-
由 Yuang Liu 提交于
* code format change * update the split logic for uniform (#47670)
-
由 Ligoml 提交于
* #46765 * #47042 * Remove redundant numpy import (#47483) * #47555 * resolve conflict * resolve conflict * resolve conflict * resolve conflict * resolve conflict * for_codestyle * fix sample code paddle.linalg.multi_dot Co-authored-by: NKevin吴嘉文 <417333277@qq.com>
-
由 Aurelius84 提交于
* Fix set_attr modify underly type (#47500) * reformat code * Revert "reformat code" This reverts commit f11a5d7658633e53c279f11612254937e2d87feb.
-
- 04 11月, 2022 2 次提交
-
-
由 xiongkun 提交于
* [ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916) * fix select_input with different shape errors: 1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var 2. the output shape of select_input is inferred from inputs. * reverse the logic in select_input * [warning] added warning message in cond block when one branch returns variable and another returns None (#46031) * [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931) * Allow manaully set py_reader name in standalone executor * [BugFix] while cond receives dict as input (#47299) * fix bugs while cond receives dict as input * add unittest * change flatten -> _is_sequence_except_dict * code format Co-authored-by: Nfeifei-111 <wuzhanfei@baidu.com>
-
由 Ligoml 提交于
* only run pre-commit * only run pre-commit
-
- 03 11月, 2022 3 次提交
-
-
由 Sławomir Siwek 提交于
-
由 zhangkaihuo 提交于
Unified api args name
-
由 ShenLiang 提交于
* add unbalanced data * fix utest
-
- 01 11月, 2022 3 次提交
-
-
由 zhangkaihuo 提交于
Fix english documents of sparse api
-
由 zyfncg 提交于
* support generating code of opmaker for backward op invoke forward op (#46912) * [code-gen] Support code-gen for opmaker of sparse op (#46993) * support generating code of opmaker for backward op invoke forward op * gsupport code-gen of opmaker for sparse op * refind logic of choose phi kernrel * fix complie budg * fix code_gen bug * fix bug * fix kernel signature code-gen * fix complie bug of VarType * fix complie bug of VarType * fix test_sparse_conv_op * fix test_sparse_norm_op * [Phi] Refactor logic of judging whether having a phi kernrel (#46920) * refind logic of choose phi kernrel * fix complie budg * update cmake
-
由 sneaxiy 提交于
-
- 31 10月, 2022 3 次提交
-
-
由 zhaoyingli 提交于
* update codestyle * [AutoParallel] fix fp16 for subblock (#47189) * [AutoParallel] fix fp16 for subblock * fix engine * fix comment * [AutoParallel] fix engine _build and cost method (#47263) * fix engine build method * fix import * update engine cost * update raise error * update cmakelist * revert optimizer * revert optimizer * fix unittest * fix unittest Co-authored-by: Ncaozhou <caozhou@radi.ac.cn> Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
-
由 YangZhou 提交于
[Cherry-pick][audio] rm kaiser window in audio get_window function && rm audio utils(#47469) (#47479) * [audio] rm kaiser window in audio get_window function && rm audio utils (#47469) * rm kaiser window in audio window function * rm paddle audio utils which is redundant * rm kaiser in test_audio_functions.py Conflicts: python/paddle/audio/utils/error.py python/paddle/tests/test_audio_functions.py * format
-
由 Guanghua Yu 提交于
* update dygraph PTQ export_model api * remove postprocess
-