- 12 5月, 2023 1 次提交
-
-
由 Leo Chen 提交于
-
- 25 4月, 2023 1 次提交
-
-
由 JZ-LIANG 提交于
* support tp sync for auto parallel * support tp sync for auto parallel1 * support tp sync for auto parallel1 * support tp sync for auto parallel1
-
- 24 4月, 2023 1 次提交
-
-
由 Zhang Zheng 提交于
-
- 21 4月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 19 4月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
-
- 14 4月, 2023 1 次提交
-
-
由 JZ-LIANG 提交于
* pr1 * pr2 * pr3 * fixed unitest * adopt for scale
-
- 12 4月, 2023 3 次提交
- 11 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Fix scale kernel for low precision, cherry pick #50998. * Fix the FP16 precision problem of add_n. (#50129) * Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315. * Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993. * Cherry-pick the multi-precision support of AdamW for bf16, #48041. * Fix compiling error. * Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993. * Fix unittest. --------- Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>
-
- 10 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Broadcast the master weight along with param for distributed training. * Fix codestyle.
-
- 09 4月, 2023 3 次提交
-
-
由 Yiqun Liu 提交于
* Cherry-pick the register of bfloat16 for amp_kernel, pull request #45541. * Cherry-pick the master_grad support of adamw, pull request #51141. * add bf16 for some ops in static mode (#51582) * Add bfloat16 support for some api in static mode. * Fix codestyle. * Revert the change of layer_function_generator.py. --------- Co-authored-by: Shaojie WANG <wsjmessi@163.com>
-
由 Yiqun Liu 提交于
* Register exp/expm1/logit bf16 activation op kernels (#48702) * register more bf16 ops * update to register coresponding backward ops * Addition of bf16 type support for Compare OP (#46413) * first commit * clarify the quotes * change code style format * support bfloat16 * add bfloat16 support for more ops (#48272) * [Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) * Sync the pull request #51903. * Add some header files back. * modify cmake file for cuda11.8 compile (#49020) * modify cmake file for cuda11.8 compile * add op_library(fused_embedding_eltwise_layernorm_op DEPS bert_encoder_functor) * Fix compling error. * Cherry-pick pull request #51396. --------- Co-authored-by: Nsneaxiy <32832641+sneaxiy@users.noreply.github.com> Co-authored-by: Nlimingshu <61349199+JamesLim-sy@users.noreply.github.com> Co-authored-by: Shaojie WANG <wsjmessi@163.com> Co-authored-by: Nzqw_1997 <118182234+zhengqiwen1997@users.noreply.github.com>
-
由 shaojie_wang 提交于
-
- 07 4月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
-
- 03 4月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
-
- 30 3月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 28 3月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 24 3月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 20 3月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 09 3月, 2023 1 次提交
-
-
由 JZ-LIANG 提交于
-
- 17 2月, 2023 1 次提交
-
-
由 Wen Sun 提交于
-
- 13 1月, 2023 2 次提交
-
-
由 xiaoxiaohehe001 提交于
-
由 Yuanle Liu 提交于
* fix fc kernel diff * disable fc_elementwise_layernorm_fuse_pass
-
- 12 1月, 2023 1 次提交
-
-
由 xiaoxiaohehe001 提交于
-
- 09 1月, 2023 1 次提交
-
-
由 Haohongxiang 提交于
-
- 04 1月, 2023 2 次提交
-
-
由 Yuanle Liu 提交于
* disable scale op in amp pass * Do not insert redundant cast op * fix fused_fc_elementwise_layernorm kernel diff * fix fc kerenl diff
-
由 YUNSHEN XIE 提交于
* resolve conflict * fix format error
-
- 03 1月, 2023 2 次提交
-
-
由 xiaoting 提交于
* fix fold for large bs * fix fold for large bs * fix pre-commit
-
由 feng_shuai 提交于
* cherry-pick:Some version of TensorRT don't support qkv_plugin * cherry-pick:support coverage CI
-
- 30 12月, 2022 1 次提交
-
-
由 Chenxiao Niu 提交于
* [MLU] fix compute error of dropout op (#45923) * [MLU] add mergedAdam kernel. (#45965) * [MLU] add int64 support for mlu one_hot_v2 (#46313) * [MLU] fix profiler compile failure (#46208) * [MLU] add barrier_op kernel. (#46417) * [MLU] fluid: add mluop (#46429) * [MLU] add huber_loss kernel. (#46455) * [MLU] add mlu kernel for add_reduce_max_grad (#45651) Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> * [MLU] add_fluid_mluop_yolo_box (#46573) * [MLU] fix phi::Tensor compile error of mlu. (#46649) * [MLU] add fluid MLUOps prior_box (#46585) * [MLU] fix cmake error (#46772) * [MLU]fix unittest of sync_bn (#46797) * [MLU] add masterparam support for mlu adamw. (#46804) * [MLU] add int64 support for allgather. (#46830) * [MLU] fix compile error & add mlu blacklist function. (#47439) * [MLU] fix softmax_with_cross_entropy failed in 370-X8. * [MLU] fix cncl stuck caused by multiple initializations. * [MLU] fix code style check. Co-authored-by: Nqipengh <huangqipeng@cambricon.com> Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com> Co-authored-by: Lux et Veritas <1004239791@qq.com> Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> Co-authored-by: Nronnywang <ronny1996@163.com>
-
- 29 12月, 2022 2 次提交
-
-
由 zhouweiwei2014 提交于
-
由 YuanRisheng 提交于
* cherry-pick 45860 * [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265) * fix sum bug * fix ci bugs * fix ci bugs * update code according comment
-
- 28 12月, 2022 1 次提交
-
-
由 Huihuang Zheng 提交于
Fix CUDA11.8 Unittest Accuracy
-
- 27 12月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 HongyuJia 提交于
* [Release2.4] Revert python link prs (#48573) * Revert "Fix mac link python (#48017)" This reverts commit 3fa7a736. * Revert "[Cherry-pick] Fix python link error (#47811)" This reverts commit ff642c68. * Update config.go * fix custom operator backward=None (#48656) * [Custom Extension] Fix custom double_grad backward=None (#49224) * fix custom double_grad backward=None * fix custom_relu.cu bug && polish testcase of double_grad * remove old dynamic graph test * add import fluid * add import fluid Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
- 22 12月, 2022 3 次提交
-
-
由 Guanghua Yu 提交于
-
由 Yuanle Liu 提交于
* [Release2.4] Revert python link prs (#48573) * Revert "Fix mac link python (#48017)" This reverts commit 3fa7a736. * Revert "[Cherry-pick] Fix python link error (#47811)" This reverts commit ff642c68. * Update config.go * fix mixed precision inference Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
由 Ligoml 提交于
-
- 21 12月, 2022 1 次提交
-
-
由 Aganlengzi 提交于
-