- 09 4月, 2023 1 次提交
-
-
由 shaojie_wang 提交于
-
- 07 4月, 2023 1 次提交
-
-
由 zhaoyingli 提交于
-
- 30 3月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 20 3月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 17 2月, 2023 1 次提交
-
-
由 Wen Sun 提交于
-
- 13 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
* fix fc kernel diff * disable fc_elementwise_layernorm_fuse_pass
-
- 04 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
* disable scale op in amp pass * Do not insert redundant cast op * fix fused_fc_elementwise_layernorm kernel diff * fix fc kerenl diff
-
- 30 12月, 2022 1 次提交
-
-
由 Chenxiao Niu 提交于
* [MLU] fix compute error of dropout op (#45923) * [MLU] add mergedAdam kernel. (#45965) * [MLU] add int64 support for mlu one_hot_v2 (#46313) * [MLU] fix profiler compile failure (#46208) * [MLU] add barrier_op kernel. (#46417) * [MLU] fluid: add mluop (#46429) * [MLU] add huber_loss kernel. (#46455) * [MLU] add mlu kernel for add_reduce_max_grad (#45651) Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> * [MLU] add_fluid_mluop_yolo_box (#46573) * [MLU] fix phi::Tensor compile error of mlu. (#46649) * [MLU] add fluid MLUOps prior_box (#46585) * [MLU] fix cmake error (#46772) * [MLU]fix unittest of sync_bn (#46797) * [MLU] add masterparam support for mlu adamw. (#46804) * [MLU] add int64 support for allgather. (#46830) * [MLU] fix compile error & add mlu blacklist function. (#47439) * [MLU] fix softmax_with_cross_entropy failed in 370-X8. * [MLU] fix cncl stuck caused by multiple initializations. * [MLU] fix code style check. Co-authored-by: Nqipengh <huangqipeng@cambricon.com> Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com> Co-authored-by: Lux et Veritas <1004239791@qq.com> Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com> Co-authored-by: Nronnywang <ronny1996@163.com>
-
- 29 12月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* cherry-pick 45860 * [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265) * fix sum bug * fix ci bugs * fix ci bugs * update code according comment
-
- 20 12月, 2022 1 次提交
-
-
由 ShenLiang 提交于
Co-authored-by: NMing-Xu Huang <mingh@nvidia.com>
-
- 29 11月, 2022 1 次提交
-
-
由 yeliang2258 提交于
[cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951) * Fix slice bugs in MKLDNN when input dims are zeros (#46671) * fix slice bugs * fix * update code * fix * update code * updating mul and matmul with set_mem_desc (#45624) * - mul & matmul changes - fix - bs16 correction of strides * - cosmetic fixes * - lint * - fix * - fix * - format -> mem_desc * - fix * - fix * - fix * - fix * - fix * fix squueze_transpose (#47911) Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
- 28 11月, 2022 1 次提交
-
-
由 zlsh80826 提交于
* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098) * Add missing fp32 config and reduce the testing combination * Reduce trt matmul pass test max examples * Loose TRT fp16 tests tolerance (#47100) * Loose TRT half test tolerance to 1e-3 (#47101) * Loose TRT half test tolerance to 1e-3 (#47106) * Update distributed_strategy.proto (#46531) * Close popen pipe after used (#47053) * Add launch_bounds (#47285) * Fix TRT UT failures (#47488) * Format cherry-picked commits * CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100 Co-authored-by: NShijie <505749828@qq.com> Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com> Co-authored-by: NTian Zheng <tizheng@nvidia.com>
-
- 11 11月, 2022 1 次提交
-
-
由 yeliang2258 提交于
* fix slice bugs * fix * update code * fix * update code
-
- 09 11月, 2022 1 次提交
-
-
由 Hui Zhang 提交于
* suqeeze2 + transpose2 fuse onednn cherrypick 2.4 * format * fix merge
-
- 08 11月, 2022 1 次提交
-
-
由 jakpiase 提交于
* fc cherrypick * another files added * added transpose cherrypick * reverter somebodys fc changes * minor fix * minor fix * cherry-pick of fc+act changes * minor fix * fix
-
- 07 11月, 2022 1 次提交
-
-
由 Ligoml 提交于
* #46165 * #45752 * fix some doc bug test=document_fix (#45488) * fix some doc bug test=document_fix * fix some docs issues, test=document_fix * beta -> \beta in softplus * threshold -> \varepsilon in softplus * parameter name * delta -> \delta in smooth_l1_loss * fix some docs test=document_fix * fix docs test=document_fix * fix docs && 增加空行 test=document_fix * Update python/paddle/nn/functional/activation.py, test=document_fix * Update python/paddle/nn/layer/activation.py, test=document_fix Co-authored-by: NSigureMo <sigure.qaq@gmail.com> * [docs] add ipustrategy Hyperlink (#46422) * [docs] add ipustrategy Hyperlink * fix ipu_shard_guard docs; test=document_fix * [docs] add set_ipu_shard note * [docs] fix hyperlink * update framework.py * fix mlu_places docs; test=document_fix * fix put_along_axis docs; test=document_fix * fix flake8 W293 error, test=document_fix * fix typo in typing, test=document_fix Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * #46659 * Update README_cn.md (#46927) 修复了错别字 * #46738 * fix paddle.get_default_dtype (#47040) Chinese and English return values are inconsistent * fix bug Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com> Co-authored-by: NInfinity_lee <luhputu0815@gmail.com> Co-authored-by: Nmrcangye <chenloong@88.com> Co-authored-by: NSigureMo <sigure.qaq@gmail.com> Co-authored-by: Ngouzil <66515297+gouzil@users.noreply.github.com> Co-authored-by: NHamid Zare <12127420+hamidzr@users.noreply.github.com> Co-authored-by: NSqhttwl <61459740+Sqhttwl@users.noreply.github.com> Co-authored-by: NOccupyMars2025 <31559413+OccupyMars2025@users.noreply.github.com> Co-authored-by: N超级码牛 <54444805+SuperCodebull@users.noreply.github.com> Co-authored-by: Njzhang533 <jzhang533@gmail.com>
-
- 03 11月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
-
- 01 11月, 2022 1 次提交
-
-
由 zyfncg 提交于
* support generating code of opmaker for backward op invoke forward op (#46912) * [code-gen] Support code-gen for opmaker of sparse op (#46993) * support generating code of opmaker for backward op invoke forward op * gsupport code-gen of opmaker for sparse op * refind logic of choose phi kernrel * fix complie budg * fix code_gen bug * fix bug * fix kernel signature code-gen * fix complie bug of VarType * fix complie bug of VarType * fix test_sparse_conv_op * fix test_sparse_norm_op * [Phi] Refactor logic of judging whether having a phi kernrel (#46920) * refind logic of choose phi kernrel * fix complie budg * update cmake
-
- 28 10月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* [JIT] Add Predictor for JITLayer (#47379) * add predictor_engine * add predictor_engine * fix zero shape * fix lodTensor * fix unittest * fix code style * update CmakeList * fix new executor
-
由 zhangkaihuo 提交于
add sync_batch_norm_bn and deliver indices_dict
-
- 27 10月, 2022 1 次提交
-
-
由 zhangkaihuo 提交于
* cherry-pick #46359 and resolve conflict
-
- 26 10月, 2022 3 次提交
-
-
由 zyfncg 提交于
* fix inference perfermence problem caused by selecting cudnn kernel for softmax * recover use_cudnn in opmaker of softmax
-
由 yeliang2258 提交于
* return proper state * fix for dims * fix Co-authored-by: Njakpiase <jakpia21@gmail.com>
-
由 sneaxiy 提交于
[Cherry-pick][Release/2.4]Refine the memory usage of fused_attention and fused_feedforward ops (#47235) * fix fused_attention fused_feedforward * fix ci * fix ci * fix ci PADDLE_GET_CONST * fix ci ut
-
- 21 10月, 2022 1 次提交
-
-
由 JingZhuangzhuang 提交于
* Add infer prune function * add fusion op
-
- 20 10月, 2022 4 次提交
-
-
由 Yiqun Liu 提交于
* Simplify the codes of conv. (#45966) * Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
-
由 zhoutianzi666 提交于
* stride_to_24 * fix CI failing
-
由 sneaxiy 提交于
Fix some operators when the tensor.numel() > INT32_MAX
-
由 sneaxiy 提交于
support pure bfloat16 for more ops
-
- 19 10月, 2022 2 次提交
- 18 10月, 2022 2 次提交
-
-
由 zhoutianzi666 提交于
-
由 Haohongxiang 提交于
* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116) * [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) * update
-
- 17 10月, 2022 1 次提交
-
-
由 zhangkaihuo 提交于
cherry-pick : #46322, #46245 Sparse API 支持静态图
-
- 14 10月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [BUG]Fix expand_as_v2 bug while X and Y with different dtype * fix commit
-
- 13 10月, 2022 2 次提交
-
-
由 傅剑寒 提交于
Fix set_value failure when source tensor is fp16 Dtype and destiny value is a number (dev PR link:#46801)
-
由 Sławomir Siwek 提交于
* Revert pool+grad oneDNN kernel conversion (#45989) * [PHI] transpose2_grad op migration (#46139) * op migrated, Copy(OneDNNContext, ...) added * mutable_data & op registration in fluid removed * refactoring * OneDNNGetDataType to uppercase * missing cpu check added, handler moved to .h file * name changed to transpose_grad * Copy changed back to TensorCopy * Resizing corrected, Copy(OneDNNContext) removed Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com> Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>
-
- 11 10月, 2022 3 次提交
-
-
由 Sławomir Siwek 提交于
* [PHI] Migrate gelu kernels (#45596) * gaussian random * mkldnn to onednn renaming * fix merge conflicts * remove fluid code * onednn renaming * gelu fwd * sort activations * gelu gradient * remove unused macros * merge conflicts * fix merge conflicts * remove extra contraint from gelu op * [PHI] relu6_grad kernel (#46501) * Relu6 * remove fluid handler * add individual kernel signature * coding style * replace bounded_relu with clip * whitespace * code style
-
由 Sławomir Siwek 提交于
Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com>
-
由 ceci3 提交于
-