- 04 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
* disable scale op in amp pass * Do not insert redundant cast op * fix fused_fc_elementwise_layernorm kernel diff * fix fc kerenl diff
-
- 29 12月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* cherry-pick 45860 * [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265) * fix sum bug * fix ci bugs * fix ci bugs * update code according comment
-
- 22 12月, 2022 1 次提交
-
-
由 Yuanle Liu 提交于
* [Release2.4] Revert python link prs (#48573) * Revert "Fix mac link python (#48017)" This reverts commit 3fa7a736. * Revert "[Cherry-pick] Fix python link error (#47811)" This reverts commit ff642c68. * Update config.go * fix mixed precision inference Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
- 19 12月, 2022 1 次提交
-
-
由 Yuanle Liu 提交于
* [Release2.4] Revert python link prs (#48573) * Revert "Fix mac link python (#48017)" This reverts commit 3fa7a736. * Revert "[Cherry-pick] Fix python link error (#47811)" This reverts commit ff642c68. * Update config.go * [Paddle Inference] Add float_to_half_pass to support inference with mixed precision (#47993) * [Inference] optimize some code and fix some bug (#48780) * clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass * fix unitest timeout * [Paddle Inference] clean unused code (#48392) * fix * update * update Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
- 29 11月, 2022 1 次提交
-
-
由 yeliang2258 提交于
[cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951) * Fix slice bugs in MKLDNN when input dims are zeros (#46671) * fix slice bugs * fix * update code * fix * update code * updating mul and matmul with set_mem_desc (#45624) * - mul & matmul changes - fix - bs16 correction of strides * - cosmetic fixes * - lint * - fix * - fix * - format -> mem_desc * - fix * - fix * - fix * - fix * - fix * fix squueze_transpose (#47911) Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
- 28 11月, 2022 1 次提交
-
-
由 zlsh80826 提交于
* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098) * Add missing fp32 config and reduce the testing combination * Reduce trt matmul pass test max examples * Loose TRT fp16 tests tolerance (#47100) * Loose TRT half test tolerance to 1e-3 (#47101) * Loose TRT half test tolerance to 1e-3 (#47106) * Update distributed_strategy.proto (#46531) * Close popen pipe after used (#47053) * Add launch_bounds (#47285) * Fix TRT UT failures (#47488) * Format cherry-picked commits * CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100 Co-authored-by: NShijie <505749828@qq.com> Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com> Co-authored-by: NTian Zheng <tizheng@nvidia.com>
-
- 10 11月, 2022 1 次提交
-
-
由 RichardWooSJTU 提交于
* add fuse_multi_transformer_layer_pass
-
- 09 11月, 2022 1 次提交
-
-
由 Hui Zhang 提交于
* suqeeze2 + transpose2 fuse onednn cherrypick 2.4 * format * fix merge
-
- 08 11月, 2022 1 次提交
-
-
由 jakpiase 提交于
* fc cherrypick * another files added * added transpose cherrypick * reverter somebodys fc changes * minor fix * minor fix * cherry-pick of fc+act changes * minor fix * fix
-
- 03 11月, 2022 3 次提交
-
-
由 Sławomir Siwek 提交于
-
由 yeliang2258 提交于
* add constant_folding_pass pass for mkldnn int8 * update UpdateScaleOpInOutScales
-
由 Kaipeng Deng 提交于
* fix memory copy in prepare_data. test=develop * add cache_kv fp16 support. test=develop * fit for simplify_with_basic_ops_pass. test=develop
-
- 01 11月, 2022 1 次提交
-
-
由 zyfncg 提交于
* support generating code of opmaker for backward op invoke forward op (#46912) * [code-gen] Support code-gen for opmaker of sparse op (#46993) * support generating code of opmaker for backward op invoke forward op * gsupport code-gen of opmaker for sparse op * refind logic of choose phi kernrel * fix complie budg * fix code_gen bug * fix bug * fix kernel signature code-gen * fix complie bug of VarType * fix complie bug of VarType * fix test_sparse_conv_op * fix test_sparse_norm_op * [Phi] Refactor logic of judging whether having a phi kernrel (#46920) * refind logic of choose phi kernrel * fix complie budg * update cmake
-
- 20 10月, 2022 3 次提交
-
-
由 Kaipeng Deng 提交于
* add fused_attention_pass. test=develop * support fp16. test=develop * fix format. test=develop
-
由 yeliang2258 提交于
* Fix quantize model deploy bugs when using MKLDNN (#45920) * fix immutable op quantize bugs * fix * fix build bug * fix test * notest,test=inference * fix ppyoloe acc drop bugs * fix test * fix test * add test * fix * fix * fix test * fix refined name bug * fix test * bias fix * fix matmul weight dequant bug * re-ci * fix tester * fix test * fix tester * update weight dequantize func * update code * update test for converage * update test * update cmake * update cmakelist * update code * rerun ci * remove useless code * re-ci * update code * update code * fix header * update code for log
-
由 Wang Bojun 提交于
* Enhance the layernorm shift partation fuse op when shift size > 0 (roll shifting) * fix cherry-pick test
-
- 19 10月, 2022 3 次提交
-
-
由 yeliang2258 提交于
* Add unsigned int8 propagation * Add or modify unit tests * Correct concat scale checking * Apply review suggestions * Corrections Co-authored-by: Njoanna.wozna.intel <joanna.wozna@intel.com>
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Support allow_partial switch, which can be configure in pipeline_configs. If sent tensor are not the same from different hosts, they shouldn't been sent partially and then concated as a whole tensor. * Change name allow_partial to enable_partial_send_recv. * Add global variable _enable_partial_send_recv
-
由 WangZhen 提交于
[CherryPick][Dy2St]Fix recurrent op eager deletion pass error in dy2st
-
- 17 10月, 2022 2 次提交
-
-
由 zhangkaihuo 提交于
cherry-pick : #46322, #46245 Sparse API 支持静态图
-
由 Allen Guo 提交于
* paddle-inference support custom-ops Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> * fix tolower Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
-
- 14 10月, 2022 1 次提交
-
-
由 zhoutianzi666 提交于
-
- 13 10月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
-
- 11 10月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* [PHI] Migrate gelu kernels (#45596) * gaussian random * mkldnn to onednn renaming * fix merge conflicts * remove fluid code * onednn renaming * gelu fwd * sort activations * gelu gradient * remove unused macros * merge conflicts * fix merge conflicts * remove extra contraint from gelu op * [PHI] relu6_grad kernel (#46501) * Relu6 * remove fluid handler * add individual kernel signature * coding style * replace bounded_relu with clip * whitespace * code style
-
- 27 9月, 2022 1 次提交
-
-
由 zyfncg 提交于
* Clear extra attrs of elementwise op in OpMaker (#45845) * clear extra attrs of elementwise op in opmaker * fix op_debug_string_test * fix bug of grad_add * fix sort of runtime attrs * Clear extra attrs of scale in OpMaker (#45984) * clear extra attr of scale in opmaker * fix sum bug * fix merge conflict * fix minus * Clear extra attributes of some Op in OpMaker (Part4) (#46060) * clear extra attr of some ops in opmaker * revert clear use_cudnn for pool * fix test_operator_desc * fix Attr interface of OperatorBase * fix code stype
-
- 23 9月, 2022 1 次提交
-
-
由 zyfncg 提交于
-
- 20 9月, 2022 5 次提交
-
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 Leo Chen 提交于
* add config * add config * follow comments * fix serial run
-
由 zhoutianzi666 提交于
* fix preln_residual_bias_fuse_pass bug in TNT_small model
-
由 zhangbo9674 提交于
* add scope cache & reuse * add gc scope for end of each train step * del scope reuse for jit * refine code * test
-
由 zyfncg 提交于
* fix wrong eigen header include * fix complie bug * fix nan_inf_utils_detail * fix resource_manager * fix conv_miopen_helper
-
- 19 9月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
-
- 16 9月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* normalize yaml file name (#45894) * Clear extra attributes of activation op in OpMaker (#45772) * clear extra attr of activation op in opmaker * fix syntax bug * fix mkldnn kernel * fix merge conflict * fix bug * [PHI] Normalize yaml op label (#45976) * normalize yaml op label * revert op_compat yaml change * fix prelu and rnn compat problem * replace api by op * support assign op backward refuse forward (#45879) * normize yaml backward op label (#46028) Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com> Co-authored-by: NCharles-hit <56987902+Charles-hit@users.noreply.github.com>
-
- 15 9月, 2022 2 次提交
- 14 9月, 2022 3 次提交
-
-
由 JingZhuangzhuang 提交于
-
由 Leo Chen 提交于
-
由 pangyoki 提交于
-
- 13 9月, 2022 2 次提交
-
-
由 JingZhuangzhuang 提交于
-
由 Ruibiao Chen 提交于
* Allow manaully set py_reader name in standalone executor * Fix CI errors
-