- 17 2月, 2023 1 次提交
-
-
由 zhupengyang 提交于
[XPU] add multi_encoder_xpu_slice_fuse_pass, generate_sequence_xpu_fuse_pass, generate_sequence_xpu kernel (#50570)
-
- 16 2月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 11 2月, 2023 1 次提交
-
-
由 Wang Bojun 提交于
* eleadd_trans first version log fix * refine code for linear format, add pass check * linear format refine and ut fix * fix ut * windows ut * windows ut 2 * move tensorMeta and alloc to configure
-
- 10 2月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 09 2月, 2023 1 次提交
-
-
由 Wang Bojun 提交于
* trans_layernorm
-
- 08 2月, 2023 1 次提交
-
-
由 Paulina Gacek 提交于
* QuantTranpose pattern is being found by pass * quant + transpose fuse * code style changes * UT written, reorder fixed * Dequantize + transpose2 fuse added * pass name changed * UT added & shift corrected * got rid of redundancy * review changes * AsIntermediate corrected * compat added
-
- 06 2月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
* disable conv2d_fusion_layout_transfer_pass temporarily * disable conv2d_fusion_layout_transfer_pass temporarily
-
- 31 1月, 2023 1 次提交
-
-
由 wenbin 提交于
* gn_silu * add ut * set TIMEOUT * correct comments * comments * disable windows ut * rename parameter
-
- 16 1月, 2023 2 次提交
-
-
由 Yuanle Liu 提交于
* add trt_support_nhwc_pass
-
由 Yuanle Liu 提交于
* add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses * disable fc_elementwise_layernorm_fuse_pass in mixed precision
-
- 13 1月, 2023 1 次提交
-
-
由 Wang Bojun 提交于
* add fmha_flashattention oss plugin
-
- 09 1月, 2023 2 次提交
- 06 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 05 1月, 2023 1 次提交
-
-
由 Wilber 提交于
-
- 04 1月, 2023 1 次提交
-
-
由 lzy 提交于
-
- 03 1月, 2023 1 次提交
-
-
由 zhoutianzi666 提交于
* Implement conv2d_fusion NHWC format using CUTLASS * Add unit testing for CUTLASS Conv in inference * Add experimental API for CUTLASS.
-
- 22 12月, 2022 1 次提交
-
-
由 gem5 提交于
-
- 19 12月, 2022 1 次提交
-
-
由 Wangzheee 提交于
* General optimization for no_varlen embedding layernorm
-
- 14 12月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 Hulek 提交于
* Deleted mkldnn_inplace_pass code * Fixed error with cmake * Resolve conflicts
-
- 12 12月, 2022 1 次提交
-
-
由 feng_shuai 提交于
-
- 08 12月, 2022 4 次提交
-
-
由 RichardWooSJTU 提交于
* rewrite delete_weight_deqquant_linear_op_encoder/decoder pass
-
由 Wangzheee 提交于
* general optimization no_varlen embedding layernorm
-
由 Wilber 提交于
-
由 Wilber 提交于
-
- 06 12月, 2022 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 05 12月, 2022 1 次提交
-
-
由 Wang Bojun 提交于
* pass * pass * draft version * share mem opt * remove sharemem * add pattern for the case with circle_shift=0 * add UT * pass opt * test_fix * code-commit * code-style * code style * code-style * ut-fix * op teller refine * resolve conflict * adjust position op_teller list and pass order for swin * ut code style update * adjust paddle pass order * refine pass order * refine pass order * refine pass order
-
- 30 11月, 2022 2 次提交
-
-
由 feng_shuai 提交于
-
由 RichardWooSJTU 提交于
* delete unnecessary shape and slice op Co-authored-by: NYour Name <you@example.com>
-
- 23 11月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 21 11月, 2022 2 次提交
-
-
由 Sylwester Fraczek 提交于
* add fc-residual quantization * revert removal of check for use_mkldnn * fix bug * add disable_logs * review fix call twice AreScalesPresntForNodes instead of if-else * rewrite residual input to output * revert fc mkldnn taking residual data * format fix * fix LoDTensor->DenseTensor * LoDTensor->DenseTensor * output->input * revert changes to unsupported script revert changes to unsupported script * remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc
-
由 RichardWooSJTU 提交于
-
- 16 11月, 2022 1 次提交
-
-
由 Piotr Paturej 提交于
* Enable bf16 in oneDNN bilinear_interp kernel * Fix bilinear_interp_v2 not enabled in models * Remove unnecessary checks
-
- 15 11月, 2022 1 次提交
-
-
由 jakpiase 提交于
* optimization for ln * fix * added output to gpd * added formatting * fix
-
- 10 11月, 2022 2 次提交
-
-
由 zhangxin81 提交于
* add roformer pass&&plugin(novarlen)
-
由 RichardWooSJTU 提交于
* add fuse_multi_transformer_layer_pass
-
- 09 11月, 2022 2 次提交
-
-
由 joanna.wozna.intel 提交于
-
由 Paulina Gacek 提交于
* Analysis API interface for disabling fc passes * Unit tests corrected * Python API added * test runs only when PADDLE_WITH_MKLDNN * Fc op changed to relu in matmul_op_test * Disable fc passes in tests where acc drops * code formating * Unit test for analysisConf added * Unit test gpu added * fc passes disabled when iterations=0 in gru test * style * passes disabled when fp32 in gru test * fc passes disabled in lstm test * Import from inference, not fluid in doc
-
- 08 11月, 2022 1 次提交
-
-
由 Kaipeng Deng 提交于
-