- 30 11月, 2022 2 次提交
-
-
由 feng_shuai 提交于
-
由 RichardWooSJTU 提交于
* delete unnecessary shape and slice op Co-authored-by: NYour Name <you@example.com>
-
- 23 11月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 21 11月, 2022 2 次提交
-
-
由 Sylwester Fraczek 提交于
* add fc-residual quantization * revert removal of check for use_mkldnn * fix bug * add disable_logs * review fix call twice AreScalesPresntForNodes instead of if-else * rewrite residual input to output * revert fc mkldnn taking residual data * format fix * fix LoDTensor->DenseTensor * LoDTensor->DenseTensor * output->input * revert changes to unsupported script revert changes to unsupported script * remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc
-
由 RichardWooSJTU 提交于
-
- 16 11月, 2022 1 次提交
-
-
由 Piotr Paturej 提交于
* Enable bf16 in oneDNN bilinear_interp kernel * Fix bilinear_interp_v2 not enabled in models * Remove unnecessary checks
-
- 15 11月, 2022 1 次提交
-
-
由 jakpiase 提交于
* optimization for ln * fix * added output to gpd * added formatting * fix
-
- 10 11月, 2022 2 次提交
-
-
由 zhangxin81 提交于
* add roformer pass&&plugin(novarlen)
-
由 RichardWooSJTU 提交于
* add fuse_multi_transformer_layer_pass
-
- 09 11月, 2022 2 次提交
-
-
由 joanna.wozna.intel 提交于
-
由 Paulina Gacek 提交于
* Analysis API interface for disabling fc passes * Unit tests corrected * Python API added * test runs only when PADDLE_WITH_MKLDNN * Fc op changed to relu in matmul_op_test * Disable fc passes in tests where acc drops * code formating * Unit test for analysisConf added * Unit test gpu added * fc passes disabled when iterations=0 in gru test * style * passes disabled when fp32 in gru test * fc passes disabled in lstm test * Import from inference, not fluid in doc
-
- 08 11月, 2022 1 次提交
-
-
由 Kaipeng Deng 提交于
-
- 07 11月, 2022 1 次提交
-
-
由 Hui Zhang 提交于
* suqeeze2 transpose2 fuse onednn * format * fix output shape * fix conflict * format * format * remove useless * remove log * simply pass * fix comment * fix * fix msg * fix error msg * format
-
- 04 11月, 2022 1 次提交
-
-
由 jakpiase 提交于
* tmp save * minor chnage * CI fix * added FC optimizations * latest update * CI fix * fixed bug with fusing fc
-
- 03 11月, 2022 1 次提交
-
-
由 yeliang2258 提交于
* add constant_folding_pass pass for mkldnn int8 * update UpdateScaleOpInOutScales
-
- 26 10月, 2022 2 次提交
-
-
由 wenbin 提交于
* prelnlayernorm_shift * add ut * remove paddle_enforce * remove useless * add UT * remove UT * add UT * set timeout
-
由 Sławomir Siwek 提交于
* fc/matmuls + scale fuse pass * remove double-extension * add unit tests * comments from review * codestyle * add pass to int8 list * new codestyle * attr name typo
-
- 20 10月, 2022 2 次提交
-
-
由 feng_shuai 提交于
-
由 Kaipeng Deng 提交于
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
-
- 18 10月, 2022 1 次提交
-
-
由 Wang Bojun 提交于
* first version, accuracy corrected * disable debug print * use blockReduceSum in phi * add UT * add opCompat * code style * code refine * bug fix * code refine * test fix * bugfix * codesytle fix * code style * code-style * code-style * code-style
-
- 17 10月, 2022 2 次提交
-
-
由 Wang Bojun 提交于
* first version of ln_s_p with s>0 * refine and UT * pass opt draft * pass opt * code refine * code-style * bug fix * fix ci test * code style
- 16 10月, 2022 1 次提交
-
-
由 ZeKai Zhou 提交于
-
- 10 10月, 2022 1 次提交
-
-
由 zhoutianzi666 提交于
-
- 27 9月, 2022 1 次提交
-
-
由 Wangzheee 提交于
* [Paddle Inference]support n lookup_tables fuse to embeddinglayernorm(3)
-
- 21 9月, 2022 1 次提交
-
-
由 zhoutianzi666 提交于
* Remove trt_reshape2_matmul_fuse_pass
-
- 07 9月, 2022 1 次提交
-
-
由 wenbin 提交于
* first commit * conver done * correct format * layernorm_shift_partition * correct convert * redefine plugin * runable * bug fix * modify ShiftPartitionPattern * correct * add UT * modify ut * compile * modify enforce * modify UT
-
- 02 9月, 2022 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 30 8月, 2022 1 次提交
-
-
由 zhoutianzi666 提交于
add constant folding pass, for some model,it will get less latency;
-
- 22 8月, 2022 3 次提交
-
-
由 joanna.wozna.intel 提交于
* Add int8 support for matmul+elementwiae_add fuse * Corrections after review and ernie test fix
-
由 Sławomir Siwek 提交于
* merge conv_concat_relu to conv_act * fix typo * extend unit test * reuse existing gpd * codestyle * enforce mkldnn conv
-
由 Yuanle Liu 提交于
-
- 16 8月, 2022 1 次提交
-
-
由 feng_shuai 提交于
* convert multihead to oss * fix:bug * fix:delete const cast * fix:don't support bias_qk * add vit pass * fix:convert bug and add preln_residual_bias * support length=-1 * add UT for convert * add no_bias_qk support for gpu_multihead_op * delete infer_shape depends on bias_qk * oss just can be used in T4 and A* * fix:change api for ROCM CI
-
- 15 8月, 2022 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 14 8月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
This reverts commit 84bf5c31.
-
- 10 8月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
* cuda_graph * cuda_graph_ * cuda_graph_ * cuda_graph_
-
- 05 8月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* remove v2_transpose_reshape * matmul_transpose_reshape * reshape_transpose_matmul * restore ut * adjust old ut * restore parallel UT ruels * feedback from review
-
- 04 8月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* Add unit tests * matmul_v2 + activation * matmuls + elementwise_add * matmul_v2 postops * transform matmul to v2 * opcompat * fix fusing matmul with multipe outs * add shape constraints * remove unused vars * change pass order * - Unit tests to be debugged - fix - refactor - diagnostic - more diagnostic - fix - Fix number two - fix - fix - fix - alpha added - more fixes - compilation fix - removed diagnostic code - cosmetic fixes * lint * add alpha constraint * merge matmul refactor * trigger CI * - fix * - another fix * code style * add support for matmul+elementwise_add+activation * code style * fix bfloat16 bugs * change append_binary to append_sum Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
- 02 8月, 2022 1 次提交
-
-
由 Wilber 提交于
* multihead matmul add fp16 * fix windows error * fix rocm error * fix rocm error
-
- 29 7月, 2022 1 次提交
-
-
由 ming1753 提交于
* fused_fc_elementwise_layernorm support fp16 * fused_fc_elementwise_layernorm support double
-