- 22 2月, 2023 1 次提交
-
-
由 Shuangchi He 提交于
* Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * pre-commit Signed-off-by: Yulv-git <yulvchi@qq.com> --------- Signed-off-by: Yulv-git <yulvchi@qq.com>
-
- 17 2月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* rename multi_tensor_adam to fused_adam * fix some bugs * fix CI coverage * rename test_fused_adam.py * fix some bug * add test_fused_adam_op.py * fix some bugs * fix fused_adam_op.cc * fix CI bugs * fix CI bug * fix CI bug
-
- 16 2月, 2023 1 次提交
-
-
由 Huang Jiyi 提交于
* move layer_norm_kernel.cu.h to phi * fix bugs * fix namespace * fix bugs * fix CI-Windwos * replace mutable_data * fix bugs * fix bugs
-
- 15 2月, 2023 1 次提交
-
-
由 lzy 提交于
* make FusedMultiTransformer supports variable-lengths. * modify ffn2 when cuda_version >= 11.6 because of #49392. * code style * delete remove_padding
-
- 14 2月, 2023 1 次提交
-
-
由 limingshu 提交于
* first commit. * a little changes * add some changes for get vec_size efficiently * fix bugs --------- Co-authored-by: Nzhangbopd <1299246947@qq.com>
-
- 08 2月, 2023 3 次提交
-
-
由 Yuang Liu 提交于
-
由 Huang Jiyi 提交于
-
由 YuanRisheng 提交于
* unify_kernel * fix compile bugs * modify macro name * perfect code according comment * fix compile bugs * fix compile bugs * fix ci bugs * fix ci bug * fix ci bugs * fix ci bugs * modify code according comment * rm conv_fusion_op
-
- 06 2月, 2023 2 次提交
-
-
由 zyfncg 提交于
* remove extra input of conv2d * fix bug * fix unittest bug * adjust conv2d.pbtxt * fix cpu_quantize_pass_tester * revert use_addto of conv2d * fix runtime attribute * fix bug * recover force_fp32_output in conv2d * refine error info * fix bug
-
由 engineer1109 提交于
-
- 03 2月, 2023 2 次提交
-
-
由 Sławomir Siwek 提交于
* replace matmul with matmul_v2 in fuse passes * Remove fusion logic from matmul * removing fusion methods * add proper name * adjust namespaces * clean attrs in python tests * delete checkpoint and restore matmul version * remove unused code * matmul and reshape/transpose fuses migrated * split MatmulOneDNN headers * fuse activation and eltwise_add * add fuse_activation * matmul_transpose_reshape/reshape_transpose_matmul * matmul + elementwise_add (fused) * activation temporary modifciation * merge newest develop * remove depedency from other PR * revert pbtxt * remove placeholders from matmul_v2 * add description in OPMaker * remove matmul_v2_op.h and all depedencies * remove dims changing in base op * add possibility to fuse already fused_matmul * restart broken CI * Empty-Commit * revert matmul_utils.h * codestyle * adjust imports * add pbtxt file * 100% matmul unit tests coverage * trigger CI with minimal changes to develop * adjust changes to develop * add fused_matmul op * inherit base ops * add "v2" * move OPMaker * Gradually add fused_matmul files * second batch of fused_matmul changes * split infershapes of matmul_v2 and fused_matmul * inherit fused_matmul from matmul_v2 * Update paddle/phi/backends/onednn/onednn_reuse.h Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> --------- Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 Yuang Liu 提交于
-
- 01 2月, 2023 1 次提交
-
-
由 Wang Bojun 提交于
* preln_residual 2 fused_bias_residual * skip layernorm fix and ut * code refine * code style refine * fix ut * fix output * add trt layer fall back info * refine op teller and ut * DropoutMaskOut output fix
-
- 13 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 06 1月, 2023 1 次提交
-
-
由 MarDino 提交于
-
- 05 1月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 04 1月, 2023 3 次提交
-
-
由 Wilber 提交于
-
由 Yuanle Liu 提交于
-
由 HongyuJia 提交于
* execute use kernel_key first * change OpKernelType->KernelKey * fix py3 compile error, remove redundant header files * fix build_strategy_test * fix DataType::RAW * fix custom_type test: operator_test.cc * fix transform place * fix backends_are_same_class * try fix place TransDataDevice * support all KernelKey * fix TransformData * fix place_are_same_class * fix merge * fix test_params_no_grad * fix specific place of GetExpectedKernelType * fix specific place of GetExpectedKernelType * fix GetKernelTypeForVar * fix dtype error * fix fetch_v2 * change GetKernelTypeForVar * fix interpreter * fix typo error * polish codes * polish codes * polish codes * fix conflict
-
- 03 1月, 2023 1 次提交
-
-
由 zhoutianzi666 提交于
* Implement conv2d_fusion NHWC format using CUTLASS * Add unit testing for CUTLASS Conv in inference * Add experimental API for CUTLASS.
-
- 29 12月, 2022 2 次提交
-
-
由 MarDino 提交于
-
由 Wang Bojun 提交于
* fusedAttenGrad_noGrad * code style fix * add ut * remove unnecessary log
-
- 23 12月, 2022 1 次提交
-
-
由 lzy 提交于
-
- 20 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* move dropout_impl from fluid to phi * move cuda_graph_with_memory_pool from fluid to phi * update namespace * remove cuad_graph in fluid * fix mac-build * fix bugs * correct CodeStyle * fix mac-build * fix mutable_data * fix stl include * fix copy param
-
- 19 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 16 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 15 12月, 2022 2 次提交
-
-
由 huangjiyi 提交于
-
由 Sławomir Siwek 提交于
* fix wrong handler name * mkldnn_engine -> onednn_engine * remove fluid/errors.h imports * remove fluid/enforce.h imports * remove note and unnecessary import * remove fluid/pretty_log.h imports * remove fluid/place.h imports * remove fluid/data_layout_transform.h imports * remove fluid/device_context.h imports * remove mkldnn_helper code * remove fluid/mkldnn_reuse.h imports * pretty_log import
-
- 14 12月, 2022 2 次提交
-
-
由 Ming-Xu Huang 提交于
-
由 zqw_1997 提交于
* modify cmake file for cuda11.8 compile * add op_library(fused_embedding_eltwise_layernorm_op DEPS bert_encoder_functor)
-
- 13 12月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* save fused_attention memory when dropout_rate = 0.0 * add ut * fix ut bug * fix fused_layernorm_residual_dropout_bias_test.cu
-
- 12 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* move norm_utils.cu.h from fluid to phi * remove norm_utils.h in fluid * fix bugs and replace mutable_data with Alloc * replace mutable_data with Alloc
-
- 09 12月, 2022 2 次提交
- 08 12月, 2022 1 次提交
-
-
由 limingshu 提交于
-
- 07 12月, 2022 1 次提交
-
-
由 张春乔 提交于
-
- 06 12月, 2022 1 次提交
-
-
由 zyfncg 提交于
* delete Bias and ResidualData in OpMaker of conv2d * delete extra input of conv3d * refactor pass of conv_bias_fusion * fix mkldnn dependency * fix mkldnn compile * fix test_conv_bias_mkldnn_fuse_pass * police some code * remove useless log * fix analyzer_vit_ocr_tester * fix conv_activation_mkldnn_fuse_pass * fix test_analyzer_ocr * add fused_conv_sig * fix performence regression * fix performance regression
-
- 05 12月, 2022 2 次提交
-
-
由 limingshu 提交于
* first commit * fix bugs according to ci * add some changes * change file name into function.cu.h * remove const_cast
-
由 zhoutianzi666 提交于
-
- 01 12月, 2022 1 次提交
-
-
由 minghaoBD 提交于
* fuse-mt passes compatible with structured pruning
-