- 22 5月, 2023 2 次提交
-
-
由 zhupengyang 提交于
-
由 zhangyikun02 提交于
-
- 19 5月, 2023 2 次提交
-
-
由 wz1qqx 提交于
-
由 limingshu 提交于
* Reorganize the forward codes of flash-attention. * Fix forward. * Remove some noused codes. * Simplify codes and fix backward. * Change all LOG(INFO) to VLOG and fix the backward. * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes * decrease the effect of debug print on performance * Unify the initialize of flashattn arguments. * Rewirte the reshape of temp_mask and temp_bias. * API support use_flash_attn. * Fix compiling error on CI. * Try to crop the flash-attention lib. * Correct the condition of whether can use flash-attn. * Remove the softmax_out argument. * Remove is_causal. * Polish codes. * Fix qkv_transpose_out's shape and scaling of Q * K. * Update commit of flash-attention. --------- Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 18 5月, 2023 2 次提交
- 15 5月, 2023 2 次提交
-
-
由 Galaxy1458 提交于
* test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop
-
由 RuohengMa 提交于
-
- 12 5月, 2023 2 次提交
- 11 5月, 2023 4 次提交
-
-
由 lijin23 提交于
* add pad op for xpu * add pad op for xpu * add pad op for xpu
-
由 jjyaoao 提交于
-
由 SaltFish11 提交于
* add_depthwise_conv2d_transpose * Update test_depthwise_conv2d_transpose_op_xpu.py 删除print语句
-
由 张春乔 提交于
-
- 10 5月, 2023 1 次提交
-
-
由 wz1qqx 提交于
* fix as review, add fp16 conv2d_transpose * fix unittest of bn and reduce_mean * fix bn unittest * fix ci * fix ci
-
- 09 5月, 2023 3 次提交
-
-
由 Galaxy1458 提交于
* test,test=develop * test,test=develop * test,test=develop * test,test=develop
-
由 RuohengMa 提交于
* bind sparse_coo_tensor, reduce_max/max_int32, range/arange_int32, equal_bool, scatter_grad_float32, nearest_interp_int64 kernels * add more unit tests; modify compilation logic of xpu sparse kernels
-
由 Galaxy1458 提交于
* test,test=develop * test,test=develop
-
- 08 5月, 2023 2 次提交
- 06 5月, 2023 1 次提交
-
-
由 csy0225 提交于
-
- 28 4月, 2023 1 次提交
-
-
由 lj970926 提交于
* clang format * add cumsum_grad op to xpu2_op_list
-
- 27 4月, 2023 1 次提交
-
-
由 houj04 提交于
-
- 26 4月, 2023 1 次提交
-
-
由 risemeup1 提交于
* Optimize prompt information * add_information * add_information
-
- 25 4月, 2023 1 次提交
-
-
由 YuanRisheng 提交于
* add flags for phi * fix compile bugs * fix ci bugs * fix inference bugs * fix cinn' bugs * fix cinn bugs * perfect code according comment * fix ci bugs * fix ci bugs
-
- 24 4月, 2023 1 次提交
-
-
由 Galaxy1458 提交于
* test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop * test,test=develop * test ,test=develop
-
- 20 4月, 2023 2 次提交
- 19 4月, 2023 1 次提交
-
-
由 houj04 提交于
-
- 17 4月, 2023 1 次提交
-
-
由 张春乔 提交于
-
- 14 4月, 2023 2 次提交
- 13 4月, 2023 3 次提交
-
-
由 jjyaoao 提交于
* delete WITH_ASCEND_CL * delete NPU/ and WITH_MLU
-
由 HongyuJia 提交于
* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h * Add logging.h for profiler.cc * Add logging.h for gloo_utils.h * Add logging.h for addmm_kernel_impl.h * Add logging.h for addmm_grad_kernel_impl.h * Add logging.h for p_send_kernel.cu * Add logging.h for determinant_grad_kernel_impl.h * Add logging.h for p_recv_kernel.cu * Add logging.h for elementwise_grad_base.h * Add logging.h for transfer_layout_kernel.cc * Add logging.h for eigvals_kernel.cc and index_select_impl.h * Add logging.h for all files in kernel directory * Add logging.h for xpu_info.cc * Add logging.h for xpu
-
由 csy0225 提交于
-
- 10 4月, 2023 2 次提交
-
-
由 HongyuJia 提交于
* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc * Add gflags.h for other files * Add gflags.h for other files * Add gflags.h for blas_impl.hip.h * Add gflags.h for miopen_helper.h
-
由 lishicheng1996 提交于
-
- 09 4月, 2023 1 次提交
-
-
由 ronnywang 提交于
* [PHI CAPI] support complex dtype kernel * update
-
- 07 4月, 2023 1 次提交
-
-
由 Wang Xin 提交于
-
- 06 4月, 2023 1 次提交
-
-
由 Sławomir Siwek 提交于
* replace matmul with matmul_v2 in fuse passes * Remove fusion logic from matmul * removing fusion methods * add proper name * adjust namespaces * clean attrs in python tests * delete checkpoint and restore matmul version * remove unused code * matmul and reshape/transpose fuses migrated * split MatmulOneDNN headers * fuse activation and eltwise_add * add fuse_activation * matmul_transpose_reshape/reshape_transpose_matmul * matmul + elementwise_add (fused) * activation temporary modifciation * restore matmul(v1) version 0 * merge newest develop * remove depedency from other PR * revert pbtxt * remove placeholders from matmul_v2 * add description in OPMaker * remove matmul_v2_op.h and all depedencies * remove dims changing in base op * add possibility to fuse already fused_matmul * restart broken CI * Empty-Commit * revert matmul_utils.h * codestyle * adjust imports * add pbtxt file * 100% matmul unit tests coverage * trigger CI with minimal changes to develop * adjust changes to develop * add fused_matmul op * inherit base ops * add "v2" * move OPMaker * Gradually add fused_matmul files * second batch of fused_matmul changes * split infershapes of matmul_v2 and fused_matmul * merge code from other PR * 2023 * inherit fused_matmul from matmul_v2 * Update paddle/phi/backends/onednn/onednn_reuse.h Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * resolve conflicts * codestyle * simplify isgemmlinear * 2023 * remove import * reuse methods * matmul_v2_mkldnn cleanup * simplify ExecuteMatMulV1Grad * matmul refactored * fc * SetOutMemDescWithLogicalLayoutFusesSupport * matmul_v2 * alpha support * group repetetive funcs * matmul utils * execute matmul methods * restore registered kernel names * split header and impl files * remove double negatives * reduce numer of modified files * adjust ExecuteMatmul * add scales for ut * dates * limit number of modified files * fluid imports * remove alpha * codestyle --------- Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-