- 21 4月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 19 4月, 2023 1 次提交
-
-
由 csy0225 提交于
-
- 17 4月, 2023 1 次提交
-
-
由 zhoutianzi666 提交于
* initial commit for cutlass_teller * second commit for cutlass_teller * add conv2d_depthwise python template * add conv2d_depthwise cutlass template * /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h * refine code in Conv2dFusionCanSupport * add macro in cutlass_teller.h * add 3x3 5x5 teller * add groups not 1 or conv2d_depthwise teller * 只生成ic是8的倍数的conv2d_depthwise 的kernel * add EXPLICIT in cutlass_teller.h * final commit * add split_k_slices in conv2d_depthwise * make stages == 2 * 重构部分代码 * add CutlassFusionType * solve illegal memory * make stride_h=stride_w && make dilation==1 * must check HasAttr(use_cutlass) before GetAttrIfExists * add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String * modify decl.h and util.cu
-
- 14 4月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 13 4月, 2023 5 次提交
-
-
由 Wangzheee 提交于
* Paddle-Trt: Replace fc mul matmul matmul_v2 with matrix_multiply
-
由 csy0225 提交于
-
由 HongyuJia 提交于
* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h * Add logging.h for profiler.cc * Add logging.h for gloo_utils.h * Add logging.h for addmm_kernel_impl.h * Add logging.h for addmm_grad_kernel_impl.h * Add logging.h for p_send_kernel.cu * Add logging.h for determinant_grad_kernel_impl.h * Add logging.h for p_recv_kernel.cu * Add logging.h for elementwise_grad_base.h * Add logging.h for transfer_layout_kernel.cc * Add logging.h for eigvals_kernel.cc and index_select_impl.h * Add logging.h for all files in kernel directory * Add logging.h for xpu_info.cc * Add logging.h for xpu
-
由 zhupengyang 提交于
-
由 csy0225 提交于
-
- 12 4月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 11 4月, 2023 1 次提交
-
-
由 wz1qqx 提交于
-
- 10 4月, 2023 1 次提交
-
-
由 xiaoxiaohehe001 提交于
* Support two inputs of multihead attention named qk_multihead
-
- 06 4月, 2023 3 次提交
-
-
由 huangjiyi 提交于
* update * fix compile bug * fix bug * fix bug * revert crop_op * fix xpu compile * fix cinn compile * fix bug * fix bug * fix bug * fix bug * update * update * update
-
由 Sławomir Siwek 提交于
* replace matmul with matmul_v2 in fuse passes * Remove fusion logic from matmul * removing fusion methods * add proper name * adjust namespaces * clean attrs in python tests * delete checkpoint and restore matmul version * remove unused code * matmul and reshape/transpose fuses migrated * split MatmulOneDNN headers * fuse activation and eltwise_add * add fuse_activation * matmul_transpose_reshape/reshape_transpose_matmul * matmul + elementwise_add (fused) * activation temporary modifciation * restore matmul(v1) version 0 * merge newest develop * remove depedency from other PR * revert pbtxt * remove placeholders from matmul_v2 * add description in OPMaker * remove matmul_v2_op.h and all depedencies * remove dims changing in base op * add possibility to fuse already fused_matmul * restart broken CI * Empty-Commit * revert matmul_utils.h * codestyle * adjust imports * add pbtxt file * 100% matmul unit tests coverage * trigger CI with minimal changes to develop * adjust changes to develop * add fused_matmul op * inherit base ops * add "v2" * move OPMaker * Gradually add fused_matmul files * second batch of fused_matmul changes * split infershapes of matmul_v2 and fused_matmul * merge code from other PR * 2023 * inherit fused_matmul from matmul_v2 * Update paddle/phi/backends/onednn/onednn_reuse.h Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * resolve conflicts * codestyle * simplify isgemmlinear * 2023 * remove import * reuse methods * matmul_v2_mkldnn cleanup * simplify ExecuteMatMulV1Grad * matmul refactored * fc * SetOutMemDescWithLogicalLayoutFusesSupport * matmul_v2 * alpha support * group repetetive funcs * matmul utils * execute matmul methods * restore registered kernel names * split header and impl files * remove double negatives * reduce numer of modified files * adjust ExecuteMatmul * add scales for ut * dates * limit number of modified files * fluid imports * remove alpha * codestyle --------- Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 Xinyu Chen 提交于
-
- 04 4月, 2023 1 次提交
-
-
由 handiz 提交于
* change skip-layernorm to adapt a new method * fix review problem and add vlog * fix review problem
-
- 03 4月, 2023 1 次提交
-
-
由 wz1qqx 提交于
-
- 31 3月, 2023 2 次提交
-
-
由 YuanRisheng 提交于
* remove distribute * fix py3 bugs * fix gpu-ps bugs * fix compile bugs * fix unittest bugs
-
由 Wangzheee 提交于
* fix skiplayernorm, add trt_version check
-
- 30 3月, 2023 2 次提交
-
-
由 zhupengyang 提交于
-
由 zhupengyang 提交于
-
- 29 3月, 2023 2 次提交
-
-
由 zhupengyang 提交于
-
由 yuehuayingxueluo 提交于
* add fuse adamw pass * fix some bugs * fix CIbug * change chunk_size * fix CI bug * rm test_fused_adam_op.py * fix CI bugs * fix fuse_adamw_op_pass.cc * change code style * fix CI bug * fix ut bug and use_adamw_op_pass.cc * fix test_fuse_adamw_pass.py * fix CI bug * remove fluid * fix ci bug * fix CI bug
-
- 27 3月, 2023 1 次提交
-
-
由 Sławomir Siwek 提交于
* extract Op and OPMaker to .h * extend pattern for fused_op * set "with_residual" default to false * adjust fuse passes * remove fc+eltwise flag * fused_output_scale * activation attrs * remove extra attrs * fix int8/bf16 unit tests * simplify RecomputeOutputDims * remove unused method * Add description for attributes * add extra check * adjust op compats * update quantize test * fix protobuf parsing error * fix int8 performance * fused elementwises * merge develop * remove activation * restore activation for existing add/sub ops
-
- 22 3月, 2023 5 次提交
-
-
由 joanna.wozna.intel 提交于
-
由 Ghost Screaming 提交于
* Add fused_feed_forward pass for semi-automatic static graph training. * Add fused_feedforward property in parallel_executor.cc * Polish code. * Polish fused feed_forward pass code. Support use_dropout1 and use_dropout2 option. * Support model parallel in fused_feedforward pass.
-
由 Sławomir Siwek 提交于
* extract common methods to reuse * add header for transpose ops * fused_transpose * Split big function * transpose2 tests * fused_transpose * Apply extra attributes * add pbtxt file * update pbtxt * Merge develop * add more strict op compats * code style * remove mkldnn_data_type * unify SetOutMemDescWithReshape2FuseSupport * adjust quantize-dequantize for transpose * remove appendact * transpose2 quantization * fix int8 tests * adjust transpose_op to current develop * delete fusion code from transpose_kernel * add fused transpose to NHWC unittest * change order
-
由 zhupengyang 提交于
-
由 Sylwester Fraczek 提交于
-
- 21 3月, 2023 1 次提交
-
-
由 iSerendipity 提交于
* move DataType from paddle::experimental to phi * convert namespace * convert namespace * convert namespace * clarify namespace * convert more datatype * Revert "convert more datatype" This reverts commit 083b462959e6a22d4d8767707b628b95b396642e. * convert more in auto_code_generator * fix conflicts for XPU * fix namespace conflicts * fix errors * Revert "fix errors" This reverts commit f9d9958b54ee32141112274c8a5c3c381ab0f876. * fix errors * fix formatting
-
- 20 3月, 2023 2 次提交
- 16 3月, 2023 1 次提交
-
-
由 wenbin 提交于
* split pass * fix compile * fix ut * more time * modify ut * reduce dim * fix compile * reshape weight * tensor * remove enforce * static shape ut * batchsize * reorder pass * minus test cases * windows timeout * windows time out * remove test for windows * correct * sssss * xxx
-
- 15 3月, 2023 1 次提交
-
-
由 iSerendipity 提交于
* Revert "Revert "【Hackathon No.67】remove operator.h in blas.h (#50989)" (#51467)" This reverts commit b9d91531. * remove cout * add header * fix missing header * fix refer fluid error * fix missing header * 更新 repeat_interleave_grad_kernel_impl.h Change to phi style datatype. * 更新 repeat_interleave_grad_kernel_impl.h Fix missing header * datatype fluid -> phi * paddle::experimental -> phi * fix reference error * fix reference error * fix reference error * fix errors * fix missing FLAGS * fix missing headers * fix missing headers * fix missing headers * fix missing headers * fix missing header * fix missing header * fix errors
-
- 14 3月, 2023 1 次提交
-
-
由 Sonder 提交于
-
- 13 3月, 2023 3 次提交
-
-
由 Sławomir Siwek 提交于
* mkldnn->onednn * fused softplus op + kernel * remove extra attributes * add missing handler * change var name
-
由 zhoutianzi666 提交于
* use python to generate cutlass code * refine CommonConvKernelPart1, CommonConvKernelPart2 * remove useless code in generate_cutlass_code.sh * add more config in conv2d_residual * CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2 * add group conv support in util.cu * remove .sh * refine name * make name goodgit status! * add fuse_alpha * make code easy to understand * mot fopen generate in py * use python script to generate conv2d,group=1 cutlass code * use const & * use const & && use python script to generate conv2d/group=1 code
-
由 zhupengyang 提交于
-
- 09 3月, 2023 1 次提交
-
-
由 Wang Xin 提交于
-
- 07 3月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-