1. 20 4月, 2023 1 次提交
    • Z
      move_elementwise_raw (#53010) · 7a72f7a2
      zhangyuqin1998 提交于
      * setup
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      
      * fix
      
      * fix
      
      * Update elementwise_kernel.cu
      
      * fix
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      
      * Update elementwise_kernel.cc
      7a72f7a2
  2. 17 4月, 2023 1 次提交
    • Z
      [Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a
      zhoutianzi666 提交于
      * initial commit for cutlass_teller
      
      * second commit for cutlass_teller
      
      * add conv2d_depthwise python template
      
      * add conv2d_depthwise cutlass template
      
      * /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h
      
      * refine code in Conv2dFusionCanSupport
      
      * add macro in cutlass_teller.h
      
      * add 3x3 5x5 teller
      
      * add groups not 1 or conv2d_depthwise teller
      
      * 只生成ic是8的倍数的conv2d_depthwise 的kernel
      
      * add EXPLICIT in cutlass_teller.h
      
      * final commit
      
      * add split_k_slices in conv2d_depthwise
      
      * make stages == 2
      
      * 重构部分代码
      
      * add CutlassFusionType
      
      * solve illegal memory
      
      * make stride_h=stride_w && make dilation==1
      
      * must check HasAttr(use_cutlass) before GetAttrIfExists
      
      * add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String
      
      * modify decl.h and util.cu
      bd3b096a
  3. 14 4月, 2023 1 次提交
    • G
      [phi] move sequence_pool to phi - Step 2 : sequence_pool_op (#52750) · b281b221
      gouzil 提交于
      * [phi] move sequence_pool kernel to phi
      
      * [phi] mv sequence_pooling to phi funcs
      
      * [phi] mv sequence_pooling_test
      
      * [phi] RollBACK `paddle/fluid/operators/sequence_ops/sequence_pool_op.cc`
      
      * [phi][funcs] fix mutable_data
      
      * [phi][funcs] fix mutable_data
      b281b221
  4. 04 4月, 2023 1 次提交
  5. 31 3月, 2023 1 次提交
  6. 29 3月, 2023 1 次提交
  7. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  8. 22 3月, 2023 2 次提交
  9. 16 3月, 2023 1 次提交
    • H
      Update from_blob API (#51646) · c07c7712
      Huang Jiyi 提交于
      * remove contexts in tensor_utils
      
      * update from_blob
      
      * update from_blob
      
      * update from_blob
      
      * fix bug
      
      * fix bug
      c07c7712
  10. 15 3月, 2023 1 次提交
  11. 13 3月, 2023 1 次提交
    • Z
      [Paddle Inference ]use python to generate cutlass code (#50603) · 4e9e23cb
      zhoutianzi666 提交于
      * use python to generate cutlass code
      
      * refine CommonConvKernelPart1, CommonConvKernelPart2
      
      * remove useless code in generate_cutlass_code.sh
      
      * add more config in conv2d_residual
      
      * CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2
      
      * add group conv support in util.cu
      
      * remove .sh
      
      * refine name
      
      * make name goodgit status!
      
      * add fuse_alpha
      
      * make code easy to understand
      
      * mot fopen generate in py
      
      * use python script to generate conv2d,group=1 cutlass code
      
      * use const &
      
      * use const & && use python script to generate conv2d/group=1 code
      4e9e23cb
  12. 10 3月, 2023 1 次提交
    • HappyHeavyRain's avatar
      [New features]Add function node in phi_kernel for MKLDNN (#51073) · a0a6dc6a
      HappyHeavyRain 提交于
      * Add function node in phi_kernel for MKLDNN
      
      * fix the bug in 'BuildInferVarKernelContext'
      
      * add infer_varkernel_utils.cc
      
      * fix the bug:the first two parametes of 'BuildInferVarKernelContext' can't be template variable
      
      * change the code according to first review
      
      * change the code according to first review
      
      * change the mode of paddle_build.sh
      
      * change 'infer_var_kernel_fn_' to 'get_kerneltype_forvar_fn_'
      
      * add the error information
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      a0a6dc6a
  13. 09 3月, 2023 1 次提交
    • TaoTao Li's avatar
      Add comm context manager, add phi broadcast op (#51072) · c191b707
      TaoTao Li 提交于
      * * add comm context for device context
      
      * add broadcast phi operator kernel and api
      
      * add broadcast support dtype, update ut
      
      * fix broadcast bfloat16 type
      
      * fix ut
      
      * update test_collective_broadcast_api timeout to 300
      c191b707
  14. 01 3月, 2023 1 次提交
    • C
      Integration flash attention (#49869) · 61611786
      Chitsing KUI 提交于
      * flash attn
      
      * seed
      
      * almost
      
      * softmax
      
      * fix workspace
      
      * add unitest; linux only
      
      * fix setup
      
      * fix datatype include
      
      * fix setup typo
      
      * fix def scope
      
      * new error api
      
      * use paddle fork
      
      * fix attr bug; complete ut
      
      * update flash hash
      
      * fix rng reset
      
      * fix offset
      
      * fix comments
      61611786
  15. 16 2月, 2023 1 次提交
    • H
      [phi decoupling] remove variable.h in phi (#50407) · 905cefd4
      Huang Jiyi 提交于
      * move variable_utils from phi_api_utils to fluid
      
      * fix coment
      
      * update include
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update
      
      * update
      
      * fix CI-Windows-OpenBLAS
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update include
      
      * move variable_utils to phi_utils
      
      * fix namespace
      905cefd4
  16. 03 1月, 2023 1 次提交
  17. 23 12月, 2022 1 次提交
  18. 22 12月, 2022 1 次提交
  19. 19 12月, 2022 1 次提交
  20. 17 12月, 2022 1 次提交
  21. 16 12月, 2022 1 次提交
  22. 06 12月, 2022 1 次提交
    • Z
      Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38
      zyfncg 提交于
      * delete Bias and ResidualData in OpMaker of conv2d
      
      * delete extra input of conv3d
      
      * refactor pass of conv_bias_fusion
      
      * fix mkldnn dependency
      
      * fix mkldnn compile
      
      * fix test_conv_bias_mkldnn_fuse_pass
      
      * police some code
      
      * remove useless log
      
      * fix analyzer_vit_ocr_tester
      
      * fix conv_activation_mkldnn_fuse_pass
      
      * fix test_analyzer_ocr
      
      * add fused_conv_sig
      
      * fix performence regression
      
      * fix performance regression
      0a2dfa38
  23. 05 12月, 2022 1 次提交
  24. 18 11月, 2022 1 次提交
    • T
      CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b
      Tian Zheng 提交于
      * Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation
      
      * Fix macro
      
      * Add implementation for conv_kernel and conv_grad_kernel
      
      * Modification after rebase onto latest develop
      
      * Modify plan cache to comply with the API of phi::autotune
      
      * Refactor to reduce duplicate code
      
      * Review fix:
      - move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
      - add const specifier for input tensor
      - add logging when plans fail to execute
      - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h
      
      * - move plan building outside of cache
      
      * Fix ROCM build
      14a6e67b
  25. 31 10月, 2022 1 次提交
  26. 20 10月, 2022 1 次提交
  27. 19 9月, 2022 1 次提交
  28. 09 9月, 2022 1 次提交
  29. 06 9月, 2022 1 次提交
  30. 02 9月, 2022 1 次提交
  31. 30 8月, 2022 1 次提交
    • H
      [phi] Transfer coalesce_tensor to phi (#45478) · cf9d651b
      HongyuJia 提交于
      * add coalesce_tensor kernel
      
      * polist coalesce_tensor kernel
      
      * add sig and InferMeta
      
      * add testcase
      
      * add legacy_api.yaml
      
      * fix infermeta
      
      * fix yaml
      
      * fix kernel implementation
      
      * add compile dependency of phi/kernels
      
      * fix MetaConfig
      
      * add python api
      
      * add and fix testcase
      
      * rnn.py add import
      
      * change _C_ops.coalesce_tensor
      
      * remove useless comments
      
      * add SetBackend
      
      * restore XPU kernel temporarily
      
      * fix code according to PR comments
      cf9d651b
  32. 26 8月, 2022 1 次提交
    • K
      Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a
      kangguangli 提交于
      * remove fluid kernel and activate phi kernel
      
      * fix parameter error
      
      * transfer mkldnn part
      
      * modify header file path
      
      * fix compile error
      
      * transfer special case
      
      * fix lod setting and special case for layout setting
      
      * add testcase and refine code
      985f2a4a
  33. 12 8月, 2022 1 次提交
  34. 05 8月, 2022 2 次提交
    • Y
      [MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2
      YuanRisheng 提交于
      * move mkldnn activation kernel
      
      * fix compile bugs
      
      * fix compile bugs
      
      * deal with conflict
      
      * fix compile bugs
      
      * fix windows compile bugs
      
      * mkldnn unittest fix
      
      * change mutable to alloc
      
      * fix unittest bugs
      
      * modify code according comment
      2dfa88d2
    • F
      move fft kernels to phi (#44714) · 153f1138
      Feiyu Chan 提交于
      * move fft kernels to phi, done with cufft, pocketfft, mkl_cdft, hipfft
      * make stft_op use fft from phi/kernels/funcs, clean code
      153f1138
  35. 03 8月, 2022 1 次提交
  36. 01 8月, 2022 1 次提交
  37. 29 7月, 2022 1 次提交
  38. 19 7月, 2022 1 次提交