1. 29 3月, 2023 1 次提交
  2. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  3. 22 3月, 2023 2 次提交
  4. 16 3月, 2023 1 次提交
    • H
      Update from_blob API (#51646) · c07c7712
      Huang Jiyi 提交于
      * remove contexts in tensor_utils
      
      * update from_blob
      
      * update from_blob
      
      * update from_blob
      
      * fix bug
      
      * fix bug
      c07c7712
  5. 15 3月, 2023 1 次提交
  6. 13 3月, 2023 1 次提交
    • Z
      [Paddle Inference ]use python to generate cutlass code (#50603) · 4e9e23cb
      zhoutianzi666 提交于
      * use python to generate cutlass code
      
      * refine CommonConvKernelPart1, CommonConvKernelPart2
      
      * remove useless code in generate_cutlass_code.sh
      
      * add more config in conv2d_residual
      
      * CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2
      
      * add group conv support in util.cu
      
      * remove .sh
      
      * refine name
      
      * make name goodgit status!
      
      * add fuse_alpha
      
      * make code easy to understand
      
      * mot fopen generate in py
      
      * use python script to generate conv2d,group=1 cutlass code
      
      * use const &
      
      * use const & && use python script to generate conv2d/group=1 code
      4e9e23cb
  7. 10 3月, 2023 1 次提交
    • HappyHeavyRain's avatar
      [New features]Add function node in phi_kernel for MKLDNN (#51073) · a0a6dc6a
      HappyHeavyRain 提交于
      * Add function node in phi_kernel for MKLDNN
      
      * fix the bug in 'BuildInferVarKernelContext'
      
      * add infer_varkernel_utils.cc
      
      * fix the bug:the first two parametes of 'BuildInferVarKernelContext' can't be template variable
      
      * change the code according to first review
      
      * change the code according to first review
      
      * change the mode of paddle_build.sh
      
      * change 'infer_var_kernel_fn_' to 'get_kerneltype_forvar_fn_'
      
      * add the error information
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      a0a6dc6a
  8. 09 3月, 2023 1 次提交
    • TaoTao Li's avatar
      Add comm context manager, add phi broadcast op (#51072) · c191b707
      TaoTao Li 提交于
      * * add comm context for device context
      
      * add broadcast phi operator kernel and api
      
      * add broadcast support dtype, update ut
      
      * fix broadcast bfloat16 type
      
      * fix ut
      
      * update test_collective_broadcast_api timeout to 300
      c191b707
  9. 01 3月, 2023 1 次提交
    • C
      Integration flash attention (#49869) · 61611786
      Chitsing KUI 提交于
      * flash attn
      
      * seed
      
      * almost
      
      * softmax
      
      * fix workspace
      
      * add unitest; linux only
      
      * fix setup
      
      * fix datatype include
      
      * fix setup typo
      
      * fix def scope
      
      * new error api
      
      * use paddle fork
      
      * fix attr bug; complete ut
      
      * update flash hash
      
      * fix rng reset
      
      * fix offset
      
      * fix comments
      61611786
  10. 16 2月, 2023 1 次提交
    • H
      [phi decoupling] remove variable.h in phi (#50407) · 905cefd4
      Huang Jiyi 提交于
      * move variable_utils from phi_api_utils to fluid
      
      * fix coment
      
      * update include
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update
      
      * update
      
      * fix CI-Windows-OpenBLAS
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update include
      
      * move variable_utils to phi_utils
      
      * fix namespace
      905cefd4
  11. 03 1月, 2023 1 次提交
  12. 23 12月, 2022 1 次提交
  13. 22 12月, 2022 1 次提交
  14. 19 12月, 2022 1 次提交
  15. 17 12月, 2022 1 次提交
  16. 16 12月, 2022 1 次提交
  17. 06 12月, 2022 1 次提交
    • Z
      Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38
      zyfncg 提交于
      * delete Bias and ResidualData in OpMaker of conv2d
      
      * delete extra input of conv3d
      
      * refactor pass of conv_bias_fusion
      
      * fix mkldnn dependency
      
      * fix mkldnn compile
      
      * fix test_conv_bias_mkldnn_fuse_pass
      
      * police some code
      
      * remove useless log
      
      * fix analyzer_vit_ocr_tester
      
      * fix conv_activation_mkldnn_fuse_pass
      
      * fix test_analyzer_ocr
      
      * add fused_conv_sig
      
      * fix performence regression
      
      * fix performance regression
      0a2dfa38
  18. 05 12月, 2022 1 次提交
  19. 18 11月, 2022 1 次提交
    • T
      CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b
      Tian Zheng 提交于
      * Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation
      
      * Fix macro
      
      * Add implementation for conv_kernel and conv_grad_kernel
      
      * Modification after rebase onto latest develop
      
      * Modify plan cache to comply with the API of phi::autotune
      
      * Refactor to reduce duplicate code
      
      * Review fix:
      - move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
      - add const specifier for input tensor
      - add logging when plans fail to execute
      - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h
      
      * - move plan building outside of cache
      
      * Fix ROCM build
      14a6e67b
  20. 31 10月, 2022 1 次提交
  21. 20 10月, 2022 1 次提交
  22. 19 9月, 2022 1 次提交
  23. 09 9月, 2022 1 次提交
  24. 06 9月, 2022 1 次提交
  25. 02 9月, 2022 1 次提交
  26. 30 8月, 2022 1 次提交
    • H
      [phi] Transfer coalesce_tensor to phi (#45478) · cf9d651b
      HongyuJia 提交于
      * add coalesce_tensor kernel
      
      * polist coalesce_tensor kernel
      
      * add sig and InferMeta
      
      * add testcase
      
      * add legacy_api.yaml
      
      * fix infermeta
      
      * fix yaml
      
      * fix kernel implementation
      
      * add compile dependency of phi/kernels
      
      * fix MetaConfig
      
      * add python api
      
      * add and fix testcase
      
      * rnn.py add import
      
      * change _C_ops.coalesce_tensor
      
      * remove useless comments
      
      * add SetBackend
      
      * restore XPU kernel temporarily
      
      * fix code according to PR comments
      cf9d651b
  27. 26 8月, 2022 1 次提交
    • K
      Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a
      kangguangli 提交于
      * remove fluid kernel and activate phi kernel
      
      * fix parameter error
      
      * transfer mkldnn part
      
      * modify header file path
      
      * fix compile error
      
      * transfer special case
      
      * fix lod setting and special case for layout setting
      
      * add testcase and refine code
      985f2a4a
  28. 12 8月, 2022 1 次提交
  29. 05 8月, 2022 2 次提交
    • Y
      [MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2
      YuanRisheng 提交于
      * move mkldnn activation kernel
      
      * fix compile bugs
      
      * fix compile bugs
      
      * deal with conflict
      
      * fix compile bugs
      
      * fix windows compile bugs
      
      * mkldnn unittest fix
      
      * change mutable to alloc
      
      * fix unittest bugs
      
      * modify code according comment
      2dfa88d2
    • F
      move fft kernels to phi (#44714) · 153f1138
      Feiyu Chan 提交于
      * move fft kernels to phi, done with cufft, pocketfft, mkl_cdft, hipfft
      * make stft_op use fft from phi/kernels/funcs, clean code
      153f1138
  30. 03 8月, 2022 1 次提交
  31. 01 8月, 2022 1 次提交
  32. 29 7月, 2022 1 次提交
  33. 19 7月, 2022 1 次提交
  34. 16 7月, 2022 1 次提交
    • W
      [Phi] Migrate solve kernel to phi (#44363) · c0a7830f
      Weilong Wu 提交于
      * draft version
      
      * draft version
      
      * draft version
      
      * migrate solve kernel to phi
      
      * polish
      
      * polish
      
      * re useless header file, fix a bug in grad_kernel_impl
      
      * add header file in need
      c0a7830f
  35. 14 7月, 2022 1 次提交
    • Y
      [Phi]Improve the mechanism for mkldnn kernel in PHI (#43941) · e9b4d0be
      YuanRisheng 提交于
      * adapt mkldnn kernel in PHI
      
      * fix ci compile bugs
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix compile bugs
      
      * delete comment
      
      * fix compile bugs in windows-inference
      
      * delete code for converage
      
      * modify code by review
      
      * modify code by review
      
      * add todo
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix unittest bugsx
      e9b4d0be
  36. 29 6月, 2022 1 次提交
  37. 24 6月, 2022 1 次提交
    • Y
      [Phi]Change Copy from Kernel to basic component utils (#43622) · 2739bd73
      YuanRisheng 提交于
      * perfect copy
      
      * deal with conflict
      
      * deal with conflict
      
      * fix compile bugs
      
      * fix unittest bugs
      
      * change code format
      
      * deal with conflict
      
      * modify code by review
      
      * fix ce bugs
      
      * fix ce bugs
      
      * add lo
      
      * perfect code format
      
      * deal with conflicts
      2739bd73
  38. 23 6月, 2022 1 次提交