1. 30 5月, 2023 4 次提交
  2. 26 5月, 2023 1 次提交
    • Y
      [PHI Decoupling]Create PHI shared lib (#53735) · da50a009
      YuanRisheng 提交于
      * create phi so
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * add file
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * perfect so
      
      * fix py3 bugs
      
      * delete all static target in phi
      
      * fix windows bugs
      
      * fix py3 bugs
      
      * fix ci bugs
      
      * fix windows bugs
      
      * fix bugs: gflags can't be linked by dynamic and static lib
      
      * fix bugs that can not load 3rd party
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix conflict
      
      * fix xpu bugs
      
      * fix mac compile bugs
      
      * fix psgpu bugs
      
      * fix inference failed
      
      * deal with conflict
      
      * fix LIBRARY_PATH bug
      
      * fix windows bugs
      
      * fix onednn error
      
      * fix windows compile bugs
      
      * fix windows compile bugs
      
      * fix test_cuda_graph_static_mode_error aborted
      
      * fix windows bugs
      
      * fix mac-python3 error
      
      * fix hip compile bugs
      
      * change mode to static
      
      * change to static mode
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix bugs
      
      * add static flag
      
      * add PADDLE_API
      
      * change position of PADDLE_API
      
      * fix windows bugs
      
      * change mode to dynamic lib
      
      * fix windows static bugs
      
      * deal with conflict
      
      * fix windows unit bug
      
      * fix coverage
      
      * deal with conflict
      
      * fix windows-inference
      
      * fix py3 bugs
      
      * fix bugs when compile type_info
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix windows openblas
      
      * fix xpu bugs
      
      * fix enforce_test in windows
      
      * update code according comment
      
      * fix windows cmake bug
      
      * fix windows bugs
      
      * fix windows bugs
      
      * delete cinn unittest
      
      * fix cinn bugs
      
      ---------
      Co-authored-by: HappyHeavyRain's avatarlzydev <1528794076@qq.com>
      da50a009
  3. 25 5月, 2023 2 次提交
  4. 24 5月, 2023 1 次提交
  5. 23 5月, 2023 11 次提交
  6. 22 5月, 2023 3 次提交
  7. 19 5月, 2023 5 次提交
    • warrentdrew's avatar
      add minimum grad composite rules (#52561) · 97690816
      warrentdrew 提交于
      * add minimum grad composite rules
      
      * add public python api
      
      * fix format
      
      * fix format
      
      * update testcase
      
      * fix testcase
      
      * fix format
      
      * fix cmakelist.txt
      
      * fix format
      
      * fix param problem
      
      * fix op and composite rule
      
      * fix bf16 cpu support problem
      
      * fix bf16 cpu issue
      
      * fix axis error log
      
      * add axis for maximum
      
      * revert commit
      
      * remove .orig
      
      * fix generic problem
      
      * revert max op
      
      * fix axis error
      
      * fix maximum axis
      
      * fix test_check_output
      
      * fix cinn
      
      * fix minimum maximum axis check
      97690816
    • L
      Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
      limingshu 提交于
      * Reorganize the forward codes of flash-attention.
      
      * Fix forward.
      
      * Remove some noused codes.
      
      * Simplify codes and fix backward.
      
      * Change all LOG(INFO) to VLOG and fix the backward.
      
      * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes
      
      * decrease the effect of debug print on performance
      
      * Unify the initialize of flashattn arguments.
      
      * Rewirte the reshape of temp_mask and temp_bias.
      
      * API support use_flash_attn.
      
      * Fix compiling error on CI.
      
      * Try to crop the flash-attention lib.
      
      * Correct the condition of whether can use flash-attn.
      
      * Remove the softmax_out argument.
      
      * Remove is_causal.
      
      * Polish codes.
      
      * Fix qkv_transpose_out's shape and scaling of Q * K.
      
      * Update commit of flash-attention.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      d29c1f8e
    • G
      test,test=develop (#53811) · 10758725
      Galaxy1458 提交于
      10758725
    • X
      【prim】merge branch for GradOpMaker codeGen to clear code (#53874) · 6cb53e91
      xiaoguoguo626807 提交于
      * review
      
      * modify opcompat bug
      
      * modify pybind
      6cb53e91
    • R
      [CustomDevice] fix buffered reader exception (#53925) · b922e711
      ronnywang 提交于
      b922e711
  8. 18 5月, 2023 6 次提交
    • H
      Fused elementwises kernels and ops (#51427) · fb4a6ecf
      Hulek 提交于
      * Fused elementwises kernels and ops
      
      * change fuse pass name
      
      * adjust .pbtxt files
      
      * adjust quantization attributes
      
      * add missing arguments and fix others, review fixed
      
      * simplify fused kernel registration
      
      * fix elementwise unit tests
      
      * reuse one fused elementwise op
      
      * adjust proto
      
      * Add supported datatypes
      
      * Change 'Scale' to 'scale' in tests, change some tests to onednn
      
      * Revert breaking changes
      
      * Fix unit tests
      
      * Delete obsolete test cases
      
      * Delete commented out code
      
      * Fix codestyle
      
      * delete temporary condition
      
      * fix conflicts and delete duplicate fusing
      
      * Fix code after merge
      
      * Move tests to new directory
      
      * fix tests volatility
      
      * Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py
      
      * Update CMakeLists.txt add mkldnn op test
      
      ---------
      Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
      fb4a6ecf
    • H
      move fusion_group kernel to phi (#53781) · 26da689d
      huangjiyi 提交于
      26da689d
    • W
      move sequence_mask op InferShape func (#53782) · a862debf
      Wang Xin 提交于
      * move sequence_mask op InferShape func
      
      * add dtype infer
      a862debf
    • C
      Fix typos in elementwise dir (#53907) · 2782b291
      co63oc 提交于
      2782b291
    • R
      support auto generate for op layer_norm (#53178) · 4f07b653
      RedContritio 提交于
      * simplify layer_norm_op.cc
      
      * support auto generate for op layer_norm
      
      * update unittest for composite_layer_norm
      
      * remove layer_norm_op.cc from scripts
      
      * replace layer_norm_op with generated_op
      
      * add get_expected_kernel for layer_norm
      
      * update cmake kernel register function for layer_norm_mkldnn_op
      4f07b653
    • C
      Fix typos in send_v2_op.cu.cc (#53904) · 65ce6886
      co63oc 提交于
      65ce6886
  9. 17 5月, 2023 1 次提交
  10. 16 5月, 2023 6 次提交
    • G
      remove some [-Wunused-parameter] warning and fix a file to pass cpplint (#53814) · 10a38b4e
      Galaxy1458 提交于
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      10a38b4e
    • X
      【static】modify backward prune logic for EmptygradOpMaker (#53746) · 69161a96
      xiaoguoguo626807 提交于
      * add rules
      
      * modify no kernel yaml parse
      
      * success op generate
      
      * success test_silu_double
      
      * modify bug
      
      * modify static error
      
      * modify silu_grad input
      
      * modify kernel signature
      
      * modify kernel signature
      
      * code style
      
      * code style
      
      * review
      
      * delete opinfo modify
      
      * modify gradOpMaker
      
      * modify gradOpMaker
      
      * modify genarated-j2
      
      * add approve rules
      
      * modify aytograd_functional_static_test
      69161a96
    • H
      move cudnn_lstm kernel to phi (#53730) · 52889e38
      huangjiyi 提交于
      * update
      
      * fix bug
      
      * test
      
      * test
      
      * update
      
      * update mutable_data
      
      * fix bug
      
      * update
      
      * fix bug
      
      * update output type reg
      
      * update
      
      * update
      52889e38
    • 昇腾和寒武纪相关代码退场 npu相关代码退场3 (#53699) · 5b054d2f
      张春乔 提交于
      * rm npu
      
      * rm use_npu
      
      * rm npuid
      
      * rm use_npu
      
      * rm npuid
      
      * delete npupinned
      
      * roll back sth.
      
      * roll back sth.
      
      * delete npupinned
      
      * roll back sth.
      
      * roll back sth.
      
      * rm npu
      
      * rollback something
      
      * rollback npu identity
      
      * rollback npu identity
      5b054d2f
    • S
      Move fused batchnorm to Phi (#53476) · 5e5481d8
      Sonder 提交于
      * trans fused batch norm Compute function
      
      * trans batch norm register info to phi
      
      * trans fused batch norm grad Compute
      
      * trans batch norm grad register info
      
      * add sig file
      
      * update sig file
      
      * Update fused_bn_activation_kernel.cu
      
      * Update fused_bn_activation_grad_kernel.cu
      
      * fix
      
      * Rename fused_bn_activation_kernel_grad.cu to fused_bn_activation_kernel.cu
      
      * fix
      
      * fix
      
      * fix CudnnDataType error
      
      * fix
      
      * fix include
      
      * update
      
      * add #if
      
      * add fused bn act to cmakelist.txt
      
      * update  cmakelist
      
      * fix #ifdef error
      
      * add timeout set
      
      * add env set
      
      * fix
      
      * fix
      
      * Update fused_bn_activation_sig.cc
      5e5481d8
    • W
      static graph autogen code support for softmax op (#53581) · 312f0187
      Wang Xin 提交于
      * static graph autogen code support for softmax op
      
      * bug fixed
      
      * fix PR-CI-Windows error
      
      * fix CI error
      
      * bug fixed
      
      * fix conflicts
      312f0187