1. 13 4月, 2023 2 次提交
  2. 12 4月, 2023 4 次提交
    • Z
      Optimize performance of unique kernel (#52736) · 8cbeefea
      Zhang Zheng 提交于
      * Optimize performance of unique kernel
      
      * fix ci
      8cbeefea
    • W
      [AMP OP&Test] add fp16/bf16 unittest for pool2d op (#52288) · f9b155f9
      Wei Shengyu 提交于
      * add bf16 support and bf16/fp16 unittest for pool2d
      
      * add include files
      
      * dbg
      
      * reformat
      
      * reformat
      
      * modify code according to review comment
      
      * remove duplicate code
      
      * remove dup code
      
      * remove useless include
      
      * dbg
      f9b155f9
    • W
      Patch del (#52754) · 189e0d44
      wangzhen38 提交于
      * [DO NOT MERGE] adadelta lr support
      
      * [DO NOT MERGE] gpu support
      
      * [test] follow torch
      
      * fix acc update order
      
      * for ci
      
      * [bug fix] update master para
      
      * [bug fix] update test
      
      * [bug fix] for ci test
      
      * for ci
      
      * fix xpu
      
      * [adadelta fix] del fluid head file
      
      * for ci
      
      * del notes
      189e0d44
    • G
      [AMP OP&Test] support bf16 for batch norm (#52407) · 523f8a26
      Guoxia Wang 提交于
      * [AMP OP&Test] support bf16 for batchnorm
      
      * codestyle
      
      * Update batch_norm_grad_kernel.cu
      
      * Update batch_norm_kernel.cu
      
      * fix codestyle
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * Update batch_norm_kernel.cc
      523f8a26
  3. 11 4月, 2023 7 次提交
  4. 10 4月, 2023 9 次提交
  5. 09 4月, 2023 2 次提交
  6. 07 4月, 2023 2 次提交
  7. 06 4月, 2023 10 次提交
    • Y
      fix build bug (#52566) · 6c01ce8a
      yuehuayingxueluo 提交于
      6c01ce8a
    • S
      Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d
      Sławomir Siwek 提交于
      * replace matmul with matmul_v2 in fuse passes
      
      * Remove fusion logic from matmul
      
      * removing fusion methods
      
      * add proper name
      
      * adjust namespaces
      
      * clean attrs in python tests
      
      * delete checkpoint and restore matmul version
      
      * remove unused code
      
      * matmul and reshape/transpose fuses migrated
      
      * split MatmulOneDNN headers
      
      * fuse activation and eltwise_add
      
      * add fuse_activation
      
      * matmul_transpose_reshape/reshape_transpose_matmul
      
      * matmul + elementwise_add (fused)
      
      * activation temporary modifciation
      
      * restore matmul(v1) version 0
      
      * merge newest develop
      
      * remove depedency from other PR
      
      * revert pbtxt
      
      * remove placeholders from matmul_v2
      
      * add description in OPMaker
      
      * remove matmul_v2_op.h and all depedencies
      
      * remove dims changing in base op
      
      * add possibility to fuse already fused_matmul
      
      * restart broken CI
      
      * Empty-Commit
      
      * revert matmul_utils.h
      
      * codestyle
      
      * adjust imports
      
      * add pbtxt file
      
      * 100% matmul unit tests coverage
      
      * trigger CI with minimal changes to develop
      
      * adjust changes to develop
      
      * add fused_matmul op
      
      * inherit base ops
      
      * add "v2"
      
      * move OPMaker
      
      * Gradually add fused_matmul files
      
      * second batch of fused_matmul changes
      
      * split infershapes of matmul_v2 and fused_matmul
      
      * merge code from other PR
      
      * 2023
      
      * inherit fused_matmul from matmul_v2
      
      * Update paddle/phi/backends/onednn/onednn_reuse.h
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      * resolve conflicts
      
      * codestyle
      
      * simplify isgemmlinear
      
      * 2023
      
      * remove import
      
      * reuse methods
      
      * matmul_v2_mkldnn cleanup
      
      * simplify ExecuteMatMulV1Grad
      
      * matmul refactored
      
      * fc
      
      * SetOutMemDescWithLogicalLayoutFusesSupport
      
      * matmul_v2
      
      * alpha support
      
      * group repetetive funcs
      
      * matmul utils
      
      * execute matmul methods
      
      * restore registered kernel names
      
      * split header and impl files
      
      * remove double negatives
      
      * reduce numer of modified files
      
      * adjust ExecuteMatmul
      
      * add scales for ut
      
      * dates
      
      * limit number of modified files
      
      * fluid imports
      
      * remove alpha
      
      * codestyle
      
      ---------
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      4d97b25d
    • S
      Move fused_attention op to phi [迁移前向 GPU OpKernel] (#51743) · a7ec8958
      Sonder 提交于
      * add kernel functions
      
      * update kernel functions
      
      * update func parameters' name
      
      * create codes for gpu device
      
      * 调整文件位置
      
      * fix include error
      
      * remove dependent files to phi/
      
      * restore fused_attention_op.cu
      
      * fix dependence errors
      
      * fix dependence errors
      
      * fix include error
      
      * fix all depandence errors[build success]
      
      * remove useless include
      
      * recover useless include
      
      * use phi::ToNCCLDataType
      
      * fix namespace
      
      * update new register code
      
      * fix error in fused_gemm_epilogue_utils
      
      * fix error in FusedAttentionKernel parm
      
      * finish fused_attention registe code[build success]
      
      * add paddle::optional
      
      * add sig file
      
      * fix build error
      
      * fix a include error
      
      * update CMkaeList
      
      * fix parameter sequence
      
      * add include file
      
      * update #if before include
      
      * fix grammly error
      
      * update codes for DropoutParam
      
      * remove const cast
      
      * trans some fluid api to phi api
      
      * add #if
      
      * update test code
      
      * update test codes
      
      * recover test codes
      
      * trans fused_attention to fluid
      
      * move #endif to end
      
      * move #endif
      
      * delete useless files
      
      * use fused attention utils and recover random seed
      
      * remove fluid include in phi
      a7ec8958
    • mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
      张春乔 提交于
      80dd1672
    • Z
      Rename conv2d transpose grad grad (#52371) · 49bbd466
      zhangyuqin1998 提交于
      * Rename conv2d transpose grad grad
      
      * fix
      49bbd466
    • C
      fix backend bug (#52526) · 380a9bf7
      Chitsing KUI 提交于
      380a9bf7
    • S
      Fix flash attention bug (#52551) · 8ac5a6b6
      sneaxiy 提交于
      * fix flash attn
      
      * fix another API
      8ac5a6b6
    • Z
      [PHI] Adjust files of fusion kernel in PHI (#52420) · 84bb7a96
      zyfncg 提交于
      * update readme
      
      * remove unused header file
      
      * fix bug
      
      * fix onednn
      
      * fix onednn
      
      * rename header file
      84bb7a96
    • L
      【PaddlePaddle Hackathon 4】No.63 add fp16 and bf16 for eye and frame (#51819) · ae10133a
      LoneRanger 提交于
      * add fp16 and bf16 for eye and frame
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * Update test_frame_op.py
      
      fix code style
      
      * fix bug
      
      * fix bug
      ae10133a
    • W
      [AMP OP&Test]Add fp16/bf16 support logical op (#52112) · b10e4577
      WJJ1995 提交于
      * fixed glog
      
      * add
      
      * add bfloat16 test for logical op
      
      * rm useless code
      
      * add uint16
      
      * deal with comments
      
      * fixed code style
      
      * fixed code style
      
      * fixed for ci
      
      * deal with comments
      
      * fixed for ci
      b10e4577
  8. 04 4月, 2023 4 次提交
    • C
      【Hackathon No.62】增加pool3d算子BF16及单测,lgamma, masked_select FP16/BF16算子单测 (#51837) · b0dbf9fe
      chenxujun 提交于
      * Add pool3d lgamma masked_select tests
      
      * Fix code
      b0dbf9fe
    • Y
      fix xpu compile bugs (#52501) · 81054ad4
      YuanRisheng 提交于
      81054ad4
    • R
      Improve new executor static build (#51149) · 5bac67d4
      Ruibiao Chen 提交于
      * Improve new executor static build
      
      * Skip GC for static build
      
      * Skip infershape for static build
      
      * Handle read_op
      
      * Add fused_attention to OpsWithFluidKernelNeedMoveToPhi
      
      * Fix argsort typos
      
      * Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi
      
      * Fix skip share lod errors
      
      * Fix errors for adam
      
      * Fix errors for eigvals, memcpy and fake_quantize
      
      * Add static_build.cc
      
      * Add black list
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Fix TensorArray
      
      * Fix TensorArray
      
      * Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel
      
      * Fix copy
      
      * Fix errors
      
      * Fix momentum
      
      * Skip mkldnn
      
      * Fix CI errors
      
      * Fix c_sync_calc_stream_op
      
      * Fix CINN
      
      * Fix while op
      
      * All CI pass, disable FLAGS to merge code, enable it after more tests in future
      
      * Add UTs
      
      * Fix typos
      
      * Fix typos
      
      * Add mkldnn UT
      
      * Remove mkldnn test
      
      * Fix typos
      
      * Fix dist test
      
      * Fix typos
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Add UTs
      
      * Fix typos
      
      * Fix typos
      
      * Add sparse tests
      
      * ToComplexType -> ToComplex
      
      * Add test_matmul_op_static_build to disable_win_inference_test
      5bac67d4
    • Z
      rename_bilinear_tensor_product (#52375) · 34069c46
      zhangyuqin1998 提交于
      * rename_bilinear_tensor_product
      
      * fix
      34069c46