1. 05 5月, 2023 2 次提交
  2. 04 5月, 2023 1 次提交
  3. 28 4月, 2023 1 次提交
  4. 27 4月, 2023 4 次提交
  5. 26 4月, 2023 1 次提交
  6. 25 4月, 2023 3 次提交
  7. 24 4月, 2023 4 次提交
  8. 23 4月, 2023 2 次提交
    • R
      apply gcc12 to gpups (#52960) · cbfd43e4
      risemeup1 提交于
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpips
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * test
      
      * test
      
      * apply gcc12 to gpups
      
      * apply_gcc12_to_gpups
      
      * fix compiler bug
      
      * fix compiler bug
      
      * test
      
      * fix dangling-pointer compiler
      
      * fix dangling-pointer compiler
      
      * fix dangling-pointer compiler
      
      * apply_gcc12_to_gpups
      
      * apply gcc12 to gpups
      
      * Update cuda_streams_py.cc
      cbfd43e4
    • G
      remove some [-Wunused-parameter] (#53162) · b02687cc
      Galaxy1458 提交于
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      b02687cc
  9. 21 4月, 2023 4 次提交
  10. 20 4月, 2023 4 次提交
  11. 19 4月, 2023 4 次提交
  12. 18 4月, 2023 3 次提交
  13. 17 4月, 2023 4 次提交
    • Z
      [Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a
      zhoutianzi666 提交于
      * initial commit for cutlass_teller
      
      * second commit for cutlass_teller
      
      * add conv2d_depthwise python template
      
      * add conv2d_depthwise cutlass template
      
      * /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h
      
      * refine code in Conv2dFusionCanSupport
      
      * add macro in cutlass_teller.h
      
      * add 3x3 5x5 teller
      
      * add groups not 1 or conv2d_depthwise teller
      
      * 只生成ic是8的倍数的conv2d_depthwise 的kernel
      
      * add EXPLICIT in cutlass_teller.h
      
      * final commit
      
      * add split_k_slices in conv2d_depthwise
      
      * make stages == 2
      
      * 重构部分代码
      
      * add CutlassFusionType
      
      * solve illegal memory
      
      * make stride_h=stride_w && make dilation==1
      
      * must check HasAttr(use_cutlass) before GetAttrIfExists
      
      * add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String
      
      * modify decl.h and util.cu
      bd3b096a
    • G
      remove some [-Wunused-paramter] warning (#52924) · 337cc2ca
      Galaxy1458 提交于
      337cc2ca
    • S
      Add output defs for some kernelsPhi register (#52941) · 23f87442
      Sonder 提交于
      * add register info for eigh and eig_gard
      
      * add sync_batch_norm_op.cu register info
      
      * add lamb output register info
      
      * add unique register info
      
      * change type name
      
      * change type name
      
      * add output register info for check_finite_and_unscale
      
      * update cmake and config file
      
      * add register info for adagrad
      
      * fix build error
      
      * add sync to run_unittests.sh
      
      * add register info for unique_consecutive
      
      * fix build error
      
      * add eigh to STATIC_BUILD_TESTS
      
      * update eig_kernel.cc
      
      * update eig_kernel.cc
      
      * fix infer mate error
      
      * fix unique register error
      
      * fix lamb register info error
      
      * fix lamb register info
      
      * update lamb register info
      
      * fix lamb
      
      * remove one Output Register
      
      * update static build file
      
      * add eigh op to disable_wingpu_test
      
      * update run_unittests
      23f87442
    • H
  14. 14 4月, 2023 3 次提交
    • J
      delete SupportNPU(), SupportMLU() (#52911) · 8601859e
      jjyaoao 提交于
      * delete SupportNPU(), SupportMLU()
      
      * delete npu branch
      8601859e
    • F
      1. modify set_value op, use Scalars to represent attr `values`, instead of a... · dd2a749a
      Feiyu Chan 提交于
      1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408)
      
      2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition);
      3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version;
      3. provide an option `legacy_format=false` in  serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute;
      4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.
      dd2a749a
    • Z