1. 17 7月, 2023 9 次提交
  2. 14 7月, 2023 6 次提交
  3. 13 7月, 2023 11 次提交
  4. 12 7月, 2023 8 次提交
    • H
      276c159d
    • J
      [Semi Auto] Softmax SPMD Rule (#55196) · 885d1aec
      JZ-LIANG 提交于
      * resolute input sharding conflict maybe
      
      * fixed comment
      
      ---------
      Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      885d1aec
    • H
    • Y
      [Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
      Yuanle Liu 提交于
      * rewrite identity_op_clean_pass
      
      * fix
      
      * adjust identity_op_clean_pass order in gpu passes
      
      * fix ut
      2363e623
    • W
      [bug fix] gpups ci (#55314) · 766fcdf0
      wangzhen38 提交于
      766fcdf0
    • H
      Support selected rows new ir (#54987) · fc66b5d7
      hong 提交于
      * refine program translator
      
      * fix warning: not override
      
      * fix bug
      
      * merge new modifications
      
      * modify by reviews
      
      * resolve conflicts
      
      * resolve conflicts
      
      * fix
      
      * fix
      
      * update
      
      * support selected rows
      
      * update
      
      * add selectrows
      
      * fix bug
      
      * add ut
      
      * refine code
      
      * refien code
      
      * update
      
      * update
      
      * support selected rows
      
      * support selected rows
      
      * support dense tensor
      
      * remove useless code
      
      * polish code
      
      * remote standalone executor test
      
      ---------
      Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
      Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
      fc66b5d7
    • Y
      [ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7
      YangQun 提交于
      * squash pick the poc code
      * fix build after rebase
      * fix int8 conv and fc uts
      * Fix and clean-up Get_SRC_Scale_Memory
      * fix floating point fc uts
      * fix test_analyzer_int8_googlenet
      * test_analyzer_int8_mobilenetv1
      * fix int8 mobilenet v2 and v3
      * fix build error after rebase
      * [oneDNN] rename library version
      * fix conv bias datatype
      * try to fix import error
      * fix rebase error
      * [oneDNN] pack library into python wheel
      * add MKLDNN_SHARED_LIB_3 to env_dict
      * fix test_analyzer_bert
      * fix fill_constant op kernel
      * fix ernie and matmul op ut
      * fix softplus ut
      * fix conv+relu6 fusion ut
      * fix hardswish fusion
      * fix quant+transpose fusion ut
      * fixsgd ut
      * fix int8 matmul with flatten
      * fix fc+scale fusion
      * fix conv/matmul+gelu fusion uts
      * fix rebase error
      * Revert "fix conv/matmul+gelu fusion uts"
      This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
      * upgrade to onednn v3.1
      * remove older version onednn
      * use densetensor::data() for achieving mean and var in layernorm impl
      * comments for atol of integer tests
      * fix clang-format
      * Revert "remove older version onednn"
      This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
      * improve binary handle
      * fix expand kernel
      * Revert "use densetensor::data() for achieving mean and var in layernorm impl"
      * always use forward_inference for conv
      * remove activation scales
      * rollback changes to mkldnn.cmake
      * address comments
      * port changes to dequantize kernel
      * fix merge error
      * fix fused_elementwise_kernel
      * upgrade onednn version to v3.1.1
      * fix some approval error
      * fix error msg format
      * remove old onednn libs
      * try to fix symbolic link issue
      * fix cinn test case segfault
      * do not explicit link test with onednn
      * remove unnecessary changes
      * integrate CINN with onednn v3
      * link with mkldnn project
      * fix cinn build file
      
      ---------
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
      Co-authored-by: Ntianshuo78520a <707759223@qq.com>
      cfa513f7
    • W
      [clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7
      Wang Xin 提交于
      * [clang-tidy] enable readability-container-size-empty check
      
      * fix test_custom_kernel Failed
      
      * add clang-tid-10 in dockerfile
      
      * add clang-tidy in dockerfile
      
      * fix bug
      be3a6fa7
  5. 11 7月, 2023 6 次提交
    • P
      support sharding parallel (#54634) · b7a05057
      pangengzheng 提交于
      * support sharding parallel
      
      * fix name
      
      * fix
      
      * update
      
      * test amp for sharding
      
      ---------
      
      Co-authored-by: pangengzheng <pangengzheng.baidu.com>
      b7a05057
    • L
      replace the AdagradOptimizer... · 94365855
      LoneRanger 提交于
      replace the AdagradOptimizer 、adamaxOptimizer、AdadeltaOptimizer、RMSPropOptimizer、LambOptimizer and Momentum (#54152)
      
      * replace the AdadeltaOptimizer with Adadelta
      
      * replace the RMSPropOptimizer with RMSProp
      
      * replace the LambOptimizer with lamb
      
      * replace the momentum in contrib/optimizer.py with Momentum in python/paddle/optimizer/momentum.py
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug of Lamp
      
      * fix bug of Lamp
      
      * fix bug of import
      
      * replace the AdamaxOptimizer with Admax and change the optimizer base for AdagradOptimizer
      
      * fix bug
      
      * fix bug
      
      * Update optimizer.py
      
      * fix bug
      
      * fix bug
      94365855
    • MarDino's avatar
      Integrate rmsnorm kernel (#54998) · 97d3d6ee
      MarDino 提交于
      * add rmsnorm kernel
      * add static graph test
      * fix round type
      * use alignas to avoid msvc compile error
      * remove redundant headerfile to avoid rocm compile error
      * fix rocm compile not found cub
      * Add document
      97d3d6ee
    • [CodeStyle][CINN] ruff F403 in test/cinn (#55255) · f4bdfa60
      张春乔 提交于
      f4bdfa60
    • Z
      [IR] Add op compat info for grad op (#55277) · b4d7e1e0
      zhangbo9674 提交于
      * fix bug
      
      * fix bug
      
      * fix bug
      b4d7e1e0
    • H
      [0D-Tensor] Support isclose and polish codes (#55292) · 036c0ae1
      HongyuJia 提交于
      036c0ae1