1. 20 7月, 2023 8 次提交
    • L
      [XPU][PHI Kernels] bind reduce_max_int64 set_value_bool sin_grad_fp32... · ab00c96c
      lijin23 提交于
      [XPU][PHI Kernels] bind reduce_max_int64 set_value_bool sin_grad_fp32 cos_grad_fp32 for XPU (#55375)
      
      * bind kernels for xpu
      
      * format code
      
      * format code
      
      * 0d support for set value
      
      * refine set_value
      ab00c96c
    • M
      fix bug of constant folding pass (#55556) · bc61c796
      ming1753 提交于
      bc61c796
    • X
      [Kunlun] Modify some legacy code on distributed training (#55515) · 806f8d2b
      XiaociZhang 提交于
      * [Kunlun] Mofify some legacy code on distributed training
      
      There were limitations on XPUs before, such as concat/split is not
      supported, and c_broadcast only support fp32. These limitations are
      lifted recently.
      
      Multi-device profiling on XPU will also be supported by this PR.
      Without this PR, a hanging broadcast will be issued by devices that
      enables profiling, eventually lead to kernel timeout error.
      
      * fix typo
      806f8d2b
    • zhenhailiu's avatar
      shard grad reduce (#55495) · 284e0d12
      zhenhailiu 提交于
      284e0d12
    • J
      [Semi Auto] Entropy SPMD Rule (#55394) · 5f376f00
      JZ-LIANG 提交于
      * base rule
      
      * add sharidng merge
      
      * add sharidng axis merge
      
      * define unified data class for inferencing dist_attr
      
      * test wrap DistTensorSpec in dygraph mode
      
      * matmul main logic done
      
      * shape int64
      
      * common cc
      
      * define unified data class for inferencing dist_attr
      
      * test wrap DistTensorSpec in dygraph mode
      
      * define python api and wrap function in static mode for DistTensorSpec
      
      * revise syntax
      
      * map bugfix
      
      * broadcast func
      
      * compile 1
      
      * add unitest
      
      * add registry
      
      * update unitest
      
      * bugfix
      
      * bugfix
      
      * add pybind
      
      * bugfix
      
      * bugfix macro gloabl name space
      
      * bugfix macro gloabl name space
      
      * pybind
      
      * pybind test
      
      * pybind bugfixed1
      
      * pybind bugfixed2
      
      * pybind unitest
      
      * merge dev
      
      * merge dev
      
      * merge dev
      
      * fixed cmake conflict
      
      * fixed cmake conflict
      
      * rename get method
      
      * revise inferforward output type
      
      * revise comment
      
      * replicated rule
      
      * replicated rule 2
      
      * revert bug deps
      
      * add rule
      
      * add unitest
      
      * add rule
      
      * add unitest
      
      * move ut of auto_parallel
      
      * fix ut
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * bugfix
      
      * resolute input sharding conflict maybe
      
      * fixed comment
      
      * add rule
      
      * add unitest
      
      * fixed typoes
      
      ---------
      Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com>
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      5f376f00
    • K
      fix data load error in static mode (#55541) · 746e7cdc
      Kai Song 提交于
      746e7cdc
    • Z
      [IR] Add variable name prefix for BuildScope (#55536) · 44f409cf
      zhangbo9674 提交于
      * add interface
      
      * add code
      
      * add code
      
      * add code
      
      * add code
      
      * fix bug
      
      * fix bug
      
      * add var prefix
      44f409cf
    • Y
      pp comm overlap use tensor fusion helper (#55540) · 1f79fd47
      Yuang Liu 提交于
      1f79fd47
  2. 19 7月, 2023 22 次提交
  3. 18 7月, 2023 10 次提交