1. 19 8月, 2022 3 次提交
    • D
      [XPU] add merged_momentum unittest and change momentum (#45241) · e0f1c9f2
      dongfangshenzhu 提交于
      * add merged_momentum *test=kunlun
      
      * add merged_momentum *test=kunlun
      
      * add fp16 to merged_momentum,*test=kunlun
      
      * change dist_model.cc
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      e0f1c9f2
    • C
      fix some auto code generation bugs (#45232) · 9556c688
      Charles-hit 提交于
      * 修复生成动态图代码时,如果输出没有配置名字,会导致下标越界的问题。
      
      * decide forward_return[0] is not none
      
      * 修改反向yaml前向输出只有一个时,未配置名字,那么输出自动生成为out
      
      * modify code style
      9556c688
    • M
      Support beam search decode op in XPU environment (#44917) · adaffb7b
      mengqingchun02 提交于
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      adaffb7b
  2. 18 8月, 2022 7 次提交
  3. 17 8月, 2022 10 次提交
  4. 16 8月, 2022 11 次提交
  5. 15 8月, 2022 7 次提交
    • Y
    • Z
      Refine TRT unit test (#45102) · 3512bf11
      zlsh80826 提交于
      * Reduce pool2d test configuration
      
      * Reduce depthwise_conv2d test configuration
      
      * Reduce trt_convert_conv2d_fusion test configuration
      
      * Reduce trt_convert_conv2d test configuration
      
      * Reduce trt_convert_conv2d_transpose test configuration
      
      * Reduce trt_convert_hard_swish test configuration
      
      * Enhance trt auto scan test error message and mechanism
      
      * Increase FP16 trt ut tolerance
      3512bf11
    • Z
      add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
      zhangyikun02 提交于
      6815c8ab
    • H
      [jit] rm useless property pybind (#44962) · 8788513b
      Hui Zhang 提交于
      * rm useless pybind
      
      * rm useless ut
      8788513b
    • Y
      [Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe
      Yulong Ao 提交于
      * [Auto Parallel] Move the distributed info from python to c++
      
      * [Auto Parallel] Add dist_attrs for VarDesc and OpDesc
      
      * [Auto Parallel] Add the lost file
      
      * [Auto Parallel] Make the dist attr be unique_ptr
      
      * [Auto Parallel] Add the proto conversion
      
      * [Auto Parallel] Improve the proto support
      
      * [Auto Parallel] Fix the bugs for adding a device or a link
      
      * [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper
      
      * [Auto Parallel] Improve the impl of these dist attrs
      
      * [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh
      
      * [Auto Parallel] Fix the unittest problem
      
      * [Auto Parallel] Explicitly add the src file for auto_parallel target
      
      * [Auto Parallel] Add the proto depedency explicitly
      
      * [Auto Parallel] Fix the cmake bug on windows and mac
      
      * [Auto Parallel] Remove the pybind11 header file in process_mesh.h
      
      * [Auto Parallel] Remove unused codes
      
      * [Auto Parallel] Check whether the dist attr is null
      
      * [Auto Parallel] Implement the assign operator for OpDesc explicitly
      a52357fe
    • H
      [XPU] add some collective ops. (#45049) · 7e2a20d5
      houj04 提交于
      * [XPU] add some collective ops. test=kunlun
      
      * use XPUOpTestWrapper. test=kunlun
      
      * skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
      7e2a20d5
    • W
      convert_fp16 support multi block (#45050) · 9aecf286
      Wilber 提交于
      * convert_fp16 support multi block
      
      * update
      
      * update
      9aecf286
  6. 14 8月, 2022 1 次提交
  7. 13 8月, 2022 1 次提交
    • L
      Refine program cache (#45005) · e96dae8b
      Leo Chen 提交于
      * add cached_serialize_str_
      
      * support program hash
      
      * add sha
      
      * add ut
      
      * use hash_str only for new_exe
      
      * fix attr order
      e96dae8b