1. 29 11月, 2022 1 次提交
    • Y
      [cherry-pick] updating mul and matmul with set_mem_desc and fix... · 9e2ba9b9
      yeliang2258 提交于
      [cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951)
      
      * Fix slice bugs in MKLDNN when input dims are zeros (#46671)
      
      * fix slice bugs
      
      * fix
      
      * update code
      
      * fix
      
      * update code
      
      * updating mul and matmul with set_mem_desc (#45624)
      
      * - mul & matmul changes
      
      - fix
      
      - bs16 correction of strides
      
      * - cosmetic fixes
      
      * - lint
      
      * - fix
      
      * - fix
      
      * - format -> mem_desc
      
      * - fix
      
      * - fix
      
      * - fix
      
      * - fix
      
      * - fix
      
      * fix squueze_transpose (#47911)
      Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
      9e2ba9b9
  2. 09 11月, 2022 1 次提交
  3. 08 11月, 2022 1 次提交
  4. 03 11月, 2022 1 次提交
  5. 28 10月, 2022 1 次提交
  6. 20 10月, 2022 1 次提交
  7. 17 10月, 2022 2 次提交
  8. 11 10月, 2022 1 次提交
    • S
      [cherry-pick] [PHI] relu6_grad kernel (#46501) (#46862) · 2bcbf8b0
      Sławomir Siwek 提交于
      * [PHI] Migrate gelu kernels (#45596)
      
      * gaussian random
      
      * mkldnn to onednn renaming
      
      * fix merge conflicts
      
      * remove fluid code
      
      * onednn renaming
      
      * gelu fwd
      
      * sort activations
      
      * gelu gradient
      
      * remove unused macros
      
      * merge conflicts
      
      * fix merge conflicts
      
      * remove extra contraint from gelu op
      
      * [PHI] relu6_grad kernel (#46501)
      
      * Relu6
      
      * remove fluid handler
      
      * add individual kernel signature
      
      * coding style
      
      * replace bounded_relu with clip
      
      * whitespace
      
      * code style
      2bcbf8b0
  9. 19 9月, 2022 2 次提交
  10. 15 9月, 2022 1 次提交
  11. 14 9月, 2022 2 次提交
  12. 09 9月, 2022 2 次提交
  13. 08 9月, 2022 2 次提交
  14. 07 9月, 2022 1 次提交
  15. 06 9月, 2022 1 次提交
    • C
      Update protobuf output format for profiler (#45724) · 23bc0e3c
      chenjian 提交于
      * update protobuf format
      
      * fix protobuf content
      
      * fix file mode
      
      * fix compiling error when gpu not exists
      
      * fix compiling error when gpu not exists
      
      * fix compiling error when gpu not exists
      
      * fix compiling error when gpu not exists
      
      * support rocm
      23bc0e3c
  16. 05 9月, 2022 3 次提交
  17. 02 9月, 2022 1 次提交
  18. 01 9月, 2022 3 次提交
  19. 29 8月, 2022 3 次提交
  20. 26 8月, 2022 1 次提交
  21. 25 8月, 2022 2 次提交
    • H
      optimize conv algo cache (#41891) · 1cd7e68b
      hong 提交于
      * optimizer conv alog speed
      
      * code polish
      
      * remove useless code
      
      * fix compile error
      
      * fix cpu compile error
      
      * not use cudnn alog t
      
      * add search cache max number
      
      * polish code
      
      * fix cache test bug
      
      * add groups data format to conv args
      
      * fix cache test bug
      
      * fix cudnn_deterministic bug
      
      * fix test switch auto tune bug
      
      * fix test swith autotune bug;
      
      * fix conv cache bug
      
      * fix cache test error
      
      * fix cache test bug
      
      * fix windows mac compile error
      
      * fix workspace search error
      
      * update cudnn cache
      
      * fix cache test bug; test=develop
      
      * fix autotune swith test error
      
      * polish code
      
      * oplish code
      1cd7e68b
    • H
      add temporal shift and grad *test=kunlun (#45300) · 63d9a175
      haosicheng 提交于
      63d9a175
  22. 24 8月, 2022 1 次提交
    • M
      Support fp16 of adam operator in xpu environment (#45292) · a012d426
      mengqingchun02 提交于
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      a012d426
  23. 23 8月, 2022 1 次提交
  24. 22 8月, 2022 1 次提交
  25. 19 8月, 2022 3 次提交
    • H
    • D
      [XPU] add merged_momentum unittest and change momentum (#45241) · e0f1c9f2
      dongfangshenzhu 提交于
      * add merged_momentum *test=kunlun
      
      * add merged_momentum *test=kunlun
      
      * add fp16 to merged_momentum,*test=kunlun
      
      * change dist_model.cc
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      e0f1c9f2
    • M
      Support beam search decode op in XPU environment (#44917) · adaffb7b
      mengqingchun02 提交于
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      adaffb7b
  26. 18 8月, 2022 1 次提交