1. 05 9月, 2022 1 次提交
  2. 02 9月, 2022 1 次提交
  3. 01 9月, 2022 3 次提交
  4. 29 8月, 2022 3 次提交
  5. 26 8月, 2022 1 次提交
  6. 25 8月, 2022 2 次提交
    • H
      optimize conv algo cache (#41891) · 1cd7e68b
      hong 提交于
      * optimizer conv alog speed
      
      * code polish
      
      * remove useless code
      
      * fix compile error
      
      * fix cpu compile error
      
      * not use cudnn alog t
      
      * add search cache max number
      
      * polish code
      
      * fix cache test bug
      
      * add groups data format to conv args
      
      * fix cache test bug
      
      * fix cudnn_deterministic bug
      
      * fix test switch auto tune bug
      
      * fix test swith autotune bug;
      
      * fix conv cache bug
      
      * fix cache test error
      
      * fix cache test bug
      
      * fix windows mac compile error
      
      * fix workspace search error
      
      * update cudnn cache
      
      * fix cache test bug; test=develop
      
      * fix autotune swith test error
      
      * polish code
      
      * oplish code
      1cd7e68b
    • H
      add temporal shift and grad *test=kunlun (#45300) · 63d9a175
      haosicheng 提交于
      63d9a175
  7. 24 8月, 2022 1 次提交
    • M
      Support fp16 of adam operator in xpu environment (#45292) · a012d426
      mengqingchun02 提交于
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      
      * support fp16 of adam operator in xpu environment. test=kunlun
      a012d426
  8. 23 8月, 2022 1 次提交
  9. 22 8月, 2022 1 次提交
  10. 19 8月, 2022 3 次提交
    • H
    • D
      [XPU] add merged_momentum unittest and change momentum (#45241) · e0f1c9f2
      dongfangshenzhu 提交于
      * add merged_momentum *test=kunlun
      
      * add merged_momentum *test=kunlun
      
      * add fp16 to merged_momentum,*test=kunlun
      
      * change dist_model.cc
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      
      * add merged_momentum unittest and  change momentum,test=kunlun
      e0f1c9f2
    • M
      Support beam search decode op in XPU environment (#44917) · adaffb7b
      mengqingchun02 提交于
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * support beam_search operator on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * fix beam_search operator bugs on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      
      * support beam_search_decode operator on xpu. test=kunlun
      adaffb7b
  11. 18 8月, 2022 1 次提交
  12. 17 8月, 2022 1 次提交
  13. 16 8月, 2022 1 次提交
  14. 15 8月, 2022 2 次提交
  15. 12 8月, 2022 2 次提交
    • A
      fix compilation (#45087) · 4eec94dd
      Allen Guo 提交于
      4eec94dd
    • S
      [geometric]Add paddle.geometric.send_ue_recv API (#43174) · 615b15a3
      Siming Dai 提交于
      * add init file
      
      * add op definition and infermeta
      
      * add kernel definition funcs
      
      * add broadcast infer shape
      
      * add gpu forward kernel
      
      * delete SUB and DIV
      
      * add x_grad
      
      * add template
      
      * add e_grad for min and max
      
      * fix small bug
      
      * temp commit
      
      * temp commit
      
      * add e_grad for sum and mean
      
      * fix some compile bug
      
      * fix compile bugs
      
      * fix compile problem
      
      * add sum forward unittest
      
      * fix broadcast error, add kernel sig, register e_grad, change unit test
      
      * fix grad
      
      * add temp grad fix
      
      * temp commit
      
      * add min max unittest
      
      * add max, min unittest, fix mul bug
      
      * add cpu forward sum and mean
      
      * add forward min max, fix mean unittest
      
      * add cpu backward min max
      
      * fix code-style
      
      * add backward sum mean
      
      * fix rocm ci
      
      * set uniitest timeout
      
      * fix bug of x broadcast to e, gpu grad
      
      * fix bug of x broadcast to e, cpu grad
      
      * rename BOOST_GET_CONST macro
      
      * fix rocm ci
      
      * mv graph_send_e_recv to graph_send_ue_recv
      
      * move out_size to IntArray
      
      * add eager op test
      
      * fix max pool type bug, add unittest for api
      
      * revise api doc
      
      * add fp16 for atomic min and max, add unittest
      
      * add unittest
      
      * add fp16 support for graph_send_recv
      
      * fix unittest fp16 bug
      
      * change OutSizeTensor to Out_size
      
      * move E to Y
      
      * add copyright, fix comment
      
      * review code
      
      * fix thread block size
      
      * fix thread block size
      
      * change api attribute name: pool_type to reduce_op, compute_type to message_op
      
      * change api attribute name, move pool_type to reduce_op, move compute_type to message_op
      615b15a3
  16. 11 8月, 2022 1 次提交
  17. 10 8月, 2022 2 次提交
  18. 09 8月, 2022 1 次提交
  19. 08 8月, 2022 1 次提交
  20. 05 8月, 2022 3 次提交
  21. 04 8月, 2022 3 次提交
  22. 03 8月, 2022 2 次提交
  23. 02 8月, 2022 2 次提交
  24. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023