1. 26 6月, 2023 1 次提交
  2. 20 6月, 2023 1 次提交
    • X
      [XPU] avoid compile issue in non-xpu env (#54711) · e2690526
      XiaociZhang 提交于
      * [kunlun] avoid compile issue in non-xpu env
      
      also rename macro WITH_XPU_XPTI to WITH_XPTI
      
      * move get_xpti_dependency.sh to tools/xpu
      
      * move get_xpti_dependency.sh to tools/xpu
      
      * call get_xpti_dependency.sh only in need
      e2690526
  3. 16 6月, 2023 1 次提交
    • J
      [kunlun] support xpu runtime profiler (#54685) · 82eeda69
      jameszhang 提交于
      * [kunlun] support xpu runtime profiler
      
      * fix cmake error
      
      * add libxpti.so to paddle package
      
      * fix for style check
      
      * sync change in setup.py and python/setup.py.in
      
      * remove libxpti.so from paddle output dir in this PR
      82eeda69
  4. 26 5月, 2023 1 次提交
    • Y
      [PHI Decoupling]Create PHI shared lib (#53735) · da50a009
      YuanRisheng 提交于
      * create phi so
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * add file
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * perfect so
      
      * fix py3 bugs
      
      * delete all static target in phi
      
      * fix windows bugs
      
      * fix py3 bugs
      
      * fix ci bugs
      
      * fix windows bugs
      
      * fix bugs: gflags can't be linked by dynamic and static lib
      
      * fix bugs that can not load 3rd party
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix conflict
      
      * fix xpu bugs
      
      * fix mac compile bugs
      
      * fix psgpu bugs
      
      * fix inference failed
      
      * deal with conflict
      
      * fix LIBRARY_PATH bug
      
      * fix windows bugs
      
      * fix onednn error
      
      * fix windows compile bugs
      
      * fix windows compile bugs
      
      * fix test_cuda_graph_static_mode_error aborted
      
      * fix windows bugs
      
      * fix mac-python3 error
      
      * fix hip compile bugs
      
      * change mode to static
      
      * change to static mode
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix bugs
      
      * add static flag
      
      * add PADDLE_API
      
      * change position of PADDLE_API
      
      * fix windows bugs
      
      * change mode to dynamic lib
      
      * fix windows static bugs
      
      * deal with conflict
      
      * fix windows unit bug
      
      * fix coverage
      
      * deal with conflict
      
      * fix windows-inference
      
      * fix py3 bugs
      
      * fix bugs when compile type_info
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix windows openblas
      
      * fix xpu bugs
      
      * fix enforce_test in windows
      
      * update code according comment
      
      * fix windows cmake bug
      
      * fix windows bugs
      
      * fix windows bugs
      
      * delete cinn unittest
      
      * fix cinn bugs
      
      ---------
      Co-authored-by: HappyHeavyRain's avatarlzydev <1528794076@qq.com>
      da50a009
  5. 14 4月, 2023 1 次提交
  6. 08 4月, 2023 1 次提交
  7. 03 4月, 2023 1 次提交
  8. 06 1月, 2023 1 次提交
  9. 12 12月, 2022 1 次提交
    • Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3
      傅剑寒 提交于
      * fix codestyle
      
      * add double complex<float> complex<double> dtype support for syevj_batched
      
      * fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case
      
      * optimize eigh in different case
      
      * fix missing ; bug
      
      * fix use_syevj bug
      
      * fix use_cusolver_syevj_batched flag
      16e364d3
  10. 03 11月, 2022 1 次提交
  11. 19 10月, 2022 1 次提交
  12. 18 9月, 2022 1 次提交
  13. 14 9月, 2022 1 次提交
  14. 01 8月, 2022 1 次提交
  15. 22 7月, 2022 1 次提交
  16. 18 7月, 2022 1 次提交
  17. 28 6月, 2022 1 次提交
  18. 26 6月, 2022 1 次提交
  19. 24 6月, 2022 1 次提交
  20. 18 6月, 2022 1 次提交
  21. 15 6月, 2022 1 次提交
  22. 13 6月, 2022 1 次提交
  23. 09 6月, 2022 1 次提交
  24. 05 6月, 2022 1 次提交
  25. 04 6月, 2022 1 次提交
  26. 04 5月, 2022 1 次提交
  27. 22 4月, 2022 1 次提交
    • M
      [WIP] Algorithm Cache of cuBlasLt Epilogue (#41010) · 19650d72
      Ming-Xu Huang 提交于
      * Fix leading dimension setting error in fused_gemm_epilogue_grad_op.
      
      * Add dyload to cuBlasLt functions.
      
      * Added cublasLtMatmulAlgoGetHeuristic to improve performance.
      
      * Added FLAGS_cublaslt_exhaustive_search_times to cublasLt epilogue
      
      * Added UTs to FLAGS_cublaslt_exhaustive_search_times
      
      * Added warmup runs in algo searching of Gemm epilogue.
      
      * Update copyright and documents.
      
      * Fixed error handling.
      19650d72
  28. 11 3月, 2022 1 次提交
  29. 28 2月, 2022 2 次提交
  30. 25 2月, 2022 1 次提交
  31. 24 2月, 2022 1 次提交
  32. 20 2月, 2022 1 次提交
  33. 24 1月, 2022 1 次提交
  34. 10 1月, 2022 1 次提交
    • H
      Add gpu kernel for new api : linalg.lstsq (#38621) · 405103d8
      Haohongxiang 提交于
      * add lstsq gpu kernel
      
      * update
      
      * add docs_en
      
      * modify ut
      
      * fix bugs
      
      * modify example in docs_en
      
      * remove lstsq_op.cu from ROCM cmake
      
      * modify docs_en
      
      * modify docs_en
      
      * modify docs_en
      
      * remove unneccessary TensorCopy
      405103d8
  35. 04 1月, 2022 1 次提交
  36. 30 12月, 2021 3 次提交
  37. 29 12月, 2021 1 次提交