1. 20 1月, 2021 1 次提交
    • W
      use nvtx push pop in timeline (#30567) · 90773473
      wanghuancoder 提交于
      * delete empty line of pybing.cc, test=develop
      
      * use nvtx push pop in timeline, test=develop
      
      * change year, test=develop
      
      * add #ifdef PADDLE_WITH_CUDA, test=develop
      
      * add #ifndef WIN32, test=develop
      
      * is_pushed to is_pushed_, test=develop
      90773473
  2. 06 1月, 2021 1 次提交
  3. 25 12月, 2020 1 次提交
  4. 16 12月, 2020 1 次提交
    • Y
      添加rocm平台支持代码 (#29342) · 76738504
      Y_Xuan 提交于
      * 添加rocm平台支持代码
      
      * 修改一些问题
      
      * 修改一些歧义并添加备注
      
      * 修改代码格式
      
      * 解决冲突后的代码修改
      
      * 修改operators.cmake
      
      * 修改格式
      
      * 修正错误
      
      * 统一接口
      
      * 修改日期
      76738504
  5. 01 12月, 2020 1 次提交
  6. 27 11月, 2020 1 次提交
  7. 23 11月, 2020 1 次提交
  8. 17 11月, 2020 1 次提交
  9. 03 11月, 2020 1 次提交
    • S
      TensorRT中ernie模型推理性能优化,支持变长输入 (#28367) · ea851796
      Shang Zhizhou 提交于
      * fp16 result ok
      
      * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS
      
      * auto detect special slice op converter for ernie with trt oss
      
      * ernie oss only support fp16
      
      * fix special_slice_plugin serialize bug
      
      * matmul in tensorrt ok
      
      * ernie unittest ok
      
      * add matmul tensorrt unittest
      
      * remove demo code
      ea851796
  10. 21 10月, 2020 1 次提交
  11. 19 10月, 2020 1 次提交
  12. 14 10月, 2020 1 次提交
  13. 28 9月, 2020 1 次提交
  14. 27 9月, 2020 1 次提交
    • L
      add support to float64 input of warpctc op. (#27399) · 1501a80f
      Li Fuchen 提交于
      * add float64 input to ctc_loss
      
      * modified error message of  warpctc
      
      * update repo and tag of warpctc
      
      * add test for warpctc with float64 input
      
      * modified warpctc.cmake to make sure build always
      
      * resolved sample code bug of warpctc
      
      * add core.ops in warpctc dygraph
      
      * fix a bug of test
      1501a80f
  15. 24 9月, 2020 2 次提交
    • S
      fix tensorrt 6 build error. test=develop (#27511) · 8f7bb52b
      Shibo Tao 提交于
      * fix tensorrt 6 build error. test=develop
      
      * fix. test=develop
      
      * bug fix
      
      * test=develop
      8f7bb52b
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  16. 23 9月, 2020 1 次提交
  17. 18 9月, 2020 1 次提交
  18. 07 9月, 2020 1 次提交
  19. 03 9月, 2020 1 次提交
  20. 19 8月, 2020 1 次提交
  21. 07 8月, 2020 1 次提交
  22. 05 8月, 2020 2 次提交
    • Z
      [CUDNN8 support] : support CUDNN8 (#25664) · 358bc06c
      Zhaolong Xing 提交于
      * cunn8 support
      test=develop
      
      * fix ci error
      test=develop
      358bc06c
    • P
      Fix registering trt plugin (#25744) · b717895f
      Pei Yang 提交于
      * develop dynamic shape serilization
      
      * add test param for gelu
      
      * fix bugs
      
      * delete redundant comments
      
      * debug
      
      * fix conflict. test=develop
      
      * fix bug. test=develop
      
      * add trt dynamic shape serialized support
      
      * fix ernie serialized bug
      test=develop
      
      * fix codestyle
      test=develop
      
      * fix bug
      test=develop
      
      * fix bug.test=develop
      
      * modify cmakelist test=develop
      
      * fix bug
      test=develop
      
      * fix error message.  test=develop
      
      * fix trt register plugin based on pr#25003
      
      * add trt dynload
      
      * fix deserialization bug of not finding plugin registration
      
      * refine code style
      
      * recover engine key in tensorrt_subgraph_pass
      
      * for ci coverage
      
      * add unittest for deserialization
      Co-authored-by: Nhaozech <chenhaoze94@gmail.com>
      b717895f
  23. 20 7月, 2020 1 次提交
  24. 15 7月, 2020 1 次提交
  25. 09 7月, 2020 2 次提交
  26. 07 7月, 2020 1 次提交
  27. 03 7月, 2020 1 次提交
  28. 02 7月, 2020 1 次提交
  29. 24 6月, 2020 1 次提交
    • C
      Add default cudnn lib path (#25175) · 353ea9e8
      Chen Weihang 提交于
      * add default cudnn lib path, test=develop
      
      * change default path in func, test=develop
      
      * move to linux branch, test=develop
      
      * fix var error in other plat, test=develop
      353ea9e8
  30. 05 6月, 2020 1 次提交
    • C
      Support SelelctedRows allreduce in multi-cards imperative mode (#24690) · 4a702ef3
      Chen Weihang 提交于
      * support selectedrows allreduce in multi-cards dygraph, test=develop
      
      * remove useless import modules in unittests, test=develop
      
      * add nccl cmake to get nccl version, test=develop
      
      * add if-condition to compiled correctly, test=develop
      
      * add detail version parseing for old nccl, test=develop
      
      * polish camke details, test=develop
      
      * fix remove test cmake error, test=develop
      
      * fix cmake condition, test=develop
      
      * change unittest camke list, test=develop
      
      * fix unittest cmake rule, test=develop, test=framep0
      4a702ef3
  31. 18 5月, 2020 1 次提交
    • Y
      Add some check for CUDA Driver API and NVRTC (#22719) · 560c8153
      Yiqun Liu 提交于
      * Add the check for whether CUDA Driver and NVRTC is available for the runtime system.
      
      * Call cuInit to initialize the CUDA Driver API before all CUDA callings.
      test=develop
      
      * Change the behavior when libnvrtc.so can not be found, printing a warning instead of exiting.
      test=develop
      
      * Do not initialize CUDA Driver API for windows and macos.
      test=develop
      
      * Remove the call of cuInit when entering paddle and enable the test_code_generator.
      test=develop
      
      * Add some built-in functions for __half.
      test=develop
      
      * Change save_intermediate_out to false in unittest.
      test=develop
      
      * Fix error reference to tempropary variable when seting including path for device_code.
      test=develop
      560c8153
  32. 08 5月, 2020 1 次提交
  33. 30 4月, 2020 1 次提交
    • G
      Fix cusolver loader for Windows (#24157) · 1fc6cc50
      Guo Sheng 提交于
      * Fix cusolver loader for Windows in dynamic_loader.cc. test=develop
      
      * Fix missing CUSOLVER_ROUTINE_EACH_R1.
      test=gpu
      test=develop
      
      * Add unsupprot for cusolver on Windows temporarily. test=develop
      
      * Fix GetCusolverDsoHandle error message. test=develop
      1fc6cc50
  34. 27 4月, 2020 1 次提交
  35. 24 4月, 2020 1 次提交
    • G
      Add cholesky_op (#23543) · a8c0fb4e
      Guo Sheng 提交于
      * Add cholesky_op forward part. test=develop
      
      * Complete cholesky_op forward part. test=develop
      
      * Add cholesky_op backward part. test=develop
      
      * Complete cholesky_op backward part. test=develop
      
      * Refine cholesky_op error check and docs. test=develop
      
      * Add grad_check unit test for cholesky_op. test=develop
      
      * Fix sample code in cholesky doc. test=develop
      
      * Refine some error messages of cholesky_op. test=develop
      
      * Refine some error messages of cholesky_op. test=develop
      
      * Remove unused input in cholesky_grad. test=develop
      
      * Remove unused input in cholesky_grad. test=develop
      
      * Fix stream for cusolverDnSetStream. test=develop
      
      * Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code.
      test=develop
      
      * Add CUSOLVER ERROR in enforce.h
      test=develop
      
      * Fix the missing return value in cholesky. test=develop
      a8c0fb4e
  36. 10 4月, 2020 2 次提交