1. 16 9月, 2020 1 次提交
  2. 15 9月, 2020 3 次提交
  3. 07 9月, 2020 1 次提交
  4. 03 9月, 2020 1 次提交
  5. 19 8月, 2020 1 次提交
  6. 07 8月, 2020 1 次提交
  7. 05 8月, 2020 2 次提交
    • Z
      [CUDNN8 support] : support CUDNN8 (#25664) · 358bc06c
      Zhaolong Xing 提交于
      * cunn8 support
      test=develop
      
      * fix ci error
      test=develop
      358bc06c
    • P
      Fix registering trt plugin (#25744) · b717895f
      Pei Yang 提交于
      * develop dynamic shape serilization
      
      * add test param for gelu
      
      * fix bugs
      
      * delete redundant comments
      
      * debug
      
      * fix conflict. test=develop
      
      * fix bug. test=develop
      
      * add trt dynamic shape serialized support
      
      * fix ernie serialized bug
      test=develop
      
      * fix codestyle
      test=develop
      
      * fix bug
      test=develop
      
      * fix bug.test=develop
      
      * modify cmakelist test=develop
      
      * fix bug
      test=develop
      
      * fix error message.  test=develop
      
      * fix trt register plugin based on pr#25003
      
      * add trt dynload
      
      * fix deserialization bug of not finding plugin registration
      
      * refine code style
      
      * recover engine key in tensorrt_subgraph_pass
      
      * for ci coverage
      
      * add unittest for deserialization
      Co-authored-by: Nhaozech <chenhaoze94@gmail.com>
      b717895f
  8. 20 7月, 2020 1 次提交
  9. 15 7月, 2020 1 次提交
  10. 09 7月, 2020 2 次提交
  11. 07 7月, 2020 1 次提交
  12. 03 7月, 2020 1 次提交
  13. 02 7月, 2020 1 次提交
  14. 24 6月, 2020 1 次提交
    • C
      Add default cudnn lib path (#25175) · 353ea9e8
      Chen Weihang 提交于
      * add default cudnn lib path, test=develop
      
      * change default path in func, test=develop
      
      * move to linux branch, test=develop
      
      * fix var error in other plat, test=develop
      353ea9e8
  15. 05 6月, 2020 1 次提交
    • C
      Support SelelctedRows allreduce in multi-cards imperative mode (#24690) · 4a702ef3
      Chen Weihang 提交于
      * support selectedrows allreduce in multi-cards dygraph, test=develop
      
      * remove useless import modules in unittests, test=develop
      
      * add nccl cmake to get nccl version, test=develop
      
      * add if-condition to compiled correctly, test=develop
      
      * add detail version parseing for old nccl, test=develop
      
      * polish camke details, test=develop
      
      * fix remove test cmake error, test=develop
      
      * fix cmake condition, test=develop
      
      * change unittest camke list, test=develop
      
      * fix unittest cmake rule, test=develop, test=framep0
      4a702ef3
  16. 18 5月, 2020 1 次提交
    • Y
      Add some check for CUDA Driver API and NVRTC (#22719) · 560c8153
      Yiqun Liu 提交于
      * Add the check for whether CUDA Driver and NVRTC is available for the runtime system.
      
      * Call cuInit to initialize the CUDA Driver API before all CUDA callings.
      test=develop
      
      * Change the behavior when libnvrtc.so can not be found, printing a warning instead of exiting.
      test=develop
      
      * Do not initialize CUDA Driver API for windows and macos.
      test=develop
      
      * Remove the call of cuInit when entering paddle and enable the test_code_generator.
      test=develop
      
      * Add some built-in functions for __half.
      test=develop
      
      * Change save_intermediate_out to false in unittest.
      test=develop
      
      * Fix error reference to tempropary variable when seting including path for device_code.
      test=develop
      560c8153
  17. 08 5月, 2020 1 次提交
  18. 30 4月, 2020 1 次提交
    • G
      Fix cusolver loader for Windows (#24157) · 1fc6cc50
      Guo Sheng 提交于
      * Fix cusolver loader for Windows in dynamic_loader.cc. test=develop
      
      * Fix missing CUSOLVER_ROUTINE_EACH_R1.
      test=gpu
      test=develop
      
      * Add unsupprot for cusolver on Windows temporarily. test=develop
      
      * Fix GetCusolverDsoHandle error message. test=develop
      1fc6cc50
  19. 27 4月, 2020 1 次提交
  20. 24 4月, 2020 1 次提交
    • G
      Add cholesky_op (#23543) · a8c0fb4e
      Guo Sheng 提交于
      * Add cholesky_op forward part. test=develop
      
      * Complete cholesky_op forward part. test=develop
      
      * Add cholesky_op backward part. test=develop
      
      * Complete cholesky_op backward part. test=develop
      
      * Refine cholesky_op error check and docs. test=develop
      
      * Add grad_check unit test for cholesky_op. test=develop
      
      * Fix sample code in cholesky doc. test=develop
      
      * Refine some error messages of cholesky_op. test=develop
      
      * Refine some error messages of cholesky_op. test=develop
      
      * Remove unused input in cholesky_grad. test=develop
      
      * Remove unused input in cholesky_grad. test=develop
      
      * Fix stream for cusolverDnSetStream. test=develop
      
      * Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code.
      test=develop
      
      * Add CUSOLVER ERROR in enforce.h
      test=develop
      
      * Fix the missing return value in cholesky. test=develop
      a8c0fb4e
  21. 10 4月, 2020 2 次提交
  22. 05 2月, 2020 1 次提交
  23. 03 1月, 2020 1 次提交
    • Y
      Add the first implememtation of fusion_group op (#19621) · d4832077
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      
      * Add DeviceCodePool to manage all device codes.
      
      * Add the first implementation fusion_group op.
      
      * Add unit-test for fusion_group op.
      
      * Add the check of result.
      
      * Add the check of nvrtc in unit-test.
      test=develop
      
      * Add comment to explain the inputs, outputs and features of fusion_group op.
      test=develop
      
      * Disable fusion_group op for mac and windows.
      test=develop
      
      * Make the compiling of device code return status instead of hanging up.
      test=develop
      
      * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
      
      * Unify fusion_group_op's input and output names.
      test=develop
      
      * Add the check of CUDA driver library in unittest.
      test=develop
      
      * Refine the calling of PADDLE_ENFORCE.
      test=develop
      d4832077
  24. 01 12月, 2019 1 次提交
  25. 30 9月, 2019 1 次提交
  26. 28 9月, 2019 2 次提交
    • Q
      Enable users to create custom cpp op outside framework. (#19256) · 1a3eef02
      qingqing01 提交于
      * How to write custom op needs to follow framework OP spec.
      * Package fluid_framework.so and headers into whl.
      * Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
      * Export some C-APIs to merge OpInfo between core.so and custom_op.so.
      * Add unit testing.
      * Update API.spec.
      1a3eef02
    • L
      fix pool2d pool3d,support asymmetric padding and channel_last (#19739) · 24010472
      liym27 提交于
      * fix pool2d pool3d:
      1. support asymmetric padding;
      2. support padding algorithm:"SAME" and "VALID";
      3. support channel_last: data_format NHWC and NDHWC;
      4. support inferring shape when input with negative dims in compile time;
      5. change doc of python API and c++;
      6. fix bug in cuda kernel when Attr(adaptive) is true.
      
      test=develop,test=document_preview
      
      * fix 'tensors' to 'Tensors'. test=develop,test=document_preview
      
      * add test for converage ValueError.test=develop,test=document_preview
      
      * resolve conflict in test_pool2d. test=develop
      24010472
  27. 14 9月, 2019 1 次提交
  28. 05 9月, 2019 1 次提交
    • Y
      Integrate NVRTC to support compiling CUDA kernel at runtime (#19422) · 42b5bec6
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      42b5bec6
  29. 02 9月, 2019 1 次提交
  30. 20 8月, 2019 1 次提交
  31. 12 8月, 2019 1 次提交
  32. 05 8月, 2019 1 次提交
    • L
      fix warpctc.dll not found issue (#18761) · a43a763b
      liuwei1031 提交于
      * fix warpctc.dll not found issue, test=develop
      
      * revert the linux platform change, test=develop
      
      * delete warpctc_lib_path.h.in, test=develop
      
      * add SetPySitePackagePath function
      
      * fix warpctc.dylib not found issue on Mac, test=develop
      
      * improve the paddle lib path setting logic, test=develop
      
      * fix mac ci issue caused by test_warpctc_op unittest, test=develop
      
      * tweak code, test=develop
      a43a763b
  33. 29 7月, 2019 1 次提交
  34. 27 7月, 2019 1 次提交