- 03 1月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop
-
- 05 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop
-
- 20 6月, 2018 1 次提交
-
-
由 tensor-tang 提交于
-
- 08 4月, 2018 1 次提交
-
-
由 Yi Wang 提交于
* Update source files. * Update headers * Update * Update * Update * Update * Fix a CMake dependency
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 26 12月, 2017 1 次提交
-
-
由 Luo Tao 提交于
-
- 15 12月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-
- 24 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* "add nccl enforce" * Dev * Update comment * Add nccl test * Follow comments
-
- 20 8月, 2017 2 次提交
- 16 8月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Also simplify pybind implementation by using OperatorBase as holder type.
-
- 01 8月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-
- 26 7月, 2017 2 次提交
- 25 7月, 2017 1 次提交
-
-
由 liaogang 提交于
-
- 17 7月, 2017 2 次提交
-
-
由 Yu Yang 提交于
-
由 Yan Chunwei 提交于
* add NDEBUG switch to PADDLE_ENFORCE
-
- 11 7月, 2017 2 次提交
-
-
由 dongzhihong 提交于
-
由 dongzhihong 提交于
-
- 06 7月, 2017 2 次提交
- 05 7月, 2017 1 次提交
-
-
由 liaogang 提交于
-
- 04 7月, 2017 4 次提交
- 03 7月, 2017 1 次提交
-
-
由 liaogang 提交于
* Free will be added soon
-
- 28 6月, 2017 2 次提交