- 28 1月, 2021 1 次提交
-
-
由 Qi Li 提交于
* [ROCM] update fluid platform for rocm35 (part1), test=develop * address review comments, test=develop
-
- 20 1月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* delete empty line of pybing.cc, test=develop * use nvtx push pop in timeline, test=develop * change year, test=develop * add #ifdef PADDLE_WITH_CUDA, test=develop * add #ifndef WIN32, test=develop * is_pushed to is_pushed_, test=develop
-
- 16 12月, 2020 1 次提交
-
-
由 Y_Xuan 提交于
* 添加rocm平台支持代码 * 修改一些问题 * 修改一些歧义并添加备注 * 修改代码格式 * 解决冲突后的代码修改 * 修改operators.cmake * 修改格式 * 修正错误 * 统一接口 * 修改日期
-
- 24 4月, 2020 1 次提交
-
-
由 Guo Sheng 提交于
* Add cholesky_op forward part. test=develop * Complete cholesky_op forward part. test=develop * Add cholesky_op backward part. test=develop * Complete cholesky_op backward part. test=develop * Refine cholesky_op error check and docs. test=develop * Add grad_check unit test for cholesky_op. test=develop * Fix sample code in cholesky doc. test=develop * Refine some error messages of cholesky_op. test=develop * Refine some error messages of cholesky_op. test=develop * Remove unused input in cholesky_grad. test=develop * Remove unused input in cholesky_grad. test=develop * Fix stream for cusolverDnSetStream. test=develop * Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code. test=develop * Add CUSOLVER ERROR in enforce.h test=develop * Fix the missing return value in cholesky. test=develop
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 05 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop
-
- 05 8月, 2019 1 次提交
-
-
由 liuwei1031 提交于
* fix warpctc.dll not found issue, test=develop * revert the linux platform change, test=develop * delete warpctc_lib_path.h.in, test=develop * add SetPySitePackagePath function * fix warpctc.dylib not found issue on Mac, test=develop * improve the paddle lib path setting logic, test=develop * fix mac ci issue caused by test_warpctc_op unittest, test=develop * tweak code, test=develop
-
- 29 7月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
-
- 27 7月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Also fix a dependency error which may cause compile error
-
- 07 5月, 2019 1 次提交
-
-
由 Tao Luo 提交于
* remove unused FLAGS_warpctc_dir test=develop * remove FLAGS_warpctc_dir test=develop
-
- 03 4月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
test=develop This reverts commit c38c7c56.
-
- 02 4月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* link the libwbaes.so into paddle * polish detail, test=develop * try fix mac_pr_ci error, test=develop * add compile option, test=develop * fix ci error, test=develop * ignore failed to find mac lib, test=develop * change cdn to bj, cdn can't get the latest version * trigger ci, test=develop * temporary delete win32 lib linking, test=develop * change https to http, test=develop * turn compile option on to off * turn compile option off to on, test=develop * try lib compiled by gcc4.8, test=develop * update lib version, test=develop * link other lib, test=develop * add setup config * delete false, test=develop * delete no_soname, test=develop * recover so name set * fix, test=develop * adjust make config, test=develop * remove link to wbaes, test=develop * remove useless define, test=develop
-
- 18 12月, 2018 3 次提交
- 27 8月, 2018 3 次提交
- 24 8月, 2018 1 次提交
-
-
由 dzhwinter 提交于
-
- 20 8月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* cudnn widndows * "add comment" * "windows support" * "fix cmake error"
-
- 17 8月, 2018 2 次提交
- 23 6月, 2018 1 次提交
-
-
由 Yi Wang 提交于
* Make paddle no longer depend on boost * Update enforce.h
-
- 20 6月, 2018 1 次提交
-
-
由 tensor-tang 提交于
-
- 16 4月, 2018 2 次提交
-
-
由 Luo Tao 提交于
-
由 Yan Chunwei 提交于
-
- 28 2月, 2018 1 次提交
-
-
由 Yu Yang 提交于
* Make CUPTI_LIB_PATH not passing by macro. * Add missing header
-
- 26 2月, 2018 1 次提交
-
-
由 Xin Pan 提交于
-
- 14 2月, 2018 1 次提交
-
-
由 Yang Yang(Tony) 提交于
* compile with nccl2 * add ncclGroup; it is necessary in nccl2 * add back libnccl-dev
-
- 10 2月, 2018 2 次提交
- 07 2月, 2018 1 次提交
-
-
由 Yang Yang 提交于
-
- 09 1月, 2018 1 次提交
-
-
由 Yiqun Liu 提交于
* Add Seq2BatchFunctor, which will be used in WarpCTCOp. * Implement WrapCTCFunctor and WrapCTCKernel. * Add unittest of warpctc_op. * Modify the check_output inferface in python unittest framework to allow check a subset of outputs. * Use absolute offset lod in warpctc_op and related functors. * Refine the comments of warpctc_op. * The new python unittest supports checking a subset of the outputs, so revoke the previous change. * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor. * Update to the newest codes. * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
-
- 24 11月, 2017 1 次提交
-
-
由 Qiao Longfei 提交于
* make enforce a target and dependent on nccl when gpu is enabled * add some more dependency
-
- 24 10月, 2017 2 次提交
- 15 10月, 2017 1 次提交
-
-
由 Dong Zhihong 提交于
-
- 31 8月, 2017 1 次提交
-
-
由 dangqingqing 提交于
-
- 12 7月, 2017 1 次提交
-
-
由 qijun 提交于
-
- 11 7月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-