- 21 4月, 2022 1 次提交
-
-
由 JingZhuangzhuang 提交于
* update ampere sm * update ampere sm * update ampere sm
-
- 12 4月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
-
- 22 3月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
-
- 02 3月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Adjust GPU Arches for Whl releases * Adjusted CUDA arches * fixed minor issue * adjusted gpu arches
-
- 11 2月, 2022 1 次提交
-
-
由 zhangchunle 提交于
-
- 31 8月, 2021 1 次提交
-
-
由 Zhanlue Yang 提交于
[Background] Expansion in code size can be irreversible in the long run, leading to huge release packages which not only hampers user experience but also exceeds a hard limit of pypi. In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU arches supported. This PR aims to prune this NV_FATBIN. [Solution] In the new release strategy, two types of whl packages will be involved: Cubin PIP package: PIP package maintains a smaller window for GPU arches support, containing sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches JIT release package: This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60, compute_70, compute_75, compute_80, with best performance and GPU arches coverage. However, it takes around 10 min to install due to the JIT compilation. [How to use] The new release strategy is disabled by default. To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL
-
- 14 7月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
* Support sccache to speed up compilation on Windows * Support sccache to speed up compilation on Windows
-
- 06 7月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add gpu implementation of shuffle batch test=develop * add thrust cuda patches test=develop * fix macro guard * fix shuffle batch compile on windows/hip * fix hip compilation error * refine CMakeLists.txt * fix windows compile error * try to fix windows CI compilation error * fix windows compilation again * fix shuffle_batch op test on Windows
-
- 02 6月, 2021 1 次提交
-
-
由 Pei Yang 提交于
-
- 26 5月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
* fix ninja compilation bug on windows * polish windows ci * polish windows ci
-
- 31 3月, 2021 2 次提交
-
-
由 tianshuo78520a 提交于
-
由 wuhuanzhou 提交于
* update compilation with C++14, test=develop * fix compilation error in eigen, test=develop
-
- 30 3月, 2021 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 17 3月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 19 2月, 2021 1 次提交
-
-
由 Wojciech Uss 提交于
* Modify relu native implementation * fix GPU performance
-
- 14 1月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 27 11月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
- 21 10月, 2020 2 次提交
- 18 9月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 09 9月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 07 9月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 20 8月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
specify cuda arch when dectected fail
-
- 10 8月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* Fixed compile warning about incorrect compile options,fix paddle_build.bat * fix paddle_build.bat to more safe
-
- 09 7月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 16 6月, 2020 1 次提交
-
-
由 T8T9 提交于
-
- 10 6月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
fix bug in CUDA_NVCC_FALS and CMAKE_CUDA_FLAGS
-
- 08 6月, 2020 1 次提交
-
-
由 T8T9 提交于
* add -DPADDLE_CUDA_BINVER. test=develop, test=win_gpu * nvcc will use add_compile_options, avoid using it if you don't want to pass arguments to nvcc. test=develop * test=develop, test=win_gpu
-
- 05 6月, 2020 1 次提交
-
-
由 T8T9 提交于
* support CUDA using cmake built-in way (#24395) * support CUDA using cmake built-in way. test=develop * test=develop * cmake_minimum_required 3.10 * test=develop
-
- 28 5月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
-
- 13 5月, 2020 1 次提交
-
- 12 5月, 2020 1 次提交
-
-
由 Shibo Tao 提交于
* support CUDA using cmake built-in way. test=develop * test=develop
-
- 08 5月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 30 4月, 2020 1 次提交
-
-
由 Guo Sheng 提交于
* Fix cusolver loader for Windows in dynamic_loader.cc. test=develop * Fix missing CUSOLVER_ROUTINE_EACH_R1. test=gpu test=develop * Add unsupprot for cusolver on Windows temporarily. test=develop * Fix GetCusolverDsoHandle error message. test=develop
-
- 22 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 12 4月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
-
- 26 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop
-
- 03 12月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* add jeston compile support test=develop * refine the cmake test=develop
-
- 26 11月, 2019 1 次提交
-
-
由 Tao Luo 提交于
* make CUDA_ARCH_NAME default Auto test=develop * refine warning test=develop
-
- 14 8月, 2019 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-