Created by: qingqing01
Fix #5712 (closed)
1. Automatically detect GPU arch and only specify the detected arch by default.
- For example, in the Tesla K40m, automatically get and specify
sm_35
arch.--DCUDA_ARCH_NAME=All
in the TeamCity.
-
-DCUDA_ARCH_NAME=All
by default. - The developers can set
-DCUDA_ARCH_NAME=Auto
- Specify -DCUDA_ARCH_NAME=All when releasing PaddlePaddle new version.
- support: Kepler, Maxwell, Pascal, Volta archs.
Speed:
-
TeamCity:
- The GPU compiling time: about (14min ~ 16min) -> about 9min in TeamCity
-
local machine: env: centos, cuda 7.5, make -j8, WITH_GPU=ON
- raw: time: 31m24.320s
- https://github.com/PaddlePaddle/Paddle/pull/5573 : 26m43.523s
- This PR: 15m0.158s
Compile time interval:
[14:49:56]W: [Step 1/1] + nvidia-docker run -i ... WITH_GPU=ON
[14:59:19] : [Step 1/1] Running unit tests ...