OPT ccache support CUDA (!24337) · 合并请求 · PaddlePaddle / Paddle

OPT ccache support CUDA !24337

Created by: T8T9

Motivation: The building stage of ci is too slow because

Paddle use Find(CUDA) to introduce CUDA currently, which doesn't support ccache. Because old cmake doesn't support CUDA, it can only add ccache prefix to gcc and g++ command. Find(Cuda) includes FindCUDA module underlying. It will generate a .cu.o.Release.cmake script for every .cu file，then call cmake -E <cmake script> to compile .cu files. cmake wouldn't add ccache prefix to cmake command. To confirm this, you can check paddle/fluid/operators/math/CMakeFiles/fc.dir/build.make and paddle/fluid/operators/math/CMakeFiles/fc.dir/fc_generated_fc.cu.o.Release.cmake in your build directory.

cmake 3.10 support CUDA, it can compile .cu files by calling nvcc command directly, just like gcc and g++. So, to ccache CUDA object files, we need to use cmake built-in way.
cmake will use -x cu to compile .cu files using nvcc compiler explicitly. but ccache less than 3.7.9 can not recognize this option, this bug has been fixed in ccache 3.7.9.

Paddle Ci use ccache 3.6 currently, so we need to upgrade ccache to 3.7.9 to ccache CUDA object files.
Paddle pass compiler options(-Wno-unused-function, -Werror etc.) to gcc/g++ by adding -Xcompiler -Wno-unused-function -Xcompiler -Werror flags to nvcc. We should notice there is a ' ' between flag -Xcompiler and -Werror, ccache might treat it as two separate flags. The problem is both g++ and nvcc have a built-in option -Werror, and -Werror is a compiler option, so ccache will remove this option when preprocess source files. In this case, ccache will change-Xcompiler -Werror to -Xcompiler, and preprocessor can not recognize -Xcompiler, this will make preprocess fail, and then ccache.

To fix this, we should use -Xcompiler=-Werror to tell ccache these two flags are binded, and they should be processed together.

Changes:

use enable_language(CUDA) which is the cmake built-in way to support ccache.
upgrade ccache to 3.7.9.
set ccache max_size to 20GB. ccache will consume about 7GB disk space to cache object files after supporting CUDA, but the default max_size is 5GB, this will cause auto clean frequently, and cache will be deleted circularly, which can increase cache miss significantly. So we need to increase max_size bigger than 7GB. However, 7GB is space consumed by a clean build, paddle ci will build many times for different PRs, so we should make max_size bigger than 7GB to cache more object files to guarantee a high cache hit. I think 20GB is enough.

Expected: Time consumption of building stage should be reduced to about 11 minutes from about 55 minutes.

PaddlePaddle / Paddle 大约 2 年 前同步成功

OPT ccache support CUDA !24337

PaddlePaddle / Paddle
大约 2 年前同步成功