未验证 提交 71ecf7e6 编写于 作者: E eltonzheng 提交者: GitHub

Add Windows support in README, use c++17 on Windows to support latest VC & cuda build tool (#1151)

* Add Windows support in README, use c++17 on Windows to support latest vc build tool

* Add detailed cpp build tools version in README
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
上级 8e48756c
......@@ -96,6 +96,12 @@ If you would like to pre-install any of the DeepSpeed extensions/ops (instead
of JIT compiling) or install pre-compiled ops via PyPI please see our [advanced
installation instructions](https://www.deepspeed.ai/tutorials/advanced-install/).
On Windows you can build wheel with following steps, currently only inference mode is supported.
1. Install pytorch, such as pytorch 1.8 + cuda 11.1
2. Install visual cpp build tools, such as VS2019 C++ x64/x86 build tools
3. Launch cmd console with Administrator privilege for creating required symlink folders
4. Run `python setup.py bdist_wheel` to build wheel in `dist` folder
# Features
Below we provide a brief feature list, see our detailed [feature
overview](https://www.deepspeed.ai/features/) for descriptions and usage.
......
......@@ -364,6 +364,18 @@ class CUDAOpBuilder(OpBuilder):
else:
return ['-O3', '-std=c++14', '-g', '-Wno-reorder']
def nvcc_args(self):
args = [
'-O3',
'--use_fast_math',
'-std=c++17' if sys.platform == "win32" else '-std=c++14',
'-U__CUDA_NO_HALF_OPERATORS__',
'-U__CUDA_NO_HALF_CONVERSIONS__',
'-U__CUDA_NO_HALF2_OPERATORS__'
]
return args + self.compute_capability_args()
def libraries_args(self):
if sys.platform == "win32":
return ['cublas', 'curand']
......
......@@ -62,15 +62,3 @@ class CPUAdamBuilder(CUDAOpBuilder):
'-fopenmp',
SIMD_WIDTH
]
def nvcc_args(self):
args = [
'-O3',
'--use_fast_math',
'-std=c++14',
'-U__CUDA_NO_HALF_OPERATORS__',
'-U__CUDA_NO_HALF_CONVERSIONS__',
'-U__CUDA_NO_HALF2_OPERATORS__'
]
args += self.compute_capability_args()
return args
......@@ -21,15 +21,3 @@ class QuantizerBuilder(CUDAOpBuilder):
def include_paths(self):
return ['csrc/includes']
def nvcc_args(self):
args = [
'-O3',
'--use_fast_math',
'-std=c++14',
'-U__CUDA_NO_HALF_OPERATORS__',
'-U__CUDA_NO_HALF_CONVERSIONS__',
'-U__CUDA_NO_HALF2_OPERATORS__'
]
return args + self.compute_capability_args()
......@@ -30,15 +30,3 @@ class TransformerBuilder(CUDAOpBuilder):
def include_paths(self):
return ['csrc/includes']
def nvcc_args(self):
args = [
'-O3',
'--use_fast_math',
'-std=c++14',
'-U__CUDA_NO_HALF_OPERATORS__',
'-U__CUDA_NO_HALF_CONVERSIONS__',
'-U__CUDA_NO_HALF2_OPERATORS__'
]
return args + self.compute_capability_args()
......@@ -24,15 +24,3 @@ class InferenceBuilder(CUDAOpBuilder):
def include_paths(self):
return ['csrc/transformer/inference/includes']
def nvcc_args(self):
args = [
'-O3',
'--use_fast_math',
'-std=c++14',
'-U__CUDA_NO_HALF_OPERATORS__',
'-U__CUDA_NO_HALF_CONVERSIONS__',
'-U__CUDA_NO_HALF2_OPERATORS__',
]
return args + self.compute_capability_args()
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册