未验证 提交 c740778f 编写于 作者: 飞龙 提交者: GitHub

Merge pull request #325 from belonHan/master

3 doc - 20190301
# torch.utils.bottleneck
> 译者: [belonHan](https://github.com/belonHan)
`torch.utils.bottleneck` is a tool that can be used as an initial step for debugging bottlenecks in your program. It summarizes runs of your script with the Python profiler and PyTorch’s autograd profiler.
`torch.utils.bottleneck`是 调试瓶颈`bottleneck`时首先用到的工具.它总结了python分析工具与PyTorch自动梯度分析工具在脚本运行中情况.
Run it on the command line with
在命令行运行如下命令
```py
python -m torch.utils.bottleneck /path/to/source/script.py [args]
```
where [args] are any number of arguments to `script.py`, or run `python -m torch.utils.bottleneck -h` for more usage instructions.
其中 `[args]``script.py`脚本的参数(任意个数).运行`python -m torch.utils.bottleneck -h`命令获取更多帮助说明.
警告
Warning
请确保脚本在分析时能够在有限时间内退出.
Because your script will be profiled, please ensure that it exits in a finite amount of time.
警告
Warning
当运行CUDA代码时,由于CUDA内核的异步特性, cProfile的输出 和cpu模式的autograd分析工具可能无法显示正确的计时: 报告的CPU时间 是用于启动内核的时间,不包括在GPU上执行的时间。 在常规cpu模式分析器下,同步操作是非常昂贵的。在这种无法准确计时的情况下,可以使用cuda模式的autograd分析工具。
Due to the asynchronous nature of CUDA kernels, when running against CUDA code, the cProfile output and CPU-mode autograd profilers may not show correct timings: the reported CPU time reports the amount of time used to launch the kernels but does not include the time the kernel spent executing on a GPU unless the operation does a synchronize. Ops that do synchronize appear to be extremely expensive under regular CPU-mode profilers. In these case where timings are incorrect, the CUDA-mode autograd profiler may be helpful.
注意
Note
选择查看哪个分析工具的输出结果(CPU模式还是CUDA模式) ,首先应确定脚本是不是CPU密集型`CPU-bound`(“CPU总时间远大于CUDA总时间”)。如果是cpu密集型,选择查看cpu模式的结果。相反,如果大部分时间都运行在GPU上,再查看CUDA分析结果中相应的CUDA操作。
To decide which (CPU-only-mode or CUDA-mode) autograd profiler output to look at, you should first check if your script is CPU-bound (“CPU total time is much greater than CUDA total time”). If it is CPU-bound, looking at the results of the CPU-mode autograd profiler will help. If on the other hand your script spends most of its time executing on the GPU, then it makes sense to start looking for responsible CUDA operators in the output of the CUDA-mode autograd profiler.
当然,实际情况取决于您的模型,可能会更复杂,不属于上面两种极端情况。除了分析结果之外,可以尝试使用`nvprof`命令查看[`torch.autograd.profiler.emit_nvtx()`](autograd.html#torch.autograd.profiler.emit_nvtx "torch.autograd.profiler.emit_nvtx")的结果.然而需要注意NVTX的开销是非常高的,时间线经常会有严重的偏差。
Of course the reality is much more complicated and your script might not be in one of those two extremes depending on the part of the model you’re evaluating. If the profiler outputs don’t help, you could try looking at the result of [`torch.autograd.profiler.emit_nvtx()`](autograd.html#torch.autograd.profiler.emit_nvtx "torch.autograd.profiler.emit_nvtx") with `nvprof`. However, please take into account that the NVTX overhead is very high and often gives a heavily skewed timeline.
Warning
警告
If you are profiling CUDA code, the first profiler that `bottleneck` runs (cProfile) will include the CUDA startup time (CUDA buffer allocation cost) in its time reporting. This should not matter if your bottlenecks result in code much slower than the CUDA startup time.
如果您在分析CUDA代码, `bottleneck`运行的第一个分析工具 (cProfile),它的时间中会包含CUDA的启动(CUDA缓存分配)时间。当然,如果CUDA启动时间远小于代码的中瓶颈,这就被可以忽略。
For more complicated uses of the profilers (like in a multi-GPU case), please see [https://docs.python.org/3/library/profile.html](https://docs.python.org/3/library/profile.html) or [`torch.autograd.profiler.profile()`](autograd.html#torch.autograd.profiler.profile "torch.autograd.profiler.profile") for more information.
更多更复杂关于分析工具的使用方法(比如多GPU),请点击[https://docs.python.org/3/library/profile.html](https://docs.python.org/3/library/profile.html) 或者 [`torch.autograd.profiler.profile()`](autograd.html#torch.autograd.profiler.profile "torch.autograd.profiler.profile").
# torch.utils.checkpoint
> 译者: [belonHan](https://github.com/belonHan)
注意
Note
checkpointing的实现方法是在向后传播期间重新运行已被checkpint的前向传播段。 所以会导致像RNG这类(模型)的持久化的状态比实际更超前。默认情况下,checkpoint包含了使用RNG状态的逻辑(例如通过dropout),与non-checkpointed传递相比,checkpointed具有更确定的输出。RNG状态的存储逻辑可能会导致一定的性能损失。如果不需要确定的输出,设置全局标志(global flag) `torch.utils.checkpoint.preserve_rng_state=False` 忽略RNG状态在checkpoint时的存取。
Checkpointing is implemented by rerunning a forward-pass segment for each checkpointed segment during backward. This can cause persistent states like the RNG state to be advanced than they would without checkpointing. By default, checkpointing includes logic to juggle the RNG state such that checkpointed passes making use of RNG (through dropout for example) have deterministic output as compared to non-checkpointed passes. The logic to stash and restore RNG states can incur a moderate performance hit depending on the runtime of checkpointed operations. If deterministic output compared to non-checkpointed passes is not required, set the global flag `torch.utils.checkpoint.preserve_rng_state=False` to omit stashing and restoring the RNG state during each checkpoint.
```py
torch.utils.checkpoint.checkpoint(function, *args)
```
Checkpoint a model or part of the model
checkpoint模型或模型的一部分
Checkpointing works by trading compute for memory. Rather than storing all intermediate activations of the entire computation graph for computing backward, the checkpointed part does **not** save intermediate activations, and instead recomputes them in backward pass. It can be applied on any part of a model.
checkpoint通过计算换内存空间来工作。与向后传播中存储整个计算图的所有中间激活不同的是,checkpoint不会保存中间激活部分,而是在反向传递中重新计算它们。它被应用于模型的任何部分。
Specifically, in the forward pass, `function` will run in `torch.no_grad()` manner, i.e., not storing the intermediate activations. Instead, the forward pass saves the inputs tuple and the `function` parameter. In the backwards pass, the saved inputs and `function` is retreived, and the forward pass is computed on `function` again, now tracking the intermediate activations, and then the gradients are calculated using these activation values.
具体来说,在正向传播中,`function`将以`torch.no_grad()`方式运行 ,即不存储中间激活,但保存输入元组和 `function`的参数。在向后传播中,保存的输入变量以及 `function`会被取回,并且`function`在正向传播中被重新计算.现在跟踪中间激活,然后使用这些激活值来计算梯度。
Warning
警告
Checkpointing doesn’t work with [`torch.autograd.grad()`](autograd.html#torch.autograd.grad "torch.autograd.grad"), but only with [`torch.autograd.backward()`](autograd.html#torch.autograd.backward "torch.autograd.backward").
Warning
Checkpointing 在 [`torch.autograd.grad()`](autograd.html#torch.autograd.grad "torch.autograd.grad")中不起作用, 仅作用于 [`torch.autograd.backward()`](autograd.html#torch.autograd.backward "torch.autograd.backward").
If `function` invocation during backward does anything different than the one during forward, e.g., due to some global variable, the checkpointed version won’t be equivalent, and unfortunately it can’t be detected.
警告
Parameters:
如果function在向后执行和前向执行不同,例如,由于某个全局变量,checkpoint版本将会不同,并且无法被检测到。
* **function** – describes what to run in the forward pass of the model or part of the model. It should also know how to handle the inputs passed as the tuple. For example, in LSTM, if user passes `(activation, hidden)`, `function` should correctly use the first input as `activation` and the second input as `hidden`
* **args** – tuple containing inputs to the `function`
参数:
* **function** - 描述在模型的正向传递或模型的一部分中运行的内容。它也应该知道如何处理作为元组传递的输入。例如,在LSTM中,如果用户通过 ,应正确使用第一个输入作为第二个输入(activation, hidden)functionactivationhidden
* **args** – 包含输入的元组function
| Returns: | Output of running `function` on `*args` |
| Returns: | 输出 |
| --- | --- |
```py
torch.utils.checkpoint.checkpoint_sequential(functions, segments, *inputs)
```
A helper function for checkpointing sequential models.
用于checkpoint sequential模型的辅助函数
Sequential models execute a list of modules/functions in order (sequentially). Therefore, we can divide such a model in various segments and checkpoint each segment. All segments except the last will run in `torch.no_grad()` manner, i.e., not storing the intermediate activations. The inputs of each checkpointed segment will be saved for re-running the segment in the backward pass.
Sequential模型按顺序执行模块/函数。因此,我们可以将这样的模型划分为不同的段(segment),并对每个段进行checkpoint。除最后一段外的所有段都将以`torch.no_grad()`方式运行,即,不存储中间活动。将保存每个checkpoint段的输入,以便在向后传递中重新运行该段。
See [`checkpoint()`](#torch.utils.checkpoint.checkpoint "torch.utils.checkpoint.checkpoint") on how checkpointing works.
checkpointing工作方式: [`checkpoint()`](#torch.utils.checkpoint.checkpoint "torch.utils.checkpoint.checkpoint").
Warning
警告
Checkpointing无法作用于[`torch.autograd.grad()`](autograd.html#torch.autograd.grad "torch.autograd.grad"), 只作用于[`torch.autograd.backward()`](autograd.html#torch.autograd.backward "torch.autograd.backward").
Checkpointing doesn’t work with [`torch.autograd.grad()`](autograd.html#torch.autograd.grad "torch.autograd.grad"), but only with [`torch.autograd.backward()`](autograd.html#torch.autograd.backward "torch.autograd.backward").
参数:
Parameters:
* **functions** – 按顺序执行的模型, 一个 [`torch.nn.Sequential`](nn.html#torch.nn.Sequential "torch.nn.Sequential")对象,或者一个由modules或functions组成的list。
* **segments** – 段的数量
* **inputs** – 输入,Tensor组成的元组
* **functions** – A [`torch.nn.Sequential`](nn.html#torch.nn.Sequential "torch.nn.Sequential") or the list of modules or functions (comprising the model) to run sequentially.
* **segments** – Number of chunks to create in the model
* **inputs** – tuple of Tensors that are inputs to `functions`
| Returns: | Output of running `functions` sequentially on `*inputs` |
| Returns: | 按顺序返回每个`*inputs`的结果
| --- | --- |
Example
例子
```py
>>> model = nn.Sequential(...)
......
# torch.utils.cpp_extension
> 译者: [belonHan](https://github.com/belonHan)
```py
torch.utils.cpp_extension.CppExtension(name, sources, *args, **kwargs)
```
Creates a `setuptools.Extension` for C++.
创建一个C++的setuptools.Extension。
Convenience method that creates a `setuptools.Extension` with the bare minimum (but often sufficient) arguments to build a C++ extension.
便捷地创建一个setuptools.Extension具有最小(但通常是足够)的参数来构建C++扩展的方法。
All arguments are forwarded to the `setuptools.Extension` constructor.
所有参数都被转发给setuptools.Extension构造函数。
Example
例子
```py
>>> from setuptools import setup
>>> from torch.utils.cpp_extension import BuildExtension, CppExtension
>>> setup(
name='extension',
ext_modules=[
CppExtension(
name='extension',
sources=['extension.cpp'],
extra_compile_args=['-g'])),
],
cmdclass={
'build_ext': BuildExtension
})
name='extension',
ext_modules=[
CppExtension(
name='extension',
sources=['extension.cpp'],
extra_compile_args=['-g'])),
],
cmdclass={
'build_ext': BuildExtension
})
```
......@@ -35,29 +36,29 @@ Example
torch.utils.cpp_extension.CUDAExtension(name, sources, *args, **kwargs)
```
Creates a `setuptools.Extension` for CUDA/C++.
为CUDA/C++创建一个`setuptools.Extension`
Convenience method that creates a `setuptools.Extension` with the bare minimum (but often sufficient) arguments to build a CUDA/C++ extension. This includes the CUDA include path, library path and runtime library.
创建一个setuptools.Extension用于构建CUDA/C ++扩展的最少参数(但通常是足够的)的便捷方法。这里包括CUDA路径,库路径和运行库。 所有参数都被转发给setuptools.Extension构造函数。
All arguments are forwarded to the `setuptools.Extension` constructor.
所有参数都被转发给setuptools.Extension构造函数。
Example
例子
```py
>>> from setuptools import setup
>>> from torch.utils.cpp_extension import BuildExtension, CUDAExtension
>>> setup(
name='cuda_extension',
ext_modules=[
CUDAExtension(
name='cuda_extension',
sources=['extension.cpp', 'extension_kernel.cu'],
extra_compile_args={'cxx': ['-g'],
'nvcc': ['-O2']})
],
cmdclass={
'build_ext': BuildExtension
})
name='cuda_extension',
ext_modules=[
CUDAExtension(
name='cuda_extension',
sources=['extension.cpp', 'extension_kernel.cu'],
extra_compile_args={'cxx': ['-g'],
'nvcc': ['-O2']})
],
cmdclass={
'build_ext': BuildExtension
})
```
......@@ -65,52 +66,52 @@ Example
torch.utils.cpp_extension.BuildExtension(*args, **kwargs)
```
A custom `setuptools` build extension .
自定义setuptools构建扩展。
This `setuptools.build_ext` subclass takes care of passing the minimum required compiler flags (e.g. `-std=c++11`) as well as mixed C++/CUDA compilation (and support for CUDA files in general).
`setuptools.build_ext`子类负责传递所需的最小编译器参数(例如`-std=c++11`)以及混合的C ++/CUDA编译(以及一般对CUDA文件的支持)。
When using [`BuildExtension`](#torch.utils.cpp_extension.BuildExtension "torch.utils.cpp_extension.BuildExtension"), it is allowed to supply a dictionary for `extra_compile_args` (rather than the usual list) that maps from languages (`cxx` or `cuda`) to a list of additional compiler flags to supply to the compiler. This makes it possible to supply different flags to the C++ and CUDA compiler during mixed compilation.
当使用[`BuildExtension`](#torch.utils.cpp_extension.BuildExtension "torch.utils.cpp_extension.BuildExtension")时,它将提供一个用于`extra_compile_args`(不是普通列表)的词典,通过语言(`cxx``cuda`)映射到参数列表提供给编译器。这样可以在混合编译期间为C ++和CUDA编译器提供不同的参数。
```py
torch.utils.cpp_extension.load(name, sources, extra_cflags=None, extra_cuda_cflags=None, extra_ldflags=None, extra_include_paths=None, build_directory=None, verbose=False, with_cuda=None, is_python_module=True)
```
Loads a PyTorch C++ extension just-in-time (JIT).
即时加载(JIT)PyTorch C ++扩展。
To load an extension, a Ninja build file is emitted, which is used to compile the given sources into a dynamic library. This library is subsequently loaded into the current Python process as a module and returned from this function, ready for use.
为了加载扩展,会创建一个Ninja构建文件,该文件用于将指定的源编译为动态库。随后将该库作为模块加载到当前Python进程中,并从该函数返回,以备使用。
By default, the directory to which the build file is emitted and the resulting library compiled to is `<tmp>/torch_extensions/<name>`, where `<tmp>` is the temporary folder on the current platform and `<name>` the name of the extension. This location can be overridden in two ways. First, if the `TORCH_EXTENSIONS_DIR` environment variable is set, it replaces `<tmp>/torch_extensions` and all extensions will be compiled into subfolders of this directory. Second, if the `build_directory` argument to this function is supplied, it overrides the entire path, i.e. the library will be compiled into that folder directly.
默认情况下,构建文件创建的目录以及编译结果库是`<tmp>/torch_extensions/<name>`,其中`<tmp>`是当前平台上的临时文件夹以及`<name>`为扩展名。这个位置可以通过两种方式被覆盖。首先,如果`TORCH_EXTENSIONS_DIR`设置了环境变量,它将替换`<tmp>/torch_extensions`并将所有扩展编译到此目录的子文件夹中。其次,如果`build_directory`函数设置了参数,它也将覆盖整个路径,即,库将直接编译到该文件夹中。
To compile the sources, the default system compiler (`c++`) is used, which can be overridden by setting the `CXX` environment variable. To pass additional arguments to the compilation process, `extra_cflags` or `extra_ldflags` can be provided. For example, to compile your extension with optimizations, pass `extra_cflags=['-O3']`. You can also use `extra_cflags` to pass further include directories.
要编译源文件,使用默认的系统编译器(`c++`),可以通过设置`CXX`环境变量来覆盖它。将其他参数传递给编译过程,`extra_cflags`或者`extra_ldflags`可以提供。例如,要通过优化来编译您的扩展,你可以传递`extra_cflags=['-O3']`,也可以使用 `extra_cflags`传递进一步包含目录。
CUDA support with mixed compilation is provided. Simply pass CUDA source files (`.cu` or `.cuh`) along with other sources. Such files will be detected and compiled with nvcc rather than the C++ compiler. This includes passing the CUDA lib64 directory as a library directory, and linking `cudart`. You can pass additional flags to nvcc via `extra_cuda_cflags`, just like with `extra_cflags` for C++. Various heuristics for finding the CUDA install directory are used, which usually work fine. If not, setting the `CUDA_HOME` environment variable is the safest option.
提供了混合编译的CUDA支持。只需将CUDA源文件(.cu或.cuh)与其他源一起传递即可。这些文件将被检测,并且使用nvcc而不是C ++编译器进行编译。包括将CUDA lib64目录作为库目录传递并进行cudart链接。您可以将其他参数传递给nvcc extra_cuda_cflags,就像使用C ++的extra_cflags一样。使用了各种原始方法来查找CUDA安装目录,通常情况下可以正常运行。如果不可以,最好设置CUDA_HOME环境变量。
Parameters:
* **name** – The name of the extension to build. This MUST be the same as the name of the pybind11 module!
* **sources** – A list of relative or absolute paths to C++ source files.
* **extra_cflags** – optional list of compiler flags to forward to the build.
* **extra_cuda_cflags** – optional list of compiler flags to forward to nvcc when building CUDA sources.
* **extra_ldflags** – optional list of linker flags to forward to the build.
* **extra_include_paths** – optional list of include directories to forward to the build.
* **build_directory** – optional path to use as build workspace.
* **verbose** – If `True`, turns on verbose logging of load steps.
* **with_cuda** – Determines whether CUDA headers and libraries are added to the build. If set to `None` (default), this value is automatically determined based on the existence of `.cu` or `.cuh` in `sources`. Set it to `True`` to force CUDA headers and libraries to be included.
* **is_python_module** – If `True` (default), imports the produced shared library as a Python module. If `False`, loads it into the process as a plain dynamic library.
参数:
* name - 要构建的扩展名。这个必须和`pybind11`模块的名字一样!
* sources - `C++`源文件的相对或绝对路径列表。
* extra_cflags - 编译器参数的可选列表,用于转发到构建。
* extra_cuda_cflags - 编译器标记的可选列表,在构建`CUDA`源时转发给`nvcc`
* extra_ldflags - 链接器参数的可选列表,用于转发到构建。
* extra_include_paths - 转发到构建的包含目录的可选列表。
* build_directory - 可选路径作为构建区域。
* verbose - 如果为`True`,打开加载步骤的详细记录。
* with_cuda – 确定构建是是否包含CUDA头/库. 默认值 `None`, 自动通过`sources`目录是否存在 `.cu``.cuh`文件确定. `True`强制包含.
* is_python_module – 默认值 `True`: python模块方式导入. `False`: 普通动态库方式加载到程序.
| Returns: | If `is_python_module` is `True`, returns the loaded PyTorch extension as a Python module. If `is_python_module` is `False` returns nothing (the shared library is loaded into the process as a side effect). |
| 返回: | `is_python_module` == `True`, 加载`PyTorch`扩展作为`Python`模块。If `is_python_module` == `False` 无返回 (副作用是共享库被加载到进程). |
| --- | --- |
Example
例子
```py
>>> from torch.utils.cpp_extension import load
>>> module = load(
name='extension',
sources=['extension.cpp', 'extension_kernel.cu'],
extra_cflags=['-O2'],
verbose=True)
name='extension',
sources=['extension.cpp', 'extension_kernel.cu'],
extra_cflags=['-O2'],
verbose=True)
```
......@@ -118,41 +119,43 @@ Example
torch.utils.cpp_extension.load_inline(name, cpp_sources, cuda_sources=None, functions=None, extra_cflags=None, extra_cuda_cflags=None, extra_ldflags=None, extra_include_paths=None, build_directory=None, verbose=False, with_cuda=None, is_python_module=True)
```
Loads a PyTorch C++ extension just-in-time (JIT) from string sources.
在运行时编译加载PyTorch C++ 扩展
This function behaves exactly like [`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load"), but takes its sources as strings rather than filenames. These strings are stored to files in the build directory, after which the behavior of [`load_inline()`](#torch.utils.cpp_extension.load_inline "torch.utils.cpp_extension.load_inline") is identical to [`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load").
这个函数很像[`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load"),但是它的源文件是字符串而不是文件名。在把这些字符串保存到构建目录后,[`load_inline()`](#torch.utils.cpp_extension.load_inline "torch.utils.cpp_extension.load_inline") 等价于 [`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load").
See [the tests](https://github.com/pytorch/pytorch/blob/master/test/test_cpp_extensions.py) for good examples of using this function.
例子: [the tests](https://github.com/pytorch/pytorch/blob/master/test/test_cpp_extensions.py)
Sources may omit two required parts of a typical non-inline C++ extension: the necessary header includes, as well as the (pybind11) binding code. More precisely, strings passed to `cpp_sources` are first concatenated into a single `.cpp` file. This file is then prepended with `#include <torch/extension.h>`.
Furthermore, if the `functions` argument is supplied, bindings will be automatically generated for each function specified. `functions` can either be a list of function names, or a dictionary mapping from function names to docstrings. If a list is given, the name of each function is used as its docstring.
源代码可能会省略非内联c++扩展的两个必要部分:必要的头文件,以及(pybind11)绑定代码。更准确地说,传递给`cpp_sources`的字符串首先连接成一个单独的`.cpp`文件。然后在这个文件前面加上`#include & lt;torch/extension.h>`
The sources in `cuda_sources` are concatenated into a separate `.cu` file and prepended with `torch/types.h`, `cuda.h` and `cuda_runtime.h` includes. The `.cpp` and `.cu` files are compiled separately, but ultimately linked into a single library. Note that no bindings are generated for functions in `cuda_sources` per se. To bind to a CUDA kernel, you must create a C++ function that calls it, and either declare or define this C++ function in one of the `cpp_sources` (and include its name in `functions`).
此外,如果提供了`functions`的参数,指定的函数将自动生成绑定。`functions`可以是函数名列表,也可以是{函数名:文档字符串}的字典。如果给定了一个列表,则每个函数的名称用作其文档字符串。
See [`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load") for a description of arguments omitted below.
`cuda_sources`中的代码按顺序连接到单独的`.cu`文件,追加`torch/types.h`, `cuda.h` and `cuda_runtime.h`头文件.`.cpp``.cu` 文件分开编译, 最终连接到一个库中. 注意`cuda_sources`中的函数本身没有绑定,为了绑定CUDA核函数,必须新建一个C++函数来调用它,或者在`cpp_sources` 中声明或定义(并且在`functions`中包含它).
Parameters:
* **cpp_sources** – A string, or list of strings, containing C++ source code.
* **cuda_sources** – A string, or list of strings, containing CUDA source code.
* **functions** – A list of function names for which to generate function bindings. If a dictionary is given, it should map function names to docstrings (which are otherwise just the function names).
* **with_cuda** – Determines whether CUDA headers and libraries are added to the build. If set to `None` (default), this value is automatically determined based on whether `cuda_sources` is provided. Set it to `True`` to force CUDA headers and libraries to be included.
[`load()`](#torch.utils.cpp_extension.load "torch.utils.cpp_extension.load")查看下面忽略的参数.
参数:
* **cpp_sources** – 字符串, or 字符串列表, 包含C++源代码
* **cuda_sources** – 字符串, or 字符串列表, 包含CUDA源代码
* **functions** – 函数名列表 用于生成函数绑定. 如果是字典,key=函数名,value=文档描述.
* **with_cuda** – 确定是否添加CUDA头/库. 默认值 `None` (default), 取决于参数`cuda_sources` . `True`强制包含CUDA头/库.
Example
例子
```py
>>> from torch.utils.cpp_extension import load_inline
>>> source = '''
at::Tensor sin_add(at::Tensor x, at::Tensor y) {
return x.sin() + y.sin();
return x.sin() + y.sin();
}
'''
>>> module = load_inline(name='inline_extension',
cpp_sources=[source],
functions=['sin_add'])
cpp_sources=[source],
functions=['sin_add'])
```
......@@ -160,27 +163,25 @@ at::Tensor sin_add(at::Tensor x, at::Tensor y) {
torch.utils.cpp_extension.include_paths(cuda=False)
```
Get the include paths required to build a C++ or CUDA extension.
获取构建`C++``CUDA`扩展所需的路径。
| Parameters: | **cuda** – If `True`, includes CUDA-specific include paths. |
| --- | --- |
| Returns: | A list of include path strings. |
| --- | --- |
* 参数: `cuda` - 如果为True,则包含`CUDA`特定的包含路径。
* 返回: 包含路径字符串的列表。
```py
torch.utils.cpp_extension.check_compiler_abi_compatibility(compiler)
```
Verifies that the given compiler is ABI-compatible with PyTorch.
验证给定的编译器是否与`PyTorch` ABI兼容。
| Parameters: | **compiler** ([_str_](https://docs.python.org/3/library/stdtypes.html#str "(in Python v3.7)")) – The compiler executable name to check (e.g. `g++`). Must be executable in a shell process. |
| --- | --- |
| Returns: | False if the compiler is (likely) ABI-incompatible with PyTorch, else True. |
| --- | --- |
* 参数:compiler([_str_](https://docs.python.org/3/library/stdtypes.html#str "(in Python v3.7)")) - 要检查可执行的编译器文件名(例如`g++`),必须在`shell`进程中可执行。
* 返回:如果编译器(可能)与`PyTorch`ABI不兼容,则为`False`,否则返回`True`
```py
torch.utils.cpp_extension.verify_ninja_availability()
```
Returns `True` if the [ninja](https://ninja-build.org/) build system is available on the system.
如果可以在[ninja](https://ninja-build.org/)上运行则返回`True`
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册