提交 33d44668 编写于 作者: T Travis CI

Deploy to GitHub Pages: 1a3d4b0d

上级 83a7b5c6
# Design Doc: The Keys of Operator Kernel Type
## Problem
An operator can have different kernel implementations, and each operator will have a map to store the related kernels. Fluid uses `OpKernelType` as a key to identify a unique Kernel. Before an operator runs, an certain kernel must be chosen by a key of `OpKernelType`. Currently, `OpKernelType` is defined as follows:
```cpp
struct OpKernelType {
platform::Place place_;
proto::DataType data_type_;
};
```
For more details, please refer to [codes](https://github.com/PaddlePaddle/Paddle/blob/2d5ec16bc8a09fb8e0f62c89b116b0cd1d333907/paddle/framework/operator.h#L348-L374) in github.
It contains two keys, `Place` and `DataType`. And these two keys will be hashed to a unique key to represent a certain type of kernel. However, these two keys are not enough. We need a more complete representation of `OpKernelType`.
We often implement a kernel of an operator with some computing library in certain device(place). Please remind that computing library and device are not one-to-one corresponding. A device can have a lot of computing libraries and a computing library can also support several devices.
For example, Eigen library can support Nvidia GPU/AMD GPU/CPU. And MKLDNN library can support Intel CPU/Intel FPGA. Both `Place` and `Library` should be a key of `OpKernelType`.
It's obvious that different DataTypes, like fp64/fp32/int8 will have different kernels. But the data layout of a Tensor will also lead to different implementation. Please refer to the batch norm operator [kernels](https://github.com/PaddlePaddle/Paddle/blob/a948fac4d0ad7e0412d373b8aabeb711c2899563/paddle/operators/batch_norm_op.cc#L180-L209). Data Layout should also be taken into consideration.
## Solution
There are four keys to determine a kernel type of an operator: `Place`/`Library`/`DataType`/`Layout`.
```cpp
struct OpKernelType {
platform::Place place_;
platform::Library library_;
proto::DataType data_type_;
framework::Layout layout_;
};
```
Following is the details:
### Place
`Place` is defined as follows:
```cpp
typedef boost::variant<CUDAPlace, ROCmPlace, FPGAPlace, CPUPlace> Place;
```
`Place` is to represent the device memory where data is locating.
### Library
One operator kernel is usually implemented based on one library. `Library` is defined as a enum variable:
```cpp
enum Library { Plain, MKLDNN, CUDNN };
```
We use `Plain` enumerator to represent default library. Since most operators in Fluid are implemented based on `Eigen` library, we take `Eigen` library as the `Plain` enumerator.
A library usually has a corresponding `DeviceContext` which contains some handles needed by computation. Fluid now have two default DeviceContexts in CPU and CUDA, `CPUDeviceContext` and `CUDADeviceContext`. `CPUDeviceContext` contains a Eigen library handle and `CDUADeviceContext` contains a Eigen library handle and cuBLAS handle.
If we want to support new Library, a new enumerator need to be added to `Library` and a new corresponding `LibraryDeviceContext` will be created.
### DataType
`DataType` is defined in [framework.proto](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto). Currently, int32/int64/fp32/fp64 are supported.
### Layout
Actually, a Tensor is a view of a block of memory. Besides a pointer to the memory, we also have to get some other descriptions of this block of memory, such as shape(ddim), stride, and layout.
Different layout leads to different implementation of operator kernel. There are mainly 4 principles we have to follow to support layout in our fluid framework.
- We take layout as a data member of Tensor. Layout is actually a enum variable. If fluid is built with MKLDNN, then, the memory format in MKLDNN will be added into this enum variable too.
- Users have to set layout for input data. And some operators like fill_constant/random, also have to set layout of generating data. Of course, we can have some default layout, like NCHW.
- The inference of Layout is at run-time, not compile-time.
- Every operator have to implement different kernels for different layouts. Let's take MKLDNN as an example, if we want to implement a MKLDNN convolution operator, we have to realize all the kernels for different layout, list at [here](http://01org.github.io/mkl-dnn/structmkldnn_1_1memory.html). And we will have a special macro to do registering kernels for MKLDNN operators.
`Layout` is also defined as a enum variable:
```cpp
enum Layout {
kNCHW,
kNHWC,
#ifdef PADDLE_WITH_MKLDNN
knChw8c
...
#endif
};
```
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Design Doc: The Keys of Operator Kernel Type &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../genindex.html"/>
<link rel="search" title="Search" href="../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/regularizer.html">Regularizer</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_ios_en.html">PaddlePaddle Compiling Guide for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Design Doc: The Keys of Operator Kernel Type</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="design-doc-the-keys-of-operator-kernel-type">
<span id="design-doc-the-keys-of-operator-kernel-type"></span><h1>Design Doc: The Keys of Operator Kernel Type<a class="headerlink" href="#design-doc-the-keys-of-operator-kernel-type" title="Permalink to this headline"></a></h1>
<div class="section" id="problem">
<span id="problem"></span><h2>Problem<a class="headerlink" href="#problem" title="Permalink to this headline"></a></h2>
<p>An operator can have different kernel implementations, and each operator will have a map to store the related kernels. Fluid uses <code class="docutils literal"><span class="pre">OpKernelType</span></code> as a key to identify a unique Kernel. Before an operator runs, an certain kernel must be chosen by a key of <code class="docutils literal"><span class="pre">OpKernelType</span></code>. Currently, <code class="docutils literal"><span class="pre">OpKernelType</span></code> is defined as follows:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpKernelType</span> <span class="p">{</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Place</span> <span class="n">place_</span><span class="p">;</span>
<span class="n">proto</span><span class="o">::</span><span class="n">DataType</span> <span class="n">data_type_</span><span class="p">;</span>
<span class="p">};</span>
</pre></div>
</div>
<p>For more details, please refer to <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/2d5ec16bc8a09fb8e0f62c89b116b0cd1d333907/paddle/framework/operator.h#L348-L374">codes</a> in github.</p>
<p>It contains two keys, <code class="docutils literal"><span class="pre">Place</span></code> and <code class="docutils literal"><span class="pre">DataType</span></code>. And these two keys will be hashed to a unique key to represent a certain type of kernel. However, these two keys are not enough. We need a more complete representation of <code class="docutils literal"><span class="pre">OpKernelType</span></code>.</p>
<p>We often implement a kernel of an operator with some computing library in certain device(place). Please remind that computing library and device are not one-to-one corresponding. A device can have a lot of computing libraries and a computing library can also support several devices.</p>
<p>For example, Eigen library can support Nvidia GPU/AMD GPU/CPU. And MKLDNN library can support Intel CPU/Intel FPGA. Both <code class="docutils literal"><span class="pre">Place</span></code> and <code class="docutils literal"><span class="pre">Library</span></code> should be a key of <code class="docutils literal"><span class="pre">OpKernelType</span></code>.</p>
<p>It&#8217;s obvious that different DataTypes, like fp64/fp32/int8 will have different kernels. But the data layout of a Tensor will also lead to different implementation. Please refer to the batch norm operator <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/a948fac4d0ad7e0412d373b8aabeb711c2899563/paddle/operators/batch_norm_op.cc#L180-L209">kernels</a>. Data Layout should also be taken into consideration.</p>
</div>
<div class="section" id="solution">
<span id="solution"></span><h2>Solution<a class="headerlink" href="#solution" title="Permalink to this headline"></a></h2>
<p>There are four keys to determine a kernel type of an operator: <code class="docutils literal"><span class="pre">Place</span></code>/<code class="docutils literal"><span class="pre">Library</span></code>/<code class="docutils literal"><span class="pre">DataType</span></code>/<code class="docutils literal"><span class="pre">Layout</span></code>.</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpKernelType</span> <span class="p">{</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Place</span> <span class="n">place_</span><span class="p">;</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Library</span> <span class="n">library_</span><span class="p">;</span>
<span class="n">proto</span><span class="o">::</span><span class="n">DataType</span> <span class="n">data_type_</span><span class="p">;</span>
<span class="n">framework</span><span class="o">::</span><span class="n">Layout</span> <span class="n">layout_</span><span class="p">;</span>
<span class="p">};</span>
</pre></div>
</div>
<p>Following is the details:</p>
<div class="section" id="place">
<span id="place"></span><h3>Place<a class="headerlink" href="#place" title="Permalink to this headline"></a></h3>
<p><code class="docutils literal"><span class="pre">Place</span></code> is defined as follows:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">variant</span><span class="o">&lt;</span><span class="n">CUDAPlace</span><span class="p">,</span> <span class="n">ROCmPlace</span><span class="p">,</span> <span class="n">FPGAPlace</span><span class="p">,</span> <span class="n">CPUPlace</span><span class="o">&gt;</span> <span class="n">Place</span><span class="p">;</span>
</pre></div>
</div>
<p><code class="docutils literal"><span class="pre">Place</span></code> is to represent the device memory where data is locating.</p>
</div>
<div class="section" id="library">
<span id="library"></span><h3>Library<a class="headerlink" href="#library" title="Permalink to this headline"></a></h3>
<p>One operator kernel is usually implemented based on one library. <code class="docutils literal"><span class="pre">Library</span></code> is defined as a enum variable:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">enum</span> <span class="n">Library</span> <span class="p">{</span> <span class="n">Plain</span><span class="p">,</span> <span class="n">MKLDNN</span><span class="p">,</span> <span class="n">CUDNN</span> <span class="p">};</span>
</pre></div>
</div>
<p>We use <code class="docutils literal"><span class="pre">Plain</span></code> enumerator to represent default library. Since most operators in Fluid are implemented based on <code class="docutils literal"><span class="pre">Eigen</span></code> library, we take <code class="docutils literal"><span class="pre">Eigen</span></code> library as the <code class="docutils literal"><span class="pre">Plain</span></code> enumerator.
A library usually has a corresponding <code class="docutils literal"><span class="pre">DeviceContext</span></code> which contains some handles needed by computation. Fluid now have two default DeviceContexts in CPU and CUDA, <code class="docutils literal"><span class="pre">CPUDeviceContext</span></code> and <code class="docutils literal"><span class="pre">CUDADeviceContext</span></code>. <code class="docutils literal"><span class="pre">CPUDeviceContext</span></code> contains a Eigen library handle and <code class="docutils literal"><span class="pre">CDUADeviceContext</span></code> contains a Eigen library handle and cuBLAS handle.</p>
<p>If we want to support new Library, a new enumerator need to be added to <code class="docutils literal"><span class="pre">Library</span></code> and a new corresponding <code class="docutils literal"><span class="pre">LibraryDeviceContext</span></code> will be created.</p>
</div>
<div class="section" id="datatype">
<span id="datatype"></span><h3>DataType<a class="headerlink" href="#datatype" title="Permalink to this headline"></a></h3>
<p><code class="docutils literal"><span class="pre">DataType</span></code> is defined in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto">framework.proto</a>. Currently, int32/int64/fp32/fp64 are supported.</p>
</div>
<div class="section" id="layout">
<span id="layout"></span><h3>Layout<a class="headerlink" href="#layout" title="Permalink to this headline"></a></h3>
<p>Actually, a Tensor is a view of a block of memory. Besides a pointer to the memory, we also have to get some other descriptions of this block of memory, such as shape(ddim), stride, and layout.</p>
<p>Different layout leads to different implementation of operator kernel. There are mainly 4 principles we have to follow to support layout in our fluid framework.</p>
<ul class="simple">
<li>We take layout as a data member of Tensor. Layout is actually a enum variable. If fluid is built with MKLDNN, then, the memory format in MKLDNN will be added into this enum variable too.</li>
<li>Users have to set layout for input data. And some operators like fill_constant/random, also have to set layout of generating data. Of course, we can have some default layout, like NCHW.</li>
<li>The inference of Layout is at run-time, not compile-time.</li>
<li>Every operator have to implement different kernels for different layouts. Let&#8217;s take MKLDNN as an example, if we want to implement a MKLDNN convolution operator, we have to realize all the kernels for different layout, list at <a class="reference external" href="http://01org.github.io/mkl-dnn/structmkldnn_1_1memory.html">here</a>. And we will have a special macro to do registering kernels for MKLDNN operators.</li>
</ul>
<p><code class="docutils literal"><span class="pre">Layout</span></code> is also defined as a enum variable:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">enum</span> <span class="n">Layout</span> <span class="p">{</span>
<span class="n">kNCHW</span><span class="p">,</span>
<span class="n">kNHWC</span><span class="p">,</span>
<span class="cp">#ifdef PADDLE_WITH_MKLDNN</span>
<span class="n">knChw8c</span>
<span class="p">...</span>
<span class="cp">#endif</span>
<span class="p">};</span>
</pre></div>
</div>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
因为 它太大了无法显示 source diff 。你可以改为 查看blob
# Design Doc: The Keys of Operator Kernel Type
## Problem
An operator can have different kernel implementations, and each operator will have a map to store the related kernels. Fluid uses `OpKernelType` as a key to identify a unique Kernel. Before an operator runs, an certain kernel must be chosen by a key of `OpKernelType`. Currently, `OpKernelType` is defined as follows:
```cpp
struct OpKernelType {
platform::Place place_;
proto::DataType data_type_;
};
```
For more details, please refer to [codes](https://github.com/PaddlePaddle/Paddle/blob/2d5ec16bc8a09fb8e0f62c89b116b0cd1d333907/paddle/framework/operator.h#L348-L374) in github.
It contains two keys, `Place` and `DataType`. And these two keys will be hashed to a unique key to represent a certain type of kernel. However, these two keys are not enough. We need a more complete representation of `OpKernelType`.
We often implement a kernel of an operator with some computing library in certain device(place). Please remind that computing library and device are not one-to-one corresponding. A device can have a lot of computing libraries and a computing library can also support several devices.
For example, Eigen library can support Nvidia GPU/AMD GPU/CPU. And MKLDNN library can support Intel CPU/Intel FPGA. Both `Place` and `Library` should be a key of `OpKernelType`.
It's obvious that different DataTypes, like fp64/fp32/int8 will have different kernels. But the data layout of a Tensor will also lead to different implementation. Please refer to the batch norm operator [kernels](https://github.com/PaddlePaddle/Paddle/blob/a948fac4d0ad7e0412d373b8aabeb711c2899563/paddle/operators/batch_norm_op.cc#L180-L209). Data Layout should also be taken into consideration.
## Solution
There are four keys to determine a kernel type of an operator: `Place`/`Library`/`DataType`/`Layout`.
```cpp
struct OpKernelType {
platform::Place place_;
platform::Library library_;
proto::DataType data_type_;
framework::Layout layout_;
};
```
Following is the details:
### Place
`Place` is defined as follows:
```cpp
typedef boost::variant<CUDAPlace, ROCmPlace, FPGAPlace, CPUPlace> Place;
```
`Place` is to represent the device memory where data is locating.
### Library
One operator kernel is usually implemented based on one library. `Library` is defined as a enum variable:
```cpp
enum Library { Plain, MKLDNN, CUDNN };
```
We use `Plain` enumerator to represent default library. Since most operators in Fluid are implemented based on `Eigen` library, we take `Eigen` library as the `Plain` enumerator.
A library usually has a corresponding `DeviceContext` which contains some handles needed by computation. Fluid now have two default DeviceContexts in CPU and CUDA, `CPUDeviceContext` and `CUDADeviceContext`. `CPUDeviceContext` contains a Eigen library handle and `CDUADeviceContext` contains a Eigen library handle and cuBLAS handle.
If we want to support new Library, a new enumerator need to be added to `Library` and a new corresponding `LibraryDeviceContext` will be created.
### DataType
`DataType` is defined in [framework.proto](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto). Currently, int32/int64/fp32/fp64 are supported.
### Layout
Actually, a Tensor is a view of a block of memory. Besides a pointer to the memory, we also have to get some other descriptions of this block of memory, such as shape(ddim), stride, and layout.
Different layout leads to different implementation of operator kernel. There are mainly 4 principles we have to follow to support layout in our fluid framework.
- We take layout as a data member of Tensor. Layout is actually a enum variable. If fluid is built with MKLDNN, then, the memory format in MKLDNN will be added into this enum variable too.
- Users have to set layout for input data. And some operators like fill_constant/random, also have to set layout of generating data. Of course, we can have some default layout, like NCHW.
- The inference of Layout is at run-time, not compile-time.
- Every operator have to implement different kernels for different layouts. Let's take MKLDNN as an example, if we want to implement a MKLDNN convolution operator, we have to realize all the kernels for different layout, list at [here](http://01org.github.io/mkl-dnn/structmkldnn_1_1memory.html). And we will have a special macro to do registering kernels for MKLDNN operators.
`Layout` is also defined as a enum variable:
```cpp
enum Layout {
kNCHW,
kNHWC,
#ifdef PADDLE_WITH_MKLDNN
knChw8c
...
#endif
};
```
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Design Doc: The Keys of Operator Kernel Type &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../genindex.html"/>
<link rel="search" title="搜索" href="../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_cn.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/fabric_cn.html">fabric集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">数据访问</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">训练与应用</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/regularizer.html">Regularizer</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_cn.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_android_cn.html">Android平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_ios_cn.html">iOS平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_raspberry_cn.html">Raspberry Pi平台编译指南</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Design Doc: The Keys of Operator Kernel Type</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="design-doc-the-keys-of-operator-kernel-type">
<span id="design-doc-the-keys-of-operator-kernel-type"></span><h1>Design Doc: The Keys of Operator Kernel Type<a class="headerlink" href="#design-doc-the-keys-of-operator-kernel-type" title="永久链接至标题"></a></h1>
<div class="section" id="problem">
<span id="problem"></span><h2>Problem<a class="headerlink" href="#problem" title="永久链接至标题"></a></h2>
<p>An operator can have different kernel implementations, and each operator will have a map to store the related kernels. Fluid uses <code class="docutils literal"><span class="pre">OpKernelType</span></code> as a key to identify a unique Kernel. Before an operator runs, an certain kernel must be chosen by a key of <code class="docutils literal"><span class="pre">OpKernelType</span></code>. Currently, <code class="docutils literal"><span class="pre">OpKernelType</span></code> is defined as follows:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpKernelType</span> <span class="p">{</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Place</span> <span class="n">place_</span><span class="p">;</span>
<span class="n">proto</span><span class="o">::</span><span class="n">DataType</span> <span class="n">data_type_</span><span class="p">;</span>
<span class="p">};</span>
</pre></div>
</div>
<p>For more details, please refer to <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/2d5ec16bc8a09fb8e0f62c89b116b0cd1d333907/paddle/framework/operator.h#L348-L374">codes</a> in github.</p>
<p>It contains two keys, <code class="docutils literal"><span class="pre">Place</span></code> and <code class="docutils literal"><span class="pre">DataType</span></code>. And these two keys will be hashed to a unique key to represent a certain type of kernel. However, these two keys are not enough. We need a more complete representation of <code class="docutils literal"><span class="pre">OpKernelType</span></code>.</p>
<p>We often implement a kernel of an operator with some computing library in certain device(place). Please remind that computing library and device are not one-to-one corresponding. A device can have a lot of computing libraries and a computing library can also support several devices.</p>
<p>For example, Eigen library can support Nvidia GPU/AMD GPU/CPU. And MKLDNN library can support Intel CPU/Intel FPGA. Both <code class="docutils literal"><span class="pre">Place</span></code> and <code class="docutils literal"><span class="pre">Library</span></code> should be a key of <code class="docutils literal"><span class="pre">OpKernelType</span></code>.</p>
<p>It&#8217;s obvious that different DataTypes, like fp64/fp32/int8 will have different kernels. But the data layout of a Tensor will also lead to different implementation. Please refer to the batch norm operator <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/a948fac4d0ad7e0412d373b8aabeb711c2899563/paddle/operators/batch_norm_op.cc#L180-L209">kernels</a>. Data Layout should also be taken into consideration.</p>
</div>
<div class="section" id="solution">
<span id="solution"></span><h2>Solution<a class="headerlink" href="#solution" title="永久链接至标题"></a></h2>
<p>There are four keys to determine a kernel type of an operator: <code class="docutils literal"><span class="pre">Place</span></code>/<code class="docutils literal"><span class="pre">Library</span></code>/<code class="docutils literal"><span class="pre">DataType</span></code>/<code class="docutils literal"><span class="pre">Layout</span></code>.</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpKernelType</span> <span class="p">{</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Place</span> <span class="n">place_</span><span class="p">;</span>
<span class="n">platform</span><span class="o">::</span><span class="n">Library</span> <span class="n">library_</span><span class="p">;</span>
<span class="n">proto</span><span class="o">::</span><span class="n">DataType</span> <span class="n">data_type_</span><span class="p">;</span>
<span class="n">framework</span><span class="o">::</span><span class="n">Layout</span> <span class="n">layout_</span><span class="p">;</span>
<span class="p">};</span>
</pre></div>
</div>
<p>Following is the details:</p>
<div class="section" id="place">
<span id="place"></span><h3>Place<a class="headerlink" href="#place" title="永久链接至标题"></a></h3>
<p><code class="docutils literal"><span class="pre">Place</span></code> is defined as follows:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">variant</span><span class="o">&lt;</span><span class="n">CUDAPlace</span><span class="p">,</span> <span class="n">ROCmPlace</span><span class="p">,</span> <span class="n">FPGAPlace</span><span class="p">,</span> <span class="n">CPUPlace</span><span class="o">&gt;</span> <span class="n">Place</span><span class="p">;</span>
</pre></div>
</div>
<p><code class="docutils literal"><span class="pre">Place</span></code> is to represent the device memory where data is locating.</p>
</div>
<div class="section" id="library">
<span id="library"></span><h3>Library<a class="headerlink" href="#library" title="永久链接至标题"></a></h3>
<p>One operator kernel is usually implemented based on one library. <code class="docutils literal"><span class="pre">Library</span></code> is defined as a enum variable:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">enum</span> <span class="n">Library</span> <span class="p">{</span> <span class="n">Plain</span><span class="p">,</span> <span class="n">MKLDNN</span><span class="p">,</span> <span class="n">CUDNN</span> <span class="p">};</span>
</pre></div>
</div>
<p>We use <code class="docutils literal"><span class="pre">Plain</span></code> enumerator to represent default library. Since most operators in Fluid are implemented based on <code class="docutils literal"><span class="pre">Eigen</span></code> library, we take <code class="docutils literal"><span class="pre">Eigen</span></code> library as the <code class="docutils literal"><span class="pre">Plain</span></code> enumerator.
A library usually has a corresponding <code class="docutils literal"><span class="pre">DeviceContext</span></code> which contains some handles needed by computation. Fluid now have two default DeviceContexts in CPU and CUDA, <code class="docutils literal"><span class="pre">CPUDeviceContext</span></code> and <code class="docutils literal"><span class="pre">CUDADeviceContext</span></code>. <code class="docutils literal"><span class="pre">CPUDeviceContext</span></code> contains a Eigen library handle and <code class="docutils literal"><span class="pre">CDUADeviceContext</span></code> contains a Eigen library handle and cuBLAS handle.</p>
<p>If we want to support new Library, a new enumerator need to be added to <code class="docutils literal"><span class="pre">Library</span></code> and a new corresponding <code class="docutils literal"><span class="pre">LibraryDeviceContext</span></code> will be created.</p>
</div>
<div class="section" id="datatype">
<span id="datatype"></span><h3>DataType<a class="headerlink" href="#datatype" title="永久链接至标题"></a></h3>
<p><code class="docutils literal"><span class="pre">DataType</span></code> is defined in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto">framework.proto</a>. Currently, int32/int64/fp32/fp64 are supported.</p>
</div>
<div class="section" id="layout">
<span id="layout"></span><h3>Layout<a class="headerlink" href="#layout" title="永久链接至标题"></a></h3>
<p>Actually, a Tensor is a view of a block of memory. Besides a pointer to the memory, we also have to get some other descriptions of this block of memory, such as shape(ddim), stride, and layout.</p>
<p>Different layout leads to different implementation of operator kernel. There are mainly 4 principles we have to follow to support layout in our fluid framework.</p>
<ul class="simple">
<li>We take layout as a data member of Tensor. Layout is actually a enum variable. If fluid is built with MKLDNN, then, the memory format in MKLDNN will be added into this enum variable too.</li>
<li>Users have to set layout for input data. And some operators like fill_constant/random, also have to set layout of generating data. Of course, we can have some default layout, like NCHW.</li>
<li>The inference of Layout is at run-time, not compile-time.</li>
<li>Every operator have to implement different kernels for different layouts. Let&#8217;s take MKLDNN as an example, if we want to implement a MKLDNN convolution operator, we have to realize all the kernels for different layout, list at <a class="reference external" href="http://01org.github.io/mkl-dnn/structmkldnn_1_1memory.html">here</a>. And we will have a special macro to do registering kernels for MKLDNN operators.</li>
</ul>
<p><code class="docutils literal"><span class="pre">Layout</span></code> is also defined as a enum variable:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">enum</span> <span class="n">Layout</span> <span class="p">{</span>
<span class="n">kNCHW</span><span class="p">,</span>
<span class="n">kNHWC</span><span class="p">,</span>
<span class="cp">#ifdef PADDLE_WITH_MKLDNN</span>
<span class="n">knChw8c</span>
<span class="p">...</span>
<span class="cp">#endif</span>
<span class="p">};</span>
</pre></div>
</div>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
因为 它太大了无法显示 source diff 。你可以改为 查看blob
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册