@@ -17,7 +17,7 @@ The goals of refactoring include:
1. A graph is composed of *variables* and *operators*.
1. The description of graphs must be capable of being serialized/deserialized, so that
1. The description of graphs must be capable of being serialized/deserialized, so that:
1. It can to be sent to the cloud for distributed execution, and
1. It can be sent to clients for mobile or enterprise deployment.
...
...
@@ -137,19 +137,18 @@ Compile Time -> IR -> Runtime
* `Eigen::Tensor` contains basic math and element-wise functions.
* Note that `Eigen::Tensor` has broadcast implementation.
* Limit the number of `tensor.device(dev) = ` in your code.
* `thrust::tranform` and `std::transform`.
* `thrust` has the same API as C++ standard library. Using `transform`, one can quickly implement customized elementwise kernels.
* `thrust::transform` and `std::transform`.
* `thrust` has the same API as C++ standard library. Using `transform`, one can quickly implement customized element-wise kernels.
* `thrust` also has more complex APIs, like `scan`, `reduce`, `reduce_by_key`.
* Hand-writing `GPUKernel` and `CPU` code
* Do not write in header (`.h`) files. CPU Kernel should be in cpp source (`.cc`) and GPU kernels should be in cuda (`.cu`) files. (GCC cannot compile GPU code.)
---
# Operator Registration
## Why registration is necessary?
## Why is registration necessary?
We need a method to build mappings between Op type names and Op classes.
## How is registration implemented?
Maintaining a map, whose key is the type name and the value is the corresponding Op constructor.
---
...
...
@@ -170,7 +169,7 @@ Maintaining a map, whose key is the type name and the value is the corresponding
# Related Concepts
### Op_Maker
It's constructor takes `proto` and `checker`. They are compeleted during Op_Maker's construction. ([ScaleOpMaker](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37))
It's constructor takes `proto` and `checker`. They are completed during Op_Maker's construction. ([ScaleOpMaker](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37))
### Register Macros
```cpp
...
...
@@ -200,7 +199,7 @@ Make sure the registration process is executed and linked.
---
# Backward Module (2/2)
### Build Backward Network
- **Input**: graph of forwarding operators
- **Input**: graph of forward operators
- **Output**: graph of backward operators
- **Corner cases in construction**
- Shared Variables => insert an `Add` operator to combine gradients
...
...
@@ -224,7 +223,7 @@ Make sure the registration process is executed and linked.
---
# Block (in design)
## the difference with original RNNOp
## the difference between original RNNOp and Block
- As an operator is more intuitive than `RNNOp`,
- Offers a new interface `Eval(targets)` to deduce the minimal block to `Run`,
- Fits the compile-time/ runtime separation design paradigm.
<li>Please refer to <aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/graph.md">computation graphs</a> for a concrete example.</li>
<li>Users write Python programs to describe the graphs and run them (locally or remotely).</li>
<li>A graph is composed of <em>variables</em> and <em>operators</em>.</li>
<li>The description of graphs must be capable of being serialized/deserialized, so that<ol>
<li>The description of graphs must be capable of being serialized/deserialized, so that:<ol>
<li>It can to be sent to the cloud for distributed execution, and</li>
<li>It can be sent to clients for mobile or enterprise deployment.</li>
</ol>
...
...
@@ -341,8 +341,8 @@
<li>Limit the number of <codeclass="docutils literal"><spanclass="pre">tensor.device(dev)</span><spanclass="pre">=</span></code> in your code.</li>
</ul>
</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust::tranform</span></code> and <codeclass="docutils literal"><spanclass="pre">std::transform</span></code>.<ul>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> has the same API as C++ standard library. Using <codeclass="docutils literal"><spanclass="pre">transform</span></code>, one can quickly implement customized elementwise kernels.</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust::transform</span></code> and <codeclass="docutils literal"><spanclass="pre">std::transform</span></code>.<ul>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> has the same API as C++ standard library. Using <codeclass="docutils literal"><spanclass="pre">transform</span></code>, one can quickly implement customized element-wise kernels.</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> also has more complex APIs, like <codeclass="docutils literal"><spanclass="pre">scan</span></code>, <codeclass="docutils literal"><spanclass="pre">reduce</span></code>, <codeclass="docutils literal"><spanclass="pre">reduce_by_key</span></code>.</li>
</ul>
</li>
...
...
@@ -355,8 +355,8 @@
<hrclass="docutils"/>
<divclass="section"id="operator-registration">
<spanid="operator-registration"></span><h1>Operator Registration<aclass="headerlink"href="#operator-registration"title="Permalink to this headline">¶</a></h1>
<spanid="why-registration-is-necessary"></span><h2>Why registration is necessary?<aclass="headerlink"href="#why-registration-is-necessary"title="Permalink to this headline">¶</a></h2>
<spanid="why-is-registration-necessary"></span><h2>Why is registration necessary?<aclass="headerlink"href="#why-is-registration-necessary"title="Permalink to this headline">¶</a></h2>
<p>We need a method to build mappings between Op type names and Op classes.</p>
<spanid="related-concepts"></span><h1>Related Concepts<aclass="headerlink"href="#related-concepts"title="Permalink to this headline">¶</a></h1>
<divclass="section"id="op-maker">
<spanid="op-maker"></span><h2>Op_Maker<aclass="headerlink"href="#op-maker"title="Permalink to this headline">¶</a></h2>
<p>It’s constructor takes <codeclass="docutils literal"><spanclass="pre">proto</span></code> and <codeclass="docutils literal"><spanclass="pre">checker</span></code>. They are compeleted during Op_Maker’s construction. (<aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37">ScaleOpMaker</a>)</p>
<p>It’s constructor takes <codeclass="docutils literal"><spanclass="pre">proto</span></code> and <codeclass="docutils literal"><spanclass="pre">checker</span></code>. They are completed during Op_Maker’s construction. (<aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37">ScaleOpMaker</a>)</p>
</div>
<divclass="section"id="register-macros">
<spanid="register-macros"></span><h2>Register Macros<aclass="headerlink"href="#register-macros"title="Permalink to this headline">¶</a></h2>
...
...
@@ -429,7 +429,7 @@
<divclass="section"id="build-backward-network">
<spanid="build-backward-network"></span><h2>Build Backward Network<aclass="headerlink"href="#build-backward-network"title="Permalink to this headline">¶</a></h2>
<ulclass="simple">
<li><strong>Input</strong>: graph of forwarding operators</li>
<li><strong>Input</strong>: graph of forward operators</li>
<li><strong>Output</strong>: graph of backward operators</li>
<li><strong>Corner cases in construction</strong><ul>
<li>Shared Variables => insert an <codeclass="docutils literal"><spanclass="pre">Add</span></code> operator to combine gradients</li>
...
...
@@ -465,8 +465,8 @@
<hrclass="docutils"/>
<divclass="section"id="block-in-design">
<spanid="block-in-design"></span><h1>Block (in design)<aclass="headerlink"href="#block-in-design"title="Permalink to this headline">¶</a></h1>
<spanid="the-difference-with-original-rnnop"></span><h2>the difference with original RNNOp<aclass="headerlink"href="#the-difference-with-original-rnnop"title="Permalink to this headline">¶</a></h2>
<spanid="the-difference-between-original-rnnop-and-block"></span><h2>the difference between original RNNOp and Block<aclass="headerlink"href="#the-difference-between-original-rnnop-and-block"title="Permalink to this headline">¶</a></h2>
<ulclass="simple">
<li>As an operator is more intuitive than <codeclass="docutils literal"><spanclass="pre">RNNOp</span></code>,</li>
<li>Offers a new interface <codeclass="docutils literal"><spanclass="pre">Eval(targets)</span></code> to deduce the minimal block to <codeclass="docutils literal"><spanclass="pre">Run</span></code>,</li>
@@ -17,7 +17,7 @@ The goals of refactoring include:
1. A graph is composed of *variables* and *operators*.
1. The description of graphs must be capable of being serialized/deserialized, so that
1. The description of graphs must be capable of being serialized/deserialized, so that:
1. It can to be sent to the cloud for distributed execution, and
1. It can be sent to clients for mobile or enterprise deployment.
...
...
@@ -137,19 +137,18 @@ Compile Time -> IR -> Runtime
* `Eigen::Tensor` contains basic math and element-wise functions.
* Note that `Eigen::Tensor` has broadcast implementation.
* Limit the number of `tensor.device(dev) = ` in your code.
* `thrust::tranform` and `std::transform`.
* `thrust` has the same API as C++ standard library. Using `transform`, one can quickly implement customized elementwise kernels.
* `thrust::transform` and `std::transform`.
* `thrust` has the same API as C++ standard library. Using `transform`, one can quickly implement customized element-wise kernels.
* `thrust` also has more complex APIs, like `scan`, `reduce`, `reduce_by_key`.
* Hand-writing `GPUKernel` and `CPU` code
* Do not write in header (`.h`) files. CPU Kernel should be in cpp source (`.cc`) and GPU kernels should be in cuda (`.cu`) files. (GCC cannot compile GPU code.)
---
# Operator Registration
## Why registration is necessary?
## Why is registration necessary?
We need a method to build mappings between Op type names and Op classes.
## How is registration implemented?
Maintaining a map, whose key is the type name and the value is the corresponding Op constructor.
---
...
...
@@ -170,7 +169,7 @@ Maintaining a map, whose key is the type name and the value is the corresponding
# Related Concepts
### Op_Maker
It's constructor takes `proto` and `checker`. They are compeleted during Op_Maker's construction. ([ScaleOpMaker](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37))
It's constructor takes `proto` and `checker`. They are completed during Op_Maker's construction. ([ScaleOpMaker](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37))
### Register Macros
```cpp
...
...
@@ -200,7 +199,7 @@ Make sure the registration process is executed and linked.
---
# Backward Module (2/2)
### Build Backward Network
- **Input**: graph of forwarding operators
- **Input**: graph of forward operators
- **Output**: graph of backward operators
- **Corner cases in construction**
- Shared Variables => insert an `Add` operator to combine gradients
...
...
@@ -224,7 +223,7 @@ Make sure the registration process is executed and linked.
---
# Block (in design)
## the difference with original RNNOp
## the difference between original RNNOp and Block
- As an operator is more intuitive than `RNNOp`,
- Offers a new interface `Eval(targets)` to deduce the minimal block to `Run`,
- Fits the compile-time/ runtime separation design paradigm.
<li>Please refer to <aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/graph.md">computation graphs</a> for a concrete example.</li>
<li>Users write Python programs to describe the graphs and run them (locally or remotely).</li>
<li>A graph is composed of <em>variables</em> and <em>operators</em>.</li>
<li>The description of graphs must be capable of being serialized/deserialized, so that<ol>
<li>The description of graphs must be capable of being serialized/deserialized, so that:<ol>
<li>It can to be sent to the cloud for distributed execution, and</li>
<li>It can be sent to clients for mobile or enterprise deployment.</li>
</ol>
...
...
@@ -355,8 +355,8 @@
<li>Limit the number of <codeclass="docutils literal"><spanclass="pre">tensor.device(dev)</span><spanclass="pre">=</span></code> in your code.</li>
</ul>
</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust::tranform</span></code> and <codeclass="docutils literal"><spanclass="pre">std::transform</span></code>.<ul>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> has the same API as C++ standard library. Using <codeclass="docutils literal"><spanclass="pre">transform</span></code>, one can quickly implement customized elementwise kernels.</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust::transform</span></code> and <codeclass="docutils literal"><spanclass="pre">std::transform</span></code>.<ul>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> has the same API as C++ standard library. Using <codeclass="docutils literal"><spanclass="pre">transform</span></code>, one can quickly implement customized element-wise kernels.</li>
<li><codeclass="docutils literal"><spanclass="pre">thrust</span></code> also has more complex APIs, like <codeclass="docutils literal"><spanclass="pre">scan</span></code>, <codeclass="docutils literal"><spanclass="pre">reduce</span></code>, <codeclass="docutils literal"><spanclass="pre">reduce_by_key</span></code>.</li>
<spanid="why-registration-is-necessary"></span><h2>Why registration is necessary?<aclass="headerlink"href="#why-registration-is-necessary"title="永久链接至标题">¶</a></h2>
<spanid="why-is-registration-necessary"></span><h2>Why is registration necessary?<aclass="headerlink"href="#why-is-registration-necessary"title="永久链接至标题">¶</a></h2>
<p>We need a method to build mappings between Op type names and Op classes.</p>
<p>It’s constructor takes <codeclass="docutils literal"><spanclass="pre">proto</span></code> and <codeclass="docutils literal"><spanclass="pre">checker</span></code>. They are compeleted during Op_Maker’s construction. (<aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37">ScaleOpMaker</a>)</p>
<p>It’s constructor takes <codeclass="docutils literal"><spanclass="pre">proto</span></code> and <codeclass="docutils literal"><spanclass="pre">checker</span></code>. They are completed during Op_Maker’s construction. (<aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/scale_op.cc#L37">ScaleOpMaker</a>)</p>
<spanid="the-difference-with-original-rnnop"></span><h2>the difference with original RNNOp<aclass="headerlink"href="#the-difference-with-original-rnnop"title="永久链接至标题">¶</a></h2>
<spanid="the-difference-between-original-rnnop-and-block"></span><h2>the difference between original RNNOp and Block<aclass="headerlink"href="#the-difference-between-original-rnnop-and-block"title="永久链接至标题">¶</a></h2>
<ulclass="simple">
<li>As an operator is more intuitive than <codeclass="docutils literal"><spanclass="pre">RNNOp</span></code>,</li>
<li>Offers a new interface <codeclass="docutils literal"><spanclass="pre">Eval(targets)</span></code> to deduce the minimal block to <codeclass="docutils literal"><spanclass="pre">Run</span></code>,</li>