提交 d0fc49e1 编写于 作者: T Travis CI

Deploy to GitHub Pages: 43702a89

上级 ba5f789e
...@@ -3,17 +3,17 @@ ...@@ -3,17 +3,17 @@
## The Problem Posed ## The Problem Posed
Currently, for each C++ operator class definition, there registers a *gradient operator creator* function, which takes a C++ operator instance and returns the corresponding gradient operator instance. Currently, for each C++ operator class definition, a *gradient operator creator* function is registered, which takes as input a C++ operator instance and returns the corresponding gradient operator instance.
However, we noticed two problems with the current deisgn: However, we noticed two problems with the current design:
1. As we decided to separate the *compilation* and *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message. 1. As we decided to separate the *compilation* and the *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message.
1. Some operator's gradient computation requires more than one gradient operators. For example, the gradient of *minus* consists of two operators -- an identity operaotr and a scale operator. So we need to make the registration mechanism to support the mapping from an operator to a set of operators for gradient computation. 1. For some operators, the gradient computation can be written in terms of existing operators. For example, the gradient of *minus* operator consists of two operators -- an *identity* operator followed by a *scale* operator. Hence the registration mechanism needs to support mapping from an operator to a set of operators for the gradient computation.
## The Current Implementation ## The Current Implementation
The C++ class `OpInfos` store in a association map which key is the operator type. The `grad_op_type` indicate associated gradient operator type. Operator can create gradient operator by `OpInfo::creator_` of gradient. The pseudo code is Instances of the C++ class `OpInfo` are stored an associative map whose key is the operator type. The `grad_op_type` indicates the associated gradient operator type. An operator can create the gradient operator by invoking `OpInfo::creator_` of the gradient operator. The pseudo code is as follows
```cpp ```cpp
struct OpInfo { struct OpInfo {
...@@ -31,16 +31,16 @@ OperatorBase* CreateGradientOperator(const OperatorBase& op) { ...@@ -31,16 +31,16 @@ OperatorBase* CreateGradientOperator(const OperatorBase& op) {
## Proposed Solution ## Proposed Solution
The mapping relationship between an operator and its gradient operators is a function. The interface of that function is: The mapping relationship between an operator and its gradient operators is a function. The interface of this function is:
```cpp ```cpp
// (OpDesc) --> vector<OpDesc> // (OpDesc) --> vector<OpDesc>
std::function<std::vector<OpDescBind>(const OpDescBind&)>; std::function<std::vector<OpDescBind>(const OpDescBind&)>;
``` ```
The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for protobuf message `OpDesc` to manipulate `OpDesc` fast. The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for the protobuf message `OpDesc` for rapid manipulation of `OpDesc`.
The `GradOpDescMaker` will be registered in `OpInfo`, to replace `grad_op_type_` field. The `OpInfo` should be The `GradOpDescMaker` will be registered in `OpInfo` and will replace the `grad_op_type_` field. The `OpInfo` should look like
```cpp ```cpp
struct OpInfo { struct OpInfo {
...@@ -49,7 +49,7 @@ struct OpInfo { ...@@ -49,7 +49,7 @@ struct OpInfo {
}; };
``` ```
The `grad_op_maker_ ` is `nullptr` if the operator does not have associated gradient operators. The `grad_op_maker_ ` is a `nullptr` if the operator does not have any associated gradient operators.
We propose a base class called `GradOpDescMakerBase` to let operator developers generate `Gradient Operators` easily. The public interface of that class is We propose a base class called `GradOpDescMakerBase` to let operator developers generate `Gradient Operators` easily. The public interface of that class is
...@@ -74,7 +74,7 @@ func = [] (const OpDescBind& fwd_op) { ...@@ -74,7 +74,7 @@ func = [] (const OpDescBind& fwd_op) {
We can write many helper functions since the `GradOpDescMakerBase` is a class now. The basic helper functions get the variables of `Input`, `Output`, `InputGradient` and `OutputGradient` in the forwarding operator. We can write many helper functions since the `GradOpDescMakerBase` is a class now. The basic helper functions get the variables of `Input`, `Output`, `InputGradient` and `OutputGradient` in the forwarding operator.
We should chagne register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`. We should change register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`.
The user interface should be The user interface should be
......
...@@ -185,16 +185,16 @@ ...@@ -185,16 +185,16 @@
<span id="design-doc-gradient-operators-registration"></span><h1>Design Doc: Gradient Operators Registration<a class="headerlink" href="#design-doc-gradient-operators-registration" title="Permalink to this headline"></a></h1> <span id="design-doc-gradient-operators-registration"></span><h1>Design Doc: Gradient Operators Registration<a class="headerlink" href="#design-doc-gradient-operators-registration" title="Permalink to this headline"></a></h1>
<div class="section" id="the-problem-posed"> <div class="section" id="the-problem-posed">
<span id="the-problem-posed"></span><h2>The Problem Posed<a class="headerlink" href="#the-problem-posed" title="Permalink to this headline"></a></h2> <span id="the-problem-posed"></span><h2>The Problem Posed<a class="headerlink" href="#the-problem-posed" title="Permalink to this headline"></a></h2>
<p>Currently, for each C++ operator class definition, there registers a <em>gradient operator creator</em> function, which takes a C++ operator instance and returns the corresponding gradient operator instance.</p> <p>Currently, for each C++ operator class definition, a <em>gradient operator creator</em> function is registered, which takes as input a C++ operator instance and returns the corresponding gradient operator instance.</p>
<p>However, we noticed two problems with the current deisgn:</p> <p>However, we noticed two problems with the current design:</p>
<ol class="simple"> <ol class="simple">
<li>As we decided to separate the <em>compilation</em> and <em>execution</em> phases, we need to change the creator to take an <code class="docutils literal"><span class="pre">OpDesc</span></code> protobuf message in a <code class="docutils literal"><span class="pre">ProgramDesc</span></code> and inserts corresponding <code class="docutils literal"><span class="pre">OpDesc</span></code> messages into the <code class="docutils literal"><span class="pre">ProgramDesc</span></code> message.</li> <li>As we decided to separate the <em>compilation</em> and the <em>execution</em> phases, we need to change the creator to take an <code class="docutils literal"><span class="pre">OpDesc</span></code> protobuf message in a <code class="docutils literal"><span class="pre">ProgramDesc</span></code> and inserts corresponding <code class="docutils literal"><span class="pre">OpDesc</span></code> messages into the <code class="docutils literal"><span class="pre">ProgramDesc</span></code> message.</li>
<li>Some operator&#8217;s gradient computation requires more than one gradient operators. For example, the gradient of <em>minus</em> consists of two operators &#8211; an identity operaotr and a scale operator. So we need to make the registration mechanism to support the mapping from an operator to a set of operators for gradient computation.</li> <li>For some operators, the gradient computation can be written in terms of existing operators. For example, the gradient of <em>minus</em> operator consists of two operators &#8211; an <em>identity</em> operator followed by a <em>scale</em> operator. Hence the registration mechanism needs to support mapping from an operator to a set of operators for the gradient computation.</li>
</ol> </ol>
</div> </div>
<div class="section" id="the-current-implementation"> <div class="section" id="the-current-implementation">
<span id="the-current-implementation"></span><h2>The Current Implementation<a class="headerlink" href="#the-current-implementation" title="Permalink to this headline"></a></h2> <span id="the-current-implementation"></span><h2>The Current Implementation<a class="headerlink" href="#the-current-implementation" title="Permalink to this headline"></a></h2>
<p>The C++ class <code class="docutils literal"><span class="pre">OpInfos</span></code> store in a association map which key is the operator type. The <code class="docutils literal"><span class="pre">grad_op_type</span></code> indicate associated gradient operator type. Operator can create gradient operator by <code class="docutils literal"><span class="pre">OpInfo::creator_</span></code> of gradient. The pseudo code is</p> <p>Instances of the C++ class <code class="docutils literal"><span class="pre">OpInfo</span></code> are stored an associative map whose key is the operator type. The <code class="docutils literal"><span class="pre">grad_op_type</span></code> indicates the associated gradient operator type. An operator can create the gradient operator by invoking <code class="docutils literal"><span class="pre">OpInfo::creator_</span></code> of the gradient operator. The pseudo code is as follows</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">OperatorBase</span><span class="o">*</span><span class="p">(...)</span><span class="o">&gt;</span> <span class="n">creator_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">OperatorBase</span><span class="o">*</span><span class="p">(...)</span><span class="o">&gt;</span> <span class="n">creator_</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">grad_op_type_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">grad_op_type_</span><span class="p">;</span>
...@@ -211,20 +211,20 @@ ...@@ -211,20 +211,20 @@
</div> </div>
<div class="section" id="proposed-solution"> <div class="section" id="proposed-solution">
<span id="proposed-solution"></span><h2>Proposed Solution<a class="headerlink" href="#proposed-solution" title="Permalink to this headline"></a></h2> <span id="proposed-solution"></span><h2>Proposed Solution<a class="headerlink" href="#proposed-solution" title="Permalink to this headline"></a></h2>
<p>The mapping relationship between an operator and its gradient operators is a function. The interface of that function is:</p> <p>The mapping relationship between an operator and its gradient operators is a function. The interface of this function is:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="c1">// (OpDesc) --&gt; vector&lt;OpDesc&gt;</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="c1">// (OpDesc) --&gt; vector&lt;OpDesc&gt;</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span>
</pre></div> </pre></div>
</div> </div>
<p>The function takes an <code class="docutils literal"><span class="pre">OpDescBind</span></code> of the forward operator and returns one or many gradient operator descriptions. <code class="docutils literal"><span class="pre">OpDescBind</span></code> is a C++ wrapper for protobuf message <code class="docutils literal"><span class="pre">OpDesc</span></code> to manipulate <code class="docutils literal"><span class="pre">OpDesc</span></code> fast.</p> <p>The function takes an <code class="docutils literal"><span class="pre">OpDescBind</span></code> of the forward operator and returns one or many gradient operator descriptions. <code class="docutils literal"><span class="pre">OpDescBind</span></code> is a C++ wrapper for the protobuf message <code class="docutils literal"><span class="pre">OpDesc</span></code> for rapid manipulation of <code class="docutils literal"><span class="pre">OpDesc</span></code>.</p>
<p>The <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code> will be registered in <code class="docutils literal"><span class="pre">OpInfo</span></code>, to replace <code class="docutils literal"><span class="pre">grad_op_type_</span></code> field. The <code class="docutils literal"><span class="pre">OpInfo</span></code> should be</p> <p>The <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code> will be registered in <code class="docutils literal"><span class="pre">OpInfo</span></code> and will replace the <code class="docutils literal"><span class="pre">grad_op_type_</span></code> field. The <code class="docutils literal"><span class="pre">OpInfo</span></code> should look like</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">grad_op_maker_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">grad_op_maker_</span><span class="p">;</span>
<span class="p">...</span> <span class="p">...</span>
<span class="p">};</span> <span class="p">};</span>
</pre></div> </pre></div>
</div> </div>
<p>The <code class="docutils literal"><span class="pre">grad_op_maker_</span></code> is <code class="docutils literal"><span class="pre">nullptr</span></code> if the operator does not have associated gradient operators.</p> <p>The <code class="docutils literal"><span class="pre">grad_op_maker_</span></code> is a <code class="docutils literal"><span class="pre">nullptr</span></code> if the operator does not have any associated gradient operators.</p>
<p>We propose a base class called <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> to let operator developers generate <code class="docutils literal"><span class="pre">Gradient</span> <span class="pre">Operators</span></code> easily. The public interface of that class is</p> <p>We propose a base class called <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> to let operator developers generate <code class="docutils literal"><span class="pre">Gradient</span> <span class="pre">Operators</span></code> easily. The public interface of that class is</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">GradOpDescMakerBase</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">GradOpDescMakerBase</span> <span class="p">{</span>
<span class="k">public</span><span class="o">:</span> <span class="k">public</span><span class="o">:</span>
...@@ -243,7 +243,7 @@ ...@@ -243,7 +243,7 @@
</pre></div> </pre></div>
</div> </div>
<p>We can write many helper functions since the <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> is a class now. The basic helper functions get the variables of <code class="docutils literal"><span class="pre">Input</span></code>, <code class="docutils literal"><span class="pre">Output</span></code>, <code class="docutils literal"><span class="pre">InputGradient</span></code> and <code class="docutils literal"><span class="pre">OutputGradient</span></code> in the forwarding operator.</p> <p>We can write many helper functions since the <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> is a class now. The basic helper functions get the variables of <code class="docutils literal"><span class="pre">Input</span></code>, <code class="docutils literal"><span class="pre">Output</span></code>, <code class="docutils literal"><span class="pre">InputGradient</span></code> and <code class="docutils literal"><span class="pre">OutputGradient</span></code> in the forwarding operator.</p>
<p>We should chagne register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So <code class="docutils literal"><span class="pre">REGISTER_OP</span></code> just register one operator. If the <code class="docutils literal"><span class="pre">REGISTER_OPERATOR</span></code> contains <code class="docutils literal"><span class="pre">OpProtoAndCheckerMaker</span></code> and <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code>, we just list them in the same macro. It can be done by a macro contains <code class="docutils literal"><span class="pre">__VA_ARGS__</span></code>.</p> <p>We should change register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So <code class="docutils literal"><span class="pre">REGISTER_OP</span></code> just register one operator. If the <code class="docutils literal"><span class="pre">REGISTER_OPERATOR</span></code> contains <code class="docutils literal"><span class="pre">OpProtoAndCheckerMaker</span></code> and <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code>, we just list them in the same macro. It can be done by a macro contains <code class="docutils literal"><span class="pre">__VA_ARGS__</span></code>.</p>
<p>The user interface should be</p> <p>The user interface should be</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDesc</span><span class="o">&gt;</span> <span class="n">MinusOpGradMaker</span><span class="p">(</span><span class="n">OpDesc</span><span class="p">)</span> <span class="p">{...}</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDesc</span><span class="o">&gt;</span> <span class="n">MinusOpGradMaker</span><span class="p">(</span><span class="n">OpDesc</span><span class="p">)</span> <span class="p">{...}</span>
<span class="n">REGISTER_OPERATOR</span><span class="p">(</span><span class="n">minus</span><span class="p">,</span> <span class="n">MinusOp</span><span class="p">,</span> <span class="n">MinusOpProtoAndCheckerMaker</span><span class="p">,</span> <span class="n">SumOpGradMaker</span><span class="p">);</span> <span class="n">REGISTER_OPERATOR</span><span class="p">(</span><span class="n">minus</span><span class="p">,</span> <span class="n">MinusOp</span><span class="p">,</span> <span class="n">MinusOpProtoAndCheckerMaker</span><span class="p">,</span> <span class="n">SumOpGradMaker</span><span class="p">);</span>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
...@@ -3,17 +3,17 @@ ...@@ -3,17 +3,17 @@
## The Problem Posed ## The Problem Posed
Currently, for each C++ operator class definition, there registers a *gradient operator creator* function, which takes a C++ operator instance and returns the corresponding gradient operator instance. Currently, for each C++ operator class definition, a *gradient operator creator* function is registered, which takes as input a C++ operator instance and returns the corresponding gradient operator instance.
However, we noticed two problems with the current deisgn: However, we noticed two problems with the current design:
1. As we decided to separate the *compilation* and *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message. 1. As we decided to separate the *compilation* and the *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message.
1. Some operator's gradient computation requires more than one gradient operators. For example, the gradient of *minus* consists of two operators -- an identity operaotr and a scale operator. So we need to make the registration mechanism to support the mapping from an operator to a set of operators for gradient computation. 1. For some operators, the gradient computation can be written in terms of existing operators. For example, the gradient of *minus* operator consists of two operators -- an *identity* operator followed by a *scale* operator. Hence the registration mechanism needs to support mapping from an operator to a set of operators for the gradient computation.
## The Current Implementation ## The Current Implementation
The C++ class `OpInfos` store in a association map which key is the operator type. The `grad_op_type` indicate associated gradient operator type. Operator can create gradient operator by `OpInfo::creator_` of gradient. The pseudo code is Instances of the C++ class `OpInfo` are stored an associative map whose key is the operator type. The `grad_op_type` indicates the associated gradient operator type. An operator can create the gradient operator by invoking `OpInfo::creator_` of the gradient operator. The pseudo code is as follows
```cpp ```cpp
struct OpInfo { struct OpInfo {
...@@ -31,16 +31,16 @@ OperatorBase* CreateGradientOperator(const OperatorBase& op) { ...@@ -31,16 +31,16 @@ OperatorBase* CreateGradientOperator(const OperatorBase& op) {
## Proposed Solution ## Proposed Solution
The mapping relationship between an operator and its gradient operators is a function. The interface of that function is: The mapping relationship between an operator and its gradient operators is a function. The interface of this function is:
```cpp ```cpp
// (OpDesc) --> vector<OpDesc> // (OpDesc) --> vector<OpDesc>
std::function<std::vector<OpDescBind>(const OpDescBind&)>; std::function<std::vector<OpDescBind>(const OpDescBind&)>;
``` ```
The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for protobuf message `OpDesc` to manipulate `OpDesc` fast. The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for the protobuf message `OpDesc` for rapid manipulation of `OpDesc`.
The `GradOpDescMaker` will be registered in `OpInfo`, to replace `grad_op_type_` field. The `OpInfo` should be The `GradOpDescMaker` will be registered in `OpInfo` and will replace the `grad_op_type_` field. The `OpInfo` should look like
```cpp ```cpp
struct OpInfo { struct OpInfo {
...@@ -49,7 +49,7 @@ struct OpInfo { ...@@ -49,7 +49,7 @@ struct OpInfo {
}; };
``` ```
The `grad_op_maker_ ` is `nullptr` if the operator does not have associated gradient operators. The `grad_op_maker_ ` is a `nullptr` if the operator does not have any associated gradient operators.
We propose a base class called `GradOpDescMakerBase` to let operator developers generate `Gradient Operators` easily. The public interface of that class is We propose a base class called `GradOpDescMakerBase` to let operator developers generate `Gradient Operators` easily. The public interface of that class is
...@@ -74,7 +74,7 @@ func = [] (const OpDescBind& fwd_op) { ...@@ -74,7 +74,7 @@ func = [] (const OpDescBind& fwd_op) {
We can write many helper functions since the `GradOpDescMakerBase` is a class now. The basic helper functions get the variables of `Input`, `Output`, `InputGradient` and `OutputGradient` in the forwarding operator. We can write many helper functions since the `GradOpDescMakerBase` is a class now. The basic helper functions get the variables of `Input`, `Output`, `InputGradient` and `OutputGradient` in the forwarding operator.
We should chagne register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`. We should change register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`.
The user interface should be The user interface should be
......
...@@ -199,16 +199,16 @@ ...@@ -199,16 +199,16 @@
<span id="design-doc-gradient-operators-registration"></span><h1>Design Doc: Gradient Operators Registration<a class="headerlink" href="#design-doc-gradient-operators-registration" title="永久链接至标题"></a></h1> <span id="design-doc-gradient-operators-registration"></span><h1>Design Doc: Gradient Operators Registration<a class="headerlink" href="#design-doc-gradient-operators-registration" title="永久链接至标题"></a></h1>
<div class="section" id="the-problem-posed"> <div class="section" id="the-problem-posed">
<span id="the-problem-posed"></span><h2>The Problem Posed<a class="headerlink" href="#the-problem-posed" title="永久链接至标题"></a></h2> <span id="the-problem-posed"></span><h2>The Problem Posed<a class="headerlink" href="#the-problem-posed" title="永久链接至标题"></a></h2>
<p>Currently, for each C++ operator class definition, there registers a <em>gradient operator creator</em> function, which takes a C++ operator instance and returns the corresponding gradient operator instance.</p> <p>Currently, for each C++ operator class definition, a <em>gradient operator creator</em> function is registered, which takes as input a C++ operator instance and returns the corresponding gradient operator instance.</p>
<p>However, we noticed two problems with the current deisgn:</p> <p>However, we noticed two problems with the current design:</p>
<ol class="simple"> <ol class="simple">
<li>As we decided to separate the <em>compilation</em> and <em>execution</em> phases, we need to change the creator to take an <code class="docutils literal"><span class="pre">OpDesc</span></code> protobuf message in a <code class="docutils literal"><span class="pre">ProgramDesc</span></code> and inserts corresponding <code class="docutils literal"><span class="pre">OpDesc</span></code> messages into the <code class="docutils literal"><span class="pre">ProgramDesc</span></code> message.</li> <li>As we decided to separate the <em>compilation</em> and the <em>execution</em> phases, we need to change the creator to take an <code class="docutils literal"><span class="pre">OpDesc</span></code> protobuf message in a <code class="docutils literal"><span class="pre">ProgramDesc</span></code> and inserts corresponding <code class="docutils literal"><span class="pre">OpDesc</span></code> messages into the <code class="docutils literal"><span class="pre">ProgramDesc</span></code> message.</li>
<li>Some operator&#8217;s gradient computation requires more than one gradient operators. For example, the gradient of <em>minus</em> consists of two operators &#8211; an identity operaotr and a scale operator. So we need to make the registration mechanism to support the mapping from an operator to a set of operators for gradient computation.</li> <li>For some operators, the gradient computation can be written in terms of existing operators. For example, the gradient of <em>minus</em> operator consists of two operators &#8211; an <em>identity</em> operator followed by a <em>scale</em> operator. Hence the registration mechanism needs to support mapping from an operator to a set of operators for the gradient computation.</li>
</ol> </ol>
</div> </div>
<div class="section" id="the-current-implementation"> <div class="section" id="the-current-implementation">
<span id="the-current-implementation"></span><h2>The Current Implementation<a class="headerlink" href="#the-current-implementation" title="永久链接至标题"></a></h2> <span id="the-current-implementation"></span><h2>The Current Implementation<a class="headerlink" href="#the-current-implementation" title="永久链接至标题"></a></h2>
<p>The C++ class <code class="docutils literal"><span class="pre">OpInfos</span></code> store in a association map which key is the operator type. The <code class="docutils literal"><span class="pre">grad_op_type</span></code> indicate associated gradient operator type. Operator can create gradient operator by <code class="docutils literal"><span class="pre">OpInfo::creator_</span></code> of gradient. The pseudo code is</p> <p>Instances of the C++ class <code class="docutils literal"><span class="pre">OpInfo</span></code> are stored an associative map whose key is the operator type. The <code class="docutils literal"><span class="pre">grad_op_type</span></code> indicates the associated gradient operator type. An operator can create the gradient operator by invoking <code class="docutils literal"><span class="pre">OpInfo::creator_</span></code> of the gradient operator. The pseudo code is as follows</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">OperatorBase</span><span class="o">*</span><span class="p">(...)</span><span class="o">&gt;</span> <span class="n">creator_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">OperatorBase</span><span class="o">*</span><span class="p">(...)</span><span class="o">&gt;</span> <span class="n">creator_</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">grad_op_type_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">grad_op_type_</span><span class="p">;</span>
...@@ -225,20 +225,20 @@ ...@@ -225,20 +225,20 @@
</div> </div>
<div class="section" id="proposed-solution"> <div class="section" id="proposed-solution">
<span id="proposed-solution"></span><h2>Proposed Solution<a class="headerlink" href="#proposed-solution" title="永久链接至标题"></a></h2> <span id="proposed-solution"></span><h2>Proposed Solution<a class="headerlink" href="#proposed-solution" title="永久链接至标题"></a></h2>
<p>The mapping relationship between an operator and its gradient operators is a function. The interface of that function is:</p> <p>The mapping relationship between an operator and its gradient operators is a function. The interface of this function is:</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="c1">// (OpDesc) --&gt; vector&lt;OpDesc&gt;</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="c1">// (OpDesc) --&gt; vector&lt;OpDesc&gt;</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span>
</pre></div> </pre></div>
</div> </div>
<p>The function takes an <code class="docutils literal"><span class="pre">OpDescBind</span></code> of the forward operator and returns one or many gradient operator descriptions. <code class="docutils literal"><span class="pre">OpDescBind</span></code> is a C++ wrapper for protobuf message <code class="docutils literal"><span class="pre">OpDesc</span></code> to manipulate <code class="docutils literal"><span class="pre">OpDesc</span></code> fast.</p> <p>The function takes an <code class="docutils literal"><span class="pre">OpDescBind</span></code> of the forward operator and returns one or many gradient operator descriptions. <code class="docutils literal"><span class="pre">OpDescBind</span></code> is a C++ wrapper for the protobuf message <code class="docutils literal"><span class="pre">OpDesc</span></code> for rapid manipulation of <code class="docutils literal"><span class="pre">OpDesc</span></code>.</p>
<p>The <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code> will be registered in <code class="docutils literal"><span class="pre">OpInfo</span></code>, to replace <code class="docutils literal"><span class="pre">grad_op_type_</span></code> field. The <code class="docutils literal"><span class="pre">OpInfo</span></code> should be</p> <p>The <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code> will be registered in <code class="docutils literal"><span class="pre">OpInfo</span></code> and will replace the <code class="docutils literal"><span class="pre">grad_op_type_</span></code> field. The <code class="docutils literal"><span class="pre">OpInfo</span></code> should look like</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">struct</span> <span class="n">OpInfo</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">grad_op_maker_</span><span class="p">;</span> <span class="n">std</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">OpDescBind</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="k">const</span> <span class="n">OpDescBind</span><span class="o">&amp;</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">grad_op_maker_</span><span class="p">;</span>
<span class="p">...</span> <span class="p">...</span>
<span class="p">};</span> <span class="p">};</span>
</pre></div> </pre></div>
</div> </div>
<p>The <code class="docutils literal"><span class="pre">grad_op_maker_</span></code> is <code class="docutils literal"><span class="pre">nullptr</span></code> if the operator does not have associated gradient operators.</p> <p>The <code class="docutils literal"><span class="pre">grad_op_maker_</span></code> is a <code class="docutils literal"><span class="pre">nullptr</span></code> if the operator does not have any associated gradient operators.</p>
<p>We propose a base class called <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> to let operator developers generate <code class="docutils literal"><span class="pre">Gradient</span> <span class="pre">Operators</span></code> easily. The public interface of that class is</p> <p>We propose a base class called <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> to let operator developers generate <code class="docutils literal"><span class="pre">Gradient</span> <span class="pre">Operators</span></code> easily. The public interface of that class is</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">GradOpDescMakerBase</span> <span class="p">{</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">GradOpDescMakerBase</span> <span class="p">{</span>
<span class="k">public</span><span class="o">:</span> <span class="k">public</span><span class="o">:</span>
...@@ -257,7 +257,7 @@ ...@@ -257,7 +257,7 @@
</pre></div> </pre></div>
</div> </div>
<p>We can write many helper functions since the <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> is a class now. The basic helper functions get the variables of <code class="docutils literal"><span class="pre">Input</span></code>, <code class="docutils literal"><span class="pre">Output</span></code>, <code class="docutils literal"><span class="pre">InputGradient</span></code> and <code class="docutils literal"><span class="pre">OutputGradient</span></code> in the forwarding operator.</p> <p>We can write many helper functions since the <code class="docutils literal"><span class="pre">GradOpDescMakerBase</span></code> is a class now. The basic helper functions get the variables of <code class="docutils literal"><span class="pre">Input</span></code>, <code class="docutils literal"><span class="pre">Output</span></code>, <code class="docutils literal"><span class="pre">InputGradient</span></code> and <code class="docutils literal"><span class="pre">OutputGradient</span></code> in the forwarding operator.</p>
<p>We should chagne register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So <code class="docutils literal"><span class="pre">REGISTER_OP</span></code> just register one operator. If the <code class="docutils literal"><span class="pre">REGISTER_OPERATOR</span></code> contains <code class="docutils literal"><span class="pre">OpProtoAndCheckerMaker</span></code> and <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code>, we just list them in the same macro. It can be done by a macro contains <code class="docutils literal"><span class="pre">__VA_ARGS__</span></code>.</p> <p>We should change register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So <code class="docutils literal"><span class="pre">REGISTER_OP</span></code> just register one operator. If the <code class="docutils literal"><span class="pre">REGISTER_OPERATOR</span></code> contains <code class="docutils literal"><span class="pre">OpProtoAndCheckerMaker</span></code> and <code class="docutils literal"><span class="pre">GradOpDescMaker</span></code>, we just list them in the same macro. It can be done by a macro contains <code class="docutils literal"><span class="pre">__VA_ARGS__</span></code>.</p>
<p>The user interface should be</p> <p>The user interface should be</p>
<div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDesc</span><span class="o">&gt;</span> <span class="n">MinusOpGradMaker</span><span class="p">(</span><span class="n">OpDesc</span><span class="p">)</span> <span class="p">{...}</span> <div class="highlight-cpp"><div class="highlight"><pre><span></span><span class="n">vector</span><span class="o">&lt;</span><span class="n">OpDesc</span><span class="o">&gt;</span> <span class="n">MinusOpGradMaker</span><span class="p">(</span><span class="n">OpDesc</span><span class="p">)</span> <span class="p">{...}</span>
<span class="n">REGISTER_OPERATOR</span><span class="p">(</span><span class="n">minus</span><span class="p">,</span> <span class="n">MinusOp</span><span class="p">,</span> <span class="n">MinusOpProtoAndCheckerMaker</span><span class="p">,</span> <span class="n">SumOpGradMaker</span><span class="p">);</span> <span class="n">REGISTER_OPERATOR</span><span class="p">(</span><span class="n">minus</span><span class="p">,</span> <span class="n">MinusOp</span><span class="p">,</span> <span class="n">MinusOpProtoAndCheckerMaker</span><span class="p">,</span> <span class="n">SumOpGradMaker</span><span class="p">);</span>
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册