提交 ad5dbb39 编写于 作者: T Travis CI

Deploy to GitHub Pages: 696874ac

上级 8cdddf35
## Optimizer Design
### The Problem
A PaddlePaddle program, or a block, is a sequence of operators operating variables. A training program needs to do three kinds of works:
1. the forward pass, which computes intermediate results and the cost(s),
1. the backward pass, which derives gradients from intermediate results and costs, and
1. the optimization pass, which update model parameters to optimize the cost(s).
These works rely on three kinds of operators:
1. forward operators,
1. gradient operators, and
1. optimization operators.
It's true that users should be able to create all these operators manually by calling some low-level API, but it would be much more convenient if they could only describe the forward pass and let PaddlePaddle create the backward and optimization operators automatically.
In this design, we propose a high-level API that automatically derives the optimisation pass and operators from the forward pass.
### High-level Python API to describe the training process
1. User write code to describe the network:
```python
images = layer.data("images")
labels = layer.data("labels")
w1 = pd.var("w1")
b1 = pd.var("b1")
hidden = layer.fc(images, w=w1, b=b1)
cost = layer.mse(hidden, labels)
```
The above code snippet will create forward operators in [Block](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md).
2. Users create a certain kind of Optimizer with some argument.
```python
optimizer = AdagradOptimizer(learing_rate=0.001)
```
3. Users use the optimizer to `minimize` a certain `cost` through updating parameters in parameter_list.
```python
opt_op_list = optimizer.minimize(cost, parameter_list=[w1, b1])
```
The above code snippet will create gradient and optimization operators in Block. The return value of `minimize()` is list of optimization operators that will be run by session.
4. Users use Session/Executor to run this opt_op_list as target to do training.
```python
sess.run(target= opt_op_list, ...)
```
#### Optimizer Python interface:
```python
class Optimizer(object):
"""Optimizer Base class.
"""
def __init__(self):
pass
def create_backward_pass(self, loss, parameter_list=None):
"""
create and add gradient Operators in BlockDesc to Compute gradients of `loss`
for parameters in parameter_list
Args:
loss: an variable generated by cost function.
parameter_list: parameters that need to compute gradient and update to optimize the lost.
Returns:
list of (parameters, gradients) pair.
"""
return None
def create_optimization_pass(self, parameters_and_grads):
"""Add optimization operators to update gradients to variables.
Args:
parameters_and_grads: a list of (variable, gradient) pair to update.
Returns:
optmization_op_list: a list of optimization operator that will update parameter using gradient.
"""
return None
def minimize(self, loss, parameter_list):
"""Add operations to minimize `loss` by updating `parameter_list`.
This method combines interface `create_backward_pass()` and
`create_optimization_pass()` into one.
"""
params_grads = self.create_backward_pass(loss, parameter_list)
update_ops = self.create_optimization_pass(params_grads)
return update_ops
```
Users can inherit the Optimizer above to create their own Optimizer with some special logic, such as AdagradOptimizer.
......@@ -214,3 +214,7 @@ def fc_layer(input, size, ...):
out.writer = op
return out
```
## Optimizer
[Optimizer Design Doc](./optimizer.md)
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Optimizer Design &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../genindex.html"/>
<link rel="search" title="Search" href="../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_en.html">PaddlePaddle in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_en.html">Installing from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_en.html">Run Distributed Training</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_en.html">Paddle On Kubernetes</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_aws_en.html">Distributed PaddlePaddle Training on AWS with Kubernetes</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/build_en.html">Build PaddlePaddle from Source Code and Run Unit Test</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">Data Reader Interface and DataSets</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">Training and Inference</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Optimizer Design</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="optimizer-design">
<span id="optimizer-design"></span><h1>Optimizer Design<a class="headerlink" href="#optimizer-design" title="Permalink to this headline"></a></h1>
<div class="section" id="the-problem">
<span id="the-problem"></span><h2>The Problem<a class="headerlink" href="#the-problem" title="Permalink to this headline"></a></h2>
<p>A PaddlePaddle program, or a block, is a sequence of operators operating variables. A training program needs to do three kinds of works:</p>
<ol class="simple">
<li>the forward pass, which computes intermediate results and the cost(s),</li>
<li>the backward pass, which derives gradients from intermediate results and costs, and</li>
<li>the optimization pass, which update model parameters to optimize the cost(s).</li>
</ol>
<p>These works rely on three kinds of operators:</p>
<ol class="simple">
<li>forward operators,</li>
<li>gradient operators, and</li>
<li>optimization operators.</li>
</ol>
<p>It&#8217;s true that users should be able to create all these operators manually by calling some low-level API, but it would be much more convenient if they could only describe the forward pass and let PaddlePaddle create the backward and optimization operators automatically.</p>
<p>In this design, we propose a high-level API that automatically derives the optimisation pass and operators from the forward pass.</p>
</div>
<div class="section" id="high-level-python-api-to-describe-the-training-process">
<span id="high-level-python-api-to-describe-the-training-process"></span><h2>High-level Python API to describe the training process<a class="headerlink" href="#high-level-python-api-to-describe-the-training-process" title="Permalink to this headline"></a></h2>
<ol>
<li><p class="first">User write code to describe the network:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">images</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="s2">&quot;images&quot;</span><span class="p">)</span>
<span class="n">labels</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="s2">&quot;labels&quot;</span><span class="p">)</span>
<span class="n">w1</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;w1&quot;</span><span class="p">)</span>
<span class="n">b1</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;b1&quot;</span><span class="p">)</span>
<span class="n">hidden</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="n">images</span><span class="p">,</span> <span class="n">w</span><span class="o">=</span><span class="n">w1</span><span class="p">,</span> <span class="n">b</span><span class="o">=</span><span class="n">b1</span><span class="p">)</span>
<span class="n">cost</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">mse</span><span class="p">(</span><span class="n">hidden</span><span class="p">,</span> <span class="n">labels</span><span class="p">)</span>
</pre></div>
</div>
<p>The above code snippet will create forward operators in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md">Block</a>.</p>
</li>
</ol>
<ol>
<li><p class="first">Users create a certain kind of Optimizer with some argument.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">optimizer</span> <span class="o">=</span> <span class="n">AdagradOptimizer</span><span class="p">(</span><span class="n">learing_rate</span><span class="o">=</span><span class="mf">0.001</span><span class="p">)</span>
</pre></div>
</div>
</li>
<li><p class="first">Users use the optimizer to <code class="docutils literal"><span class="pre">minimize</span></code> a certain <code class="docutils literal"><span class="pre">cost</span></code> through updating parameters in parameter_list.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">opt_op_list</span> <span class="o">=</span> <span class="n">optimizer</span><span class="o">.</span><span class="n">minimize</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span> <span class="n">parameter_list</span><span class="o">=</span><span class="p">[</span><span class="n">w1</span><span class="p">,</span> <span class="n">b1</span><span class="p">])</span>
</pre></div>
</div>
<p>The above code snippet will create gradient and optimization operators in Block. The return value of <code class="docutils literal"><span class="pre">minimize()</span></code> is list of optimization operators that will be run by session.</p>
</li>
<li><p class="first">Users use Session/Executor to run this opt_op_list as target to do training.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">target</span><span class="o">=</span> <span class="n">opt_op_list</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span>
</pre></div>
</div>
</li>
</ol>
<div class="section" id="optimizer-python-interface">
<span id="optimizer-python-interface"></span><h3>Optimizer Python interface:<a class="headerlink" href="#optimizer-python-interface" title="Permalink to this headline"></a></h3>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Optimizer</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Optimizer Base class.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">create_backward_pass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> create and add gradient Operators in BlockDesc to Compute gradients of `loss`</span>
<span class="sd"> for parameters in parameter_list</span>
<span class="sd"> Args:</span>
<span class="sd"> loss: an variable generated by cost function.</span>
<span class="sd"> parameter_list: parameters that need to compute gradient and update to optimize the lost.</span>
<span class="sd"> Returns:</span>
<span class="sd"> list of (parameters, gradients) pair.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="bp">None</span>
<span class="k">def</span> <span class="nf">create_optimization_pass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">parameters_and_grads</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Add optimization operators to update gradients to variables.</span>
<span class="sd"> Args:</span>
<span class="sd"> parameters_and_grads: a list of (variable, gradient) pair to update.</span>
<span class="sd"> Returns:</span>
<span class="sd"> optmization_op_list: a list of optimization operator that will update parameter using gradient.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="bp">None</span>
<span class="k">def</span> <span class="nf">minimize</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Add operations to minimize `loss` by updating `parameter_list`.</span>
<span class="sd"> This method combines interface `create_backward_pass()` and</span>
<span class="sd"> `create_optimization_pass()` into one.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">params_grads</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">create_backward_pass</span><span class="p">(</span><span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="p">)</span>
<span class="n">update_ops</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">create_optimization_pass</span><span class="p">(</span><span class="n">params_grads</span><span class="p">)</span>
<span class="k">return</span> <span class="n">update_ops</span>
</pre></div>
</div>
<p>Users can inherit the Optimizer above to create their own Optimizer with some special logic, such as AdagradOptimizer.</p>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
......@@ -375,6 +375,10 @@
</div>
</div>
</div>
<div class="section" id="optimizer">
<span id="optimizer"></span><h2>Optimizer<a class="headerlink" href="#optimizer" title="Permalink to this headline"></a></h2>
<p><a class="reference internal" href="optimizer.html"><span class="doc">Optimizer Design Doc</span></a></p>
</div>
</div>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
## Optimizer Design
### The Problem
A PaddlePaddle program, or a block, is a sequence of operators operating variables. A training program needs to do three kinds of works:
1. the forward pass, which computes intermediate results and the cost(s),
1. the backward pass, which derives gradients from intermediate results and costs, and
1. the optimization pass, which update model parameters to optimize the cost(s).
These works rely on three kinds of operators:
1. forward operators,
1. gradient operators, and
1. optimization operators.
It's true that users should be able to create all these operators manually by calling some low-level API, but it would be much more convenient if they could only describe the forward pass and let PaddlePaddle create the backward and optimization operators automatically.
In this design, we propose a high-level API that automatically derives the optimisation pass and operators from the forward pass.
### High-level Python API to describe the training process
1. User write code to describe the network:
```python
images = layer.data("images")
labels = layer.data("labels")
w1 = pd.var("w1")
b1 = pd.var("b1")
hidden = layer.fc(images, w=w1, b=b1)
cost = layer.mse(hidden, labels)
```
The above code snippet will create forward operators in [Block](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md).
2. Users create a certain kind of Optimizer with some argument.
```python
optimizer = AdagradOptimizer(learing_rate=0.001)
```
3. Users use the optimizer to `minimize` a certain `cost` through updating parameters in parameter_list.
```python
opt_op_list = optimizer.minimize(cost, parameter_list=[w1, b1])
```
The above code snippet will create gradient and optimization operators in Block. The return value of `minimize()` is list of optimization operators that will be run by session.
4. Users use Session/Executor to run this opt_op_list as target to do training.
```python
sess.run(target= opt_op_list, ...)
```
#### Optimizer Python interface:
```python
class Optimizer(object):
"""Optimizer Base class.
"""
def __init__(self):
pass
def create_backward_pass(self, loss, parameter_list=None):
"""
create and add gradient Operators in BlockDesc to Compute gradients of `loss`
for parameters in parameter_list
Args:
loss: an variable generated by cost function.
parameter_list: parameters that need to compute gradient and update to optimize the lost.
Returns:
list of (parameters, gradients) pair.
"""
return None
def create_optimization_pass(self, parameters_and_grads):
"""Add optimization operators to update gradients to variables.
Args:
parameters_and_grads: a list of (variable, gradient) pair to update.
Returns:
optmization_op_list: a list of optimization operator that will update parameter using gradient.
"""
return None
def minimize(self, loss, parameter_list):
"""Add operations to minimize `loss` by updating `parameter_list`.
This method combines interface `create_backward_pass()` and
`create_optimization_pass()` into one.
"""
params_grads = self.create_backward_pass(loss, parameter_list)
update_ops = self.create_optimization_pass(params_grads)
return update_ops
```
Users can inherit the Optimizer above to create their own Optimizer with some special logic, such as AdagradOptimizer.
......@@ -214,3 +214,7 @@ def fc_layer(input, size, ...):
out.writer = op
return out
```
## Optimizer
[Optimizer Design Doc](./optimizer.md)
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Optimizer Design &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../genindex.html"/>
<link rel="search" title="搜索" href="../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_cn.html">PaddlePaddle的Docker容器使用方式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/cmake/build_from_source_cn.html">PaddlePaddle的编译选项</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_cn.html">运行分布式训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_basis_cn.html">Kubernetes 简介</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_cn.html">Kubernetes单机训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_distributed_cn.html">Kubernetes分布式训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/build_cn.html">编译PaddlePaddle和运行单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">数据访问</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">训练与应用</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Optimizer Design</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="optimizer-design">
<span id="optimizer-design"></span><h1>Optimizer Design<a class="headerlink" href="#optimizer-design" title="永久链接至标题"></a></h1>
<div class="section" id="the-problem">
<span id="the-problem"></span><h2>The Problem<a class="headerlink" href="#the-problem" title="永久链接至标题"></a></h2>
<p>A PaddlePaddle program, or a block, is a sequence of operators operating variables. A training program needs to do three kinds of works:</p>
<ol class="simple">
<li>the forward pass, which computes intermediate results and the cost(s),</li>
<li>the backward pass, which derives gradients from intermediate results and costs, and</li>
<li>the optimization pass, which update model parameters to optimize the cost(s).</li>
</ol>
<p>These works rely on three kinds of operators:</p>
<ol class="simple">
<li>forward operators,</li>
<li>gradient operators, and</li>
<li>optimization operators.</li>
</ol>
<p>It&#8217;s true that users should be able to create all these operators manually by calling some low-level API, but it would be much more convenient if they could only describe the forward pass and let PaddlePaddle create the backward and optimization operators automatically.</p>
<p>In this design, we propose a high-level API that automatically derives the optimisation pass and operators from the forward pass.</p>
</div>
<div class="section" id="high-level-python-api-to-describe-the-training-process">
<span id="high-level-python-api-to-describe-the-training-process"></span><h2>High-level Python API to describe the training process<a class="headerlink" href="#high-level-python-api-to-describe-the-training-process" title="永久链接至标题"></a></h2>
<ol>
<li><p class="first">User write code to describe the network:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">images</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="s2">&quot;images&quot;</span><span class="p">)</span>
<span class="n">labels</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="s2">&quot;labels&quot;</span><span class="p">)</span>
<span class="n">w1</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;w1&quot;</span><span class="p">)</span>
<span class="n">b1</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;b1&quot;</span><span class="p">)</span>
<span class="n">hidden</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="n">images</span><span class="p">,</span> <span class="n">w</span><span class="o">=</span><span class="n">w1</span><span class="p">,</span> <span class="n">b</span><span class="o">=</span><span class="n">b1</span><span class="p">)</span>
<span class="n">cost</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">mse</span><span class="p">(</span><span class="n">hidden</span><span class="p">,</span> <span class="n">labels</span><span class="p">)</span>
</pre></div>
</div>
<p>The above code snippet will create forward operators in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md">Block</a>.</p>
</li>
</ol>
<ol>
<li><p class="first">Users create a certain kind of Optimizer with some argument.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">optimizer</span> <span class="o">=</span> <span class="n">AdagradOptimizer</span><span class="p">(</span><span class="n">learing_rate</span><span class="o">=</span><span class="mf">0.001</span><span class="p">)</span>
</pre></div>
</div>
</li>
<li><p class="first">Users use the optimizer to <code class="docutils literal"><span class="pre">minimize</span></code> a certain <code class="docutils literal"><span class="pre">cost</span></code> through updating parameters in parameter_list.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">opt_op_list</span> <span class="o">=</span> <span class="n">optimizer</span><span class="o">.</span><span class="n">minimize</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span> <span class="n">parameter_list</span><span class="o">=</span><span class="p">[</span><span class="n">w1</span><span class="p">,</span> <span class="n">b1</span><span class="p">])</span>
</pre></div>
</div>
<p>The above code snippet will create gradient and optimization operators in Block. The return value of <code class="docutils literal"><span class="pre">minimize()</span></code> is list of optimization operators that will be run by session.</p>
</li>
<li><p class="first">Users use Session/Executor to run this opt_op_list as target to do training.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">target</span><span class="o">=</span> <span class="n">opt_op_list</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span>
</pre></div>
</div>
</li>
</ol>
<div class="section" id="optimizer-python-interface">
<span id="optimizer-python-interface"></span><h3>Optimizer Python interface:<a class="headerlink" href="#optimizer-python-interface" title="永久链接至标题"></a></h3>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Optimizer</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Optimizer Base class.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">create_backward_pass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> create and add gradient Operators in BlockDesc to Compute gradients of `loss`</span>
<span class="sd"> for parameters in parameter_list</span>
<span class="sd"> Args:</span>
<span class="sd"> loss: an variable generated by cost function.</span>
<span class="sd"> parameter_list: parameters that need to compute gradient and update to optimize the lost.</span>
<span class="sd"> Returns:</span>
<span class="sd"> list of (parameters, gradients) pair.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="bp">None</span>
<span class="k">def</span> <span class="nf">create_optimization_pass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">parameters_and_grads</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Add optimization operators to update gradients to variables.</span>
<span class="sd"> Args:</span>
<span class="sd"> parameters_and_grads: a list of (variable, gradient) pair to update.</span>
<span class="sd"> Returns:</span>
<span class="sd"> optmization_op_list: a list of optimization operator that will update parameter using gradient.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="bp">None</span>
<span class="k">def</span> <span class="nf">minimize</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Add operations to minimize `loss` by updating `parameter_list`.</span>
<span class="sd"> This method combines interface `create_backward_pass()` and</span>
<span class="sd"> `create_optimization_pass()` into one.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">params_grads</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">create_backward_pass</span><span class="p">(</span><span class="n">loss</span><span class="p">,</span> <span class="n">parameter_list</span><span class="p">)</span>
<span class="n">update_ops</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">create_optimization_pass</span><span class="p">(</span><span class="n">params_grads</span><span class="p">)</span>
<span class="k">return</span> <span class="n">update_ops</span>
</pre></div>
</div>
<p>Users can inherit the Optimizer above to create their own Optimizer with some special logic, such as AdagradOptimizer.</p>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
......@@ -389,6 +389,10 @@
</div>
</div>
</div>
<div class="section" id="optimizer">
<span id="optimizer"></span><h2>Optimizer<a class="headerlink" href="#optimizer" title="永久链接至标题"></a></h2>
<p><a class="reference internal" href="optimizer.html"><span class="doc">Optimizer Design Doc</span></a></p>
</div>
</div>
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册