提交 42e29e07 编写于 作者: chrisxu2014's avatar chrisxu2014 提交者: GitHub

Merge branch 'gh-pages' into gh-pages

...@@ -15,7 +15,6 @@ ...@@ -15,7 +15,6 @@
.norsTitle {font-size: 22px; font-family: Microsoft Yahei; font-weight: normal; color: #333; margin: 35px 0 25px 0; } .norsTitle {font-size: 22px; font-family: Microsoft Yahei; font-weight: normal; color: #333; margin: 35px 0 25px 0; }
</style> </style>
</head> </head>
<body link="#0000cc"> <body link="#0000cc">
<div id="wrapper_wrapper"> <div id="wrapper_wrapper">
<div id="content_left"> <div id="content_left">
......
...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c ...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c
## Task Queue ## Task Queue
As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *blocks* from one or multiple files. The master server maintains *task queues* to track the training progress. As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *chunks* from one or multiple files. The master server maintains *task queues* to track the training progress.
### Task Queue Creation ### Task Queue Creation
...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da ...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da
func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error { func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error {
} }
``` ```
1. The master server will scan through each RecordIO file to generate the *block index* and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file. 1. The master server will scan through each RecordIO file to generate the *chunk index* and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.
The definition of the block is: The definition of the chunk is:
```go ```go
type Block struct { type Chunk struct {
Idx int // index of the block within the file Idx int // index of the chunk within the file
Path string Path string
Index recordio.Index // block index Index recordio.Index // chunk index
} }
``` ```
1. Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element. 1. Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.
The definition of the task is: The definition of the task is:
```go ```go
type Task struct { type Task struct {
Index int Index int
Blocks []Block Chunks []Chunk
} }
``` ```
......
...@@ -55,7 +55,7 @@ The trainer select process is encapsulated in the C API function: ...@@ -55,7 +55,7 @@ The trainer select process is encapsulated in the C API function:
```c ```c
int paddle_begin_init_params(paddle_pserver_client* client, const char* config_proto); int paddle_begin_init_params(paddle_pserver_client* client, const char* config_proto);
``` ```
The selected trainer's call to `paddle_begin_init_params` will return with 1, and the other trainers' call to `paddle_begin_init_params` will block until initialization is done, and return 0. As illustrated below: The selected trainer's call to `paddle_begin_init_params` will return with 1, and the other trainers' call to `paddle_begin_init_params` will return 0. `paddle_get_params` will be blocked until initialization is completed. As illustrated below:
<img src="./src/pserver_init.png"> <img src="./src/pserver_init.png">
...@@ -89,16 +89,13 @@ void paddle_pserver_client_release(paddle_pserver_client* client); ...@@ -89,16 +89,13 @@ void paddle_pserver_client_release(paddle_pserver_client* client);
* *
* paddle_begin_init_params will be called from multiple trainers, * paddle_begin_init_params will be called from multiple trainers,
* only one trainer will be selected to initialize the parameters on * only one trainer will be selected to initialize the parameters on
* parameter servers. Other trainers will be blocked until the * parameter servers. Other trainers need to get the initialized
* initialization is done, and they need to get the initialized
* parameters from parameter servers using @paddle_get_params. * parameters from parameter servers using @paddle_get_params.
* *
* @param pserver_config_proto serialized parameter server configuration in
* Protocol Buffers format.
* @return 1 if the trainer is selected to initialize parameter * @return 1 if the trainer is selected to initialize parameter
* servers, otherwise 0. * servers, otherwise 0.
*/ */
int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_config_proto); int paddle_begin_init_params(paddle_pserver_client* client);
/** /**
* @brief paddle_init_param initializes the parameter on parameter * @brief paddle_init_param initializes the parameter on parameter
...@@ -106,12 +103,13 @@ int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_ ...@@ -106,12 +103,13 @@ int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_
* *
* @param param the parameter to initialize. * @param param the parameter to initialize.
* @param param_config_proto the configuration for the parameter. * @param param_config_proto the configuration for the parameter.
* @param config_len the length of param_config_proto
* @return 0 if successful, otherwise -1. On failure, the trainer * @return 0 if successful, otherwise -1. On failure, the trainer
* needs to restart the entire initialization process (starting from * needs to restart the entire initialization process (starting from
* @paddle_begin_init_param). Or simply exit the program and wait for * @paddle_begin_init_param). Or simply exit the program and wait for
* the cluster management system to restart the trainer. * the cluster management system to restart the trainer.
*/ */
int paddle_init_param(paddle_pserver_client* client, paddle_parameter params, const char* param_config_proto); int paddle_init_param(paddle_pserver_client* client, paddle_parameter param, const unsigned char* param_config_proto, int config_len);
/** /**
* @brief paddle_finish_init_params tells parameter servers client has * @brief paddle_finish_init_params tells parameter servers client has
...@@ -138,6 +136,9 @@ int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grad ...@@ -138,6 +136,9 @@ int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grad
/** /**
* @brief paddle_get_params gets parameters from parameter servers. * @brief paddle_get_params gets parameters from parameter servers.
* *
* paddle_get_params will block until parameters are initialized on
* the parameter servers.
*
* @param names the array of names of the parameters to get. * @param names the array of names of the parameters to get.
* @param dst the destination array of parameters to save to. * @param dst the destination array of parameters to save to.
* @param len the length of the names array and the paddle_parameter * @param len the length of the names array and the paddle_parameter
......
# Design Doc: The C++ Class `Parameters`
`Parameters` is a concept we designed in Paddle V2 API. `Parameters` is a container of parameters, and make Paddle can shared parameter between topologies. We described usages of `Parameter` in [api.md](./api.md).
We used Python to implement Parameters when designing V2 API before. There are several defects for current implementation:
* We just use `memcpy` to share Parameters between topologies, but this is very inefficient.
* We did not implement share Parameters while training. We just trigger `memcpy` when start training.
It is necessary that we implement Parameters in CPP side. However, it could be a code refactoring for Paddle, because Paddle was designed for training only one topology before, i.e., each GradientMachine contains its Parameter as a data member. In current Paddle implementation, there are three concepts associated with `Parameters`:
1. `paddle::Parameter`. A `Parameters` is a container for `paddle::Parameter`.
It is evident that we should use `paddle::Parameter` when developing `Parameters`.
However, the `Parameter` class contains many functions and does not have a clear interface.
It contains `create/store Parameter`, `serialize/deserialize`, `optimize(i.e SGD)`, `randomize/zero`.
When we developing `Parameters`, we only use `create/store Parameter` functionality.
We should extract functionalities of Parameter into many classes to clean Paddle CPP implementation.
2. `paddle::GradientMachine` and its sub-classes, e.g., `paddle::MultiGradientMachine`, `paddle::NeuralNetwork`.
We should pass `Parameters` to `paddle::GradientMachine` when `forward/backward` to avoid `memcpy` between topologies.
Also, we should handle multi-GPU/CPU training, because `forward` and `backward` would perform on multi-GPUs and multi-CPUs.
`Parameters` should dispatch the parameter value to each device, and gather the parameter gradient from each device.
3. `paddle::ParameterUpdater`. The ParameterUpdater is used to update parameters in Paddle.
So `Parameters` should be used by `paddle::ParameterUpdater`, and `paddle::ParameterUpdater` should optimize `Parameters` (by SGD).
The step by step approach for implementation Parameters in Paddle C++ core is listed below. Each step should be a PR and could be merged into Paddle one by one.
1. Clean `paddle::Parameter` interface. Extract the functionalities of `paddle::Parameter` to prepare for the implementation of Parameters.
2. Implementation a `Parameters` class. It just stores the `paddle::Parameter` inside. Make `GradientMachine` uses `Parameters` as a class member.
3. Make `Parameters` support Multi-CPU and Multi-GPU training to prepare for sharing `Parameter` between topologies.
Because we need share `Parameters` between topologies, it is `Parameters`'s response to exchange Parameters between GPUs.
`GradientMachine` should not handle how to exchange Parameters because `GradientMachine` only used to train one topology and we need to support train many topologies in Paddle, i.e., there could be many GradientMachines use one `Parameters`.
* We should use a global function to exchange Parameters between GPUs, not a member function in `Parameters`. The `MultiGradientMachine` invoke this function, which uses `Parameters` as this function inputs.
* The MultiGradientMachine contains many functionalities. Extracting the Parameters exchanging logic could make MultiGradientMachine clearer and simpler.
4. Make `Parameters` as an argument for `forward/backward` function, not a data member for `GradientMachine`. For example, `forward` could be `forward(const Parameters& params, ...)` and `backward` could be `backward(Parameters* params, ...)`. After this step, Paddle could share `Parameters` between topologies.
5. `ParameterUpdater` is invoked by `GradientMachine` and `Trainer`, but it updates `Parameters`. In the end of this code refactoring, we could change `ParameterUpdater` directly uses `Parameters` to make `ParameterUpdater`'s implementation clear.
...@@ -189,15 +189,15 @@ ...@@ -189,15 +189,15 @@
<h2>Classification<a class="headerlink" href="#classification" title="Permalink to this headline"></a></h2> <h2>Classification<a class="headerlink" href="#classification" title="Permalink to this headline"></a></h2>
<div class="section" id="classification-error"> <div class="section" id="classification-error">
<h3>classification_error<a class="headerlink" href="#classification-error" title="Permalink to this headline"></a></h3> <h3>classification_error<a class="headerlink" href="#classification-error" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Classification Error Evaluator. It will print error rate for classification.</p> <dd><p>Classification Error Evaluator. It will print error rate for classification.</p>
<p>The classification error is:</p> <p>The classification error is:</p>
<div class="math"> <div class="math">
\[classification\_error = \frac{NumOfWrongPredicts}{NumOfAllSamples}\]</div> \[classification\_error = \frac{NumOfWrongPredicts}{NumOfAllSamples}\]</div>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span><span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_evaluator</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span><span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -228,12 +228,12 @@ important this sample is.</li> ...@@ -228,12 +228,12 @@ important this sample is.</li>
</div> </div>
<div class="section" id="auc"> <div class="section" id="auc">
<h3>auc<a class="headerlink" href="#auc" title="Permalink to this headline"></a></h3> <h3>auc<a class="headerlink" href="#auc" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">auc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">auc</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Auc Evaluator which adapts to binary classification.</p> <dd><p>Auc Evaluator which adapts to binary classification.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">auc_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">auc</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -256,12 +256,12 @@ important this sample is.</li> ...@@ -256,12 +256,12 @@ important this sample is.</li>
</div> </div>
<div class="section" id="ctc-error"> <div class="section" id="ctc-error">
<h3>ctc_error<a class="headerlink" href="#ctc-error" title="Permalink to this headline"></a></h3> <h3>ctc_error<a class="headerlink" href="#ctc-error" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">ctc_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">ctc_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This evaluator is to calculate sequence-to-sequence edit distance.</p> <dd><p>This evaluator is to calculate sequence-to-sequence edit distance.</p>
<p>The simple usage is :</p> <p>The simple usage is :</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">ctc_error_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">ctc_evaluator</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -283,32 +283,68 @@ label for ctc</li> ...@@ -283,32 +283,68 @@ label for ctc</li>
</div> </div>
<div class="section" id="chunk"> <div class="section" id="chunk">
<h3>chunk<a class="headerlink" href="#chunk" title="Permalink to this headline"></a></h3> <h3>chunk<a class="headerlink" href="#chunk" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">chunk</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">chunk</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Chunk evaluator is used to evaluate segment labelling accuracy for a <dd><p>Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates the chunk detection F1 score.</p> sequence. It calculates precision, recall and F1 scores for the chunk detection.</p>
<p>A chunk is correctly detected if its beginning, end and type are correct. <p>To use chunk evaluator, several concepts need to be clarified firstly.</p>
Other chunk type is ignored.</p> <ul class="simple">
<p>For each label in the label sequence, we have:</p> <li><strong>Chunk type</strong> is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)</li>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">tagType</span> <span class="o">=</span> <span class="n">label</span> <span class="o">%</span> <span class="n">numTagType</span> <li><strong>Tag type</strong> indicates the position of a word in a chunk. (B for begin, I for inside, E for end, S for single)</li>
<span class="n">chunkType</span> <span class="o">=</span> <span class="n">label</span> <span class="o">/</span> <span class="n">numTagType</span> </ul>
<span class="n">otherChunkType</span> <span class="o">=</span> <span class="n">numChunkTypes</span> <p>We can name a label by combining tag type and chunk type. (ie. B-ORG for begining of an organization name)</p>
<p>The construction of label dictionary should obey the following rules:</p>
<ul class="simple">
<li>Use one of the listed labelling schemes. These schemes differ in ways indicating chunk boundry.</li>
</ul>
<div class="highlight-text"><div class="highlight"><pre><span></span>Scheme Description
plain Use the same label for the whole chunk.
IOB Two labels for chunk type X, B-X for chunk begining and I-X for chunk inside.
IOE Two labels for chunk type X, E-X for chunk ending and I-X for chunk inside.
IOBES Four labels for chunk type X, B-X for chunk begining, I-X for chunk inside, E-X for chunk end and S-X for single word chunk.
</pre></div>
</div>
<p>To make it clear, let&#8217;s illustrate by an NER example.
Assuming that there are three named entity types including ORG, PER and LOC which are called &#8216;chunk type&#8217; here,
if &#8216;IOB&#8217; scheme were used, the label set will be extended to a set including B-ORG, I-ORG, B-PER, I-PER, B-LOC, I-LOC and O,
in which B-ORG for begining of ORG and I-ORG for inside of ORG.
Prefixes which are called &#8216;tag type&#8217; here are added to chunk types and there are two tag types including B and I.
Of course, the training data should be labeled accordingly.</p>
<ul class="simple">
<li>Mapping is done correctly by the listed equations and assigning protocol.</li>
</ul>
<p>The following table are equations to extract tag type and chunk type from a label.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>tagType = label % numTagType
chunkType = label / numTagType
otherChunkType = numChunkTypes
</pre></div>
</div>
<p>The following table shows the mapping rule between tagType and tag type in each scheme.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>Scheme Begin Inside End Single
plain 0 - - -
IOB 0 1 - -
IOE - 0 1 -
IOBES 0 1 2 3
</pre></div> </pre></div>
</div> </div>
<p>The total number of different labels is numTagType*numChunkTypes+1. <p>Continue the NER example, and the label dict should look like this to satify above equations:</p>
We support 4 labelling scheme. <div class="highlight-text"><div class="highlight"><pre><span></span>B-ORG 0
The tag type for each of the scheme is shown as follows:</p> I-ORG 1
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">Scheme</span> <span class="n">Begin</span> <span class="n">Inside</span> <span class="n">End</span> <span class="n">Single</span> B-PER 2
<span class="n">plain</span> <span class="mi">0</span> <span class="o">-</span> <span class="o">-</span> <span class="o">-</span> I-PER 3
<span class="n">IOB</span> <span class="mi">0</span> <span class="mi">1</span> <span class="o">-</span> <span class="o">-</span> B-LOC 4
<span class="n">IOE</span> <span class="o">-</span> <span class="mi">0</span> <span class="mi">1</span> <span class="o">-</span> I-LOC 5
<span class="n">IOBES</span> <span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> O 6
</pre></div> </pre></div>
</div> </div>
<p>&#8216;plain&#8217; means the whole chunk must contain exactly the same chunk label.</p> <p>In this example, chunkType has three values: 0 for ORG, 1 for PER, 2 for LOC, because the scheme is
&#8220;IOB&#8221; so tagType has two values: 0 for B and 1 for I.
Here we will use I-LOC to explain the above mapping rules in detail.
For I-LOC, the label id is 5, so we can get tagType=1 and chunkType=2, which means I-LOC is a part of NER chunk LOC
and the tag is I.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">chunk_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">,</span> <span class="n">chunk_scheme</span><span class="p">,</span> <span class="n">num_chunk_types</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">chunk</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">,</span> <span class="n">chunk_scheme</span><span class="p">,</span> <span class="n">num_chunk_types</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -333,9 +369,9 @@ The tag type for each of the scheme is shown as follows:</p> ...@@ -333,9 +369,9 @@ The tag type for each of the scheme is shown as follows:</p>
</div> </div>
<div class="section" id="precision-recall"> <div class="section" id="precision-recall">
<h3>precision_recall<a class="headerlink" href="#precision-recall" title="Permalink to this headline"></a></h3> <h3>precision_recall<a class="headerlink" href="#precision-recall" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">precision_recall</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">precision_recall</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>An Evaluator to calculate precision and recall, F1-score. <dd><p>An Evaluator to calculate precision and recall, F1-score.
It is adapt to the task with multiple labels.</p> It is adapt to the task with multiple labels.</p>
<ul class="simple"> <ul class="simple">
...@@ -345,7 +381,7 @@ F1-score of all labels.</li> ...@@ -345,7 +381,7 @@ F1-score of all labels.</li>
F1-score of this label.</li> F1-score of this label.</li>
</ul> </ul>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">precision_recall_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">precision_evaluator</span><span class="o">.</span><span class="n">recall</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -372,13 +408,13 @@ F1-score of this label.</li> ...@@ -372,13 +408,13 @@ F1-score of this label.</li>
<h2>Rank<a class="headerlink" href="#rank" title="Permalink to this headline"></a></h2> <h2>Rank<a class="headerlink" href="#rank" title="Permalink to this headline"></a></h2>
<div class="section" id="pnpair"> <div class="section" id="pnpair">
<h3>pnpair<a class="headerlink" href="#pnpair" title="Permalink to this headline"></a></h3> <h3>pnpair<a class="headerlink" href="#pnpair" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">pnpair</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">pnpair</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Positive-negative pair rate Evaluator which adapts to rank task like <dd><p>Positive-negative pair rate Evaluator which adapts to rank task like
learning to rank. This evaluator must contain at least three layers.</p> learning to rank. This evaluator must contain at least three layers.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">pnpair_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">info</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">pnpair</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">info</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -405,12 +441,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -405,12 +441,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
<h2>Utils<a class="headerlink" href="#utils" title="Permalink to this headline"></a></h2> <h2>Utils<a class="headerlink" href="#utils" title="Permalink to this headline"></a></h2>
<div class="section" id="sum"> <div class="section" id="sum">
<h3>sum<a class="headerlink" href="#sum" title="Permalink to this headline"></a></h3> <h3>sum<a class="headerlink" href="#sum" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>An Evaluator to sum the result of input.</p> <dd><p>An Evaluator to sum the result of input.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">sum_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -432,12 +468,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -432,12 +468,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
</div> </div>
<div class="section" id="column-sum"> <div class="section" id="column-sum">
<h3>column_sum<a class="headerlink" href="#column-sum" title="Permalink to this headline"></a></h3> <h3>column_sum<a class="headerlink" href="#column-sum" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">column_sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">column_sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to sum the last column of input.</p> <dd><p>This Evaluator is used to sum the last column of input.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">column_sum_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">column_evaluator</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -460,12 +496,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -460,12 +496,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
<h2>Print<a class="headerlink" href="#print" title="Permalink to this headline"></a></h2> <h2>Print<a class="headerlink" href="#print" title="Permalink to this headline"></a></h2>
<div class="section" id="classification-error-printer"> <div class="section" id="classification-error-printer">
<h3>classification_error_printer<a class="headerlink" href="#classification-error-printer" title="Permalink to this headline"></a></h3> <h3>classification_error_printer<a class="headerlink" href="#classification-error-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the classification error of each sample.</p> <dd><p>This Evaluator is used to print the classification error of each sample.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -486,13 +522,13 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -486,13 +522,13 @@ learning to rank. This evaluator must contain at least three layers.</p>
</div> </div>
<div class="section" id="gradient-printer"> <div class="section" id="gradient-printer">
<h3>gradient_printer<a class="headerlink" href="#gradient-printer" title="Permalink to this headline"></a></h3> <h3>gradient_printer<a class="headerlink" href="#gradient-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">gradient_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">gradient_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the gradient of input layers. It contains <dd><p>This Evaluator is used to print the gradient of input layers. It contains
one or more input layers.</p> one or more input layers.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">gradient_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">gradient_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -512,14 +548,14 @@ one or more input layers.</p> ...@@ -512,14 +548,14 @@ one or more input layers.</p>
</div> </div>
<div class="section" id="maxid-printer"> <div class="section" id="maxid-printer">
<h3>maxid_printer<a class="headerlink" href="#maxid-printer" title="Permalink to this headline"></a></h3> <h3>maxid_printer<a class="headerlink" href="#maxid-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxid_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxid_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print maximum top k values and their indexes <dd><p>This Evaluator is used to print maximum top k values and their indexes
of each row of input layers. It contains one or more input layers. of each row of input layers. It contains one or more input layers.
k is specified by num_results.</p> k is specified by num_results.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxid_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxid_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -541,9 +577,9 @@ It is 1 by default.</li> ...@@ -541,9 +577,9 @@ It is 1 by default.</li>
</div> </div>
<div class="section" id="maxframe-printer"> <div class="section" id="maxframe-printer">
<h3>maxframe_printer<a class="headerlink" href="#maxframe-printer" title="Permalink to this headline"></a></h3> <h3>maxframe_printer<a class="headerlink" href="#maxframe-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxframe_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxframe_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the top k frames of each input layers. <dd><p>This Evaluator is used to print the top k frames of each input layers.
The input layers should contain sequences info or sequences type. The input layers should contain sequences info or sequences type.
k is specified by num_results. k is specified by num_results.
...@@ -553,7 +589,7 @@ It contains one or more input layers.</p> ...@@ -553,7 +589,7 @@ It contains one or more input layers.</p>
<p class="last">The width of each frame is 1.</p> <p class="last">The width of each frame is 1.</p>
</div> </div>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxframe_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxframe_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -573,9 +609,9 @@ It contains one or more input layers.</p> ...@@ -573,9 +609,9 @@ It contains one or more input layers.</p>
</div> </div>
<div class="section" id="seqtext-printer"> <div class="section" id="seqtext-printer">
<h3>seqtext_printer<a class="headerlink" href="#seqtext-printer" title="Permalink to this headline"></a></h3> <h3>seqtext_printer<a class="headerlink" href="#seqtext-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">seqtext_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">seqtext_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Sequence text printer will print text according to index matrix and a <dd><p>Sequence text printer will print text according to index matrix and a
dictionary. There can be multiple input to this layer:</p> dictionary. There can be multiple input to this layer:</p>
<p>1. If there is no id_input, the input must be a matrix containing <p>1. If there is no id_input, the input must be a matrix containing
...@@ -607,7 +643,7 @@ the sequence of indices;</p> ...@@ -607,7 +643,7 @@ the sequence of indices;</p>
<p>Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup <p>Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup
with maxid (when generating) as an input.</p> with maxid (when generating) as an input.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">seqtext_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">maxid</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">seqtext_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">maxid</span><span class="p">,</span>
<span class="n">id_input</span><span class="o">=</span><span class="n">sample_id</span><span class="p">,</span> <span class="n">id_input</span><span class="o">=</span><span class="n">sample_id</span><span class="p">,</span>
<span class="n">dict_file</span><span class="o">=</span><span class="n">dict_file</span><span class="p">,</span> <span class="n">dict_file</span><span class="o">=</span><span class="n">dict_file</span><span class="p">,</span>
<span class="n">result_file</span><span class="o">=</span><span class="n">result_file</span><span class="p">)</span> <span class="n">result_file</span><span class="o">=</span><span class="n">result_file</span><span class="p">)</span>
...@@ -647,13 +683,13 @@ Default is True. No space is added if set to False.</li> ...@@ -647,13 +683,13 @@ Default is True. No space is added if set to False.</li>
</div> </div>
<div class="section" id="value-printer"> <div class="section" id="value-printer">
<h3>value_printer<a class="headerlink" href="#value-printer" title="Permalink to this headline"></a></h3> <h3>value_printer<a class="headerlink" href="#value-printer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">value_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">value_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the values of input layers. It contains <dd><p>This Evaluator is used to print the values of input layers. It contains
one or more input layers.</p> one or more input layers.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">value_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">value_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
......
...@@ -189,35 +189,10 @@ ...@@ -189,35 +189,10 @@
<h2>Data layer<a class="headerlink" href="#data-layer" title="Permalink to this headline"></a></h2> <h2>Data layer<a class="headerlink" href="#data-layer" title="Permalink to this headline"></a></h2>
<div class="section" id="data"> <div class="section" id="data">
<span id="api-v2-layer-data"></span><h3>data<a class="headerlink" href="#data" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-data"></span><h3>data<a class="headerlink" href="#data" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="attribute">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">data</code><span class="sig-paren">(</span><em>name</em>, <em>type</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.layer.</code><code class="descname">data</code></dt>
<dd><p>Define DataLayer For NeuralNetwork.</p> <dd><p>alias of <code class="xref py py-class docutils literal"><span class="pre">name</span></code></p>
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;input&quot;</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">data_type</span><span class="o">.</span><span class="n">dense_vector</span><span class="p">(</span><span class="mi">1000</span><span class="p">))</span>
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this data layer.</li>
<li><strong>type</strong> &#8211; Data type of this data layer</li>
<li><strong>height</strong> (<em>int|None</em>) &#8211; Height of this data layer, used for image</li>
<li><strong>width</strong> (<em>int|None</em>) &#8211; Width of this data layer, used for image</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td>
</tr>
</tbody>
</table>
</dd></dl> </dd></dl>
</div> </div>
...@@ -228,12 +203,12 @@ ...@@ -228,12 +203,12 @@
<span id="api-v2-layer-fc"></span><h3>fc<a class="headerlink" href="#fc" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-fc"></span><h3>fc<a class="headerlink" href="#fc" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">fc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">fc</code></dt>
<dd><p>Helper for declare fully connected layer.</p> <dd><p>Helper for declare fully connected layer.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fc</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fc</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
...@@ -250,7 +225,7 @@ ...@@ -250,7 +225,7 @@
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li> <li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -274,13 +249,13 @@ default Bias.</li> ...@@ -274,13 +249,13 @@ default Bias.</li>
<h3>selective_fc<a class="headerlink" href="#selective-fc" title="Permalink to this headline"></a></h3> <h3>selective_fc<a class="headerlink" href="#selective-fc" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">selective_fc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">selective_fc</code></dt>
<dd><p>Selectived fully connected layer. Different from fc, the output <dd><p>Selectived fully connected layer. Different from fc, the output
of this layer maybe sparse. It requires an additional input to indicate of this layer maybe sparse. It requires an additional input to indicate
several selected columns for output. If the selected columns is not several selected columns for output. If the selected columns is not
specified, selective_fc acts exactly like fc.</p> specified, selective_fc acts exactly like fc.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sel_fc</span> <span class="o">=</span> <span class="n">selective_fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sel_fc</span> <span class="o">=</span> <span class="n">selective_fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -294,7 +269,7 @@ specified, selective_fc acts exactly like fc.</p> ...@@ -294,7 +269,7 @@ specified, selective_fc acts exactly like fc.</p>
sparse binary matrix, and treat as the mask of selective fc. sparse binary matrix, and treat as the mask of selective fc.
If is None, acts exactly like fc.</li> If is None, acts exactly like fc.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li> <li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -321,7 +296,7 @@ default Bias.</li> ...@@ -321,7 +296,7 @@ default Bias.</li>
<h3>conv_operator<a class="headerlink" href="#conv-operator" title="Permalink to this headline"></a></h3> <h3>conv_operator<a class="headerlink" href="#conv-operator" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_operator</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_operator</code></dt>
<dd><p>Different from img_conv, conv_op is an Operator, which can be used <dd><p>Different from img_conv, conv_op is an Operator, which can be used
in mixed. And conv_op takes two inputs to perform convolution. in mixed. And conv_op takes two inputs to perform convolution.
The first input is the image and the second is filter kernel. It only The first input is the image and the second is filter kernel. It only
...@@ -369,7 +344,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li> ...@@ -369,7 +344,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li>
<h3>conv_projection<a class="headerlink" href="#conv-projection" title="Permalink to this headline"></a></h3> <h3>conv_projection<a class="headerlink" href="#conv-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_projection</code></dt>
<dd><p>Different from img_conv and conv_op, conv_projection is an Projection, <dd><p>Different from img_conv and conv_op, conv_projection is an Projection,
which can be used in mixed and conat. It use cudnn to implement which can be used in mixed and conat. It use cudnn to implement
conv and only support GPU mode.</p> conv and only support GPU mode.</p>
...@@ -417,7 +392,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li> ...@@ -417,7 +392,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li>
<h3>conv_shift<a class="headerlink" href="#conv-shift" title="Permalink to this headline"></a></h3> <h3>conv_shift<a class="headerlink" href="#conv-shift" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_shift</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_shift</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>This layer performs cyclic convolution for two input. For example:</dt> <dt>This layer performs cyclic convolution for two input. For example:</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -470,7 +445,7 @@ the right size (which is the end of array) to the left.</li> ...@@ -470,7 +445,7 @@ the right size (which is the end of array) to the left.</li>
<h3>img_conv<a class="headerlink" href="#img-conv" title="Permalink to this headline"></a></h3> <h3>img_conv<a class="headerlink" href="#img-conv" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_conv</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_conv</code></dt>
<dd><p>Convolution layer for image. Paddle can support both square and non-square <dd><p>Convolution layer for image. Paddle can support both square and non-square
input currently.</p> input currently.</p>
<p>The details of convolution layer, please refer UFLDL&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/FeatureExtractionUsingConvolution/">convolution</a> .</p> <p>The details of convolution layer, please refer UFLDL&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/FeatureExtractionUsingConvolution/">convolution</a> .</p>
...@@ -494,7 +469,7 @@ rest channels will be processed by rest group of filters.</p> ...@@ -494,7 +469,7 @@ rest channels will be processed by rest group of filters.</p>
<span class="n">num_channels</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">num_channels</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span>
<span class="n">num_filters</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">stride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">num_filters</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">stride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -510,7 +485,7 @@ two image dimension.</li> ...@@ -510,7 +485,7 @@ two image dimension.</li>
currently supports rectangular filters, the filter&#8217;s currently supports rectangular filters, the filter&#8217;s
shape will be (filter_size, filter_size_y).</li> shape will be (filter_size, filter_size_y).</li>
<li><strong>num_filters</strong> &#8211; Each filter group&#8217;s number of filter</li> <li><strong>num_filters</strong> &#8211; Each filter group&#8217;s number of filter</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type. Default is tanh</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type. Default is tanh</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; Group size of filters.</li> <li><strong>groups</strong> (<em>int</em>) &#8211; Group size of filters.</li>
<li><strong>stride</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the stride. Or input a tuple for two image <li><strong>stride</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the stride. Or input a tuple for two image
dimension.</li> dimension.</li>
...@@ -548,7 +523,7 @@ otherwise layer_type has to be either &#8220;exconv&#8221; or ...@@ -548,7 +523,7 @@ otherwise layer_type has to be either &#8220;exconv&#8221; or
<span id="api-v2-layer-context-projection"></span><h3>context_projection<a class="headerlink" href="#context-projection" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-context-projection"></span><h3>context_projection<a class="headerlink" href="#context-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">context_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">context_projection</code></dt>
<dd><p>Context Projection.</p> <dd><p>Context Projection.</p>
<p>It just simply reorganizes input sequence, combines &#8220;context_len&#8221; sequence <p>It just simply reorganizes input sequence, combines &#8220;context_len&#8221; sequence
to one context from context_start. &#8220;context_start&#8221; will be set to to one context from context_start. &#8220;context_start&#8221; will be set to
...@@ -591,7 +566,7 @@ parameter attribute is set by this parameter.</li> ...@@ -591,7 +566,7 @@ parameter attribute is set by this parameter.</li>
<h3>img_pool<a class="headerlink" href="#img-pool" title="Permalink to this headline"></a></h3> <h3>img_pool<a class="headerlink" href="#img-pool" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_pool</code></dt>
<dd><p>Image pooling Layer.</p> <dd><p>Image pooling Layer.</p>
<p>The details of pooling layer, please refer ufldl&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/Pooling/">pooling</a> .</p> <p>The details of pooling layer, please refer ufldl&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/Pooling/">pooling</a> .</p>
<ul class="simple"> <ul class="simple">
...@@ -655,7 +630,7 @@ Defalut is True. If set false, Otherwise use floor.</li> ...@@ -655,7 +630,7 @@ Defalut is True. If set false, Otherwise use floor.</li>
<h3>spp<a class="headerlink" href="#spp" title="Permalink to this headline"></a></h3> <h3>spp<a class="headerlink" href="#spp" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">spp</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">spp</code></dt>
<dd><p>Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. <dd><p>Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.
The details please refer to The details please refer to
<a class="reference external" href="https://arxiv.org/abs/1406.4729">Kaiming He&#8217;s paper</a>.</p> <a class="reference external" href="https://arxiv.org/abs/1406.4729">Kaiming He&#8217;s paper</a>.</p>
...@@ -695,7 +670,7 @@ The details please refer to ...@@ -695,7 +670,7 @@ The details please refer to
<h3>maxout<a class="headerlink" href="#maxout" title="Permalink to this headline"></a></h3> <h3>maxout<a class="headerlink" href="#maxout" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">maxout</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">maxout</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>A layer to do max out on conv layer output.</dt> <dt>A layer to do max out on conv layer output.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -752,7 +727,7 @@ automatically from previous output.</li> ...@@ -752,7 +727,7 @@ automatically from previous output.</li>
<h3>img_cmrnorm<a class="headerlink" href="#img-cmrnorm" title="Permalink to this headline"></a></h3> <h3>img_cmrnorm<a class="headerlink" href="#img-cmrnorm" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_cmrnorm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_cmrnorm</code></dt>
<dd><p>Response normalization across feature maps. <dd><p>Response normalization across feature maps.
The details please refer to The details please refer to
<a class="reference external" href="http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf">Alex&#8217;s paper</a>.</p> <a class="reference external" href="http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf">Alex&#8217;s paper</a>.</p>
...@@ -791,7 +766,7 @@ num_channels is None, it will be set automatically.</li> ...@@ -791,7 +766,7 @@ num_channels is None, it will be set automatically.</li>
<h3>batch_norm<a class="headerlink" href="#batch-norm" title="Permalink to this headline"></a></h3> <h3>batch_norm<a class="headerlink" href="#batch-norm" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">batch_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">batch_norm</code></dt>
<dd><p>Batch Normalization Layer. The notation of this layer as follow.</p> <dd><p>Batch Normalization Layer. The notation of this layer as follow.</p>
<p><span class="math">\(x\)</span> is the input features over a mini-batch.</p> <p><span class="math">\(x\)</span> is the input features over a mini-batch.</p>
<div class="math"> <div class="math">
...@@ -805,7 +780,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp ...@@ -805,7 +780,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp
<p>The details of batch normalization please refer to this <p>The details of batch normalization please refer to this
<a class="reference external" href="http://arxiv.org/abs/1502.03167">paper</a>.</p> <a class="reference external" href="http://arxiv.org/abs/1502.03167">paper</a>.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">norm</span> <span class="o">=</span> <span class="n">batch_norm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">norm</span> <span class="o">=</span> <span class="n">batch_norm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -825,7 +800,7 @@ automaticly select cudnn_batch_norm for GPU and ...@@ -825,7 +800,7 @@ automaticly select cudnn_batch_norm for GPU and
batch_norm for CPU. Otherwise, select batch norm batch_norm for CPU. Otherwise, select batch norm
type based on the specified type. If you use cudnn_batch_norm, type based on the specified type. If you use cudnn_batch_norm,
we suggested you use latest version, such as v5.1.</li> we suggested you use latest version, such as v5.1.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Better be relu. Because batch <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Better be relu. Because batch
normalization will normalize input near zero.</li> normalization will normalize input near zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of <li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of
filters. None will automatically get from layer&#8217;s filters. None will automatically get from layer&#8217;s
...@@ -863,7 +838,7 @@ computation, referred to as facotr, ...@@ -863,7 +838,7 @@ computation, referred to as facotr,
<h3>sum_to_one_norm<a class="headerlink" href="#sum-to-one-norm" title="Permalink to this headline"></a></h3> <h3>sum_to_one_norm<a class="headerlink" href="#sum-to-one-norm" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_to_one_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_to_one_norm</code></dt>
<dd><p>A layer for sum-to-one normalization, <dd><p>A layer for sum-to-one normalization,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -900,7 +875,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.< ...@@ -900,7 +875,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.<
<h3>cross_channel_norm<a class="headerlink" href="#cross-channel-norm" title="Permalink to this headline"></a></h3> <h3>cross_channel_norm<a class="headerlink" href="#cross-channel-norm" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_channel_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_channel_norm</code></dt>
<dd><p>Normalize a layer&#8217;s output. This layer is necessary for ssd. <dd><p>Normalize a layer&#8217;s output. This layer is necessary for ssd.
This layer applys normalize across the channels of each sample to This layer applys normalize across the channels of each sample to
a conv layer&#8217;s output and scale the output by a group of trainable a conv layer&#8217;s output and scale the output by a group of trainable
...@@ -931,7 +906,7 @@ factors which dimensions equal to the channel&#8217;s number.</p> ...@@ -931,7 +906,7 @@ factors which dimensions equal to the channel&#8217;s number.</p>
<h3>recurrent<a class="headerlink" href="#recurrent" title="Permalink to this headline"></a></h3> <h3>recurrent<a class="headerlink" href="#recurrent" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">recurrent</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">recurrent</code></dt>
<dd><p>Simple recurrent unit layer. It is just a fully connect layer through both <dd><p>Simple recurrent unit layer. It is just a fully connect layer through both
time and neural network.</p> time and neural network.</p>
<p>For each sequence [start, end] it performs the following computation:</p> <p>For each sequence [start, end] it performs the following computation:</p>
...@@ -948,7 +923,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en ...@@ -948,7 +923,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li> <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li>
...@@ -971,7 +946,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en ...@@ -971,7 +946,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<h3>lstmemory<a class="headerlink" href="#lstmemory" title="Permalink to this headline"></a></h3> <h3>lstmemory<a class="headerlink" href="#lstmemory" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstmemory</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstmemory</code></dt>
<dd><p>Long Short-term Memory Cell.</p> <dd><p>Long Short-term Memory Cell.</p>
<p>The memory cell was implemented as follow equations.</p> <p>The memory cell was implemented as follow equations.</p>
<div class="math"> <div class="math">
...@@ -995,9 +970,9 @@ more details about LSTM.</p> ...@@ -995,9 +970,9 @@ more details about LSTM.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The lstmemory layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; The lstmemory layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is sequence process reversed or not.</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; is sequence process reversed or not.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation type, paddle.v2.Activation.Tanh by default. <span class="math">\(h_t\)</span></li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. <span class="math">\(h_t\)</span></li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; gate activation type, paddle.v2.Activation.Sigmoid by default.</li> <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; state activation type, paddle.v2.Activation.Tanh by default.</li> <li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; state activation type, paddle.v2.activation.Tanh by default.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li> bias.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
...@@ -1020,7 +995,7 @@ bias.</li> ...@@ -1020,7 +995,7 @@ bias.</li>
<h3>grumemory<a class="headerlink" href="#grumemory" title="Permalink to this headline"></a></h3> <h3>grumemory<a class="headerlink" href="#grumemory" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">grumemory</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">grumemory</code></dt>
<dd><p>Gate Recurrent Unit Layer.</p> <dd><p>Gate Recurrent Unit Layer.</p>
<p>The memory cell was implemented as follow equations.</p> <p>The memory cell was implemented as follow equations.</p>
<p>1. update gate <span class="math">\(z\)</span>: defines how much of the previous memory to <p>1. update gate <span class="math">\(z\)</span>: defines how much of the previous memory to
...@@ -1060,9 +1035,9 @@ Recurrent Neural Networks on Sequence Modeling.</a></p> ...@@ -1060,9 +1035,9 @@ Recurrent Neural Networks on Sequence Modeling.</a></p>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The gru layer name.</li> <li><strong>name</strong> (<em>None|basestring</em>) &#8211; The gru layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; input layer.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; Whether sequence process is reversed or not.</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; Whether sequence process is reversed or not.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation type, paddle.v2.Activation.Tanh by default. This activation <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. This activation
affects the <span class="math">\({\tilde{h_t}}\)</span>.</li> affects the <span class="math">\({\tilde{h_t}}\)</span>.</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; gate activation type, paddle.v2.Activation.Sigmoid by default. <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.
This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the
<span class="math">\(\sigma\)</span> in the above formula.</li> <span class="math">\(\sigma\)</span> in the above formula.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
...@@ -1092,7 +1067,7 @@ will get a warning.</li> ...@@ -1092,7 +1067,7 @@ will get a warning.</li>
<h3>memory<a class="headerlink" href="#memory" title="Permalink to this headline"></a></h3> <h3>memory<a class="headerlink" href="#memory" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">memory</code><span class="sig-paren">(</span><em>name</em>, <em>extra_input=None</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">memory</code></dt>
<dd><p>The memory layers is a layer cross each time step. Reference this output <dd><p>The memory layers is a layer cross each time step. Reference this output
as previous time step layer <code class="code docutils literal"><span class="pre">name</span></code> &#8216;s output.</p> as previous time step layer <code class="code docutils literal"><span class="pre">name</span></code> &#8216;s output.</p>
<p>The default memory is zero in first time step, previous time step&#8217;s <p>The default memory is zero in first time step, previous time step&#8217;s
...@@ -1101,12 +1076,12 @@ output in the rest time steps.</p> ...@@ -1101,12 +1076,12 @@ output in the rest time steps.</p>
with activation.</p> with activation.</p>
<p>If boot_with_const_id, then the first time stop is a IndexSlot, the <p>If boot_with_const_id, then the first time stop is a IndexSlot, the
Arguments.ids()[0] is this <code class="code docutils literal"><span class="pre">cost_id</span></code>.</p> Arguments.ids()[0] is this <code class="code docutils literal"><span class="pre">cost_id</span></code>.</p>
<p>If boot_layer is not null, the memory is just the boot_layer&#8217;s output. <p>If boot is not null, the memory is just the boot&#8217;s output.
Set <code class="code docutils literal"><span class="pre">is_seq</span></code> is true boot layer is sequence.</p> Set <code class="code docutils literal"><span class="pre">is_seq</span></code> is true boot layer is sequence.</p>
<p>The same name layer in recurrent group will set memory on each time <p>The same name layer in recurrent group will set memory on each time
step.</p> step.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">mem</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">mem</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span>
<span class="n">state</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">mem</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span> <span class="n">state</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">mem</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<p>If you do not want to specify the name, you can equivalently use set_input() <p>If you do not want to specify the name, you can equivalently use set_input()
...@@ -1122,18 +1097,18 @@ name of the layer which this memory remembers.</li> ...@@ -1122,18 +1097,18 @@ name of the layer which this memory remembers.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; size of memory.</li> <li><strong>size</strong> (<em>int</em>) &#8211; size of memory.</li>
<li><strong>memory_name</strong> (<em>basestring</em>) &#8211; the name of the memory. <li><strong>memory_name</strong> (<em>basestring</em>) &#8211; the name of the memory.
It is ignored when name is provided.</li> It is ignored when name is provided.</li>
<li><strong>is_seq</strong> (<em>bool</em>) &#8211; is sequence for boot_layer</li> <li><strong>is_seq</strong> (<em>bool</em>) &#8211; is sequence for boot</li>
<li><strong>boot_layer</strong> (<em>LayerOutput|None</em>) &#8211; boot layer of memory.</li> <li><strong>boot</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; boot layer of memory.</li>
<li><strong>boot_bias</strong> (<em>ParameterAttribute|None</em>) &#8211; boot layer&#8217;s bias</li> <li><strong>boot_bias</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; boot layer&#8217;s bias</li>
<li><strong>boot_bias_active_type</strong> (<em>BaseActivation</em>) &#8211; boot layer&#8217;s active type.</li> <li><strong>boot_bias_active_type</strong> (<em>paddle.v2.activation.Base</em>) &#8211; boot layer&#8217;s active type.</li>
<li><strong>boot_with_const_id</strong> (<em>int</em>) &#8211; boot layer&#8217;s id.</li> <li><strong>boot_with_const_id</strong> (<em>int</em>) &#8211; boot layer&#8217;s id.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object which is a memory.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object which is a memory.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -1153,9 +1128,9 @@ sequence input. This is extremely usefull for attention based model, or ...@@ -1153,9 +1128,9 @@ sequence input. This is extremely usefull for attention based model, or
Neural Turning Machine like models.</p> Neural Turning Machine like models.</p>
<p>The basic usage (time steps) is:</p> <p>The basic usage (time steps) is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <span class="n">output</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">LinearActivation</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="k">return</span> <span class="n">output</span> <span class="k">return</span> <span class="n">output</span>
...@@ -1165,8 +1140,8 @@ Neural Turning Machine like models.</p> ...@@ -1165,8 +1140,8 @@ Neural Turning Machine like models.</p>
</div> </div>
<p>You can see following configs for further usages:</p> <p>You can see following configs for further usages:</p>
<ul class="simple"> <ul class="simple">
<li>time steps: lstmemory_group, paddle/gserver/tests/sequence_layer_group.conf, demo/seqToseq/seqToseq_net.py</li> <li>time steps: lstmemory_group, paddle/gserver/tests/sequence_group.conf, demo/seqToseq/seqToseq_net.py</li>
<li>sequence steps: paddle/gserver/tests/sequence_nest_layer_group.conf</li> <li>sequence steps: paddle/gserver/tests/sequence_nest_group.conf</li>
</ul> </ul>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -1182,24 +1157,24 @@ a time step result. Then gather each time step of output into ...@@ -1182,24 +1157,24 @@ a time step result. Then gather each time step of output into
layer group&#8217;s output.</p> layer group&#8217;s output.</p>
</li> </li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; recurrent_group&#8217;s name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; recurrent_group&#8217;s name.</li>
<li><strong>input</strong> (<em>LayerOutput|StaticInput|SubsequenceInput|list|tuple</em>) &#8211; <p>Input links array.</p> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|StaticInput|SubsequenceInput|list|tuple</em>) &#8211; <p>Input links array.</p>
<p>LayerOutput will be scattered into time steps. <p>paddle.v2.config_base.Layer will be scattered into time steps.
SubsequenceInput will be scattered into sequence steps. SubsequenceInput will be scattered into sequence steps.
StaticInput will be imported to each time step, and doesn&#8217;t change StaticInput will be imported to each time step, and doesn&#8217;t change
through time. It&#8217;s a mechanism to access layer outside step function.</p> through time. It&#8217;s a mechanism to access layer outside step function.</p>
</li> </li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set true, the recurrent unit will process the <li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set true, the recurrent unit will process the
input sequence in a reverse order.</li> input sequence in a reverse order.</li>
<li><strong>targetInlink</strong> (<em>LayerOutput|SubsequenceInput</em>) &#8211; <p>the input layer which share info with layer group&#8217;s output</p> <li><strong>targetInlink</strong> (<em>paddle.v2.config_base.Layer|SubsequenceInput</em>) &#8211; <p>the input layer which share info with layer group&#8217;s output</p>
<p>Param input specifies multiple input layers. For <p>Param input specifies multiple input layers. For
SubsequenceInput inputs, config should assign one input SubsequenceInput inputs, config should assign one input
layer that share info(the number of sentences and the number layer that share info(the number of sentences and the number
of words in each sentence) with all layer group&#8217;s outputs. of words in each sentence) with all layer group&#8217;s outputs.
targetInlink should be one of the layer group&#8217;s input.</p> targetInlink should be one of the layer group&#8217;s input.</p>
</li> </li>
<li><strong>is_generating</strong> &#8211; If is generating, none of input type should be LayerOutput; <li><strong>is_generating</strong> &#8211; If is generating, none of input type should be paddle.v2.config_base.Layer;
else, for training or testing, one of the input type must else, for training or testing, one of the input type must
be LayerOutput.</li> be paddle.v2.config_base.Layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -1210,9 +1185,9 @@ be LayerOutput.</li> ...@@ -1210,9 +1185,9 @@ be LayerOutput.</li>
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">LayerOutput object.</td> <tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">paddle.v2.config_base.Layer object.</td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Return type:</th><td class="field-body">LayerOutput</td> <tr class="field-even field"><th class="field-name">Return type:</th><td class="field-body">paddle.v2.config_base.Layer</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
...@@ -1223,7 +1198,7 @@ be LayerOutput.</li> ...@@ -1223,7 +1198,7 @@ be LayerOutput.</li>
<h3>lstm_step<a class="headerlink" href="#lstm-step" title="Permalink to this headline"></a></h3> <h3>lstm_step<a class="headerlink" href="#lstm-step" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstm_step</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstm_step</code></dt>
<dd><p>LSTM Step Layer. It used in recurrent_group. The lstm equations are shown <dd><p>LSTM Step Layer. It used in recurrent_group. The lstm equations are shown
as follow.</p> as follow.</p>
<div class="math"> <div class="math">
...@@ -1248,10 +1223,10 @@ output is <span class="math">\(o_t\)</span>, which name is &#8216;state&#8217; a ...@@ -1248,10 +1223,10 @@ output is <span class="math">\(o_t\)</span>, which name is &#8216;state&#8217; a
<code class="code docutils literal"><span class="pre">state.size</span></code>.</li> <code class="code docutils literal"><span class="pre">state.size</span></code>.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer. <span class="math">\(Wx_t + Wh_{t-1}\)</span></li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer. <span class="math">\(Wx_t + Wh_{t-1}\)</span></li>
<li><strong>state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; State Layer. <span class="math">\(c_{t-1}\)</span></li> <li><strong>state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; State Layer. <span class="math">\(c_{t-1}\)</span></li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type. Default is tanh</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type. Default is tanh</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Gate Activation Type. Default is sigmoid, and should <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Gate Activation Type. Default is sigmoid, and should
be sigmoid only.</li> be sigmoid only.</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should <li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should
be sigmoid only.</li> be sigmoid only.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li> <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
...@@ -1273,7 +1248,7 @@ be sigmoid only.</li> ...@@ -1273,7 +1248,7 @@ be sigmoid only.</li>
<h3>gru_step<a class="headerlink" href="#gru-step" title="Permalink to this headline"></a></h3> <h3>gru_step<a class="headerlink" href="#gru-step" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">gru_step</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">gru_step</code></dt>
<dd><table class="docutils field-list" frame="void" rules="none"> <dd><table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
...@@ -1314,7 +1289,7 @@ to maintain tractability.</p> ...@@ -1314,7 +1289,7 @@ to maintain tractability.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">rnn_step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">rnn_step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span>
<span class="n">last_time_step_output</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span> <span class="n">last_time_step_output</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span>
<span class="k">with</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">simple_rnn</span><span class="p">:</span> <span class="k">with</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">simple_rnn</span><span class="p">:</span>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span> <span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span>
<span class="k">return</span> <span class="n">simple_rnn</span> <span class="k">return</span> <span class="n">simple_rnn</span>
...@@ -1376,7 +1351,7 @@ beam size.</li> ...@@ -1376,7 +1351,7 @@ beam size.</li>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The generated word index.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The generated word index.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -1388,7 +1363,7 @@ beam size.</li> ...@@ -1388,7 +1363,7 @@ beam size.</li>
<h3>get_output<a class="headerlink" href="#get-output" title="Permalink to this headline"></a></h3> <h3>get_output<a class="headerlink" href="#get-output" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">get_output</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">get_output</code></dt>
<dd><p>Get layer&#8217;s output by name. In PaddlePaddle, a layer might return multiple <dd><p>Get layer&#8217;s output by name. In PaddlePaddle, a layer might return multiple
values, but returns one layer&#8217;s output. If the user wants to use another values, but returns one layer&#8217;s output. If the user wants to use another
output besides the default one, please use get_output first to get output besides the default one, please use get_output first to get
...@@ -1429,17 +1404,17 @@ multiple outputs.</li> ...@@ -1429,17 +1404,17 @@ multiple outputs.</li>
Each inputs is a projection or operator.</p> Each inputs is a projection or operator.</p>
<p>There are two styles of usages.</p> <p>There are two styles of usages.</p>
<ol class="arabic simple"> <ol class="arabic simple">
<li>When not set inputs parameter, use mixed_layer like this:</li> <li>When not set inputs parameter, use mixed like this:</li>
</ol> </ol>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span>
<span class="n">m</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">)</span> <span class="n">m</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">)</span>
<span class="n">m</span> <span class="o">+=</span> <span class="n">identity_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)</span> <span class="n">m</span> <span class="o">+=</span> <span class="n">identity_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<ol class="arabic simple" start="2"> <ol class="arabic simple" start="2">
<li>You can also set all inputs when invoke mixed_layer as follows:</li> <li>You can also set all inputs when invoke mixed as follows:</li>
</ol> </ol>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">m</span> <span class="o">=</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">m</span> <span class="o">=</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">),</span> <span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">),</span>
<span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)])</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)])</span>
</pre></div> </pre></div>
...@@ -1453,11 +1428,11 @@ Each inputs is a projection or operator.</p> ...@@ -1453,11 +1428,11 @@ Each inputs is a projection or operator.</p>
<li><strong>size</strong> (<em>int</em>) &#8211; layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; layer size.</li>
<li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set, <li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set,
then this function will just return layer&#8217;s name.</li> then this function will just return layer&#8217;s name.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; Activation Type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type.</li>
<li><strong>bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li> default Bias.</li>
<li><strong>layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; The extra layer config. Default is None.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; The extra layer config. Default is None.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -1476,7 +1451,7 @@ default Bias.</li> ...@@ -1476,7 +1451,7 @@ default Bias.</li>
<span id="api-v2-layer-embedding"></span><h3>embedding<a class="headerlink" href="#embedding" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-embedding"></span><h3>embedding<a class="headerlink" href="#embedding" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">embedding</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">embedding</code></dt>
<dd><p>Define a embedding Layer.</p> <dd><p>Define a embedding Layer.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -1507,7 +1482,7 @@ for details.</li> ...@@ -1507,7 +1482,7 @@ for details.</li>
<h3>scaling_projection<a class="headerlink" href="#scaling-projection" title="Permalink to this headline"></a></h3> <h3>scaling_projection<a class="headerlink" href="#scaling-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling_projection</code></dt>
<dd><p>scaling_projection multiplies the input with a scalar parameter and add to <dd><p>scaling_projection multiplies the input with a scalar parameter and add to
the output.</p> the output.</p>
<div class="math"> <div class="math">
...@@ -1541,7 +1516,7 @@ the output.</p> ...@@ -1541,7 +1516,7 @@ the output.</p>
<h3>dotmul_projection<a class="headerlink" href="#dotmul-projection" title="Permalink to this headline"></a></h3> <h3>dotmul_projection<a class="headerlink" href="#dotmul-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_projection</code></dt>
<dd><p>DotMulProjection with a layer as input. <dd><p>DotMulProjection with a layer as input.
It performs element-wise multiplication with weight.</p> It performs element-wise multiplication with weight.</p>
<div class="math"> <div class="math">
...@@ -1576,7 +1551,7 @@ It performs element-wise multiplication with weight.</p> ...@@ -1576,7 +1551,7 @@ It performs element-wise multiplication with weight.</p>
<h3>dotmul_operator<a class="headerlink" href="#dotmul-operator" title="Permalink to this headline"></a></h3> <h3>dotmul_operator<a class="headerlink" href="#dotmul-operator" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_operator</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_operator</code></dt>
<dd><p>DotMulOperator takes two inputs and performs element-wise multiplication:</p> <dd><p>DotMulOperator takes two inputs and performs element-wise multiplication:</p>
<div class="math"> <div class="math">
\[out.row[i] += scale * (a.row[i] .* b.row[i])\]</div> \[out.row[i] += scale * (a.row[i] .* b.row[i])\]</div>
...@@ -1612,7 +1587,7 @@ scale is a config scalar, its default value is one.</p> ...@@ -1612,7 +1587,7 @@ scale is a config scalar, its default value is one.</p>
<h3>full_matrix_projection<a class="headerlink" href="#full-matrix-projection" title="Permalink to this headline"></a></h3> <h3>full_matrix_projection<a class="headerlink" href="#full-matrix-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">full_matrix_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">full_matrix_projection</code></dt>
<dd><p>Full Matrix Projection. It performs full matrix multiplication.</p> <dd><p>Full Matrix Projection. It performs full matrix multiplication.</p>
<div class="math"> <div class="math">
\[out.row[i] += in.row[i] * weight\]</div> \[out.row[i] += in.row[i] * weight\]</div>
...@@ -1658,7 +1633,7 @@ scale is a config scalar, its default value is one.</p> ...@@ -1658,7 +1633,7 @@ scale is a config scalar, its default value is one.</p>
<h3>identity_projection<a class="headerlink" href="#identity-projection" title="Permalink to this headline"></a></h3> <h3>identity_projection<a class="headerlink" href="#identity-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">identity_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">identity_projection</code></dt>
<dd><ol class="arabic simple"> <dd><ol class="arabic simple">
<li>IdentityProjection if offset=None. It performs:</li> <li>IdentityProjection if offset=None. It performs:</li>
</ol> </ol>
...@@ -1704,7 +1679,7 @@ It select dimesions [offset, offset+layer_size) from input:</p> ...@@ -1704,7 +1679,7 @@ It select dimesions [offset, offset+layer_size) from input:</p>
<h3>table_projection<a class="headerlink" href="#table-projection" title="Permalink to this headline"></a></h3> <h3>table_projection<a class="headerlink" href="#table-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">table_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">table_projection</code></dt>
<dd><p>Table Projection. It selects rows from parameter where row_id <dd><p>Table Projection. It selects rows from parameter where row_id
is in input_ids.</p> is in input_ids.</p>
<div class="math"> <div class="math">
...@@ -1753,7 +1728,7 @@ and <span class="math">\(i\)</span> is row_id.</p> ...@@ -1753,7 +1728,7 @@ and <span class="math">\(i\)</span> is row_id.</p>
<h3>trans_full_matrix_projection<a class="headerlink" href="#trans-full-matrix-projection" title="Permalink to this headline"></a></h3> <h3>trans_full_matrix_projection<a class="headerlink" href="#trans-full-matrix-projection" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans_full_matrix_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans_full_matrix_projection</code></dt>
<dd><p>Different from full_matrix_projection, this projection performs matrix <dd><p>Different from full_matrix_projection, this projection performs matrix
multiplication, using transpose of weight.</p> multiplication, using transpose of weight.</p>
<div class="math"> <div class="math">
...@@ -1821,7 +1796,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class=" ...@@ -1821,7 +1796,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class="
<span id="id1"></span><h3>pooling<a class="headerlink" href="#api-v2-layer-pooling" title="Permalink to this headline"></a></h3> <span id="id1"></span><h3>pooling<a class="headerlink" href="#api-v2-layer-pooling" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pooling</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pooling</code></dt>
<dd><p>Pooling layer for sequence inputs, not used for Image.</p> <dd><p>Pooling layer for sequence inputs, not used for Image.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">seq_pool</span> <span class="o">=</span> <span class="n">pooling</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">seq_pool</span> <span class="o">=</span> <span class="n">pooling</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
...@@ -1860,7 +1835,7 @@ SumPooling, SquareRootNPooling.</li> ...@@ -1860,7 +1835,7 @@ SumPooling, SquareRootNPooling.</li>
<span id="api-v2-layer-last-seq"></span><h3>last_seq<a class="headerlink" href="#last-seq" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-last-seq"></span><h3>last_seq<a class="headerlink" href="#last-seq" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">last_seq</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">last_seq</code></dt>
<dd><p>Get Last Timestamp Activation of a sequence.</p> <dd><p>Get Last Timestamp Activation of a sequence.</p>
<p>If stride &gt; 0, this layer slides a window whose size is determined by stride, <p>If stride &gt; 0, this layer slides a window whose size is determined by stride,
and return the last value of the window as the output. Thus, a long sequence and return the last value of the window as the output. Thus, a long sequence
...@@ -1898,7 +1873,7 @@ of stride is -1.</p> ...@@ -1898,7 +1873,7 @@ of stride is -1.</p>
<span id="api-v2-layer-first-seq"></span><h3>first_seq<a class="headerlink" href="#first-seq" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-first-seq"></span><h3>first_seq<a class="headerlink" href="#first-seq" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">first_seq</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">first_seq</code></dt>
<dd><p>Get First Timestamp Activation of a sequence.</p> <dd><p>Get First Timestamp Activation of a sequence.</p>
<p>If stride &gt; 0, this layer slides a window whose size is determined by stride, <p>If stride &gt; 0, this layer slides a window whose size is determined by stride,
and return the first value of the window as the output. Thus, a long sequence and return the first value of the window as the output. Thus, a long sequence
...@@ -1936,7 +1911,7 @@ of stride is -1.</p> ...@@ -1936,7 +1911,7 @@ of stride is -1.</p>
<h3>concat<a class="headerlink" href="#concat" title="Permalink to this headline"></a></h3> <h3>concat<a class="headerlink" href="#concat" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">concat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">concat</code></dt>
<dd><p>Concat all input vector into one huge vector. <dd><p>Concat all input vector into one huge vector.
Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -1950,7 +1925,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -1950,7 +1925,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li> <li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul> </ul>
</td> </td>
...@@ -1970,7 +1945,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -1970,7 +1945,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<h3>seq_concat<a class="headerlink" href="#seq-concat" title="Permalink to this headline"></a></h3> <h3>seq_concat<a class="headerlink" href="#seq-concat" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_concat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_concat</code></dt>
<dd><p>Concat sequence a with sequence b.</p> <dd><p>Concat sequence a with sequence b.</p>
<dl class="docutils"> <dl class="docutils">
<dt>Inputs:</dt> <dt>Inputs:</dt>
...@@ -1994,7 +1969,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -1994,7 +1969,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li> <li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li> <li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2020,7 +1995,7 @@ default Bias.</li> ...@@ -2020,7 +1995,7 @@ default Bias.</li>
<h3>block_expand<a class="headerlink" href="#block-expand" title="Permalink to this headline"></a></h3> <h3>block_expand<a class="headerlink" href="#block-expand" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">block_expand</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">block_expand</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>Expand feature map to minibatch matrix.</dt> <dt>Expand feature map to minibatch matrix.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -2096,7 +2071,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class=" ...@@ -2096,7 +2071,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class="
<h3>expand<a class="headerlink" href="#expand" title="Permalink to this headline"></a></h3> <h3>expand<a class="headerlink" href="#expand" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">expand</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">expand</code></dt>
<dd><p>A layer for &#8220;Expand Dense data or (sequence data where the length of each <dd><p>A layer for &#8220;Expand Dense data or (sequence data where the length of each
sequence is one) to sequence data.&#8221;</p> sequence is one) to sequence data.&#8221;</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -2135,7 +2110,7 @@ bias.</li> ...@@ -2135,7 +2110,7 @@ bias.</li>
<h3>repeat<a class="headerlink" href="#repeat" title="Permalink to this headline"></a></h3> <h3>repeat<a class="headerlink" href="#repeat" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">repeat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">repeat</code></dt>
<dd><p>A layer for repeating the input for num_repeats times. This is equivalent <dd><p>A layer for repeating the input for num_repeats times. This is equivalent
to apply concat() with num_repeats same input.</p> to apply concat() with num_repeats same input.</p>
<div class="math"> <div class="math">
...@@ -2171,7 +2146,7 @@ to apply concat() with num_repeats same input.</p> ...@@ -2171,7 +2146,7 @@ to apply concat() with num_repeats same input.</p>
<h3>rotate<a class="headerlink" href="#rotate" title="Permalink to this headline"></a></h3> <h3>rotate<a class="headerlink" href="#rotate" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rotate</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rotate</code></dt>
<dd><p>A layer for rotating 90 degrees (clock-wise) for each feature channel, <dd><p>A layer for rotating 90 degrees (clock-wise) for each feature channel,
usually used when the input sample is some image or feature map.</p> usually used when the input sample is some image or feature map.</p>
<div class="math"> <div class="math">
...@@ -2210,7 +2185,7 @@ usually used when the input sample is some image or feature map.</p> ...@@ -2210,7 +2185,7 @@ usually used when the input sample is some image or feature map.</p>
<h3>seq_reshape<a class="headerlink" href="#seq-reshape" title="Permalink to this headline"></a></h3> <h3>seq_reshape<a class="headerlink" href="#seq-reshape" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_reshape</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_reshape</code></dt>
<dd><p>A layer for reshaping the sequence. Assume the input sequence has T instances, <dd><p>A layer for reshaping the sequence. Assume the input sequence has T instances,
the dimension of each instance is M, and the input reshape_size is N, then the the dimension of each instance is M, and the input reshape_size is N, then the
output sequence has T*M/N instances, the dimension of each instance is N.</p> output sequence has T*M/N instances, the dimension of each instance is N.</p>
...@@ -2227,7 +2202,7 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p> ...@@ -2227,7 +2202,7 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li> <li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2253,7 +2228,7 @@ default Bias.</li> ...@@ -2253,7 +2228,7 @@ default Bias.</li>
<h3>addto<a class="headerlink" href="#addto" title="Permalink to this headline"></a></h3> <h3>addto<a class="headerlink" href="#addto" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">addto</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">addto</code></dt>
<dd><p>AddtoLayer.</p> <dd><p>AddtoLayer.</p>
<div class="math"> <div class="math">
\[y = f(\sum_{i} x_i + b)\]</div> \[y = f(\sum_{i} x_i + b)\]</div>
...@@ -2261,7 +2236,7 @@ default Bias.</li> ...@@ -2261,7 +2236,7 @@ default Bias.</li>
and <span class="math">\(f\)</span> is activation function.</p> and <span class="math">\(f\)</span> is activation function.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">addto</span> <span class="o">=</span> <span class="n">addto</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">,</span> <span class="n">layer2</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">addto</span> <span class="o">=</span> <span class="n">addto</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">,</span> <span class="n">layer2</span><span class="p">],</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
...@@ -2283,7 +2258,7 @@ Please refer to dropout for details.</p> ...@@ -2283,7 +2258,7 @@ Please refer to dropout for details.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of
paddle.v2.config_base.Layer.</li> paddle.v2.config_base.Layer.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type, default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type, default is tanh.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default
bias.</li> bias.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
...@@ -2305,7 +2280,7 @@ bias.</li> ...@@ -2305,7 +2280,7 @@ bias.</li>
<h3>linear_comb<a class="headerlink" href="#linear-comb" title="Permalink to this headline"></a></h3> <h3>linear_comb<a class="headerlink" href="#linear-comb" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">linear_comb</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">linear_comb</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>A layer for weighted sum of vectors takes two inputs.</dt> <dt>A layer for weighted sum of vectors takes two inputs.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -2368,7 +2343,7 @@ processed in one batch.</p> ...@@ -2368,7 +2343,7 @@ processed in one batch.</p>
<h3>interpolation<a class="headerlink" href="#interpolation" title="Permalink to this headline"></a></h3> <h3>interpolation<a class="headerlink" href="#interpolation" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">interpolation</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">interpolation</code></dt>
<dd><p>This layer is for linear interpolation with two inputs, <dd><p>This layer is for linear interpolation with two inputs,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -2407,7 +2382,7 @@ which is used in NEURAL TURING MACHINE.</p> ...@@ -2407,7 +2382,7 @@ which is used in NEURAL TURING MACHINE.</p>
<h3>bilinear_interp<a class="headerlink" href="#bilinear-interp" title="Permalink to this headline"></a></h3> <h3>bilinear_interp<a class="headerlink" href="#bilinear-interp" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">bilinear_interp</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">bilinear_interp</code></dt>
<dd><p>This layer is to implement bilinear interpolation on conv layer output.</p> <dd><p>This layer is to implement bilinear interpolation on conv layer output.</p>
<p>Please refer to Wikipedia: <a class="reference external" href="https://en.wikipedia.org/wiki/Bilinear_interpolation">https://en.wikipedia.org/wiki/Bilinear_interpolation</a></p> <p>Please refer to Wikipedia: <a class="reference external" href="https://en.wikipedia.org/wiki/Bilinear_interpolation">https://en.wikipedia.org/wiki/Bilinear_interpolation</a></p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
...@@ -2442,7 +2417,7 @@ which is used in NEURAL TURING MACHINE.</p> ...@@ -2442,7 +2417,7 @@ which is used in NEURAL TURING MACHINE.</p>
<h3>power<a class="headerlink" href="#power" title="Permalink to this headline"></a></h3> <h3>power<a class="headerlink" href="#power" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">power</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">power</code></dt>
<dd><p>This layer applies a power function to a vector element-wise, <dd><p>This layer applies a power function to a vector element-wise,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -2480,7 +2455,7 @@ and <span class="math">\(y\)</span> is a output vector.</p> ...@@ -2480,7 +2455,7 @@ and <span class="math">\(y\)</span> is a output vector.</p>
<h3>scaling<a class="headerlink" href="#scaling" title="Permalink to this headline"></a></h3> <h3>scaling<a class="headerlink" href="#scaling" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling</code></dt>
<dd><p>A layer for multiplying input vector by weight scalar.</p> <dd><p>A layer for multiplying input vector by weight scalar.</p>
<div class="math"> <div class="math">
\[y = w x\]</div> \[y = w x\]</div>
...@@ -2519,7 +2494,7 @@ processed in one batch.</p> ...@@ -2519,7 +2494,7 @@ processed in one batch.</p>
<h3>slope_intercept<a class="headerlink" href="#slope-intercept" title="Permalink to this headline"></a></h3> <h3>slope_intercept<a class="headerlink" href="#slope-intercept" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">slope_intercept</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">slope_intercept</code></dt>
<dd><p>This layer for applying a slope and an intercept to the input <dd><p>This layer for applying a slope and an intercept to the input
element-wise. There is no activation and weight.</p> element-wise. There is no activation and weight.</p>
<div class="math"> <div class="math">
...@@ -2556,7 +2531,7 @@ element-wise. There is no activation and weight.</p> ...@@ -2556,7 +2531,7 @@ element-wise. There is no activation and weight.</p>
<h3>tensor<a class="headerlink" href="#tensor" title="Permalink to this headline"></a></h3> <h3>tensor<a class="headerlink" href="#tensor" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">tensor</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">tensor</code></dt>
<dd><p>This layer performs tensor operation for two input. <dd><p>This layer performs tensor operation for two input.
For example, each sample:</p> For example, each sample:</p>
<div class="math"> <div class="math">
...@@ -2585,7 +2560,7 @@ For example, each sample:</p> ...@@ -2585,7 +2560,7 @@ For example, each sample:</p>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li> <li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li> <li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li> <li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2609,7 +2584,7 @@ default Bias.</li> ...@@ -2609,7 +2584,7 @@ default Bias.</li>
<span id="api-v2-layer-cos-sim"></span><h3>cos_sim<a class="headerlink" href="#cos-sim" title="Permalink to this headline"></a></h3> <span id="api-v2-layer-cos-sim"></span><h3>cos_sim<a class="headerlink" href="#cos-sim" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cos_sim</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cos_sim</code></dt>
<dd><p>Cosine Similarity Layer. The cosine similarity equation is here.</p> <dd><p>Cosine Similarity Layer. The cosine similarity equation is here.</p>
<div class="math"> <div class="math">
\[similarity = cos(\theta) = {\mathbf{a} \cdot \mathbf{b} \[similarity = cos(\theta) = {\mathbf{a} \cdot \mathbf{b}
...@@ -2652,7 +2627,7 @@ processed in one batch.</p> ...@@ -2652,7 +2627,7 @@ processed in one batch.</p>
<h3>trans<a class="headerlink" href="#trans" title="Permalink to this headline"></a></h3> <h3>trans<a class="headerlink" href="#trans" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans</code></dt>
<dd><p>A layer for transposing a minibatch matrix.</p> <dd><p>A layer for transposing a minibatch matrix.</p>
<div class="math"> <div class="math">
\[y = x^\mathrm{T}\]</div> \[y = x^\mathrm{T}\]</div>
...@@ -2690,7 +2665,7 @@ processed in one batch.</p> ...@@ -2690,7 +2665,7 @@ processed in one batch.</p>
<h3>maxid<a class="headerlink" href="#maxid" title="Permalink to this headline"></a></h3> <h3>maxid<a class="headerlink" href="#maxid" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">max_id</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">max_id</code></dt>
<dd><p>A layer for finding the id which has the maximal value for each sample. <dd><p>A layer for finding the id which has the maximal value for each sample.
The result is stored in output.ids.</p> The result is stored in output.ids.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -2723,7 +2698,7 @@ The result is stored in output.ids.</p> ...@@ -2723,7 +2698,7 @@ The result is stored in output.ids.</p>
<h3>sampling_id<a class="headerlink" href="#sampling-id" title="Permalink to this headline"></a></h3> <h3>sampling_id<a class="headerlink" href="#sampling-id" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sampling_id</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sampling_id</code></dt>
<dd><p>A layer for sampling id from multinomial distribution from the input layer. <dd><p>A layer for sampling id from multinomial distribution from the input layer.
Sampling one id for one sample.</p> Sampling one id for one sample.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
...@@ -2759,7 +2734,7 @@ Sampling one id for one sample.</p> ...@@ -2759,7 +2734,7 @@ Sampling one id for one sample.</p>
<h3>pad<a class="headerlink" href="#pad" title="Permalink to this headline"></a></h3> <h3>pad<a class="headerlink" href="#pad" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pad</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pad</code></dt>
<dd><p>This operation pads zeros to the input data according to pad_c,pad_h <dd><p>This operation pads zeros to the input data according to pad_c,pad_h
and pad_w. pad_c, pad_h, pad_w specifies the which dimension and size and pad_w. pad_c, pad_h, pad_w specifies the which dimension and size
of padding. And the input data shape is NCHW.</p> of padding. And the input data shape is NCHW.</p>
...@@ -2828,7 +2803,7 @@ in width dimension.</p> ...@@ -2828,7 +2803,7 @@ in width dimension.</p>
<h3>cross_entropy_cost<a class="headerlink" href="#cross-entropy-cost" title="Permalink to this headline"></a></h3> <h3>cross_entropy_cost<a class="headerlink" href="#cross-entropy-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_cost</code></dt>
<dd><p>A loss layer for multi class entropy.</p> <dd><p>A loss layer for multi class entropy.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2866,7 +2841,7 @@ will not be calculated for weight.</li> ...@@ -2866,7 +2841,7 @@ will not be calculated for weight.</li>
<h3>cross_entropy_with_selfnorm_cost<a class="headerlink" href="#cross-entropy-with-selfnorm-cost" title="Permalink to this headline"></a></h3> <h3>cross_entropy_with_selfnorm_cost<a class="headerlink" href="#cross-entropy-with-selfnorm-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_with_selfnorm_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_with_selfnorm_cost</code></dt>
<dd><p>A loss layer for multi class entropy with selfnorm. <dd><p>A loss layer for multi class entropy with selfnorm.
Input should be a vector of positive numbers, without normalization.</p> Input should be a vector of positive numbers, without normalization.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy_with_selfnorm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy_with_selfnorm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
...@@ -2902,7 +2877,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2902,7 +2877,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>multi_binary_label_cross_entropy_cost<a class="headerlink" href="#multi-binary-label-cross-entropy-cost" title="Permalink to this headline"></a></h3> <h3>multi_binary_label_cross_entropy_cost<a class="headerlink" href="#multi-binary-label-cross-entropy-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">multi_binary_label_cross_entropy_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">multi_binary_label_cross_entropy_cost</code></dt>
<dd><p>A loss layer for multi binary label cross entropy.</p> <dd><p>A loss layer for multi binary label cross entropy.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">multi_binary_label_cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">multi_binary_label_cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2936,7 +2911,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2936,7 +2911,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>huber_cost<a class="headerlink" href="#huber-cost" title="Permalink to this headline"></a></h3> <h3>huber_cost<a class="headerlink" href="#huber-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">huber_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">huber_cost</code></dt>
<dd><p>A loss layer for huber loss.</p> <dd><p>A loss layer for huber loss.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">huber_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">huber_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2970,7 +2945,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2970,7 +2945,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>lambda_cost<a class="headerlink" href="#lambda-cost" title="Permalink to this headline"></a></h3> <h3>lambda_cost<a class="headerlink" href="#lambda-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lambda_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lambda_cost</code></dt>
<dd><p>lambdaCost for lambdaRank LTR approach.</p> <dd><p>lambdaCost for lambdaRank LTR approach.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">lambda_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">lambda_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
...@@ -3016,7 +2991,7 @@ entire list of get gradient.</li> ...@@ -3016,7 +2991,7 @@ entire list of get gradient.</li>
<h3>mse_cost<a class="headerlink" href="#mse-cost" title="Permalink to this headline"></a></h3> <h3>mse_cost<a class="headerlink" href="#mse-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">mse_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">mse_cost</code></dt>
<dd><blockquote> <dd><blockquote>
<div><p>mean squared error cost:</p> <div><p>mean squared error cost:</p>
<div class="math"> <div class="math">
...@@ -3069,7 +3044,7 @@ It is an optional argument.</td> ...@@ -3069,7 +3044,7 @@ It is an optional argument.</td>
<h3>rank_cost<a class="headerlink" href="#rank-cost" title="Permalink to this headline"></a></h3> <h3>rank_cost<a class="headerlink" href="#rank-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rank_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rank_cost</code></dt>
<dd><p>A cost Layer for learning to rank using gradient descent. Details can refer <dd><p>A cost Layer for learning to rank using gradient descent. Details can refer
to <a class="reference external" href="http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf">papers</a>. to <a class="reference external" href="http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf">papers</a>.
This layer contains at least three inputs. The weight is an optional This layer contains at least three inputs. The weight is an optional
...@@ -3124,7 +3099,7 @@ It is an optional argument.</li> ...@@ -3124,7 +3099,7 @@ It is an optional argument.</li>
<h3>sum_cost<a class="headerlink" href="#sum-cost" title="Permalink to this headline"></a></h3> <h3>sum_cost<a class="headerlink" href="#sum-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_cost</code></dt>
<dd><p>A loss layer which calculate the sum of the input as loss</p> <dd><p>A loss layer which calculate the sum of the input as loss</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">sum_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">sum_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
...@@ -3155,7 +3130,7 @@ It is an optional argument.</li> ...@@ -3155,7 +3130,7 @@ It is an optional argument.</li>
<h3>crf<a class="headerlink" href="#crf" title="Permalink to this headline"></a></h3> <h3>crf<a class="headerlink" href="#crf" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf</code></dt>
<dd><p>A layer for calculating the cost of sequential conditional random <dd><p>A layer for calculating the cost of sequential conditional random
field model.</p> field model.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
...@@ -3196,7 +3171,7 @@ optional argument.</li> ...@@ -3196,7 +3171,7 @@ optional argument.</li>
<h3>crf_decoding<a class="headerlink" href="#crf-decoding" title="Permalink to this headline"></a></h3> <h3>crf_decoding<a class="headerlink" href="#crf-decoding" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf_decoding</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf_decoding</code></dt>
<dd><p>A layer for calculating the decoding sequence of sequential conditional <dd><p>A layer for calculating the decoding sequence of sequential conditional
random field model. The decoding sequence is stored in output.ids. random field model. The decoding sequence is stored in output.ids.
If a second input is provided, it is treated as the ground-truth label, and If a second input is provided, it is treated as the ground-truth label, and
...@@ -3236,7 +3211,7 @@ decoding or 0 for correct decoding.</p> ...@@ -3236,7 +3211,7 @@ decoding or 0 for correct decoding.</p>
<h3>ctc<a class="headerlink" href="#ctc" title="Permalink to this headline"></a></h3> <h3>ctc<a class="headerlink" href="#ctc" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">ctc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">ctc</code></dt>
<dd><p>Connectionist Temporal Classification (CTC) is designed for temporal <dd><p>Connectionist Temporal Classification (CTC) is designed for temporal
classication task. That is, for sequence labeling problems where the classication task. That is, for sequence labeling problems where the
alignment between the inputs and the target labels is unknown.</p> alignment between the inputs and the target labels is unknown.</p>
...@@ -3287,7 +3262,7 @@ should also be num_classes + 1.</p> ...@@ -3287,7 +3262,7 @@ should also be num_classes + 1.</p>
<h3>warp_ctc<a class="headerlink" href="#warp-ctc" title="Permalink to this headline"></a></h3> <h3>warp_ctc<a class="headerlink" href="#warp-ctc" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">warp_ctc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">warp_ctc</code></dt>
<dd><p>A layer intergrating the open-source <cite>warp-ctc <dd><p>A layer intergrating the open-source <cite>warp-ctc
&lt;https://github.com/baidu-research/warp-ctc&gt;</cite> library, which is used in &lt;https://github.com/baidu-research/warp-ctc&gt;</cite> library, which is used in
<cite>Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin <cite>Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin
...@@ -3347,7 +3322,7 @@ should be consistent as that used in your labels.</li> ...@@ -3347,7 +3322,7 @@ should be consistent as that used in your labels.</li>
<h3>nce<a class="headerlink" href="#nce" title="Permalink to this headline"></a></h3> <h3>nce<a class="headerlink" href="#nce" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">nce</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">nce</code></dt>
<dd><p>Noise-contrastive estimation. <dd><p>Noise-contrastive estimation.
Implements the method in the following paper: Implements the method in the following paper:
A fast and simple algorithm for training neural probabilistic language models.</p> A fast and simple algorithm for training neural probabilistic language models.</p>
...@@ -3367,7 +3342,7 @@ A fast and simple algorithm for training neural probabilistic language models.</ ...@@ -3367,7 +3342,7 @@ A fast and simple algorithm for training neural probabilistic language models.</
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li> <li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li> <li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li>
<li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li> <li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation, default is Sigmoid.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation, default is Sigmoid.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>num_neg_samples</strong> (<em>int</em>) &#8211; number of negative samples. Default is 10.</li> <li><strong>num_neg_samples</strong> (<em>int</em>) &#8211; number of negative samples. Default is 10.</li>
<li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels. <li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels.
...@@ -3393,7 +3368,7 @@ If not None, its length must be equal to num_classes.</li> ...@@ -3393,7 +3368,7 @@ If not None, its length must be equal to num_classes.</li>
<h3>hsigmoid<a class="headerlink" href="#hsigmoid" title="Permalink to this headline"></a></h3> <h3>hsigmoid<a class="headerlink" href="#hsigmoid" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">hsigmoid</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">hsigmoid</code></dt>
<dd><p>Organize the classes into a binary tree. At each node, a sigmoid function <dd><p>Organize the classes into a binary tree. At each node, a sigmoid function
is used to calculate the probability of belonging to the right branch. is used to calculate the probability of belonging to the right branch.
This idea is from &#8220;F. Morin, Y. Bengio (AISTATS 05): This idea is from &#8220;F. Morin, Y. Bengio (AISTATS 05):
...@@ -3435,7 +3410,7 @@ False means no bias.</li> ...@@ -3435,7 +3410,7 @@ False means no bias.</li>
<h3>smooth_l1_cost<a class="headerlink" href="#smooth-l1-cost" title="Permalink to this headline"></a></h3> <h3>smooth_l1_cost<a class="headerlink" href="#smooth-l1-cost" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">smooth_l1_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">smooth_l1_cost</code></dt>
<dd><p>This is a L1 loss but more smooth. It requires that the <dd><p>This is a L1 loss but more smooth. It requires that the
size of input and label are equal. The formula is as follows,</p> size of input and label are equal. The formula is as follows,</p>
<div class="math"> <div class="math">
...@@ -3479,7 +3454,7 @@ size of input and label are equal. The formula is as follows,</p> ...@@ -3479,7 +3454,7 @@ size of input and label are equal. The formula is as follows,</p>
<h3>eos<a class="headerlink" href="#eos" title="Permalink to this headline"></a></h3> <h3>eos<a class="headerlink" href="#eos" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">eos</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">eos</code></dt>
<dd><p>A layer for checking EOS for each sample: <dd><p>A layer for checking EOS for each sample:
- output_id = (input_id == conf.eos_id)</p> - output_id = (input_id == conf.eos_id)</p>
<p>The result is stored in output_.ids. <p>The result is stored in output_.ids.
......
...@@ -190,9 +190,9 @@ ...@@ -190,9 +190,9 @@
<h2>NLP<a class="headerlink" href="#nlp" title="Permalink to this headline"></a></h2> <h2>NLP<a class="headerlink" href="#nlp" title="Permalink to this headline"></a></h2>
<div class="section" id="sequence-conv-pool"> <div class="section" id="sequence-conv-pool">
<h3>sequence_conv_pool<a class="headerlink" href="#sequence-conv-pool" title="Permalink to this headline"></a></h3> <h3>sequence_conv_pool<a class="headerlink" href="#sequence-conv-pool" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -201,34 +201,34 @@ ...@@ -201,34 +201,34 @@
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; fc layer activation type. None means tanh</li> <li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care. <li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li> False if no bias.</li>
<li><strong>fc_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -238,9 +238,9 @@ False if no bias.</li> ...@@ -238,9 +238,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="text-conv-pool"> <div class="section" id="text-conv-pool">
<span id="api-trainer-config-helpers-network-text-conv-pool"></span><h3>text_conv_pool<a class="headerlink" href="#text-conv-pool" title="Permalink to this headline"></a></h3> <span id="api-trainer-config-helpers-network-text-conv-pool"></span><h3>text_conv_pool<a class="headerlink" href="#text-conv-pool" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -249,34 +249,34 @@ False if no bias.</li> ...@@ -249,34 +249,34 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; fc layer activation type. None means tanh</li> <li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care. <li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li> False if no bias.</li>
<li><strong>fc_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">output layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -289,9 +289,9 @@ False if no bias.</li> ...@@ -289,9 +289,9 @@ False if no bias.</li>
<h2>Images<a class="headerlink" href="#images" title="Permalink to this headline"></a></h2> <h2>Images<a class="headerlink" href="#images" title="Permalink to this headline"></a></h2>
<div class="section" id="img-conv-bn-pool"> <div class="section" id="img-conv-bn-pool">
<h3>img_conv_bn_pool<a class="headerlink" href="#img-conv-bn-pool" title="Permalink to this headline"></a></h3> <h3>img_conv_bn_pool<a class="headerlink" href="#img-conv-bn-pool" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Convolution, batch normalization, pooling group.</p> <dd><p>Convolution, batch normalization, pooling group.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -299,33 +299,33 @@ False if no bias.</li> ...@@ -299,33 +299,33 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; see batch_norm&#8217;s document.</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_attr</strong> (<em>Extrapaddle.v2.config_base.Layer</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>bn_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute.</em>) &#8211; see batch_norm&#8217;s document.</li> <li><strong>bn_param_attr</strong> (<em>ParameterAttribute.</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_bias_attr</strong> &#8211; see batch_norm&#8217;s document.</li> <li><strong>bn_bias_attr</strong> &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_attr</strong> &#8211; paddle.v2.attr.ParameterAttribute.</li> <li><strong>bn_layer_attr</strong> &#8211; ParameterAttribute.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer&#8217;s document.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer groups output</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer groups output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -335,9 +335,9 @@ False if no bias.</li> ...@@ -335,9 +335,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="img-conv-group"> <div class="section" id="img-conv-group">
<h3>img_conv_group<a class="headerlink" href="#img-conv-group" title="Permalink to this headline"></a></h3> <h3>img_conv_group<a class="headerlink" href="#img-conv-group" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_group</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Image Convolution Group, Used for vgg net.</p> <dd><p>Image Convolution Group, Used for vgg net.</p>
<p>TODO(yuyang18): Complete docs</p> <p>TODO(yuyang18): Complete docs</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -369,9 +369,9 @@ False if no bias.</li> ...@@ -369,9 +369,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="simple-img-conv-pool"> <div class="section" id="simple-img-conv-pool">
<span id="api-trainer-config-helpers-network-simple-img-conv-pool"></span><h3>simple_img_conv_pool<a class="headerlink" href="#simple-img-conv-pool" title="Permalink to this headline"></a></h3> <span id="api-trainer-config-helpers-network-simple-img-conv-pool"></span><h3>simple_img_conv_pool<a class="headerlink" href="#simple-img-conv-pool" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple image convolution and pooling group.</p> <dd><p>Simple image convolution and pooling group.</p>
<p>Input =&gt; conv =&gt; pooling</p> <p>Input =&gt; conv =&gt; pooling</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -380,30 +380,30 @@ False if no bias.</li> ...@@ -380,30 +380,30 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool for details</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; see img_conv for details</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv for details</li> <li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv for details</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv for details</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_conv for details</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_pool for details</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">Layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -416,9 +416,9 @@ False if no bias.</li> ...@@ -416,9 +416,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="vgg-16-network"> <div class="section" id="vgg-16-network">
<h3>vgg_16_network<a class="headerlink" href="#vgg-16-network" title="Permalink to this headline"></a></h3> <h3>vgg_16_network<a class="headerlink" href="#vgg-16-network" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">vgg_16_network</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">vgg_16_network</code><span class="sig-paren">(</span><em>input_image</em>, <em>num_channels</em>, <em>num_classes=1000</em><span class="sig-paren">)</span></dt>
<dd><p>Same model from <a class="reference external" href="https://gist.github.com/ksimonyan/211839e770f7b538e2d8">https://gist.github.com/ksimonyan/211839e770f7b538e2d8</a></p> <dd><p>Same model from <a class="reference external" href="https://gist.github.com/ksimonyan/211839e770f7b538e2d8">https://gist.github.com/ksimonyan/211839e770f7b538e2d8</a></p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -426,7 +426,7 @@ False if no bias.</li> ...@@ -426,7 +426,7 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>num_classes</strong> &#8211; </li> <li><strong>num_classes</strong> &#8211; </li>
<li><strong>input_image</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; </li> <li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; </li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; </li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; </li>
</ul> </ul>
</td> </td>
...@@ -446,9 +446,9 @@ False if no bias.</li> ...@@ -446,9 +446,9 @@ False if no bias.</li>
<h3>LSTM<a class="headerlink" href="#lstm" title="Permalink to this headline"></a></h3> <h3>LSTM<a class="headerlink" href="#lstm" title="Permalink to this headline"></a></h3>
<div class="section" id="lstmemory-unit"> <div class="section" id="lstmemory-unit">
<h4>lstmemory_unit<a class="headerlink" href="#lstmemory-unit" title="Permalink to this headline"></a></h4> <h4>lstmemory_unit<a class="headerlink" href="#lstmemory-unit" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a LSTM unit performs in a single time step. <dd><p>Define calculations that a LSTM unit performs in a single time step.
This function itself is not a recurrent layer, so that it can not be This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is always used in directly applied to sequence input. This function is always used in
...@@ -462,9 +462,9 @@ for more details about LSTM. The link goes as follows: ...@@ -462,9 +462,9 @@ for more details about LSTM. The link goes as follows:
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_unit</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_unit</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">(),</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
<span class="n">state_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -472,27 +472,27 @@ for more details about LSTM. The link goes as follows: ...@@ -472,27 +472,27 @@ for more details about LSTM. The link goes as follows:
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer. <li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li> <li><strong>get_output_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstmemory unit name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstmemory unit name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -502,9 +502,9 @@ False means no bias, None means default bias.</li> ...@@ -502,9 +502,9 @@ False means no bias, None means default bias.</li>
</div> </div>
<div class="section" id="lstmemory-group"> <div class="section" id="lstmemory-group">
<h4>lstmemory_group<a class="headerlink" href="#lstmemory-group" title="Permalink to this headline"></a></h4> <h4>lstmemory_group<a class="headerlink" href="#lstmemory-group" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>lstm_group is a recurrent layer group version of Long Short Term Memory. It <dd><p>lstm_group is a recurrent layer group version of Long Short Term Memory. It
does exactly the same calculation as the lstmemory layer (see lstmemory in does exactly the same calculation as the lstmemory layer (see lstmemory in
layers.py for the maths) does. A promising benefit is that LSTM memory layers.py for the maths) does. A promising benefit is that LSTM memory
...@@ -517,14 +517,14 @@ lstmemory_group.</p> ...@@ -517,14 +517,14 @@ lstmemory_group.</p>
multiplications: multiplications:
<span class="math">\(W_{xi}x_{t}\)</span> , <span class="math">\(W_{xf}x_{t}\)</span>, <span class="math">\(W_{xi}x_{t}\)</span> , <span class="math">\(W_{xf}x_{t}\)</span>,
<span class="math">\(W_{xc}x_t\)</span>, <span class="math">\(W_{xo}x_{t}\)</span> are not done in lstmemory_unit to <span class="math">\(W_{xc}x_t\)</span>, <span class="math">\(W_{xo}x_{t}\)</span> are not done in lstmemory_unit to
speed up the calculations. Consequently, an additional mixed with speed up the calculations. Consequently, an additional mixed_layer with
full_matrix_projection must be included before lstmemory_unit is called.</p> full_matrix_projection must be included before lstmemory_unit is called.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">(),</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
<span class="n">state_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -532,28 +532,28 @@ full_matrix_projection must be included before lstmemory_unit is called.</p> ...@@ -532,28 +532,28 @@ full_matrix_projection must be included before lstmemory_unit is called.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory group name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory group name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer. <li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li> <li><strong>get_output_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the lstmemory group.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the lstmemory group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -563,9 +563,9 @@ False means no bias, None means default bias.</li> ...@@ -563,9 +563,9 @@ False means no bias, None means default bias.</li>
</div> </div>
<div class="section" id="simple-lstm"> <div class="section" id="simple-lstm">
<h4>simple_lstm<a class="headerlink" href="#simple-lstm" title="Permalink to this headline"></a></h4> <h4>simple_lstm<a class="headerlink" href="#simple-lstm" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple LSTM Cell.</p> <dd><p>Simple LSTM Cell.</p>
<p>It just combine a mixed layer with fully_matrix_projection and a lstmemory <p>It just combine a mixed layer with fully_matrix_projection and a lstmemory
layer. The simple lstm cell was implemented as follow equations.</p> layer. The simple lstm cell was implemented as follow equations.</p>
...@@ -579,25 +579,25 @@ want to know what lstm is. <a class="reference external" href="http://arxiv.org/ ...@@ -579,25 +579,25 @@ want to know what lstm is. <a class="reference external" href="http://arxiv.org/
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>mat_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li> <li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li>
<li><strong>bias_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None <li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None
means default bias.</li> means default bias.</li>
<li><strong>inner_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li> <li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_cell_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstm layer name.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">lstm layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -607,9 +607,9 @@ means default bias.</li> ...@@ -607,9 +607,9 @@ means default bias.</li>
</div> </div>
<div class="section" id="bidirectional-lstm"> <div class="section" id="bidirectional-lstm">
<h4>bidirectional_lstm<a class="headerlink" href="#bidirectional-lstm" title="Permalink to this headline"></a></h4> <h4>bidirectional_lstm<a class="headerlink" href="#bidirectional-lstm" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input <dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and bardward orders, and then concatenate two
outputs form a final output. However, concatenation of two outputs outputs form a final output. However, concatenation of two outputs
...@@ -629,7 +629,7 @@ The link goes as follows: ...@@ -629,7 +629,7 @@ The link goes as follows:
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are
concatenated and returned. concatenated and returned.
...@@ -639,10 +639,10 @@ concatenated and returned.</li> ...@@ -639,10 +639,10 @@ concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object accroding to the return_seq.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object accroding to the return_seq.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -655,9 +655,9 @@ concatenated and returned.</li> ...@@ -655,9 +655,9 @@ concatenated and returned.</li>
<h3>GRU<a class="headerlink" href="#gru" title="Permalink to this headline"></a></h3> <h3>GRU<a class="headerlink" href="#gru" title="Permalink to this headline"></a></h3>
<div class="section" id="gru-unit"> <div class="section" id="gru-unit">
<h4>gru_unit<a class="headerlink" href="#gru-unit" title="Permalink to this headline"></a></h4> <h4>gru_unit<a class="headerlink" href="#gru-unit" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a gated recurrent unit performs in a single time <dd><p>Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be step. This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is almost always used in directly applied to sequence input. This function is almost always used in
...@@ -669,19 +669,19 @@ mechanism.</p> ...@@ -669,19 +669,19 @@ mechanism.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activation</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru output layer.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru output layer.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -691,9 +691,9 @@ mechanism.</p> ...@@ -691,9 +691,9 @@ mechanism.</p>
</div> </div>
<div class="section" id="gru-group"> <div class="section" id="gru-group">
<h4>gru_group<a class="headerlink" href="#gru-group" title="Permalink to this headline"></a></h4> <h4>gru_group<a class="headerlink" href="#gru-group" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>gru_group is a recurrent layer group version of Gated Recurrent Unit. It <dd><p>gru_group is a recurrent layer group version of Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
benefit is that gru hidden states are accessible to the user. This is benefit is that gru hidden states are accessible to the user. This is
...@@ -704,8 +704,8 @@ to use the grumemory, which is relatively faster.</p> ...@@ -704,8 +704,8 @@ to use the grumemory, which is relatively faster.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">())</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -713,21 +713,21 @@ to use the grumemory, which is relatively faster.</p> ...@@ -713,21 +713,21 @@ to use the grumemory, which is relatively faster.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -737,23 +737,23 @@ to use the grumemory, which is relatively faster.</p> ...@@ -737,23 +737,23 @@ to use the grumemory, which is relatively faster.</p>
</div> </div>
<div class="section" id="simple-gru"> <div class="section" id="simple-gru">
<h4>simple_gru<a class="headerlink" href="#simple-gru" title="Permalink to this headline"></a></h4> <h4>simple_gru<a class="headerlink" href="#simple-gru" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>You maybe see gru_step, grumemory in layers.py, gru_unit, gru_group, <dd><p>You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
that we have two ways to implement recurrent neural network. One way is to that we have two ways to implement recurrent neural network. One way is to
use one complete layer to implement rnn (including simple rnn, gru and lstm) use one complete layer to implement rnn (including simple rnn, gru and lstm)
with multiple time steps, such as recurrent, lstmemory, grumemory. But, with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But,
the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers. the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers.
See details in their interfaces in layers.py. See details in their interfaces in layers.py.
The other implementation is to use an recurrent group which can ensemble a The other implementation is to use an recurrent group which can ensemble a
series of layers to compute rnn step by step. This way is flexible for series of layers to compute rnn step by step. This way is flexible for
attenion mechanism or other complex connections.</p> attenion mechanism or other complex connections.</p>
<ul class="simple"> <ul class="simple">
<li>gru_step: only compute rnn by one step. It needs an memory as input <li>gru_step_layer: only compute rnn by one step. It needs an memory as input
and can be used in recurrent group.</li> and can be used in recurrent group.</li>
<li>gru_unit: a wrapper of gru_step with memory.</li> <li>gru_unit: a wrapper of gru_step_layer with memory.</li>
<li>gru_group: a GRU cell implemented by a combination of multiple layers in <li>gru_group: a GRU cell implemented by a combination of multiple layers in
recurrent group. recurrent group.
But <span class="math">\(W x_t\)</span> is not done in group.</li> But <span class="math">\(W x_t\)</span> is not done in group.</li>
...@@ -774,21 +774,21 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -774,21 +774,21 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -798,9 +798,9 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -798,9 +798,9 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
</div> </div>
<div class="section" id="simple-gru2"> <div class="section" id="simple-gru2">
<h4>simple_gru2<a class="headerlink" href="#simple-gru2" title="Permalink to this headline"></a></h4> <h4>simple_gru2<a class="headerlink" href="#simple-gru2" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead <dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead
Please see grumemory in layers.py for more detail about the maths. Please see grumemory in layers.py for more detail about the maths.
simple_gru2 is faster than simple_gru.</p> simple_gru2 is faster than simple_gru.</p>
...@@ -813,21 +813,21 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -813,21 +813,21 @@ simple_gru2 is faster than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -837,9 +837,9 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -837,9 +837,9 @@ simple_gru2 is faster than simple_gru.</p>
</div> </div>
<div class="section" id="bidirectional-gru"> <div class="section" id="bidirectional-gru">
<h4>bidirectional_gru<a class="headerlink" href="#bidirectional-gru" title="Permalink to this headline"></a></h4> <h4>bidirectional_gru<a class="headerlink" href="#bidirectional-gru" title="Permalink to this headline"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_gru is a recurrent unit that iterates over the input <dd><p>A bidirectional_gru is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and bardward orders, and then concatenate two
outputs to form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
...@@ -855,7 +855,7 @@ just add them together.</p> ...@@ -855,7 +855,7 @@ just add them together.</p>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are
concatenated and returned. concatenated and returned.
...@@ -865,10 +865,10 @@ concatenated and returned.</li> ...@@ -865,10 +865,10 @@ concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p> <tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">LayerOutput object.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -879,9 +879,9 @@ concatenated and returned.</li> ...@@ -879,9 +879,9 @@ concatenated and returned.</li>
</div> </div>
<div class="section" id="simple-attention"> <div class="section" id="simple-attention">
<h3>simple_attention<a class="headerlink" href="#simple-attention" title="Permalink to this headline"></a></h3> <h3>simple_attention<a class="headerlink" href="#simple-attention" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Calculate and then return a context vector by attention machanism. <dd><p>Calculate and then return a context vector by attention machanism.
Size of the context vector equals to size of the encoded_sequence.</p> Size of the context vector equals to size of the encoded_sequence.</p>
<div class="math"> <div class="math">
...@@ -905,18 +905,18 @@ Align and Translate</strong> for more details. The link is as follows: ...@@ -905,18 +905,18 @@ Align and Translate</strong> for more details. The link is as follows:
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li>
<li><strong>softmax_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax <li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax
that is used to produce attention weight</li> that is used to produce attention weight</li>
<li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li> <li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li>
<li><strong>encoded_sequence</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; output of the encoder</li> <li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li>
<li><strong>encoded_proj</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; attention weight is computed by a feed forward neural <li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural
network which has two inputs : decoder&#8217;s hidden state network which has two inputs : decoder&#8217;s hidden state
of previous time step and encoder&#8217;s output. of previous time step and encoder&#8217;s output.
encoded_proj is output of the feed-forward network for encoded_proj is output of the feed-forward network for
encoder&#8217;s output. Here we pre-compute it outside encoder&#8217;s output. Here we pre-compute it outside
simple_attention for speed consideration.</li> simple_attention for speed consideration.</li>
<li><strong>decoder_state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; hidden state of decoder in previous time step</li> <li><strong>decoder_state</strong> (<em>LayerOutput</em>) &#8211; hidden state of decoder in previous time step</li>
<li><strong>transform_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute of the feed-forward <li><strong>transform_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of the feed-forward
network that takes decoder_state as inputs to network that takes decoder_state as inputs to
compute attention weight.</li> compute attention weight.</li>
</ul> </ul>
...@@ -935,9 +935,9 @@ compute attention weight.</li> ...@@ -935,9 +935,9 @@ compute attention weight.</li>
<h2>Miscs<a class="headerlink" href="#miscs" title="Permalink to this headline"></a></h2> <h2>Miscs<a class="headerlink" href="#miscs" title="Permalink to this headline"></a></h2>
<div class="section" id="dropout-layer"> <div class="section" id="dropout-layer">
<h3>dropout_layer<a class="headerlink" href="#dropout-layer" title="Permalink to this headline"></a></h3> <h3>dropout_layer<a class="headerlink" href="#dropout-layer" title="Permalink to this headline"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">dropout_layer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">dropout_layer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>&#64;TODO(yuyang18): Add comments.</p> <dd><p>&#64;TODO(yuyang18): Add comments.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
......
...@@ -185,12 +185,50 @@ ...@@ -185,12 +185,50 @@
<h1>Data Reader Interface and DataSets<a class="headerlink" href="#data-reader-interface-and-datasets" title="Permalink to this headline"></a></h1> <h1>Data Reader Interface and DataSets<a class="headerlink" href="#data-reader-interface-and-datasets" title="Permalink to this headline"></a></h1>
<div class="section" id="datatypes"> <div class="section" id="datatypes">
<h2>DataTypes<a class="headerlink" href="#datatypes" title="Permalink to this headline"></a></h2> <h2>DataTypes<a class="headerlink" href="#datatypes" title="Permalink to this headline"></a></h2>
<dl class="function">
<dt>
<code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_array</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt>
<dd><p>Dense Array. It means the input feature is dense array with float type.
For example, if the input is an image with 28*28 pixels, the input of
Paddle neural network could be a dense vector with dimension 784 or a
numpy array with shape (28, 28).</p>
<p>For the 2-D convolution operation, each sample in one mini-batch must have
the similarly size in PaddlePaddle now. But, it supports variable-dimension
feature across mini-batch. For the variable-dimension, the param dim is not
used. While the data reader must yield numpy array and the data feeder will
set the data shape correctly.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>dim</strong> (<em>int</em>) &#8211; dimension of this vector.</li>
<li><strong>seq_type</strong> (<em>int</em>) &#8211; sequence type of input.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">An input type object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">InputType</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_vector</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_vector</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt>
<dd><p>Dense Vector. It means the input feature is dense float vector. For example, <dd><p>Dense Array. It means the input feature is dense array with float type.
if the input is an image with 28*28 pixels, the input of Paddle neural For example, if the input is an image with 28*28 pixels, the input of
network should be a dense vector with dimension 784.</p> Paddle neural network could be a dense vector with dimension 784 or a
numpy array with shape (28, 28).</p>
<p>For the 2-D convolution operation, each sample in one mini-batch must have
the similarly size in PaddlePaddle now. But, it supports variable-dimension
feature across mini-batch. For the variable-dimension, the param dim is not
used. While the data reader must yield numpy array and the data feeder will
set the data shape correctly.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
......
...@@ -186,7 +186,7 @@ ...@@ -186,7 +186,7 @@
</div> </div>
<div class="section" id="task-queue"> <div class="section" id="task-queue">
<span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="Permalink to this headline"></a></h2> <span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="Permalink to this headline"></a></h2>
<p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>blocks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p> <p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>chunks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p>
<div class="section" id="task-queue-creation"> <div class="section" id="task-queue-creation">
<span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="Permalink to this headline"></a></h3> <span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="Permalink to this headline"></a></h3>
<ol> <ol>
...@@ -197,21 +197,21 @@ ...@@ -197,21 +197,21 @@
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">The master server will scan through each RecordIO file to generate the <em>block index</em> and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file.</p> <li><p class="first">The master server will scan through each RecordIO file to generate the <em>chunk index</em> and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.</p>
<p>The definition of the block is:</p> <p>The definition of the chunk is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Block</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Chunk</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the block within the file</span> <span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the chunk within the file</span>
<span class="nx">Path</span> <span class="kt">string</span> <span class="nx">Path</span> <span class="kt">string</span>
<span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// block index</span> <span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// chunk index</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p> <li><p class="first">Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p>
<p>The definition of the task is:</p> <p>The definition of the task is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Index</span> <span class="kt">int</span> <span class="nx">Index</span> <span class="kt">int</span>
<span class="nx">Blocks</span> <span class="p">[]</span><span class="nx">Block</span> <span class="nx">Chunks</span> <span class="p">[]</span><span class="nx">Chunk</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
......
...@@ -226,7 +226,7 @@ name:sparse-n-1 ...@@ -226,7 +226,7 @@ name:sparse-n-1
<div class="highlight-c"><div class="highlight"><pre><span></span><span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">config_proto</span><span class="p">);</span> <div class="highlight-c"><div class="highlight"><pre><span></span><span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">config_proto</span><span class="p">);</span>
</pre></div> </pre></div>
</div> </div>
<p>The selected trainer&#8217;s call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return with 1, and the other trainers&#8217; call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will block until initialization is done, and return 0. As illustrated below:</p> <p>The selected trainer&#8217;s call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return with 1, and the other trainers&#8217; call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return 0. <code class="docutils literal"><span class="pre">paddle_get_params</span></code> will be blocked until initialization is completed. As illustrated below:</p>
<p><img src="./src/pserver_init.png"></p> <p><img src="./src/pserver_init.png"></p>
</div> </div>
</div> </div>
...@@ -259,16 +259,13 @@ name:sparse-n-1 ...@@ -259,16 +259,13 @@ name:sparse-n-1
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * paddle_begin_init_params will be called from multiple trainers,</span> <span class="cm"> * paddle_begin_init_params will be called from multiple trainers,</span>
<span class="cm"> * only one trainer will be selected to initialize the parameters on</span> <span class="cm"> * only one trainer will be selected to initialize the parameters on</span>
<span class="cm"> * parameter servers. Other trainers will be blocked until the</span> <span class="cm"> * parameter servers. Other trainers need to get the initialized</span>
<span class="cm"> * initialization is done, and they need to get the initialized</span>
<span class="cm"> * parameters from parameter servers using @paddle_get_params.</span> <span class="cm"> * parameters from parameter servers using @paddle_get_params.</span>
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * @param pserver_config_proto serialized parameter server configuration in</span>
<span class="cm"> * Protocol Buffers format.</span>
<span class="cm"> * @return 1 if the trainer is selected to initialize parameter</span> <span class="cm"> * @return 1 if the trainer is selected to initialize parameter</span>
<span class="cm"> * servers, otherwise 0.</span> <span class="cm"> * servers, otherwise 0.</span>
<span class="cm"> */</span> <span class="cm"> */</span>
<span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">pserver_config_proto</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">);</span>
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_init_param initializes the parameter on parameter</span> <span class="cm"> * @brief paddle_init_param initializes the parameter on parameter</span>
...@@ -276,12 +273,13 @@ name:sparse-n-1 ...@@ -276,12 +273,13 @@ name:sparse-n-1
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * @param param the parameter to initialize.</span> <span class="cm"> * @param param the parameter to initialize.</span>
<span class="cm"> * @param param_config_proto the configuration for the parameter.</span> <span class="cm"> * @param param_config_proto the configuration for the parameter.</span>
<span class="cm"> * @param config_len the length of param_config_proto</span>
<span class="cm"> * @return 0 if successful, otherwise -1. On failure, the trainer</span> <span class="cm"> * @return 0 if successful, otherwise -1. On failure, the trainer</span>
<span class="cm"> * needs to restart the entire initialization process (starting from</span> <span class="cm"> * needs to restart the entire initialization process (starting from</span>
<span class="cm"> * @paddle_begin_init_param). Or simply exit the program and wait for</span> <span class="cm"> * @paddle_begin_init_param). Or simply exit the program and wait for</span>
<span class="cm"> * the cluster management system to restart the trainer.</span> <span class="cm"> * the cluster management system to restart the trainer.</span>
<span class="cm"> */</span> <span class="cm"> */</span>
<span class="kt">int</span> <span class="nf">paddle_init_param</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="n">paddle_parameter</span> <span class="n">params</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">param_config_proto</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">paddle_init_param</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="n">paddle_parameter</span> <span class="n">param</span><span class="p">,</span> <span class="k">const</span> <span class="kt">unsigned</span> <span class="kt">char</span><span class="o">*</span> <span class="n">param_config_proto</span><span class="p">,</span> <span class="kt">int</span> <span class="n">config_len</span><span class="p">);</span>
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_finish_init_params tells parameter servers client has</span> <span class="cm"> * @brief paddle_finish_init_params tells parameter servers client has</span>
...@@ -308,6 +306,9 @@ name:sparse-n-1 ...@@ -308,6 +306,9 @@ name:sparse-n-1
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_get_params gets parameters from parameter servers.</span> <span class="cm"> * @brief paddle_get_params gets parameters from parameter servers.</span>
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * paddle_get_params will block until parameters are initialized on</span>
<span class="cm"> * the parameter servers.</span>
<span class="cm"> *</span>
<span class="cm"> * @param names the array of names of the parameters to get.</span> <span class="cm"> * @param names the array of names of the parameters to get.</span>
<span class="cm"> * @param dst the destination array of parameters to save to.</span> <span class="cm"> * @param dst the destination array of parameters to save to.</span>
<span class="cm"> * @param len the length of the names array and the paddle_parameter</span> <span class="cm"> * @param len the length of the names array and the paddle_parameter</span>
......
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Design Doc: The C++ Class Parameters &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../genindex.html"/>
<link rel="search" title="Search" href="../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Folk me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../about/index_en.html">ABOUT</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_en.html">PaddlePaddle in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/ubuntu_install_en.html">Debian Package installation guide</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_en.html">Installing from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_en.html">Run Distributed Training</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_en.html">Paddle On Kubernetes</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_aws_en.html">Distributed PaddlePaddle Training on AWS with Kubernetes</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_en.html">RNN Models</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">Data Reader Interface and DataSets</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">Training and Inference</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../about/index_en.html">ABOUT</a></li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Design Doc: The C++ Class <code class="docutils literal"><span class="pre">Parameters</span></code></li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="design-doc-the-c-class-parameters">
<span id="design-doc-the-c-class-parameters"></span><h1>Design Doc: The C++ Class <code class="docutils literal"><span class="pre">Parameters</span></code><a class="headerlink" href="#design-doc-the-c-class-parameters" title="Permalink to this headline"></a></h1>
<p><code class="docutils literal"><span class="pre">Parameters</span></code> is a concept we designed in Paddle V2 API. <code class="docutils literal"><span class="pre">Parameters</span></code> is a container of parameters, and make Paddle can shared parameter between topologies. We described usages of <code class="docutils literal"><span class="pre">Parameter</span></code> in <a class="reference internal" href="api.html"><span class="doc">api.md</span></a>.</p>
<p>We used Python to implement Parameters when designing V2 API before. There are several defects for current implementation:</p>
<ul class="simple">
<li>We just use <code class="docutils literal"><span class="pre">memcpy</span></code> to share Parameters between topologies, but this is very inefficient.</li>
<li>We did not implement share Parameters while training. We just trigger <code class="docutils literal"><span class="pre">memcpy</span></code> when start training.</li>
</ul>
<p>It is necessary that we implement Parameters in CPP side. However, it could be a code refactoring for Paddle, because Paddle was designed for training only one topology before, i.e., each GradientMachine contains its Parameter as a data member. In current Paddle implementation, there are three concepts associated with <code class="docutils literal"><span class="pre">Parameters</span></code>:</p>
<ol class="simple">
<li><code class="docutils literal"><span class="pre">paddle::Parameter</span></code>. A <code class="docutils literal"><span class="pre">Parameters</span></code> is a container for <code class="docutils literal"><span class="pre">paddle::Parameter</span></code>.
It is evident that we should use <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> when developing <code class="docutils literal"><span class="pre">Parameters</span></code>.
However, the <code class="docutils literal"><span class="pre">Parameter</span></code> class contains many functions and does not have a clear interface.
It contains <code class="docutils literal"><span class="pre">create/store</span> <span class="pre">Parameter</span></code>, <code class="docutils literal"><span class="pre">serialize/deserialize</span></code>, <code class="docutils literal"><span class="pre">optimize(i.e</span> <span class="pre">SGD)</span></code>, <code class="docutils literal"><span class="pre">randomize/zero</span></code>.
When we developing <code class="docutils literal"><span class="pre">Parameters</span></code>, we only use <code class="docutils literal"><span class="pre">create/store</span> <span class="pre">Parameter</span></code> functionality.
We should extract functionalities of Parameter into many classes to clean Paddle CPP implementation.</li>
<li><code class="docutils literal"><span class="pre">paddle::GradientMachine</span></code> and its sub-classes, e.g., <code class="docutils literal"><span class="pre">paddle::MultiGradientMachine</span></code>, <code class="docutils literal"><span class="pre">paddle::NeuralNetwork</span></code>.
We should pass <code class="docutils literal"><span class="pre">Parameters</span></code> to <code class="docutils literal"><span class="pre">paddle::GradientMachine</span></code> when <code class="docutils literal"><span class="pre">forward/backward</span></code> to avoid <code class="docutils literal"><span class="pre">memcpy</span></code> between topologies.
Also, we should handle multi-GPU/CPU training, because <code class="docutils literal"><span class="pre">forward</span></code> and <code class="docutils literal"><span class="pre">backward</span></code> would perform on multi-GPUs and multi-CPUs.
<code class="docutils literal"><span class="pre">Parameters</span></code> should dispatch the parameter value to each device, and gather the parameter gradient from each device.</li>
<li><code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code>. The ParameterUpdater is used to update parameters in Paddle.
So <code class="docutils literal"><span class="pre">Parameters</span></code> should be used by <code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code>, and <code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code> should optimize <code class="docutils literal"><span class="pre">Parameters</span></code> (by SGD).</li>
</ol>
<p>The step by step approach for implementation Parameters in Paddle C++ core is listed below. Each step should be a PR and could be merged into Paddle one by one.</p>
<ol class="simple">
<li>Clean <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> interface. Extract the functionalities of <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> to prepare for the implementation of Parameters.</li>
<li>Implementation a <code class="docutils literal"><span class="pre">Parameters</span></code> class. It just stores the <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> inside. Make <code class="docutils literal"><span class="pre">GradientMachine</span></code> uses <code class="docutils literal"><span class="pre">Parameters</span></code> as a class member.</li>
<li>Make <code class="docutils literal"><span class="pre">Parameters</span></code> support Multi-CPU and Multi-GPU training to prepare for sharing <code class="docutils literal"><span class="pre">Parameter</span></code> between topologies.
Because we need share <code class="docutils literal"><span class="pre">Parameters</span></code> between topologies, it is <code class="docutils literal"><span class="pre">Parameters</span></code>&#8216;s response to exchange Parameters between GPUs.
<code class="docutils literal"><span class="pre">GradientMachine</span></code> should not handle how to exchange Parameters because <code class="docutils literal"><span class="pre">GradientMachine</span></code> only used to train one topology and we need to support train many topologies in Paddle, i.e., there could be many GradientMachines use one <code class="docutils literal"><span class="pre">Parameters</span></code>.<ul>
<li>We should use a global function to exchange Parameters between GPUs, not a member function in <code class="docutils literal"><span class="pre">Parameters</span></code>. The <code class="docutils literal"><span class="pre">MultiGradientMachine</span></code> invoke this function, which uses <code class="docutils literal"><span class="pre">Parameters</span></code> as this function inputs.</li>
<li>The MultiGradientMachine contains many functionalities. Extracting the Parameters exchanging logic could make MultiGradientMachine clearer and simpler.</li>
</ul>
</li>
<li>Make <code class="docutils literal"><span class="pre">Parameters</span></code> as an argument for <code class="docutils literal"><span class="pre">forward/backward</span></code> function, not a data member for <code class="docutils literal"><span class="pre">GradientMachine</span></code>. For example, <code class="docutils literal"><span class="pre">forward</span></code> could be <code class="docutils literal"><span class="pre">forward(const</span> <span class="pre">Parameters&amp;</span> <span class="pre">params,</span> <span class="pre">...)</span></code> and <code class="docutils literal"><span class="pre">backward</span></code> could be <code class="docutils literal"><span class="pre">backward(Parameters*</span> <span class="pre">params,</span> <span class="pre">...)</span></code>. After this step, Paddle could share <code class="docutils literal"><span class="pre">Parameters</span></code> between topologies.</li>
<li><code class="docutils literal"><span class="pre">ParameterUpdater</span></code> is invoked by <code class="docutils literal"><span class="pre">GradientMachine</span></code> and <code class="docutils literal"><span class="pre">Trainer</span></code>, but it updates <code class="docutils literal"><span class="pre">Parameters</span></code>. In the end of this code refactoring, we could change <code class="docutils literal"><span class="pre">ParameterUpdater</span></code> directly uses <code class="docutils literal"><span class="pre">Parameters</span></code> to make <code class="docutils literal"><span class="pre">ParameterUpdater</span></code>&#8216;s implementation clear.</li>
</ol>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
因为 它太大了无法显示 source diff 。你可以改为 查看blob
...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c ...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c
## Task Queue ## Task Queue
As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *blocks* from one or multiple files. The master server maintains *task queues* to track the training progress. As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *chunks* from one or multiple files. The master server maintains *task queues* to track the training progress.
### Task Queue Creation ### Task Queue Creation
...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da ...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da
func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error { func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error {
} }
``` ```
1. The master server will scan through each RecordIO file to generate the *block index* and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file. 1. The master server will scan through each RecordIO file to generate the *chunk index* and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.
The definition of the block is: The definition of the chunk is:
```go ```go
type Block struct { type Chunk struct {
Idx int // index of the block within the file Idx int // index of the chunk within the file
Path string Path string
Index recordio.Index // block index Index recordio.Index // chunk index
} }
``` ```
1. Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element. 1. Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.
The definition of the task is: The definition of the task is:
```go ```go
type Task struct { type Task struct {
Index int Index int
Blocks []Block Chunks []Chunk
} }
``` ```
......
...@@ -55,7 +55,7 @@ The trainer select process is encapsulated in the C API function: ...@@ -55,7 +55,7 @@ The trainer select process is encapsulated in the C API function:
```c ```c
int paddle_begin_init_params(paddle_pserver_client* client, const char* config_proto); int paddle_begin_init_params(paddle_pserver_client* client, const char* config_proto);
``` ```
The selected trainer's call to `paddle_begin_init_params` will return with 1, and the other trainers' call to `paddle_begin_init_params` will block until initialization is done, and return 0. As illustrated below: The selected trainer's call to `paddle_begin_init_params` will return with 1, and the other trainers' call to `paddle_begin_init_params` will return 0. `paddle_get_params` will be blocked until initialization is completed. As illustrated below:
<img src="./src/pserver_init.png"> <img src="./src/pserver_init.png">
...@@ -89,16 +89,13 @@ void paddle_pserver_client_release(paddle_pserver_client* client); ...@@ -89,16 +89,13 @@ void paddle_pserver_client_release(paddle_pserver_client* client);
* *
* paddle_begin_init_params will be called from multiple trainers, * paddle_begin_init_params will be called from multiple trainers,
* only one trainer will be selected to initialize the parameters on * only one trainer will be selected to initialize the parameters on
* parameter servers. Other trainers will be blocked until the * parameter servers. Other trainers need to get the initialized
* initialization is done, and they need to get the initialized
* parameters from parameter servers using @paddle_get_params. * parameters from parameter servers using @paddle_get_params.
* *
* @param pserver_config_proto serialized parameter server configuration in
* Protocol Buffers format.
* @return 1 if the trainer is selected to initialize parameter * @return 1 if the trainer is selected to initialize parameter
* servers, otherwise 0. * servers, otherwise 0.
*/ */
int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_config_proto); int paddle_begin_init_params(paddle_pserver_client* client);
/** /**
* @brief paddle_init_param initializes the parameter on parameter * @brief paddle_init_param initializes the parameter on parameter
...@@ -106,12 +103,13 @@ int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_ ...@@ -106,12 +103,13 @@ int paddle_begin_init_params(paddle_pserver_client* client, const char* pserver_
* *
* @param param the parameter to initialize. * @param param the parameter to initialize.
* @param param_config_proto the configuration for the parameter. * @param param_config_proto the configuration for the parameter.
* @param config_len the length of param_config_proto
* @return 0 if successful, otherwise -1. On failure, the trainer * @return 0 if successful, otherwise -1. On failure, the trainer
* needs to restart the entire initialization process (starting from * needs to restart the entire initialization process (starting from
* @paddle_begin_init_param). Or simply exit the program and wait for * @paddle_begin_init_param). Or simply exit the program and wait for
* the cluster management system to restart the trainer. * the cluster management system to restart the trainer.
*/ */
int paddle_init_param(paddle_pserver_client* client, paddle_parameter params, const char* param_config_proto); int paddle_init_param(paddle_pserver_client* client, paddle_parameter param, const unsigned char* param_config_proto, int config_len);
/** /**
* @brief paddle_finish_init_params tells parameter servers client has * @brief paddle_finish_init_params tells parameter servers client has
...@@ -138,6 +136,9 @@ int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grad ...@@ -138,6 +136,9 @@ int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grad
/** /**
* @brief paddle_get_params gets parameters from parameter servers. * @brief paddle_get_params gets parameters from parameter servers.
* *
* paddle_get_params will block until parameters are initialized on
* the parameter servers.
*
* @param names the array of names of the parameters to get. * @param names the array of names of the parameters to get.
* @param dst the destination array of parameters to save to. * @param dst the destination array of parameters to save to.
* @param len the length of the names array and the paddle_parameter * @param len the length of the names array and the paddle_parameter
......
# Design Doc: The C++ Class `Parameters`
`Parameters` is a concept we designed in Paddle V2 API. `Parameters` is a container of parameters, and make Paddle can shared parameter between topologies. We described usages of `Parameter` in [api.md](./api.md).
We used Python to implement Parameters when designing V2 API before. There are several defects for current implementation:
* We just use `memcpy` to share Parameters between topologies, but this is very inefficient.
* We did not implement share Parameters while training. We just trigger `memcpy` when start training.
It is necessary that we implement Parameters in CPP side. However, it could be a code refactoring for Paddle, because Paddle was designed for training only one topology before, i.e., each GradientMachine contains its Parameter as a data member. In current Paddle implementation, there are three concepts associated with `Parameters`:
1. `paddle::Parameter`. A `Parameters` is a container for `paddle::Parameter`.
It is evident that we should use `paddle::Parameter` when developing `Parameters`.
However, the `Parameter` class contains many functions and does not have a clear interface.
It contains `create/store Parameter`, `serialize/deserialize`, `optimize(i.e SGD)`, `randomize/zero`.
When we developing `Parameters`, we only use `create/store Parameter` functionality.
We should extract functionalities of Parameter into many classes to clean Paddle CPP implementation.
2. `paddle::GradientMachine` and its sub-classes, e.g., `paddle::MultiGradientMachine`, `paddle::NeuralNetwork`.
We should pass `Parameters` to `paddle::GradientMachine` when `forward/backward` to avoid `memcpy` between topologies.
Also, we should handle multi-GPU/CPU training, because `forward` and `backward` would perform on multi-GPUs and multi-CPUs.
`Parameters` should dispatch the parameter value to each device, and gather the parameter gradient from each device.
3. `paddle::ParameterUpdater`. The ParameterUpdater is used to update parameters in Paddle.
So `Parameters` should be used by `paddle::ParameterUpdater`, and `paddle::ParameterUpdater` should optimize `Parameters` (by SGD).
The step by step approach for implementation Parameters in Paddle C++ core is listed below. Each step should be a PR and could be merged into Paddle one by one.
1. Clean `paddle::Parameter` interface. Extract the functionalities of `paddle::Parameter` to prepare for the implementation of Parameters.
2. Implementation a `Parameters` class. It just stores the `paddle::Parameter` inside. Make `GradientMachine` uses `Parameters` as a class member.
3. Make `Parameters` support Multi-CPU and Multi-GPU training to prepare for sharing `Parameter` between topologies.
Because we need share `Parameters` between topologies, it is `Parameters`'s response to exchange Parameters between GPUs.
`GradientMachine` should not handle how to exchange Parameters because `GradientMachine` only used to train one topology and we need to support train many topologies in Paddle, i.e., there could be many GradientMachines use one `Parameters`.
* We should use a global function to exchange Parameters between GPUs, not a member function in `Parameters`. The `MultiGradientMachine` invoke this function, which uses `Parameters` as this function inputs.
* The MultiGradientMachine contains many functionalities. Extracting the Parameters exchanging logic could make MultiGradientMachine clearer and simpler.
4. Make `Parameters` as an argument for `forward/backward` function, not a data member for `GradientMachine`. For example, `forward` could be `forward(const Parameters& params, ...)` and `backward` could be `backward(Parameters* params, ...)`. After this step, Paddle could share `Parameters` between topologies.
5. `ParameterUpdater` is invoked by `GradientMachine` and `Trainer`, but it updates `Parameters`. In the end of this code refactoring, we could change `ParameterUpdater` directly uses `Parameters` to make `ParameterUpdater`'s implementation clear.
...@@ -196,15 +196,15 @@ ...@@ -196,15 +196,15 @@
<h2>Classification<a class="headerlink" href="#classification" title="永久链接至标题"></a></h2> <h2>Classification<a class="headerlink" href="#classification" title="永久链接至标题"></a></h2>
<div class="section" id="classification-error"> <div class="section" id="classification-error">
<h3>classification_error<a class="headerlink" href="#classification-error" title="永久链接至标题"></a></h3> <h3>classification_error<a class="headerlink" href="#classification-error" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Classification Error Evaluator. It will print error rate for classification.</p> <dd><p>Classification Error Evaluator. It will print error rate for classification.</p>
<p>The classification error is:</p> <p>The classification error is:</p>
<div class="math"> <div class="math">
\[classification\_error = \frac{NumOfWrongPredicts}{NumOfAllSamples}\]</div> \[classification\_error = \frac{NumOfWrongPredicts}{NumOfAllSamples}\]</div>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span><span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_evaluator</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span><span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -235,12 +235,12 @@ important this sample is.</li> ...@@ -235,12 +235,12 @@ important this sample is.</li>
</div> </div>
<div class="section" id="auc"> <div class="section" id="auc">
<h3>auc<a class="headerlink" href="#auc" title="永久链接至标题"></a></h3> <h3>auc<a class="headerlink" href="#auc" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">auc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">auc</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Auc Evaluator which adapts to binary classification.</p> <dd><p>Auc Evaluator which adapts to binary classification.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">auc_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">auc</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -263,12 +263,12 @@ important this sample is.</li> ...@@ -263,12 +263,12 @@ important this sample is.</li>
</div> </div>
<div class="section" id="ctc-error"> <div class="section" id="ctc-error">
<h3>ctc_error<a class="headerlink" href="#ctc-error" title="永久链接至标题"></a></h3> <h3>ctc_error<a class="headerlink" href="#ctc-error" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">ctc_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">ctc_error</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This evaluator is to calculate sequence-to-sequence edit distance.</p> <dd><p>This evaluator is to calculate sequence-to-sequence edit distance.</p>
<p>The simple usage is :</p> <p>The simple usage is :</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">ctc_error_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">ctc_evaluator</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">lbl</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -290,32 +290,68 @@ label for ctc</li> ...@@ -290,32 +290,68 @@ label for ctc</li>
</div> </div>
<div class="section" id="chunk"> <div class="section" id="chunk">
<h3>chunk<a class="headerlink" href="#chunk" title="永久链接至标题"></a></h3> <h3>chunk<a class="headerlink" href="#chunk" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">chunk</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">chunk</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Chunk evaluator is used to evaluate segment labelling accuracy for a <dd><p>Chunk evaluator is used to evaluate segment labelling accuracy for a
sequence. It calculates the chunk detection F1 score.</p> sequence. It calculates precision, recall and F1 scores for the chunk detection.</p>
<p>A chunk is correctly detected if its beginning, end and type are correct. <p>To use chunk evaluator, several concepts need to be clarified firstly.</p>
Other chunk type is ignored.</p> <ul class="simple">
<p>For each label in the label sequence, we have:</p> <li><strong>Chunk type</strong> is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)</li>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">tagType</span> <span class="o">=</span> <span class="n">label</span> <span class="o">%</span> <span class="n">numTagType</span> <li><strong>Tag type</strong> indicates the position of a word in a chunk. (B for begin, I for inside, E for end, S for single)</li>
<span class="n">chunkType</span> <span class="o">=</span> <span class="n">label</span> <span class="o">/</span> <span class="n">numTagType</span> </ul>
<span class="n">otherChunkType</span> <span class="o">=</span> <span class="n">numChunkTypes</span> <p>We can name a label by combining tag type and chunk type. (ie. B-ORG for begining of an organization name)</p>
<p>The construction of label dictionary should obey the following rules:</p>
<ul class="simple">
<li>Use one of the listed labelling schemes. These schemes differ in ways indicating chunk boundry.</li>
</ul>
<div class="highlight-text"><div class="highlight"><pre><span></span>Scheme Description
plain Use the same label for the whole chunk.
IOB Two labels for chunk type X, B-X for chunk begining and I-X for chunk inside.
IOE Two labels for chunk type X, E-X for chunk ending and I-X for chunk inside.
IOBES Four labels for chunk type X, B-X for chunk begining, I-X for chunk inside, E-X for chunk end and S-X for single word chunk.
</pre></div>
</div>
<p>To make it clear, let&#8217;s illustrate by an NER example.
Assuming that there are three named entity types including ORG, PER and LOC which are called &#8216;chunk type&#8217; here,
if &#8216;IOB&#8217; scheme were used, the label set will be extended to a set including B-ORG, I-ORG, B-PER, I-PER, B-LOC, I-LOC and O,
in which B-ORG for begining of ORG and I-ORG for inside of ORG.
Prefixes which are called &#8216;tag type&#8217; here are added to chunk types and there are two tag types including B and I.
Of course, the training data should be labeled accordingly.</p>
<ul class="simple">
<li>Mapping is done correctly by the listed equations and assigning protocol.</li>
</ul>
<p>The following table are equations to extract tag type and chunk type from a label.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>tagType = label % numTagType
chunkType = label / numTagType
otherChunkType = numChunkTypes
</pre></div>
</div>
<p>The following table shows the mapping rule between tagType and tag type in each scheme.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>Scheme Begin Inside End Single
plain 0 - - -
IOB 0 1 - -
IOE - 0 1 -
IOBES 0 1 2 3
</pre></div> </pre></div>
</div> </div>
<p>The total number of different labels is numTagType*numChunkTypes+1. <p>Continue the NER example, and the label dict should look like this to satify above equations:</p>
We support 4 labelling scheme. <div class="highlight-text"><div class="highlight"><pre><span></span>B-ORG 0
The tag type for each of the scheme is shown as follows:</p> I-ORG 1
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">Scheme</span> <span class="n">Begin</span> <span class="n">Inside</span> <span class="n">End</span> <span class="n">Single</span> B-PER 2
<span class="n">plain</span> <span class="mi">0</span> <span class="o">-</span> <span class="o">-</span> <span class="o">-</span> I-PER 3
<span class="n">IOB</span> <span class="mi">0</span> <span class="mi">1</span> <span class="o">-</span> <span class="o">-</span> B-LOC 4
<span class="n">IOE</span> <span class="o">-</span> <span class="mi">0</span> <span class="mi">1</span> <span class="o">-</span> I-LOC 5
<span class="n">IOBES</span> <span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> O 6
</pre></div> </pre></div>
</div> </div>
<p>&#8216;plain&#8217; means the whole chunk must contain exactly the same chunk label.</p> <p>In this example, chunkType has three values: 0 for ORG, 1 for PER, 2 for LOC, because the scheme is
&#8220;IOB&#8221; so tagType has two values: 0 for B and 1 for I.
Here we will use I-LOC to explain the above mapping rules in detail.
For I-LOC, the label id is 5, so we can get tagType=1 and chunkType=2, which means I-LOC is a part of NER chunk LOC
and the tag is I.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">chunk_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">,</span> <span class="n">chunk_scheme</span><span class="p">,</span> <span class="n">num_chunk_types</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">chunk</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">,</span> <span class="n">chunk_scheme</span><span class="p">,</span> <span class="n">num_chunk_types</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -340,9 +376,9 @@ The tag type for each of the scheme is shown as follows:</p> ...@@ -340,9 +376,9 @@ The tag type for each of the scheme is shown as follows:</p>
</div> </div>
<div class="section" id="precision-recall"> <div class="section" id="precision-recall">
<h3>precision_recall<a class="headerlink" href="#precision-recall" title="永久链接至标题"></a></h3> <h3>precision_recall<a class="headerlink" href="#precision-recall" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">precision_recall</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">precision_recall</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>An Evaluator to calculate precision and recall, F1-score. <dd><p>An Evaluator to calculate precision and recall, F1-score.
It is adapt to the task with multiple labels.</p> It is adapt to the task with multiple labels.</p>
<ul class="simple"> <ul class="simple">
...@@ -352,7 +388,7 @@ F1-score of all labels.</li> ...@@ -352,7 +388,7 @@ F1-score of all labels.</li>
F1-score of this label.</li> F1-score of this label.</li>
</ul> </ul>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">precision_recall_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">precision_evaluator</span><span class="o">.</span><span class="n">recall</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -379,13 +415,13 @@ F1-score of this label.</li> ...@@ -379,13 +415,13 @@ F1-score of this label.</li>
<h2>Rank<a class="headerlink" href="#rank" title="永久链接至标题"></a></h2> <h2>Rank<a class="headerlink" href="#rank" title="永久链接至标题"></a></h2>
<div class="section" id="pnpair"> <div class="section" id="pnpair">
<h3>pnpair<a class="headerlink" href="#pnpair" title="永久链接至标题"></a></h3> <h3>pnpair<a class="headerlink" href="#pnpair" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">pnpair</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">pnpair</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Positive-negative pair rate Evaluator which adapts to rank task like <dd><p>Positive-negative pair rate Evaluator which adapts to rank task like
learning to rank. This evaluator must contain at least three layers.</p> learning to rank. This evaluator must contain at least three layers.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">pnpair_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">info</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">pnpair</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">info</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -412,12 +448,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -412,12 +448,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
<h2>Utils<a class="headerlink" href="#utils" title="永久链接至标题"></a></h2> <h2>Utils<a class="headerlink" href="#utils" title="永久链接至标题"></a></h2>
<div class="section" id="sum"> <div class="section" id="sum">
<h3>sum<a class="headerlink" href="#sum" title="永久链接至标题"></a></h3> <h3>sum<a class="headerlink" href="#sum" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>An Evaluator to sum the result of input.</p> <dd><p>An Evaluator to sum the result of input.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">sum_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">evaluator</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -439,12 +475,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -439,12 +475,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
</div> </div>
<div class="section" id="column-sum"> <div class="section" id="column-sum">
<h3>column_sum<a class="headerlink" href="#column-sum" title="永久链接至标题"></a></h3> <h3>column_sum<a class="headerlink" href="#column-sum" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">column_sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">column_sum</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to sum the last column of input.</p> <dd><p>This Evaluator is used to sum the last column of input.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">column_sum_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">column_evaluator</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="nb">input</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -467,12 +503,12 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -467,12 +503,12 @@ learning to rank. This evaluator must contain at least three layers.</p>
<h2>Print<a class="headerlink" href="#print" title="永久链接至标题"></a></h2> <h2>Print<a class="headerlink" href="#print" title="永久链接至标题"></a></h2>
<div class="section" id="classification-error-printer"> <div class="section" id="classification-error-printer">
<h3>classification_error_printer<a class="headerlink" href="#classification-error-printer" title="永久链接至标题"></a></h3> <h3>classification_error_printer<a class="headerlink" href="#classification-error-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">classification_error_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the classification error of each sample.</p> <dd><p>This Evaluator is used to print the classification error of each sample.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">classification_error_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -493,13 +529,13 @@ learning to rank. This evaluator must contain at least three layers.</p> ...@@ -493,13 +529,13 @@ learning to rank. This evaluator must contain at least three layers.</p>
</div> </div>
<div class="section" id="gradient-printer"> <div class="section" id="gradient-printer">
<h3>gradient_printer<a class="headerlink" href="#gradient-printer" title="永久链接至标题"></a></h3> <h3>gradient_printer<a class="headerlink" href="#gradient-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">gradient_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">gradient_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the gradient of input layers. It contains <dd><p>This Evaluator is used to print the gradient of input layers. It contains
one or more input layers.</p> one or more input layers.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">gradient_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">gradient_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -519,14 +555,14 @@ one or more input layers.</p> ...@@ -519,14 +555,14 @@ one or more input layers.</p>
</div> </div>
<div class="section" id="maxid-printer"> <div class="section" id="maxid-printer">
<h3>maxid_printer<a class="headerlink" href="#maxid-printer" title="永久链接至标题"></a></h3> <h3>maxid_printer<a class="headerlink" href="#maxid-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxid_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxid_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print maximum top k values and their indexes <dd><p>This Evaluator is used to print maximum top k values and their indexes
of each row of input layers. It contains one or more input layers. of each row of input layers. It contains one or more input layers.
k is specified by num_results.</p> k is specified by num_results.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxid_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxid_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -548,9 +584,9 @@ It is 1 by default.</li> ...@@ -548,9 +584,9 @@ It is 1 by default.</li>
</div> </div>
<div class="section" id="maxframe-printer"> <div class="section" id="maxframe-printer">
<h3>maxframe_printer<a class="headerlink" href="#maxframe-printer" title="永久链接至标题"></a></h3> <h3>maxframe_printer<a class="headerlink" href="#maxframe-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxframe_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">maxframe_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the top k frames of each input layers. <dd><p>This Evaluator is used to print the top k frames of each input layers.
The input layers should contain sequences info or sequences type. The input layers should contain sequences info or sequences type.
k is specified by num_results. k is specified by num_results.
...@@ -560,7 +596,7 @@ It contains one or more input layers.</p> ...@@ -560,7 +596,7 @@ It contains one or more input layers.</p>
<p class="last">The width of each frame is 1.</p> <p class="last">The width of each frame is 1.</p>
</div> </div>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxframe_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">maxframe_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -580,9 +616,9 @@ It contains one or more input layers.</p> ...@@ -580,9 +616,9 @@ It contains one or more input layers.</p>
</div> </div>
<div class="section" id="seqtext-printer"> <div class="section" id="seqtext-printer">
<h3>seqtext_printer<a class="headerlink" href="#seqtext-printer" title="永久链接至标题"></a></h3> <h3>seqtext_printer<a class="headerlink" href="#seqtext-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">seqtext_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">seqtext_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>Sequence text printer will print text according to index matrix and a <dd><p>Sequence text printer will print text according to index matrix and a
dictionary. There can be multiple input to this layer:</p> dictionary. There can be multiple input to this layer:</p>
<p>1. If there is no id_input, the input must be a matrix containing <p>1. If there is no id_input, the input must be a matrix containing
...@@ -614,7 +650,7 @@ the sequence of indices;</p> ...@@ -614,7 +650,7 @@ the sequence of indices;</p>
<p>Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup <p>Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup
with maxid (when generating) as an input.</p> with maxid (when generating) as an input.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">seqtext_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">maxid</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">seqtext_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">maxid</span><span class="p">,</span>
<span class="n">id_input</span><span class="o">=</span><span class="n">sample_id</span><span class="p">,</span> <span class="n">id_input</span><span class="o">=</span><span class="n">sample_id</span><span class="p">,</span>
<span class="n">dict_file</span><span class="o">=</span><span class="n">dict_file</span><span class="p">,</span> <span class="n">dict_file</span><span class="o">=</span><span class="n">dict_file</span><span class="p">,</span>
<span class="n">result_file</span><span class="o">=</span><span class="n">result_file</span><span class="p">)</span> <span class="n">result_file</span><span class="o">=</span><span class="n">result_file</span><span class="p">)</span>
...@@ -654,13 +690,13 @@ Default is True. No space is added if set to False.</li> ...@@ -654,13 +690,13 @@ Default is True. No space is added if set to False.</li>
</div> </div>
<div class="section" id="value-printer"> <div class="section" id="value-printer">
<h3>value_printer<a class="headerlink" href="#value-printer" title="永久链接至标题"></a></h3> <h3>value_printer<a class="headerlink" href="#value-printer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.evaluator.</code><code class="descname">value_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.evaluator.</code><code class="descname">value_printer</code><span class="sig-paren">(</span><em>*args</em>, <em>**xargs</em><span class="sig-paren">)</span></dt>
<dd><p>This Evaluator is used to print the values of input layers. It contains <dd><p>This Evaluator is used to print the values of input layers. It contains
one or more input layers.</p> one or more input layers.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">value_printer_evaluator</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">eval</span> <span class="o">=</span> <span class="n">value_evaluator</span><span class="o">.</span><span class="n">printer</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
......
...@@ -196,35 +196,10 @@ ...@@ -196,35 +196,10 @@
<h2>Data layer<a class="headerlink" href="#data-layer" title="永久链接至标题"></a></h2> <h2>Data layer<a class="headerlink" href="#data-layer" title="永久链接至标题"></a></h2>
<div class="section" id="data"> <div class="section" id="data">
<span id="api-v2-layer-data"></span><h3>data<a class="headerlink" href="#data" title="永久链接至标题"></a></h3> <span id="api-v2-layer-data"></span><h3>data<a class="headerlink" href="#data" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="attribute">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">data</code><span class="sig-paren">(</span><em>name</em>, <em>type</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.layer.</code><code class="descname">data</code></dt>
<dd><p>Define DataLayer For NeuralNetwork.</p> <dd><p><code class="xref py py-class docutils literal"><span class="pre">name</span></code> 的别名</p>
<p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">layer</span><span class="o">.</span><span class="n">data</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;input&quot;</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">data_type</span><span class="o">.</span><span class="n">dense_vector</span><span class="p">(</span><span class="mi">1000</span><span class="p">))</span>
</pre></div>
</div>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Name of this data layer.</li>
<li><strong>type</strong> &#8211; Data type of this data layer</li>
<li><strong>height</strong> (<em>int|None</em>) &#8211; Height of this data layer, used for image</li>
<li><strong>width</strong> (<em>int|None</em>) &#8211; Width of this data layer, used for image</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td>
</tr>
</tbody>
</table>
</dd></dl> </dd></dl>
</div> </div>
...@@ -235,12 +210,12 @@ ...@@ -235,12 +210,12 @@
<span id="api-v2-layer-fc"></span><h3>fc<a class="headerlink" href="#fc" title="永久链接至标题"></a></h3> <span id="api-v2-layer-fc"></span><h3>fc<a class="headerlink" href="#fc" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">fc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">fc</code></dt>
<dd><p>Helper for declare fully connected layer.</p> <dd><p>Helper for declare fully connected layer.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fc</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fc</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
...@@ -257,7 +232,7 @@ ...@@ -257,7 +232,7 @@
<li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; The Layer Name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; The input layer. Could be a list/tuple of input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li> <li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -281,13 +256,13 @@ default Bias.</li> ...@@ -281,13 +256,13 @@ default Bias.</li>
<h3>selective_fc<a class="headerlink" href="#selective-fc" title="永久链接至标题"></a></h3> <h3>selective_fc<a class="headerlink" href="#selective-fc" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">selective_fc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">selective_fc</code></dt>
<dd><p>Selectived fully connected layer. Different from fc, the output <dd><p>Selectived fully connected layer. Different from fc, the output
of this layer maybe sparse. It requires an additional input to indicate of this layer maybe sparse. It requires an additional input to indicate
several selected columns for output. If the selected columns is not several selected columns for output. If the selected columns is not
specified, selective_fc acts exactly like fc.</p> specified, selective_fc acts exactly like fc.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sel_fc</span> <span class="o">=</span> <span class="n">selective_fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sel_fc</span> <span class="o">=</span> <span class="n">selective_fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -301,7 +276,7 @@ specified, selective_fc acts exactly like fc.</p> ...@@ -301,7 +276,7 @@ specified, selective_fc acts exactly like fc.</p>
sparse binary matrix, and treat as the mask of selective fc. sparse binary matrix, and treat as the mask of selective fc.
If is None, acts exactly like fc.</li> If is None, acts exactly like fc.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li> <li><strong>size</strong> (<em>int</em>) &#8211; The layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -328,7 +303,7 @@ default Bias.</li> ...@@ -328,7 +303,7 @@ default Bias.</li>
<h3>conv_operator<a class="headerlink" href="#conv-operator" title="永久链接至标题"></a></h3> <h3>conv_operator<a class="headerlink" href="#conv-operator" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_operator</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_operator</code></dt>
<dd><p>Different from img_conv, conv_op is an Operator, which can be used <dd><p>Different from img_conv, conv_op is an Operator, which can be used
in mixed. And conv_op takes two inputs to perform convolution. in mixed. And conv_op takes two inputs to perform convolution.
The first input is the image and the second is filter kernel. It only The first input is the image and the second is filter kernel. It only
...@@ -376,7 +351,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li> ...@@ -376,7 +351,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li>
<h3>conv_projection<a class="headerlink" href="#conv-projection" title="永久链接至标题"></a></h3> <h3>conv_projection<a class="headerlink" href="#conv-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_projection</code></dt>
<dd><p>Different from img_conv and conv_op, conv_projection is an Projection, <dd><p>Different from img_conv and conv_op, conv_projection is an Projection,
which can be used in mixed and conat. It use cudnn to implement which can be used in mixed and conat. It use cudnn to implement
conv and only support GPU mode.</p> conv and only support GPU mode.</p>
...@@ -424,7 +399,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li> ...@@ -424,7 +399,7 @@ the filter&#8217;s shape can be (filter_size, filter_size_y).</li>
<h3>conv_shift<a class="headerlink" href="#conv-shift" title="永久链接至标题"></a></h3> <h3>conv_shift<a class="headerlink" href="#conv-shift" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_shift</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">conv_shift</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>This layer performs cyclic convolution for two input. For example:</dt> <dt>This layer performs cyclic convolution for two input. For example:</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -477,7 +452,7 @@ the right size (which is the end of array) to the left.</li> ...@@ -477,7 +452,7 @@ the right size (which is the end of array) to the left.</li>
<h3>img_conv<a class="headerlink" href="#img-conv" title="永久链接至标题"></a></h3> <h3>img_conv<a class="headerlink" href="#img-conv" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_conv</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_conv</code></dt>
<dd><p>Convolution layer for image. Paddle can support both square and non-square <dd><p>Convolution layer for image. Paddle can support both square and non-square
input currently.</p> input currently.</p>
<p>The details of convolution layer, please refer UFLDL&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/FeatureExtractionUsingConvolution/">convolution</a> .</p> <p>The details of convolution layer, please refer UFLDL&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/FeatureExtractionUsingConvolution/">convolution</a> .</p>
...@@ -501,7 +476,7 @@ rest channels will be processed by rest group of filters.</p> ...@@ -501,7 +476,7 @@ rest channels will be processed by rest group of filters.</p>
<span class="n">num_channels</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">num_channels</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span>
<span class="n">num_filters</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">stride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">num_filters</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="n">stride</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -517,7 +492,7 @@ two image dimension.</li> ...@@ -517,7 +492,7 @@ two image dimension.</li>
currently supports rectangular filters, the filter&#8217;s currently supports rectangular filters, the filter&#8217;s
shape will be (filter_size, filter_size_y).</li> shape will be (filter_size, filter_size_y).</li>
<li><strong>num_filters</strong> &#8211; Each filter group&#8217;s number of filter</li> <li><strong>num_filters</strong> &#8211; Each filter group&#8217;s number of filter</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type. Default is tanh</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type. Default is tanh</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; Group size of filters.</li> <li><strong>groups</strong> (<em>int</em>) &#8211; Group size of filters.</li>
<li><strong>stride</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the stride. Or input a tuple for two image <li><strong>stride</strong> (<em>int|tuple|list</em>) &#8211; The x dimension of the stride. Or input a tuple for two image
dimension.</li> dimension.</li>
...@@ -555,7 +530,7 @@ otherwise layer_type has to be either &#8220;exconv&#8221; or ...@@ -555,7 +530,7 @@ otherwise layer_type has to be either &#8220;exconv&#8221; or
<span id="api-v2-layer-context-projection"></span><h3>context_projection<a class="headerlink" href="#context-projection" title="永久链接至标题"></a></h3> <span id="api-v2-layer-context-projection"></span><h3>context_projection<a class="headerlink" href="#context-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">context_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">context_projection</code></dt>
<dd><p>Context Projection.</p> <dd><p>Context Projection.</p>
<p>It just simply reorganizes input sequence, combines &#8220;context_len&#8221; sequence <p>It just simply reorganizes input sequence, combines &#8220;context_len&#8221; sequence
to one context from context_start. &#8220;context_start&#8221; will be set to to one context from context_start. &#8220;context_start&#8221; will be set to
...@@ -598,7 +573,7 @@ parameter attribute is set by this parameter.</li> ...@@ -598,7 +573,7 @@ parameter attribute is set by this parameter.</li>
<h3>img_pool<a class="headerlink" href="#img-pool" title="永久链接至标题"></a></h3> <h3>img_pool<a class="headerlink" href="#img-pool" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_pool</code></dt>
<dd><p>Image pooling Layer.</p> <dd><p>Image pooling Layer.</p>
<p>The details of pooling layer, please refer ufldl&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/Pooling/">pooling</a> .</p> <p>The details of pooling layer, please refer ufldl&#8217;s <a class="reference external" href="http://ufldl.stanford.edu/tutorial/supervised/Pooling/">pooling</a> .</p>
<ul class="simple"> <ul class="simple">
...@@ -662,7 +637,7 @@ Defalut is True. If set false, Otherwise use floor.</li> ...@@ -662,7 +637,7 @@ Defalut is True. If set false, Otherwise use floor.</li>
<h3>spp<a class="headerlink" href="#spp" title="永久链接至标题"></a></h3> <h3>spp<a class="headerlink" href="#spp" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">spp</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">spp</code></dt>
<dd><p>Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. <dd><p>Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.
The details please refer to The details please refer to
<a class="reference external" href="https://arxiv.org/abs/1406.4729">Kaiming He&#8217;s paper</a>.</p> <a class="reference external" href="https://arxiv.org/abs/1406.4729">Kaiming He&#8217;s paper</a>.</p>
...@@ -702,7 +677,7 @@ The details please refer to ...@@ -702,7 +677,7 @@ The details please refer to
<h3>maxout<a class="headerlink" href="#maxout" title="永久链接至标题"></a></h3> <h3>maxout<a class="headerlink" href="#maxout" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">maxout</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">maxout</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>A layer to do max out on conv layer output.</dt> <dt>A layer to do max out on conv layer output.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -759,7 +734,7 @@ automatically from previous output.</li> ...@@ -759,7 +734,7 @@ automatically from previous output.</li>
<h3>img_cmrnorm<a class="headerlink" href="#img-cmrnorm" title="永久链接至标题"></a></h3> <h3>img_cmrnorm<a class="headerlink" href="#img-cmrnorm" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_cmrnorm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">img_cmrnorm</code></dt>
<dd><p>Response normalization across feature maps. <dd><p>Response normalization across feature maps.
The details please refer to The details please refer to
<a class="reference external" href="http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf">Alex&#8217;s paper</a>.</p> <a class="reference external" href="http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf">Alex&#8217;s paper</a>.</p>
...@@ -798,7 +773,7 @@ num_channels is None, it will be set automatically.</li> ...@@ -798,7 +773,7 @@ num_channels is None, it will be set automatically.</li>
<h3>batch_norm<a class="headerlink" href="#batch-norm" title="永久链接至标题"></a></h3> <h3>batch_norm<a class="headerlink" href="#batch-norm" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">batch_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">batch_norm</code></dt>
<dd><p>Batch Normalization Layer. The notation of this layer as follow.</p> <dd><p>Batch Normalization Layer. The notation of this layer as follow.</p>
<p><span class="math">\(x\)</span> is the input features over a mini-batch.</p> <p><span class="math">\(x\)</span> is the input features over a mini-batch.</p>
<div class="math"> <div class="math">
...@@ -812,7 +787,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp ...@@ -812,7 +787,7 @@ y_i &amp;\gets \gamma \hat{x_i} + \beta \qquad &amp;//\ scale\ and\ shift\end{sp
<p>The details of batch normalization please refer to this <p>The details of batch normalization please refer to this
<a class="reference external" href="http://arxiv.org/abs/1502.03167">paper</a>.</p> <a class="reference external" href="http://arxiv.org/abs/1502.03167">paper</a>.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">norm</span> <span class="o">=</span> <span class="n">batch_norm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">norm</span> <span class="o">=</span> <span class="n">batch_norm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -832,7 +807,7 @@ automaticly select cudnn_batch_norm for GPU and ...@@ -832,7 +807,7 @@ automaticly select cudnn_batch_norm for GPU and
batch_norm for CPU. Otherwise, select batch norm batch_norm for CPU. Otherwise, select batch norm
type based on the specified type. If you use cudnn_batch_norm, type based on the specified type. If you use cudnn_batch_norm,
we suggested you use latest version, such as v5.1.</li> we suggested you use latest version, such as v5.1.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Better be relu. Because batch <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Better be relu. Because batch
normalization will normalize input near zero.</li> normalization will normalize input near zero.</li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of <li><strong>num_channels</strong> (<em>int</em>) &#8211; num of image channels or previous layer&#8217;s number of
filters. None will automatically get from layer&#8217;s filters. None will automatically get from layer&#8217;s
...@@ -870,7 +845,7 @@ computation, referred to as facotr, ...@@ -870,7 +845,7 @@ computation, referred to as facotr,
<h3>sum_to_one_norm<a class="headerlink" href="#sum-to-one-norm" title="永久链接至标题"></a></h3> <h3>sum_to_one_norm<a class="headerlink" href="#sum-to-one-norm" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_to_one_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_to_one_norm</code></dt>
<dd><p>A layer for sum-to-one normalization, <dd><p>A layer for sum-to-one normalization,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -907,7 +882,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.< ...@@ -907,7 +882,7 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.<
<h3>cross_channel_norm<a class="headerlink" href="#cross-channel-norm" title="永久链接至标题"></a></h3> <h3>cross_channel_norm<a class="headerlink" href="#cross-channel-norm" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_channel_norm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_channel_norm</code></dt>
<dd><p>Normalize a layer&#8217;s output. This layer is necessary for ssd. <dd><p>Normalize a layer&#8217;s output. This layer is necessary for ssd.
This layer applys normalize across the channels of each sample to This layer applys normalize across the channels of each sample to
a conv layer&#8217;s output and scale the output by a group of trainable a conv layer&#8217;s output and scale the output by a group of trainable
...@@ -938,7 +913,7 @@ factors which dimensions equal to the channel&#8217;s number.</p> ...@@ -938,7 +913,7 @@ factors which dimensions equal to the channel&#8217;s number.</p>
<h3>recurrent<a class="headerlink" href="#recurrent" title="永久链接至标题"></a></h3> <h3>recurrent<a class="headerlink" href="#recurrent" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">recurrent</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">recurrent</code></dt>
<dd><p>Simple recurrent unit layer. It is just a fully connect layer through both <dd><p>Simple recurrent unit layer. It is just a fully connect layer through both
time and neural network.</p> time and neural network.</p>
<p>For each sequence [start, end] it performs the following computation:</p> <p>For each sequence [start, end] it performs the following computation:</p>
...@@ -955,7 +930,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en ...@@ -955,7 +930,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input Layer</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li> <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; bias attribute.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the layer</li>
...@@ -978,7 +953,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en ...@@ -978,7 +953,7 @@ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start &lt;= i &lt; end\en
<h3>lstmemory<a class="headerlink" href="#lstmemory" title="永久链接至标题"></a></h3> <h3>lstmemory<a class="headerlink" href="#lstmemory" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstmemory</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstmemory</code></dt>
<dd><p>Long Short-term Memory Cell.</p> <dd><p>Long Short-term Memory Cell.</p>
<p>The memory cell was implemented as follow equations.</p> <p>The memory cell was implemented as follow equations.</p>
<div class="math"> <div class="math">
...@@ -1002,9 +977,9 @@ more details about LSTM.</p> ...@@ -1002,9 +977,9 @@ more details about LSTM.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; The lstmemory layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; The lstmemory layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is sequence process reversed or not.</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; is sequence process reversed or not.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation type, paddle.v2.Activation.Tanh by default. <span class="math">\(h_t\)</span></li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. <span class="math">\(h_t\)</span></li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; gate activation type, paddle.v2.Activation.Sigmoid by default.</li> <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; state activation type, paddle.v2.Activation.Tanh by default.</li> <li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; state activation type, paddle.v2.activation.Tanh by default.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
bias.</li> bias.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Parameter Attribute.</li>
...@@ -1027,7 +1002,7 @@ bias.</li> ...@@ -1027,7 +1002,7 @@ bias.</li>
<h3>grumemory<a class="headerlink" href="#grumemory" title="永久链接至标题"></a></h3> <h3>grumemory<a class="headerlink" href="#grumemory" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">grumemory</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">grumemory</code></dt>
<dd><p>Gate Recurrent Unit Layer.</p> <dd><p>Gate Recurrent Unit Layer.</p>
<p>The memory cell was implemented as follow equations.</p> <p>The memory cell was implemented as follow equations.</p>
<p>1. update gate <span class="math">\(z\)</span>: defines how much of the previous memory to <p>1. update gate <span class="math">\(z\)</span>: defines how much of the previous memory to
...@@ -1067,9 +1042,9 @@ Recurrent Neural Networks on Sequence Modeling.</a></p> ...@@ -1067,9 +1042,9 @@ Recurrent Neural Networks on Sequence Modeling.</a></p>
<li><strong>name</strong> (<em>None|basestring</em>) &#8211; The gru layer name.</li> <li><strong>name</strong> (<em>None|basestring</em>) &#8211; The gru layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer.</em>) &#8211; input layer.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; Whether sequence process is reversed or not.</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; Whether sequence process is reversed or not.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; activation type, paddle.v2.Activation.Tanh by default. This activation <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; activation type, paddle.v2.activation.Tanh by default. This activation
affects the <span class="math">\({\tilde{h_t}}\)</span>.</li> affects the <span class="math">\({\tilde{h_t}}\)</span>.</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; gate activation type, paddle.v2.Activation.Sigmoid by default. <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; gate activation type, paddle.v2.activation.Sigmoid by default.
This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the This activation affects the <span class="math">\(z_t\)</span> and <span class="math">\(r_t\)</span>. It is the
<span class="math">\(\sigma\)</span> in the above formula.</li> <span class="math">\(\sigma\)</span> in the above formula.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|False</em>) &#8211; Bias attribute. None means default bias. False means no
...@@ -1099,7 +1074,7 @@ will get a warning.</li> ...@@ -1099,7 +1074,7 @@ will get a warning.</li>
<h3>memory<a class="headerlink" href="#memory" title="永久链接至标题"></a></h3> <h3>memory<a class="headerlink" href="#memory" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">memory</code><span class="sig-paren">(</span><em>name</em>, <em>extra_input=None</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">memory</code></dt>
<dd><p>The memory layers is a layer cross each time step. Reference this output <dd><p>The memory layers is a layer cross each time step. Reference this output
as previous time step layer <code class="code docutils literal"><span class="pre">name</span></code> &#8216;s output.</p> as previous time step layer <code class="code docutils literal"><span class="pre">name</span></code> &#8216;s output.</p>
<p>The default memory is zero in first time step, previous time step&#8217;s <p>The default memory is zero in first time step, previous time step&#8217;s
...@@ -1108,12 +1083,12 @@ output in the rest time steps.</p> ...@@ -1108,12 +1083,12 @@ output in the rest time steps.</p>
with activation.</p> with activation.</p>
<p>If boot_with_const_id, then the first time stop is a IndexSlot, the <p>If boot_with_const_id, then the first time stop is a IndexSlot, the
Arguments.ids()[0] is this <code class="code docutils literal"><span class="pre">cost_id</span></code>.</p> Arguments.ids()[0] is this <code class="code docutils literal"><span class="pre">cost_id</span></code>.</p>
<p>If boot_layer is not null, the memory is just the boot_layer&#8217;s output. <p>If boot is not null, the memory is just the boot&#8217;s output.
Set <code class="code docutils literal"><span class="pre">is_seq</span></code> is true boot layer is sequence.</p> Set <code class="code docutils literal"><span class="pre">is_seq</span></code> is true boot layer is sequence.</p>
<p>The same name layer in recurrent group will set memory on each time <p>The same name layer in recurrent group will set memory on each time
step.</p> step.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">mem</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">mem</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span>
<span class="n">state</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">mem</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span> <span class="n">state</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">mem</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;state&#39;</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<p>If you do not want to specify the name, you can equivalently use set_input() <p>If you do not want to specify the name, you can equivalently use set_input()
...@@ -1129,18 +1104,18 @@ name of the layer which this memory remembers.</li> ...@@ -1129,18 +1104,18 @@ name of the layer which this memory remembers.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; size of memory.</li> <li><strong>size</strong> (<em>int</em>) &#8211; size of memory.</li>
<li><strong>memory_name</strong> (<em>basestring</em>) &#8211; the name of the memory. <li><strong>memory_name</strong> (<em>basestring</em>) &#8211; the name of the memory.
It is ignored when name is provided.</li> It is ignored when name is provided.</li>
<li><strong>is_seq</strong> (<em>bool</em>) &#8211; is sequence for boot_layer</li> <li><strong>is_seq</strong> (<em>bool</em>) &#8211; is sequence for boot</li>
<li><strong>boot_layer</strong> (<em>LayerOutput|None</em>) &#8211; boot layer of memory.</li> <li><strong>boot</strong> (<em>paddle.v2.config_base.Layer|None</em>) &#8211; boot layer of memory.</li>
<li><strong>boot_bias</strong> (<em>ParameterAttribute|None</em>) &#8211; boot layer&#8217;s bias</li> <li><strong>boot_bias</strong> (<em>paddle.v2.attr.ParameterAttribute|None</em>) &#8211; boot layer&#8217;s bias</li>
<li><strong>boot_bias_active_type</strong> (<em>BaseActivation</em>) &#8211; boot layer&#8217;s active type.</li> <li><strong>boot_bias_active_type</strong> (<em>paddle.v2.activation.Base</em>) &#8211; boot layer&#8217;s active type.</li>
<li><strong>boot_with_const_id</strong> (<em>int</em>) &#8211; boot layer&#8217;s id.</li> <li><strong>boot_with_const_id</strong> (<em>int</em>) &#8211; boot layer&#8217;s id.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">LayerOutput object which is a memory.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object which is a memory.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -1160,9 +1135,9 @@ sequence input. This is extremely usefull for attention based model, or ...@@ -1160,9 +1135,9 @@ sequence input. This is extremely usefull for attention based model, or
Neural Turning Machine like models.</p> Neural Turning Machine like models.</p>
<p>The basic usage (time steps) is:</p> <p>The basic usage (time steps) is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <span class="n">output</span> <span class="o">=</span> <span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">1024</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">LinearActivation</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Linear</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="k">return</span> <span class="n">output</span> <span class="k">return</span> <span class="n">output</span>
...@@ -1172,8 +1147,8 @@ Neural Turning Machine like models.</p> ...@@ -1172,8 +1147,8 @@ Neural Turning Machine like models.</p>
</div> </div>
<p>You can see following configs for further usages:</p> <p>You can see following configs for further usages:</p>
<ul class="simple"> <ul class="simple">
<li>time steps: lstmemory_group, paddle/gserver/tests/sequence_layer_group.conf, demo/seqToseq/seqToseq_net.py</li> <li>time steps: lstmemory_group, paddle/gserver/tests/sequence_group.conf, demo/seqToseq/seqToseq_net.py</li>
<li>sequence steps: paddle/gserver/tests/sequence_nest_layer_group.conf</li> <li>sequence steps: paddle/gserver/tests/sequence_nest_group.conf</li>
</ul> </ul>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -1189,24 +1164,24 @@ a time step result. Then gather each time step of output into ...@@ -1189,24 +1164,24 @@ a time step result. Then gather each time step of output into
layer group&#8217;s output.</p> layer group&#8217;s output.</p>
</li> </li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; recurrent_group&#8217;s name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; recurrent_group&#8217;s name.</li>
<li><strong>input</strong> (<em>LayerOutput|StaticInput|SubsequenceInput|list|tuple</em>) &#8211; <p>Input links array.</p> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|StaticInput|SubsequenceInput|list|tuple</em>) &#8211; <p>Input links array.</p>
<p>LayerOutput will be scattered into time steps. <p>paddle.v2.config_base.Layer will be scattered into time steps.
SubsequenceInput will be scattered into sequence steps. SubsequenceInput will be scattered into sequence steps.
StaticInput will be imported to each time step, and doesn&#8217;t change StaticInput will be imported to each time step, and doesn&#8217;t change
through time. It&#8217;s a mechanism to access layer outside step function.</p> through time. It&#8217;s a mechanism to access layer outside step function.</p>
</li> </li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set true, the recurrent unit will process the <li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set true, the recurrent unit will process the
input sequence in a reverse order.</li> input sequence in a reverse order.</li>
<li><strong>targetInlink</strong> (<em>LayerOutput|SubsequenceInput</em>) &#8211; <p>the input layer which share info with layer group&#8217;s output</p> <li><strong>targetInlink</strong> (<em>paddle.v2.config_base.Layer|SubsequenceInput</em>) &#8211; <p>the input layer which share info with layer group&#8217;s output</p>
<p>Param input specifies multiple input layers. For <p>Param input specifies multiple input layers. For
SubsequenceInput inputs, config should assign one input SubsequenceInput inputs, config should assign one input
layer that share info(the number of sentences and the number layer that share info(the number of sentences and the number
of words in each sentence) with all layer group&#8217;s outputs. of words in each sentence) with all layer group&#8217;s outputs.
targetInlink should be one of the layer group&#8217;s input.</p> targetInlink should be one of the layer group&#8217;s input.</p>
</li> </li>
<li><strong>is_generating</strong> &#8211; If is generating, none of input type should be LayerOutput; <li><strong>is_generating</strong> &#8211; If is generating, none of input type should be paddle.v2.config_base.Layer;
else, for training or testing, one of the input type must else, for training or testing, one of the input type must
be LayerOutput.</li> be paddle.v2.config_base.Layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -1217,9 +1192,9 @@ be LayerOutput.</li> ...@@ -1217,9 +1192,9 @@ be LayerOutput.</li>
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">返回:</th><td class="field-body">LayerOutput object.</td> <tr class="field-odd field"><th class="field-name">返回:</th><td class="field-body">paddle.v2.config_base.Layer object.</td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回类型:</th><td class="field-body">LayerOutput</td> <tr class="field-even field"><th class="field-name">返回类型:</th><td class="field-body">paddle.v2.config_base.Layer</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
...@@ -1230,7 +1205,7 @@ be LayerOutput.</li> ...@@ -1230,7 +1205,7 @@ be LayerOutput.</li>
<h3>lstm_step<a class="headerlink" href="#lstm-step" title="永久链接至标题"></a></h3> <h3>lstm_step<a class="headerlink" href="#lstm-step" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstm_step</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lstm_step</code></dt>
<dd><p>LSTM Step Layer. It used in recurrent_group. The lstm equations are shown <dd><p>LSTM Step Layer. It used in recurrent_group. The lstm equations are shown
as follow.</p> as follow.</p>
<div class="math"> <div class="math">
...@@ -1255,10 +1230,10 @@ output is <span class="math">\(o_t\)</span>, which name is &#8216;state&#8217; a ...@@ -1255,10 +1230,10 @@ output is <span class="math">\(o_t\)</span>, which name is &#8216;state&#8217; a
<code class="code docutils literal"><span class="pre">state.size</span></code>.</li> <code class="code docutils literal"><span class="pre">state.size</span></code>.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer. <span class="math">\(Wx_t + Wh_{t-1}\)</span></li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer. <span class="math">\(Wx_t + Wh_{t-1}\)</span></li>
<li><strong>state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; State Layer. <span class="math">\(c_{t-1}\)</span></li> <li><strong>state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; State Layer. <span class="math">\(c_{t-1}\)</span></li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type. Default is tanh</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type. Default is tanh</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Gate Activation Type. Default is sigmoid, and should <li><strong>gate_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Gate Activation Type. Default is sigmoid, and should
be sigmoid only.</li> be sigmoid only.</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should <li><strong>state_act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; State Activation Type. Default is sigmoid, and should
be sigmoid only.</li> be sigmoid only.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li> <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Bias Attribute.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; layer&#8217;s extra attribute.</li>
...@@ -1280,7 +1255,7 @@ be sigmoid only.</li> ...@@ -1280,7 +1255,7 @@ be sigmoid only.</li>
<h3>gru_step<a class="headerlink" href="#gru-step" title="永久链接至标题"></a></h3> <h3>gru_step<a class="headerlink" href="#gru-step" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">gru_step</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">gru_step</code></dt>
<dd><table class="docutils field-list" frame="void" rules="none"> <dd><table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
...@@ -1321,7 +1296,7 @@ to maintain tractability.</p> ...@@ -1321,7 +1296,7 @@ to maintain tractability.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">rnn_step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">rnn_step</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span>
<span class="n">last_time_step_output</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span> <span class="n">last_time_step_output</span> <span class="o">=</span> <span class="n">memory</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span>
<span class="k">with</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">simple_rnn</span><span class="p">:</span> <span class="k">with</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">512</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;rnn&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">simple_rnn</span><span class="p">:</span>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span> <span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span> <span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span>
<span class="k">return</span> <span class="n">simple_rnn</span> <span class="k">return</span> <span class="n">simple_rnn</span>
...@@ -1383,7 +1358,7 @@ beam size.</li> ...@@ -1383,7 +1358,7 @@ beam size.</li>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The generated word index.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The generated word index.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -1395,7 +1370,7 @@ beam size.</li> ...@@ -1395,7 +1370,7 @@ beam size.</li>
<h3>get_output<a class="headerlink" href="#get-output" title="永久链接至标题"></a></h3> <h3>get_output<a class="headerlink" href="#get-output" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">get_output</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">get_output</code></dt>
<dd><p>Get layer&#8217;s output by name. In PaddlePaddle, a layer might return multiple <dd><p>Get layer&#8217;s output by name. In PaddlePaddle, a layer might return multiple
values, but returns one layer&#8217;s output. If the user wants to use another values, but returns one layer&#8217;s output. If the user wants to use another
output besides the default one, please use get_output first to get output besides the default one, please use get_output first to get
...@@ -1436,17 +1411,17 @@ multiple outputs.</li> ...@@ -1436,17 +1411,17 @@ multiple outputs.</li>
Each inputs is a projection or operator.</p> Each inputs is a projection or operator.</p>
<p>There are two styles of usages.</p> <p>There are two styles of usages.</p>
<ol class="arabic simple"> <ol class="arabic simple">
<li>When not set inputs parameter, use mixed_layer like this:</li> <li>When not set inputs parameter, use mixed like this:</li>
</ol> </ol>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">)</span> <span class="k">as</span> <span class="n">m</span><span class="p">:</span>
<span class="n">m</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">)</span> <span class="n">m</span> <span class="o">+=</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">)</span>
<span class="n">m</span> <span class="o">+=</span> <span class="n">identity_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)</span> <span class="n">m</span> <span class="o">+=</span> <span class="n">identity_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
<ol class="arabic simple" start="2"> <ol class="arabic simple" start="2">
<li>You can also set all inputs when invoke mixed_layer as follows:</li> <li>You can also set all inputs when invoke mixed as follows:</li>
</ol> </ol>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">m</span> <span class="o">=</span> <span class="n">mixed_layer</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">m</span> <span class="o">=</span> <span class="n">mixed</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">),</span> <span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer1</span><span class="p">),</span>
<span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)])</span> <span class="n">full_matrix_projection</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer2</span><span class="p">)])</span>
</pre></div> </pre></div>
...@@ -1460,11 +1435,11 @@ Each inputs is a projection or operator.</p> ...@@ -1460,11 +1435,11 @@ Each inputs is a projection or operator.</p>
<li><strong>size</strong> (<em>int</em>) &#8211; layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; layer size.</li>
<li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set, <li><strong>input</strong> &#8211; inputs layer. It is an optional parameter. If set,
then this function will just return layer&#8217;s name.</li> then this function will just return layer&#8217;s name.</li>
<li><strong>act</strong> (<em>BaseActivation</em>) &#8211; Activation Type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type.</li>
<li><strong>bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
default Bias.</li> default Bias.</li>
<li><strong>layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; The extra layer config. Default is None.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; The extra layer config. Default is None.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -1483,7 +1458,7 @@ default Bias.</li> ...@@ -1483,7 +1458,7 @@ default Bias.</li>
<span id="api-v2-layer-embedding"></span><h3>embedding<a class="headerlink" href="#embedding" title="永久链接至标题"></a></h3> <span id="api-v2-layer-embedding"></span><h3>embedding<a class="headerlink" href="#embedding" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">embedding</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">embedding</code></dt>
<dd><p>Define a embedding Layer.</p> <dd><p>Define a embedding Layer.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -1514,7 +1489,7 @@ for details.</li> ...@@ -1514,7 +1489,7 @@ for details.</li>
<h3>scaling_projection<a class="headerlink" href="#scaling-projection" title="永久链接至标题"></a></h3> <h3>scaling_projection<a class="headerlink" href="#scaling-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling_projection</code></dt>
<dd><p>scaling_projection multiplies the input with a scalar parameter and add to <dd><p>scaling_projection multiplies the input with a scalar parameter and add to
the output.</p> the output.</p>
<div class="math"> <div class="math">
...@@ -1548,7 +1523,7 @@ the output.</p> ...@@ -1548,7 +1523,7 @@ the output.</p>
<h3>dotmul_projection<a class="headerlink" href="#dotmul-projection" title="永久链接至标题"></a></h3> <h3>dotmul_projection<a class="headerlink" href="#dotmul-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_projection</code></dt>
<dd><p>DotMulProjection with a layer as input. <dd><p>DotMulProjection with a layer as input.
It performs element-wise multiplication with weight.</p> It performs element-wise multiplication with weight.</p>
<div class="math"> <div class="math">
...@@ -1583,7 +1558,7 @@ It performs element-wise multiplication with weight.</p> ...@@ -1583,7 +1558,7 @@ It performs element-wise multiplication with weight.</p>
<h3>dotmul_operator<a class="headerlink" href="#dotmul-operator" title="永久链接至标题"></a></h3> <h3>dotmul_operator<a class="headerlink" href="#dotmul-operator" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_operator</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">dotmul_operator</code></dt>
<dd><p>DotMulOperator takes two inputs and performs element-wise multiplication:</p> <dd><p>DotMulOperator takes two inputs and performs element-wise multiplication:</p>
<div class="math"> <div class="math">
\[out.row[i] += scale * (a.row[i] .* b.row[i])\]</div> \[out.row[i] += scale * (a.row[i] .* b.row[i])\]</div>
...@@ -1619,7 +1594,7 @@ scale is a config scalar, its default value is one.</p> ...@@ -1619,7 +1594,7 @@ scale is a config scalar, its default value is one.</p>
<h3>full_matrix_projection<a class="headerlink" href="#full-matrix-projection" title="永久链接至标题"></a></h3> <h3>full_matrix_projection<a class="headerlink" href="#full-matrix-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">full_matrix_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">full_matrix_projection</code></dt>
<dd><p>Full Matrix Projection. It performs full matrix multiplication.</p> <dd><p>Full Matrix Projection. It performs full matrix multiplication.</p>
<div class="math"> <div class="math">
\[out.row[i] += in.row[i] * weight\]</div> \[out.row[i] += in.row[i] * weight\]</div>
...@@ -1665,7 +1640,7 @@ scale is a config scalar, its default value is one.</p> ...@@ -1665,7 +1640,7 @@ scale is a config scalar, its default value is one.</p>
<h3>identity_projection<a class="headerlink" href="#identity-projection" title="永久链接至标题"></a></h3> <h3>identity_projection<a class="headerlink" href="#identity-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">identity_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">identity_projection</code></dt>
<dd><ol class="arabic simple"> <dd><ol class="arabic simple">
<li>IdentityProjection if offset=None. It performs:</li> <li>IdentityProjection if offset=None. It performs:</li>
</ol> </ol>
...@@ -1711,7 +1686,7 @@ It select dimesions [offset, offset+layer_size) from input:</p> ...@@ -1711,7 +1686,7 @@ It select dimesions [offset, offset+layer_size) from input:</p>
<h3>table_projection<a class="headerlink" href="#table-projection" title="永久链接至标题"></a></h3> <h3>table_projection<a class="headerlink" href="#table-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">table_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">table_projection</code></dt>
<dd><p>Table Projection. It selects rows from parameter where row_id <dd><p>Table Projection. It selects rows from parameter where row_id
is in input_ids.</p> is in input_ids.</p>
<div class="math"> <div class="math">
...@@ -1760,7 +1735,7 @@ and <span class="math">\(i\)</span> is row_id.</p> ...@@ -1760,7 +1735,7 @@ and <span class="math">\(i\)</span> is row_id.</p>
<h3>trans_full_matrix_projection<a class="headerlink" href="#trans-full-matrix-projection" title="永久链接至标题"></a></h3> <h3>trans_full_matrix_projection<a class="headerlink" href="#trans-full-matrix-projection" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans_full_matrix_projection</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans_full_matrix_projection</code></dt>
<dd><p>Different from full_matrix_projection, this projection performs matrix <dd><p>Different from full_matrix_projection, this projection performs matrix
multiplication, using transpose of weight.</p> multiplication, using transpose of weight.</p>
<div class="math"> <div class="math">
...@@ -1828,7 +1803,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class=" ...@@ -1828,7 +1803,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class="
<span id="id1"></span><h3>pooling<a class="headerlink" href="#api-v2-layer-pooling" title="永久链接至标题"></a></h3> <span id="id1"></span><h3>pooling<a class="headerlink" href="#api-v2-layer-pooling" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pooling</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pooling</code></dt>
<dd><p>Pooling layer for sequence inputs, not used for Image.</p> <dd><p>Pooling layer for sequence inputs, not used for Image.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">seq_pool</span> <span class="o">=</span> <span class="n">pooling</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">seq_pool</span> <span class="o">=</span> <span class="n">pooling</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">layer</span><span class="p">,</span>
...@@ -1867,7 +1842,7 @@ SumPooling, SquareRootNPooling.</li> ...@@ -1867,7 +1842,7 @@ SumPooling, SquareRootNPooling.</li>
<span id="api-v2-layer-last-seq"></span><h3>last_seq<a class="headerlink" href="#last-seq" title="永久链接至标题"></a></h3> <span id="api-v2-layer-last-seq"></span><h3>last_seq<a class="headerlink" href="#last-seq" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">last_seq</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">last_seq</code></dt>
<dd><p>Get Last Timestamp Activation of a sequence.</p> <dd><p>Get Last Timestamp Activation of a sequence.</p>
<p>If stride &gt; 0, this layer slides a window whose size is determined by stride, <p>If stride &gt; 0, this layer slides a window whose size is determined by stride,
and return the last value of the window as the output. Thus, a long sequence and return the last value of the window as the output. Thus, a long sequence
...@@ -1905,7 +1880,7 @@ of stride is -1.</p> ...@@ -1905,7 +1880,7 @@ of stride is -1.</p>
<span id="api-v2-layer-first-seq"></span><h3>first_seq<a class="headerlink" href="#first-seq" title="永久链接至标题"></a></h3> <span id="api-v2-layer-first-seq"></span><h3>first_seq<a class="headerlink" href="#first-seq" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">first_seq</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">first_seq</code></dt>
<dd><p>Get First Timestamp Activation of a sequence.</p> <dd><p>Get First Timestamp Activation of a sequence.</p>
<p>If stride &gt; 0, this layer slides a window whose size is determined by stride, <p>If stride &gt; 0, this layer slides a window whose size is determined by stride,
and return the first value of the window as the output. Thus, a long sequence and return the first value of the window as the output. Thus, a long sequence
...@@ -1943,7 +1918,7 @@ of stride is -1.</p> ...@@ -1943,7 +1918,7 @@ of stride is -1.</p>
<h3>concat<a class="headerlink" href="#concat" title="永久链接至标题"></a></h3> <h3>concat<a class="headerlink" href="#concat" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">concat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">concat</code></dt>
<dd><p>Concat all input vector into one huge vector. <dd><p>Concat all input vector into one huge vector.
Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -1957,7 +1932,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -1957,7 +1932,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li> <li><strong>input</strong> (<em>list|tuple|collections.Sequence</em>) &#8211; input layers or projections</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
</ul> </ul>
</td> </td>
...@@ -1977,7 +1952,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -1977,7 +1952,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<h3>seq_concat<a class="headerlink" href="#seq-concat" title="永久链接至标题"></a></h3> <h3>seq_concat<a class="headerlink" href="#seq-concat" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_concat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_concat</code></dt>
<dd><p>Concat sequence a with sequence b.</p> <dd><p>Concat sequence a with sequence b.</p>
<dl class="docutils"> <dl class="docutils">
<dt>Inputs:</dt> <dt>Inputs:</dt>
...@@ -2001,7 +1976,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p> ...@@ -2001,7 +1976,7 @@ Inputs can be list of paddle.v2.config_base.Layer or list of projection.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li> <li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li> <li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input sequence layer</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2027,7 +2002,7 @@ default Bias.</li> ...@@ -2027,7 +2002,7 @@ default Bias.</li>
<h3>block_expand<a class="headerlink" href="#block-expand" title="永久链接至标题"></a></h3> <h3>block_expand<a class="headerlink" href="#block-expand" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">block_expand</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">block_expand</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>Expand feature map to minibatch matrix.</dt> <dt>Expand feature map to minibatch matrix.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -2103,7 +2078,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class=" ...@@ -2103,7 +2078,7 @@ sequence of a nested sequence, <code class="code docutils literal"><span class="
<h3>expand<a class="headerlink" href="#expand" title="永久链接至标题"></a></h3> <h3>expand<a class="headerlink" href="#expand" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">expand</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">expand</code></dt>
<dd><p>A layer for &#8220;Expand Dense data or (sequence data where the length of each <dd><p>A layer for &#8220;Expand Dense data or (sequence data where the length of each
sequence is one) to sequence data.&#8221;</p> sequence is one) to sequence data.&#8221;</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -2142,7 +2117,7 @@ bias.</li> ...@@ -2142,7 +2117,7 @@ bias.</li>
<h3>repeat<a class="headerlink" href="#repeat" title="永久链接至标题"></a></h3> <h3>repeat<a class="headerlink" href="#repeat" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">repeat</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">repeat</code></dt>
<dd><p>A layer for repeating the input for num_repeats times. This is equivalent <dd><p>A layer for repeating the input for num_repeats times. This is equivalent
to apply concat() with num_repeats same input.</p> to apply concat() with num_repeats same input.</p>
<div class="math"> <div class="math">
...@@ -2178,7 +2153,7 @@ to apply concat() with num_repeats same input.</p> ...@@ -2178,7 +2153,7 @@ to apply concat() with num_repeats same input.</p>
<h3>rotate<a class="headerlink" href="#rotate" title="永久链接至标题"></a></h3> <h3>rotate<a class="headerlink" href="#rotate" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rotate</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rotate</code></dt>
<dd><p>A layer for rotating 90 degrees (clock-wise) for each feature channel, <dd><p>A layer for rotating 90 degrees (clock-wise) for each feature channel,
usually used when the input sample is some image or feature map.</p> usually used when the input sample is some image or feature map.</p>
<div class="math"> <div class="math">
...@@ -2217,7 +2192,7 @@ usually used when the input sample is some image or feature map.</p> ...@@ -2217,7 +2192,7 @@ usually used when the input sample is some image or feature map.</p>
<h3>seq_reshape<a class="headerlink" href="#seq-reshape" title="永久链接至标题"></a></h3> <h3>seq_reshape<a class="headerlink" href="#seq-reshape" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_reshape</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">seq_reshape</code></dt>
<dd><p>A layer for reshaping the sequence. Assume the input sequence has T instances, <dd><p>A layer for reshaping the sequence. Assume the input sequence has T instances,
the dimension of each instance is M, and the input reshape_size is N, then the the dimension of each instance is M, and the input reshape_size is N, then the
output sequence has T*M/N instances, the dimension of each instance is N.</p> output sequence has T*M/N instances, the dimension of each instance is N.</p>
...@@ -2234,7 +2209,7 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p> ...@@ -2234,7 +2209,7 @@ output sequence has T*M/N instances, the dimension of each instance is N.</p>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li> <li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer.</li>
<li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li> <li><strong>reshape_size</strong> (<em>int</em>) &#8211; the size of reshaped sequence.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation type.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation type.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; extra layer attributes.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em><em> or </em><em>bool</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2260,7 +2235,7 @@ default Bias.</li> ...@@ -2260,7 +2235,7 @@ default Bias.</li>
<h3>addto<a class="headerlink" href="#addto" title="永久链接至标题"></a></h3> <h3>addto<a class="headerlink" href="#addto" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">addto</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">addto</code></dt>
<dd><p>AddtoLayer.</p> <dd><p>AddtoLayer.</p>
<div class="math"> <div class="math">
\[y = f(\sum_{i} x_i + b)\]</div> \[y = f(\sum_{i} x_i + b)\]</div>
...@@ -2268,7 +2243,7 @@ default Bias.</li> ...@@ -2268,7 +2243,7 @@ default Bias.</li>
and <span class="math">\(f\)</span> is activation function.</p> and <span class="math">\(f\)</span> is activation function.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">addto</span> <span class="o">=</span> <span class="n">addto</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">,</span> <span class="n">layer2</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">addto</span> <span class="o">=</span> <span class="n">addto</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">,</span> <span class="n">layer2</span><span class="p">],</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">activation</span><span class="o">.</span><span class="n">Relu</span><span class="p">(),</span>
<span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="n">bias_attr</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
</pre></div> </pre></div>
</div> </div>
...@@ -2290,7 +2265,7 @@ Please refer to dropout for details.</p> ...@@ -2290,7 +2265,7 @@ Please refer to dropout for details.</p>
<li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; Layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of <li><strong>input</strong> (<em>paddle.v2.config_base.Layer|list|tuple</em>) &#8211; Input layers. It could be a paddle.v2.config_base.Layer or list/tuple of
paddle.v2.config_base.Layer.</li> paddle.v2.config_base.Layer.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type, default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type, default is tanh.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|bool</em>) &#8211; Bias attribute. If False, means no bias. None is default
bias.</li> bias.</li>
<li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li> <li><strong>layer_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; Extra Layer attribute.</li>
...@@ -2312,7 +2287,7 @@ bias.</li> ...@@ -2312,7 +2287,7 @@ bias.</li>
<h3>linear_comb<a class="headerlink" href="#linear-comb" title="永久链接至标题"></a></h3> <h3>linear_comb<a class="headerlink" href="#linear-comb" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">linear_comb</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">linear_comb</code></dt>
<dd><dl class="docutils"> <dd><dl class="docutils">
<dt>A layer for weighted sum of vectors takes two inputs.</dt> <dt>A layer for weighted sum of vectors takes two inputs.</dt>
<dd><ul class="first last simple"> <dd><ul class="first last simple">
...@@ -2375,7 +2350,7 @@ processed in one batch.</p> ...@@ -2375,7 +2350,7 @@ processed in one batch.</p>
<h3>interpolation<a class="headerlink" href="#interpolation" title="永久链接至标题"></a></h3> <h3>interpolation<a class="headerlink" href="#interpolation" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">interpolation</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">interpolation</code></dt>
<dd><p>This layer is for linear interpolation with two inputs, <dd><p>This layer is for linear interpolation with two inputs,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -2414,7 +2389,7 @@ which is used in NEURAL TURING MACHINE.</p> ...@@ -2414,7 +2389,7 @@ which is used in NEURAL TURING MACHINE.</p>
<h3>bilinear_interp<a class="headerlink" href="#bilinear-interp" title="永久链接至标题"></a></h3> <h3>bilinear_interp<a class="headerlink" href="#bilinear-interp" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">bilinear_interp</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">bilinear_interp</code></dt>
<dd><p>This layer is to implement bilinear interpolation on conv layer output.</p> <dd><p>This layer is to implement bilinear interpolation on conv layer output.</p>
<p>Please refer to Wikipedia: <a class="reference external" href="https://en.wikipedia.org/wiki/Bilinear_interpolation">https://en.wikipedia.org/wiki/Bilinear_interpolation</a></p> <p>Please refer to Wikipedia: <a class="reference external" href="https://en.wikipedia.org/wiki/Bilinear_interpolation">https://en.wikipedia.org/wiki/Bilinear_interpolation</a></p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
...@@ -2449,7 +2424,7 @@ which is used in NEURAL TURING MACHINE.</p> ...@@ -2449,7 +2424,7 @@ which is used in NEURAL TURING MACHINE.</p>
<h3>power<a class="headerlink" href="#power" title="永久链接至标题"></a></h3> <h3>power<a class="headerlink" href="#power" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">power</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">power</code></dt>
<dd><p>This layer applies a power function to a vector element-wise, <dd><p>This layer applies a power function to a vector element-wise,
which is used in NEURAL TURING MACHINE.</p> which is used in NEURAL TURING MACHINE.</p>
<div class="math"> <div class="math">
...@@ -2487,7 +2462,7 @@ and <span class="math">\(y\)</span> is a output vector.</p> ...@@ -2487,7 +2462,7 @@ and <span class="math">\(y\)</span> is a output vector.</p>
<h3>scaling<a class="headerlink" href="#scaling" title="永久链接至标题"></a></h3> <h3>scaling<a class="headerlink" href="#scaling" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">scaling</code></dt>
<dd><p>A layer for multiplying input vector by weight scalar.</p> <dd><p>A layer for multiplying input vector by weight scalar.</p>
<div class="math"> <div class="math">
\[y = w x\]</div> \[y = w x\]</div>
...@@ -2526,7 +2501,7 @@ processed in one batch.</p> ...@@ -2526,7 +2501,7 @@ processed in one batch.</p>
<h3>slope_intercept<a class="headerlink" href="#slope-intercept" title="永久链接至标题"></a></h3> <h3>slope_intercept<a class="headerlink" href="#slope-intercept" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">slope_intercept</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">slope_intercept</code></dt>
<dd><p>This layer for applying a slope and an intercept to the input <dd><p>This layer for applying a slope and an intercept to the input
element-wise. There is no activation and weight.</p> element-wise. There is no activation and weight.</p>
<div class="math"> <div class="math">
...@@ -2563,7 +2538,7 @@ element-wise. There is no activation and weight.</p> ...@@ -2563,7 +2538,7 @@ element-wise. There is no activation and weight.</p>
<h3>tensor<a class="headerlink" href="#tensor" title="永久链接至标题"></a></h3> <h3>tensor<a class="headerlink" href="#tensor" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">tensor</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">tensor</code></dt>
<dd><p>This layer performs tensor operation for two input. <dd><p>This layer performs tensor operation for two input.
For example, each sample:</p> For example, each sample:</p>
<div class="math"> <div class="math">
...@@ -2592,7 +2567,7 @@ For example, each sample:</p> ...@@ -2592,7 +2567,7 @@ For example, each sample:</p>
<li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li> <li><strong>a</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; Input layer a.</li>
<li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li> <li><strong>b</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer b.</li>
<li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li> <li><strong>size</strong> (<em>int.</em>) &#8211; the layer dimension.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation Type. Default is tanh.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation Type. Default is tanh.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute.</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or <li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|None|Any</em>) &#8211; The Bias Attribute. If no bias, then pass False or
something not type of paddle.v2.attr.ParameterAttribute. None will get a something not type of paddle.v2.attr.ParameterAttribute. None will get a
...@@ -2616,7 +2591,7 @@ default Bias.</li> ...@@ -2616,7 +2591,7 @@ default Bias.</li>
<span id="api-v2-layer-cos-sim"></span><h3>cos_sim<a class="headerlink" href="#cos-sim" title="永久链接至标题"></a></h3> <span id="api-v2-layer-cos-sim"></span><h3>cos_sim<a class="headerlink" href="#cos-sim" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cos_sim</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cos_sim</code></dt>
<dd><p>Cosine Similarity Layer. The cosine similarity equation is here.</p> <dd><p>Cosine Similarity Layer. The cosine similarity equation is here.</p>
<div class="math"> <div class="math">
\[similarity = cos(\theta) = {\mathbf{a} \cdot \mathbf{b} \[similarity = cos(\theta) = {\mathbf{a} \cdot \mathbf{b}
...@@ -2659,7 +2634,7 @@ processed in one batch.</p> ...@@ -2659,7 +2634,7 @@ processed in one batch.</p>
<h3>trans<a class="headerlink" href="#trans" title="永久链接至标题"></a></h3> <h3>trans<a class="headerlink" href="#trans" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">trans</code></dt>
<dd><p>A layer for transposing a minibatch matrix.</p> <dd><p>A layer for transposing a minibatch matrix.</p>
<div class="math"> <div class="math">
\[y = x^\mathrm{T}\]</div> \[y = x^\mathrm{T}\]</div>
...@@ -2697,7 +2672,7 @@ processed in one batch.</p> ...@@ -2697,7 +2672,7 @@ processed in one batch.</p>
<h3>maxid<a class="headerlink" href="#maxid" title="永久链接至标题"></a></h3> <h3>maxid<a class="headerlink" href="#maxid" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">max_id</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">max_id</code></dt>
<dd><p>A layer for finding the id which has the maximal value for each sample. <dd><p>A layer for finding the id which has the maximal value for each sample.
The result is stored in output.ids.</p> The result is stored in output.ids.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
...@@ -2730,7 +2705,7 @@ The result is stored in output.ids.</p> ...@@ -2730,7 +2705,7 @@ The result is stored in output.ids.</p>
<h3>sampling_id<a class="headerlink" href="#sampling-id" title="永久链接至标题"></a></h3> <h3>sampling_id<a class="headerlink" href="#sampling-id" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sampling_id</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sampling_id</code></dt>
<dd><p>A layer for sampling id from multinomial distribution from the input layer. <dd><p>A layer for sampling id from multinomial distribution from the input layer.
Sampling one id for one sample.</p> Sampling one id for one sample.</p>
<p>The simple usage is:</p> <p>The simple usage is:</p>
...@@ -2766,7 +2741,7 @@ Sampling one id for one sample.</p> ...@@ -2766,7 +2741,7 @@ Sampling one id for one sample.</p>
<h3>pad<a class="headerlink" href="#pad" title="永久链接至标题"></a></h3> <h3>pad<a class="headerlink" href="#pad" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pad</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">pad</code></dt>
<dd><p>This operation pads zeros to the input data according to pad_c,pad_h <dd><p>This operation pads zeros to the input data according to pad_c,pad_h
and pad_w. pad_c, pad_h, pad_w specifies the which dimension and size and pad_w. pad_c, pad_h, pad_w specifies the which dimension and size
of padding. And the input data shape is NCHW.</p> of padding. And the input data shape is NCHW.</p>
...@@ -2835,7 +2810,7 @@ in width dimension.</p> ...@@ -2835,7 +2810,7 @@ in width dimension.</p>
<h3>cross_entropy_cost<a class="headerlink" href="#cross-entropy-cost" title="永久链接至标题"></a></h3> <h3>cross_entropy_cost<a class="headerlink" href="#cross-entropy-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_cost</code></dt>
<dd><p>A loss layer for multi class entropy.</p> <dd><p>A loss layer for multi class entropy.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2873,7 +2848,7 @@ will not be calculated for weight.</li> ...@@ -2873,7 +2848,7 @@ will not be calculated for weight.</li>
<h3>cross_entropy_with_selfnorm_cost<a class="headerlink" href="#cross-entropy-with-selfnorm-cost" title="永久链接至标题"></a></h3> <h3>cross_entropy_with_selfnorm_cost<a class="headerlink" href="#cross-entropy-with-selfnorm-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_with_selfnorm_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">cross_entropy_with_selfnorm_cost</code></dt>
<dd><p>A loss layer for multi class entropy with selfnorm. <dd><p>A loss layer for multi class entropy with selfnorm.
Input should be a vector of positive numbers, without normalization.</p> Input should be a vector of positive numbers, without normalization.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy_with_selfnorm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy_with_selfnorm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
...@@ -2909,7 +2884,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2909,7 +2884,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>multi_binary_label_cross_entropy_cost<a class="headerlink" href="#multi-binary-label-cross-entropy-cost" title="永久链接至标题"></a></h3> <h3>multi_binary_label_cross_entropy_cost<a class="headerlink" href="#multi-binary-label-cross-entropy-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">multi_binary_label_cross_entropy_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">multi_binary_label_cross_entropy_cost</code></dt>
<dd><p>A loss layer for multi binary label cross entropy.</p> <dd><p>A loss layer for multi binary label cross entropy.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">multi_binary_label_cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">multi_binary_label_cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2943,7 +2918,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2943,7 +2918,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>huber_cost<a class="headerlink" href="#huber-cost" title="永久链接至标题"></a></h3> <h3>huber_cost<a class="headerlink" href="#huber-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">huber_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">huber_cost</code></dt>
<dd><p>A loss layer for huber loss.</p> <dd><p>A loss layer for huber loss.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">huber_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">huber_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
...@@ -2977,7 +2952,7 @@ Input should be a vector of positive numbers, without normalization.</p> ...@@ -2977,7 +2952,7 @@ Input should be a vector of positive numbers, without normalization.</p>
<h3>lambda_cost<a class="headerlink" href="#lambda-cost" title="永久链接至标题"></a></h3> <h3>lambda_cost<a class="headerlink" href="#lambda-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lambda_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">lambda_cost</code></dt>
<dd><p>lambdaCost for lambdaRank LTR approach.</p> <dd><p>lambdaCost for lambdaRank LTR approach.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">lambda_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">lambda_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">,</span>
...@@ -3023,7 +2998,7 @@ entire list of get gradient.</li> ...@@ -3023,7 +2998,7 @@ entire list of get gradient.</li>
<h3>mse_cost<a class="headerlink" href="#mse-cost" title="永久链接至标题"></a></h3> <h3>mse_cost<a class="headerlink" href="#mse-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">mse_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">mse_cost</code></dt>
<dd><blockquote> <dd><blockquote>
<div><p>mean squared error cost:</p> <div><p>mean squared error cost:</p>
<div class="math"> <div class="math">
...@@ -3076,7 +3051,7 @@ It is an optional argument.</td> ...@@ -3076,7 +3051,7 @@ It is an optional argument.</td>
<h3>rank_cost<a class="headerlink" href="#rank-cost" title="永久链接至标题"></a></h3> <h3>rank_cost<a class="headerlink" href="#rank-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rank_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">rank_cost</code></dt>
<dd><p>A cost Layer for learning to rank using gradient descent. Details can refer <dd><p>A cost Layer for learning to rank using gradient descent. Details can refer
to <a class="reference external" href="http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf">papers</a>. to <a class="reference external" href="http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf">papers</a>.
This layer contains at least three inputs. The weight is an optional This layer contains at least three inputs. The weight is an optional
...@@ -3131,7 +3106,7 @@ It is an optional argument.</li> ...@@ -3131,7 +3106,7 @@ It is an optional argument.</li>
<h3>sum_cost<a class="headerlink" href="#sum-cost" title="永久链接至标题"></a></h3> <h3>sum_cost<a class="headerlink" href="#sum-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">sum_cost</code></dt>
<dd><p>A loss layer which calculate the sum of the input as loss</p> <dd><p>A loss layer which calculate the sum of the input as loss</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">sum_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">)</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">cost</span> <span class="o">=</span> <span class="n">sum_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="nb">input</span><span class="p">)</span>
</pre></div> </pre></div>
...@@ -3162,7 +3137,7 @@ It is an optional argument.</li> ...@@ -3162,7 +3137,7 @@ It is an optional argument.</li>
<h3>crf<a class="headerlink" href="#crf" title="永久链接至标题"></a></h3> <h3>crf<a class="headerlink" href="#crf" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf</code></dt>
<dd><p>A layer for calculating the cost of sequential conditional random <dd><p>A layer for calculating the cost of sequential conditional random
field model.</p> field model.</p>
<p>The simple usage:</p> <p>The simple usage:</p>
...@@ -3203,7 +3178,7 @@ optional argument.</li> ...@@ -3203,7 +3178,7 @@ optional argument.</li>
<h3>crf_decoding<a class="headerlink" href="#crf-decoding" title="永久链接至标题"></a></h3> <h3>crf_decoding<a class="headerlink" href="#crf-decoding" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf_decoding</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">crf_decoding</code></dt>
<dd><p>A layer for calculating the decoding sequence of sequential conditional <dd><p>A layer for calculating the decoding sequence of sequential conditional
random field model. The decoding sequence is stored in output.ids. random field model. The decoding sequence is stored in output.ids.
If a second input is provided, it is treated as the ground-truth label, and If a second input is provided, it is treated as the ground-truth label, and
...@@ -3243,7 +3218,7 @@ decoding or 0 for correct decoding.</p> ...@@ -3243,7 +3218,7 @@ decoding or 0 for correct decoding.</p>
<h3>ctc<a class="headerlink" href="#ctc" title="永久链接至标题"></a></h3> <h3>ctc<a class="headerlink" href="#ctc" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">ctc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">ctc</code></dt>
<dd><p>Connectionist Temporal Classification (CTC) is designed for temporal <dd><p>Connectionist Temporal Classification (CTC) is designed for temporal
classication task. That is, for sequence labeling problems where the classication task. That is, for sequence labeling problems where the
alignment between the inputs and the target labels is unknown.</p> alignment between the inputs and the target labels is unknown.</p>
...@@ -3294,7 +3269,7 @@ should also be num_classes + 1.</p> ...@@ -3294,7 +3269,7 @@ should also be num_classes + 1.</p>
<h3>warp_ctc<a class="headerlink" href="#warp-ctc" title="永久链接至标题"></a></h3> <h3>warp_ctc<a class="headerlink" href="#warp-ctc" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">warp_ctc</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">warp_ctc</code></dt>
<dd><p>A layer intergrating the open-source <cite>warp-ctc <dd><p>A layer intergrating the open-source <cite>warp-ctc
&lt;https://github.com/baidu-research/warp-ctc&gt;</cite> library, which is used in &lt;https://github.com/baidu-research/warp-ctc&gt;</cite> library, which is used in
<cite>Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin <cite>Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin
...@@ -3354,7 +3329,7 @@ should be consistent as that used in your labels.</li> ...@@ -3354,7 +3329,7 @@ should be consistent as that used in your labels.</li>
<h3>nce<a class="headerlink" href="#nce" title="永久链接至标题"></a></h3> <h3>nce<a class="headerlink" href="#nce" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">nce</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">nce</code></dt>
<dd><p>Noise-contrastive estimation. <dd><p>Noise-contrastive estimation.
Implements the method in the following paper: Implements the method in the following paper:
A fast and simple algorithm for training neural probabilistic language models.</p> A fast and simple algorithm for training neural probabilistic language models.</p>
...@@ -3374,7 +3349,7 @@ A fast and simple algorithm for training neural probabilistic language models.</ ...@@ -3374,7 +3349,7 @@ A fast and simple algorithm for training neural probabilistic language models.</
<li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li> <li><strong>label</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; label layer</li>
<li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li> <li><strong>weight</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; weight layer, can be None(default)</li>
<li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li> <li><strong>num_classes</strong> (<em>int</em>) &#8211; number of classes.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; Activation, default is Sigmoid.</li> <li><strong>act</strong> (<em>paddle.v2.activation.Base</em>) &#8211; Activation, default is Sigmoid.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li> <li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; The Parameter Attribute|list.</li>
<li><strong>num_neg_samples</strong> (<em>int</em>) &#8211; number of negative samples. Default is 10.</li> <li><strong>num_neg_samples</strong> (<em>int</em>) &#8211; number of negative samples. Default is 10.</li>
<li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels. <li><strong>neg_distribution</strong> (<em>list|tuple|collections.Sequence|None</em>) &#8211; The distribution for generating the random negative labels.
...@@ -3400,7 +3375,7 @@ If not None, its length must be equal to num_classes.</li> ...@@ -3400,7 +3375,7 @@ If not None, its length must be equal to num_classes.</li>
<h3>hsigmoid<a class="headerlink" href="#hsigmoid" title="永久链接至标题"></a></h3> <h3>hsigmoid<a class="headerlink" href="#hsigmoid" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">hsigmoid</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">hsigmoid</code></dt>
<dd><p>Organize the classes into a binary tree. At each node, a sigmoid function <dd><p>Organize the classes into a binary tree. At each node, a sigmoid function
is used to calculate the probability of belonging to the right branch. is used to calculate the probability of belonging to the right branch.
This idea is from &#8220;F. Morin, Y. Bengio (AISTATS 05): This idea is from &#8220;F. Morin, Y. Bengio (AISTATS 05):
...@@ -3442,7 +3417,7 @@ False means no bias.</li> ...@@ -3442,7 +3417,7 @@ False means no bias.</li>
<h3>smooth_l1_cost<a class="headerlink" href="#smooth-l1-cost" title="永久链接至标题"></a></h3> <h3>smooth_l1_cost<a class="headerlink" href="#smooth-l1-cost" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">smooth_l1_cost</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">smooth_l1_cost</code></dt>
<dd><p>This is a L1 loss but more smooth. It requires that the <dd><p>This is a L1 loss but more smooth. It requires that the
size of input and label are equal. The formula is as follows,</p> size of input and label are equal. The formula is as follows,</p>
<div class="math"> <div class="math">
...@@ -3486,7 +3461,7 @@ size of input and label are equal. The formula is as follows,</p> ...@@ -3486,7 +3461,7 @@ size of input and label are equal. The formula is as follows,</p>
<h3>eos<a class="headerlink" href="#eos" title="永久链接至标题"></a></h3> <h3>eos<a class="headerlink" href="#eos" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="class">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">eos</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <em class="property">class </em><code class="descclassname">paddle.v2.layer.</code><code class="descname">eos</code></dt>
<dd><p>A layer for checking EOS for each sample: <dd><p>A layer for checking EOS for each sample:
- output_id = (input_id == conf.eos_id)</p> - output_id = (input_id == conf.eos_id)</p>
<p>The result is stored in output_.ids. <p>The result is stored in output_.ids.
......
...@@ -197,9 +197,9 @@ ...@@ -197,9 +197,9 @@
<h2>NLP<a class="headerlink" href="#nlp" title="永久链接至标题"></a></h2> <h2>NLP<a class="headerlink" href="#nlp" title="永久链接至标题"></a></h2>
<div class="section" id="sequence-conv-pool"> <div class="section" id="sequence-conv-pool">
<h3>sequence_conv_pool<a class="headerlink" href="#sequence-conv-pool" title="永久链接至标题"></a></h3> <h3>sequence_conv_pool<a class="headerlink" href="#sequence-conv-pool" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">sequence_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -208,34 +208,34 @@ ...@@ -208,34 +208,34 @@
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; fc layer activation type. None means tanh</li> <li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care. <li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li> False if no bias.</li>
<li><strong>fc_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -245,9 +245,9 @@ False if no bias.</li> ...@@ -245,9 +245,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="text-conv-pool"> <div class="section" id="text-conv-pool">
<span id="api-trainer-config-helpers-network-text-conv-pool"></span><h3>text_conv_pool<a class="headerlink" href="#text-conv-pool" title="永久链接至标题"></a></h3> <span id="api-trainer-config-helpers-network-text-conv-pool"></span><h3>text_conv_pool<a class="headerlink" href="#text-conv-pool" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">text_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Text convolution pooling layers helper.</p> <dd><p>Text convolution pooling layers helper.</p>
<p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p> <p>Text input =&gt; Context Projection =&gt; FC Layer =&gt; Pooling =&gt; Output.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -256,34 +256,34 @@ False if no bias.</li> ...@@ -256,34 +256,34 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of output layer(pooling layer name)</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; name of input layer</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; name of input layer</li>
<li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See <li><strong>context_len</strong> (<em>int</em>) &#8211; context projection length. See
context_projection&#8217;s document.</li> context_projection&#8217;s document.</li>
<li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li> <li><strong>hidden_size</strong> (<em>int</em>) &#8211; FC Layer size.</li>
<li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See <li><strong>context_start</strong> (<em>int</em><em> or </em><em>None</em>) &#8211; context projection length. See
context_projection&#8217;s context_start.</li> context_projection&#8217;s context_start.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType.</em>) &#8211; pooling layer type. See pooling_layer&#8217;s document.</li>
<li><strong>context_proj_name</strong> (<em>basestring</em>) &#8211; context projection layer name. <li><strong>context_proj_layer_name</strong> (<em>basestring</em>) &#8211; context projection layer name.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>context_proj_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute. <li><strong>context_proj_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; context projection parameter attribute.
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li> <li><strong>fc_layer_name</strong> (<em>basestring</em>) &#8211; fc layer name. None if user don&#8217;t care.</li>
<li><strong>fc_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li> <li><strong>fc_param_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc layer parameter attribute. None if user don&#8217;t care.</li>
<li><strong>fc_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias, <li><strong>fc_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None</em>) &#8211; fc bias parameter attribute. False if no bias,
None if user don&#8217;t care.</li> None if user don&#8217;t care.</li>
<li><strong>fc_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; fc layer activation type. None means tanh</li> <li><strong>fc_act</strong> (<em>BaseActivation</em>) &#8211; fc layer activation type. None means tanh</li>
<li><strong>pool_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care. <li><strong>pool_bias_attr</strong> (<em>ParameterAttribute</em><em> or </em><em>None.</em>) &#8211; pooling layer bias attr. None if don&#8217;t care.
False if no bias.</li> False if no bias.</li>
<li><strong>fc_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; fc layer extra attribute.</li> <li><strong>fc_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; fc layer extra attribute.</li>
<li><strong>context_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; context projection layer extra attribute.</li> <li><strong>context_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; context projection layer extra attribute.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; pooling layer extra attribute.</li> <li><strong>pool_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; pooling layer extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">output layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -296,9 +296,9 @@ False if no bias.</li> ...@@ -296,9 +296,9 @@ False if no bias.</li>
<h2>Images<a class="headerlink" href="#images" title="永久链接至标题"></a></h2> <h2>Images<a class="headerlink" href="#images" title="永久链接至标题"></a></h2>
<div class="section" id="img-conv-bn-pool"> <div class="section" id="img-conv-bn-pool">
<h3>img_conv_bn_pool<a class="headerlink" href="#img-conv-bn-pool" title="永久链接至标题"></a></h3> <h3>img_conv_bn_pool<a class="headerlink" href="#img-conv-bn-pool" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_bn_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Convolution, batch normalization, pooling group.</p> <dd><p>Convolution, batch normalization, pooling group.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -306,33 +306,33 @@ False if no bias.</li> ...@@ -306,33 +306,33 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; layer&#8217;s input</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; layer&#8217;s input</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; see batch_norm&#8217;s document.</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>conv_attr</strong> (<em>Extrapaddle.v2.config_base.Layer</em>) &#8211; see img_conv&#8217;s document.</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerOutput</em>) &#8211; see img_conv_layer&#8217;s document.</li>
<li><strong>bn_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute.</em>) &#8211; see batch_norm&#8217;s document.</li> <li><strong>bn_param_attr</strong> (<em>ParameterAttribute.</em>) &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_bias_attr</strong> &#8211; see batch_norm&#8217;s document.</li> <li><strong>bn_bias_attr</strong> &#8211; see batch_norm_layer&#8217;s document.</li>
<li><strong>bn_attr</strong> &#8211; paddle.v2.attr.ParameterAttribute.</li> <li><strong>bn_layer_attr</strong> &#8211; ParameterAttribute.</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer&#8217;s document.</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_pool&#8217;s document.</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer&#8217;s document.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer groups output</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer groups output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -342,9 +342,9 @@ False if no bias.</li> ...@@ -342,9 +342,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="img-conv-group"> <div class="section" id="img-conv-group">
<h3>img_conv_group<a class="headerlink" href="#img-conv-group" title="永久链接至标题"></a></h3> <h3>img_conv_group<a class="headerlink" href="#img-conv-group" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_group</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">img_conv_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Image Convolution Group, Used for vgg net.</p> <dd><p>Image Convolution Group, Used for vgg net.</p>
<p>TODO(yuyang18): Complete docs</p> <p>TODO(yuyang18): Complete docs</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -376,9 +376,9 @@ False if no bias.</li> ...@@ -376,9 +376,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="simple-img-conv-pool"> <div class="section" id="simple-img-conv-pool">
<span id="api-trainer-config-helpers-network-simple-img-conv-pool"></span><h3>simple_img_conv_pool<a class="headerlink" href="#simple-img-conv-pool" title="永久链接至标题"></a></h3> <span id="api-trainer-config-helpers-network-simple-img-conv-pool"></span><h3>simple_img_conv_pool<a class="headerlink" href="#simple-img-conv-pool" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_img_conv_pool</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple image convolution and pooling group.</p> <dd><p>Simple image convolution and pooling group.</p>
<p>Input =&gt; conv =&gt; pooling</p> <p>Input =&gt; conv =&gt; pooling</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -387,30 +387,30 @@ False if no bias.</li> ...@@ -387,30 +387,30 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; group name</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>filter_size</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>num_filters</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_size</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool for details</li> <li><strong>pool_type</strong> (<em>BasePoolingType</em>) &#8211; see img_pool_layer for details</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; see img_conv for details</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; see img_conv_layer for details</li>
<li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>groups</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>conv_stride</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>conv_padding</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv for details</li> <li><strong>bias_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv for details</li> <li><strong>num_channel</strong> (<em>int</em>) &#8211; see img_conv_layer for details</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; see img_conv for details</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv for details</li> <li><strong>shared_bias</strong> (<em>bool</em>) &#8211; see img_conv_layer for details</li>
<li><strong>conv_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_conv for details</li> <li><strong>conv_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_conv_layer for details</li>
<li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_stride</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool for details</li> <li><strong>pool_padding</strong> (<em>int</em>) &#8211; see img_pool_layer for details</li>
<li><strong>pool_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; see img_pool for details</li> <li><strong>pool_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; see img_pool_layer for details</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer&#8217;s output</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">Layer&#8217;s output</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -423,9 +423,9 @@ False if no bias.</li> ...@@ -423,9 +423,9 @@ False if no bias.</li>
</div> </div>
<div class="section" id="vgg-16-network"> <div class="section" id="vgg-16-network">
<h3>vgg_16_network<a class="headerlink" href="#vgg-16-network" title="永久链接至标题"></a></h3> <h3>vgg_16_network<a class="headerlink" href="#vgg-16-network" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">vgg_16_network</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">vgg_16_network</code><span class="sig-paren">(</span><em>input_image</em>, <em>num_channels</em>, <em>num_classes=1000</em><span class="sig-paren">)</span></dt>
<dd><p>Same model from <a class="reference external" href="https://gist.github.com/ksimonyan/211839e770f7b538e2d8">https://gist.github.com/ksimonyan/211839e770f7b538e2d8</a></p> <dd><p>Same model from <a class="reference external" href="https://gist.github.com/ksimonyan/211839e770f7b538e2d8">https://gist.github.com/ksimonyan/211839e770f7b538e2d8</a></p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
...@@ -433,7 +433,7 @@ False if no bias.</li> ...@@ -433,7 +433,7 @@ False if no bias.</li>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>num_classes</strong> &#8211; </li> <li><strong>num_classes</strong> &#8211; </li>
<li><strong>input_image</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; </li> <li><strong>input_image</strong> (<em>LayerOutput</em>) &#8211; </li>
<li><strong>num_channels</strong> (<em>int</em>) &#8211; </li> <li><strong>num_channels</strong> (<em>int</em>) &#8211; </li>
</ul> </ul>
</td> </td>
...@@ -453,9 +453,9 @@ False if no bias.</li> ...@@ -453,9 +453,9 @@ False if no bias.</li>
<h3>LSTM<a class="headerlink" href="#lstm" title="永久链接至标题"></a></h3> <h3>LSTM<a class="headerlink" href="#lstm" title="永久链接至标题"></a></h3>
<div class="section" id="lstmemory-unit"> <div class="section" id="lstmemory-unit">
<h4>lstmemory_unit<a class="headerlink" href="#lstmemory-unit" title="永久链接至标题"></a></h4> <h4>lstmemory_unit<a class="headerlink" href="#lstmemory-unit" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a LSTM unit performs in a single time step. <dd><p>Define calculations that a LSTM unit performs in a single time step.
This function itself is not a recurrent layer, so that it can not be This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is always used in directly applied to sequence input. This function is always used in
...@@ -469,9 +469,9 @@ for more details about LSTM. The link goes as follows: ...@@ -469,9 +469,9 @@ for more details about LSTM. The link goes as follows:
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_unit</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_unit</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">(),</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
<span class="n">state_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -479,27 +479,27 @@ for more details about LSTM. The link goes as follows: ...@@ -479,27 +479,27 @@ for more details about LSTM. The link goes as follows:
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory unit name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory unit size.</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer. <li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li> <li><strong>get_output_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">lstmemory unit name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">lstmemory unit name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -509,9 +509,9 @@ False means no bias, None means default bias.</li> ...@@ -509,9 +509,9 @@ False means no bias, None means default bias.</li>
</div> </div>
<div class="section" id="lstmemory-group"> <div class="section" id="lstmemory-group">
<h4>lstmemory_group<a class="headerlink" href="#lstmemory-group" title="永久链接至标题"></a></h4> <h4>lstmemory_group<a class="headerlink" href="#lstmemory-group" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">lstmemory_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>lstm_group is a recurrent layer group version of Long Short Term Memory. It <dd><p>lstm_group is a recurrent layer group version of Long Short Term Memory. It
does exactly the same calculation as the lstmemory layer (see lstmemory in does exactly the same calculation as the lstmemory layer (see lstmemory in
layers.py for the maths) does. A promising benefit is that LSTM memory layers.py for the maths) does. A promising benefit is that LSTM memory
...@@ -524,14 +524,14 @@ lstmemory_group.</p> ...@@ -524,14 +524,14 @@ lstmemory_group.</p>
multiplications: multiplications:
<span class="math">\(W_{xi}x_{t}\)</span> , <span class="math">\(W_{xf}x_{t}\)</span>, <span class="math">\(W_{xi}x_{t}\)</span> , <span class="math">\(W_{xf}x_{t}\)</span>,
<span class="math">\(W_{xc}x_t\)</span>, <span class="math">\(W_{xo}x_{t}\)</span> are not done in lstmemory_unit to <span class="math">\(W_{xc}x_t\)</span>, <span class="math">\(W_{xo}x_{t}\)</span> are not done in lstmemory_unit to
speed up the calculations. Consequently, an additional mixed with speed up the calculations. Consequently, an additional mixed_layer with
full_matrix_projection must be included before lstmemory_unit is called.</p> full_matrix_projection must be included before lstmemory_unit is called.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm_step</span> <span class="o">=</span> <span class="n">lstmemory_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">(),</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">(),</span>
<span class="n">state_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">())</span> <span class="n">state_act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -539,28 +539,28 @@ full_matrix_projection must be included before lstmemory_unit is called.</p> ...@@ -539,28 +539,28 @@ full_matrix_projection must be included before lstmemory_unit is called.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory group name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstmemory group name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstmemory group size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; is lstm reversed</li>
<li><strong>param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li> <li><strong>param_attr</strong> (<em>ParameterAttribute</em>) &#8211; Parameter config, None if use default.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer. <li><strong>mixed_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of mixed layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>lstm_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer. <li><strong>lstm_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute of lstm layer.
False means no bias, None means default bias.</li> False means no bias, None means default bias.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
<li><strong>get_output_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li> <li><strong>get_output_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; get output layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the lstmemory group.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the lstmemory group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -570,9 +570,9 @@ False means no bias, None means default bias.</li> ...@@ -570,9 +570,9 @@ False means no bias, None means default bias.</li>
</div> </div>
<div class="section" id="simple-lstm"> <div class="section" id="simple-lstm">
<h4>simple_lstm<a class="headerlink" href="#simple-lstm" title="永久链接至标题"></a></h4> <h4>simple_lstm<a class="headerlink" href="#simple-lstm" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Simple LSTM Cell.</p> <dd><p>Simple LSTM Cell.</p>
<p>It just combine a mixed layer with fully_matrix_projection and a lstmemory <p>It just combine a mixed layer with fully_matrix_projection and a lstmemory
layer. The simple lstm cell was implemented as follow equations.</p> layer. The simple lstm cell was implemented as follow equations.</p>
...@@ -586,25 +586,25 @@ want to know what lstm is. <a class="reference external" href="http://arxiv.org/ ...@@ -586,25 +586,25 @@ want to know what lstm is. <a class="reference external" href="http://arxiv.org/
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; lstm layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>mat_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li> <li><strong>mat_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; mixed layer&#8217;s matrix projection parameter attribute.</li>
<li><strong>bias_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None <li><strong>bias_param_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias parameter attribute. False means no bias, None
means default bias.</li> means default bias.</li>
<li><strong>inner_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li> <li><strong>inner_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; lstm cell parameter attribute.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm final activiation type</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; lstm final activiation type</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm gate activiation type</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; lstm gate activiation type</li>
<li><strong>state_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; lstm state activiation type.</li> <li><strong>state_act</strong> (<em>BaseActivation</em>) &#8211; lstm state activiation type.</li>
<li><strong>mixed_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li> <li><strong>mixed_layer_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; mixed layer&#8217;s extra attribute.</li>
<li><strong>lstm_cell_attr</strong> (<em>paddle.v2.attr.ExtraAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li> <li><strong>lstm_cell_attr</strong> (<em>ExtraLayerAttribute</em>) &#8211; lstm layer&#8217;s extra attribute.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">lstm layer name.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">lstm layer name.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -614,9 +614,9 @@ means default bias.</li> ...@@ -614,9 +614,9 @@ means default bias.</li>
</div> </div>
<div class="section" id="bidirectional-lstm"> <div class="section" id="bidirectional-lstm">
<h4>bidirectional_lstm<a class="headerlink" href="#bidirectional-lstm" title="永久链接至标题"></a></h4> <h4>bidirectional_lstm<a class="headerlink" href="#bidirectional-lstm" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_lstm</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input <dd><p>A bidirectional_lstm is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and bardward orders, and then concatenate two
outputs form a final output. However, concatenation of two outputs outputs form a final output. However, concatenation of two outputs
...@@ -636,7 +636,7 @@ The link goes as follows: ...@@ -636,7 +636,7 @@ The link goes as follows:
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional lstm layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; lstm layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are
concatenated and returned. concatenated and returned.
...@@ -646,10 +646,10 @@ concatenated and returned.</li> ...@@ -646,10 +646,10 @@ concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object accroding to the return_seq.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">LayerOutput object accroding to the return_seq.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -662,9 +662,9 @@ concatenated and returned.</li> ...@@ -662,9 +662,9 @@ concatenated and returned.</li>
<h3>GRU<a class="headerlink" href="#gru" title="永久链接至标题"></a></h3> <h3>GRU<a class="headerlink" href="#gru" title="永久链接至标题"></a></h3>
<div class="section" id="gru-unit"> <div class="section" id="gru-unit">
<h4>gru_unit<a class="headerlink" href="#gru-unit" title="永久链接至标题"></a></h4> <h4>gru_unit<a class="headerlink" href="#gru-unit" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_unit</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Define calculations that a gated recurrent unit performs in a single time <dd><p>Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be step. This function itself is not a recurrent layer, so that it can not be
directly applied to sequence input. This function is almost always used in directly applied to sequence input. This function is almost always used in
...@@ -676,19 +676,19 @@ mechanism.</p> ...@@ -676,19 +676,19 @@ mechanism.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activation</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru output layer.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru output layer.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -698,9 +698,9 @@ mechanism.</p> ...@@ -698,9 +698,9 @@ mechanism.</p>
</div> </div>
<div class="section" id="gru-group"> <div class="section" id="gru-group">
<h4>gru_group<a class="headerlink" href="#gru-group" title="永久链接至标题"></a></h4> <h4>gru_group<a class="headerlink" href="#gru-group" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">gru_group</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>gru_group is a recurrent layer group version of Gated Recurrent Unit. It <dd><p>gru_group is a recurrent layer group version of Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
benefit is that gru hidden states are accessible to the user. This is benefit is that gru hidden states are accessible to the user. This is
...@@ -711,8 +711,8 @@ to use the grumemory, which is relatively faster.</p> ...@@ -711,8 +711,8 @@ to use the grumemory, which is relatively faster.</p>
<p>The example usage is:</p> <p>The example usage is:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">gur_group</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">],</span>
<span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span>
<span class="n">act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Tanh</span><span class="p">(),</span> <span class="n">act</span><span class="o">=</span><span class="n">TanhActivation</span><span class="p">(),</span>
<span class="n">gate_act</span><span class="o">=</span><span class="n">paddle</span><span class="o">.</span><span class="n">v2</span><span class="o">.</span><span class="n">Activation</span><span class="o">.</span><span class="n">Sigmoid</span><span class="p">())</span> <span class="n">gate_act</span><span class="o">=</span><span class="n">SigmoidActivation</span><span class="p">())</span>
</pre></div> </pre></div>
</div> </div>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
...@@ -720,21 +720,21 @@ to use the grumemory, which is relatively faster.</p> ...@@ -720,21 +720,21 @@ to use the grumemory, which is relatively faster.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -744,23 +744,23 @@ to use the grumemory, which is relatively faster.</p> ...@@ -744,23 +744,23 @@ to use the grumemory, which is relatively faster.</p>
</div> </div>
<div class="section" id="simple-gru"> <div class="section" id="simple-gru">
<h4>simple_gru<a class="headerlink" href="#simple-gru" title="永久链接至标题"></a></h4> <h4>simple_gru<a class="headerlink" href="#simple-gru" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>You maybe see gru_step, grumemory in layers.py, gru_unit, gru_group, <dd><p>You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
that we have two ways to implement recurrent neural network. One way is to that we have two ways to implement recurrent neural network. One way is to
use one complete layer to implement rnn (including simple rnn, gru and lstm) use one complete layer to implement rnn (including simple rnn, gru and lstm)
with multiple time steps, such as recurrent, lstmemory, grumemory. But, with multiple time steps, such as recurrent_layer, lstmemory, grumemory. But,
the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers. the multiplication operation <span class="math">\(W x_t\)</span> is not computed in these layers.
See details in their interfaces in layers.py. See details in their interfaces in layers.py.
The other implementation is to use an recurrent group which can ensemble a The other implementation is to use an recurrent group which can ensemble a
series of layers to compute rnn step by step. This way is flexible for series of layers to compute rnn step by step. This way is flexible for
attenion mechanism or other complex connections.</p> attenion mechanism or other complex connections.</p>
<ul class="simple"> <ul class="simple">
<li>gru_step: only compute rnn by one step. It needs an memory as input <li>gru_step_layer: only compute rnn by one step. It needs an memory as input
and can be used in recurrent group.</li> and can be used in recurrent group.</li>
<li>gru_unit: a wrapper of gru_step with memory.</li> <li>gru_unit: a wrapper of gru_step_layer with memory.</li>
<li>gru_group: a GRU cell implemented by a combination of multiple layers in <li>gru_group: a GRU cell implemented by a combination of multiple layers in
recurrent group. recurrent group.
But <span class="math">\(W x_t\)</span> is not done in group.</li> But <span class="math">\(W x_t\)</span> is not done in group.</li>
...@@ -781,21 +781,21 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -781,21 +781,21 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -805,9 +805,9 @@ gru_group, and gru_group is relatively better than simple_gru.</p> ...@@ -805,9 +805,9 @@ gru_group, and gru_group is relatively better than simple_gru.</p>
</div> </div>
<div class="section" id="simple-gru2"> <div class="section" id="simple-gru2">
<h4>simple_gru2<a class="headerlink" href="#simple-gru2" title="永久链接至标题"></a></h4> <h4>simple_gru2<a class="headerlink" href="#simple-gru2" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_gru2</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead <dd><p>simple_gru2 is the same with simple_gru, but using grumemory instead
Please see grumemory in layers.py for more detail about the maths. Please see grumemory in layers.py for more detail about the maths.
simple_gru2 is faster than simple_gru.</p> simple_gru2 is faster than simple_gru.</p>
...@@ -820,21 +820,21 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -820,21 +820,21 @@ simple_gru2 is faster than simple_gru.</p>
<col class="field-body" /> <col class="field-body" />
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer name.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer name.</li>
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the gru group.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li> <li><strong>size</strong> (<em>int</em>) &#8211; hidden size of the gru.</li>
<li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li> <li><strong>reverse</strong> (<em>bool</em>) &#8211; whether to process the input data in a reverse order</li>
<li><strong>act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the activiation</li> <li><strong>act</strong> (<em>BaseActivation</em>) &#8211; type of the activiation</li>
<li><strong>gate_act</strong> (<em>paddle.v2.Activation.Base</em>) &#8211; type of the gate activiation</li> <li><strong>gate_act</strong> (<em>BaseActivation</em>) &#8211; type of the gate activiation</li>
<li><strong>gru_bias_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li> <li><strong>gru_bias_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; bias. False means no bias, None means default bias.</li>
<li><strong>gru_attr</strong> (<em>paddle.v2.attr.ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li> <li><strong>gru_layer_attr</strong> (<em>ParameterAttribute|False</em>) &#8211; Extra parameter attribute of the gru layer.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">the gru group.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -844,9 +844,9 @@ simple_gru2 is faster than simple_gru.</p> ...@@ -844,9 +844,9 @@ simple_gru2 is faster than simple_gru.</p>
</div> </div>
<div class="section" id="bidirectional-gru"> <div class="section" id="bidirectional-gru">
<h4>bidirectional_gru<a class="headerlink" href="#bidirectional-gru" title="永久链接至标题"></a></h4> <h4>bidirectional_gru<a class="headerlink" href="#bidirectional-gru" title="永久链接至标题"></a></h4>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">bidirectional_gru</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>A bidirectional_gru is a recurrent unit that iterates over the input <dd><p>A bidirectional_gru is a recurrent unit that iterates over the input
sequence both in forward and bardward orders, and then concatenate two sequence both in forward and bardward orders, and then concatenate two
outputs to form a final output. However, concatenation of two outputs outputs to form a final output. However, concatenation of two outputs
...@@ -862,7 +862,7 @@ just add them together.</p> ...@@ -862,7 +862,7 @@ just add them together.</p>
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; bidirectional gru layer name.</li>
<li><strong>input</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; input layer.</li> <li><strong>input</strong> (<em>LayerOutput</em>) &#8211; input layer.</li>
<li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li> <li><strong>size</strong> (<em>int</em>) &#8211; gru layer size.</li>
<li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are <li><strong>return_seq</strong> (<em>bool</em>) &#8211; If set False, outputs of the last time step are
concatenated and returned. concatenated and returned.
...@@ -872,10 +872,10 @@ concatenated and returned.</li> ...@@ -872,10 +872,10 @@ concatenated and returned.</li>
</ul> </ul>
</td> </td>
</tr> </tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">paddle.v2.config_base.Layer object.</p> <tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">LayerOutput object.</p>
</td> </td>
</tr> </tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">paddle.v2.config_base.Layer</p> <tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">LayerOutput</p>
</td> </td>
</tr> </tr>
</tbody> </tbody>
...@@ -886,9 +886,9 @@ concatenated and returned.</li> ...@@ -886,9 +886,9 @@ concatenated and returned.</li>
</div> </div>
<div class="section" id="simple-attention"> <div class="section" id="simple-attention">
<h3>simple_attention<a class="headerlink" href="#simple-attention" title="永久链接至标题"></a></h3> <h3>simple_attention<a class="headerlink" href="#simple-attention" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">simple_attention</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>Calculate and then return a context vector by attention machanism. <dd><p>Calculate and then return a context vector by attention machanism.
Size of the context vector equals to size of the encoded_sequence.</p> Size of the context vector equals to size of the encoded_sequence.</p>
<div class="math"> <div class="math">
...@@ -912,18 +912,18 @@ Align and Translate</strong> for more details. The link is as follows: ...@@ -912,18 +912,18 @@ Align and Translate</strong> for more details. The link is as follows:
<tbody valign="top"> <tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple"> <tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li> <li><strong>name</strong> (<em>basestring</em>) &#8211; name of the attention model.</li>
<li><strong>softmax_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax <li><strong>softmax_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of sequence softmax
that is used to produce attention weight</li> that is used to produce attention weight</li>
<li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li> <li><strong>weight_act</strong> (<em>Activation</em>) &#8211; activation of the attention model</li>
<li><strong>encoded_sequence</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; output of the encoder</li> <li><strong>encoded_sequence</strong> (<em>LayerOutput</em>) &#8211; output of the encoder</li>
<li><strong>encoded_proj</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; attention weight is computed by a feed forward neural <li><strong>encoded_proj</strong> (<em>LayerOutput</em>) &#8211; attention weight is computed by a feed forward neural
network which has two inputs : decoder&#8217;s hidden state network which has two inputs : decoder&#8217;s hidden state
of previous time step and encoder&#8217;s output. of previous time step and encoder&#8217;s output.
encoded_proj is output of the feed-forward network for encoded_proj is output of the feed-forward network for
encoder&#8217;s output. Here we pre-compute it outside encoder&#8217;s output. Here we pre-compute it outside
simple_attention for speed consideration.</li> simple_attention for speed consideration.</li>
<li><strong>decoder_state</strong> (<em>paddle.v2.config_base.Layer</em>) &#8211; hidden state of decoder in previous time step</li> <li><strong>decoder_state</strong> (<em>LayerOutput</em>) &#8211; hidden state of decoder in previous time step</li>
<li><strong>transform_param_attr</strong> (<em>paddle.v2.attr.ParameterAttribute</em>) &#8211; parameter attribute of the feed-forward <li><strong>transform_param_attr</strong> (<em>ParameterAttribute</em>) &#8211; parameter attribute of the feed-forward
network that takes decoder_state as inputs to network that takes decoder_state as inputs to
compute attention weight.</li> compute attention weight.</li>
</ul> </ul>
...@@ -942,9 +942,9 @@ compute attention weight.</li> ...@@ -942,9 +942,9 @@ compute attention weight.</li>
<h2>Miscs<a class="headerlink" href="#miscs" title="永久链接至标题"></a></h2> <h2>Miscs<a class="headerlink" href="#miscs" title="永久链接至标题"></a></h2>
<div class="section" id="dropout-layer"> <div class="section" id="dropout-layer">
<h3>dropout_layer<a class="headerlink" href="#dropout-layer" title="永久链接至标题"></a></h3> <h3>dropout_layer<a class="headerlink" href="#dropout-layer" title="永久链接至标题"></a></h3>
<dl class="class"> <dl class="function">
<dt> <dt>
<em class="property">class </em><code class="descclassname">paddle.v2.networks.</code><code class="descname">dropout_layer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.networks.</code><code class="descname">dropout_layer</code><span class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span></dt>
<dd><p>&#64;TODO(yuyang18): Add comments.</p> <dd><p>&#64;TODO(yuyang18): Add comments.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
......
...@@ -192,12 +192,50 @@ ...@@ -192,12 +192,50 @@
<h1>Data Reader Interface and DataSets<a class="headerlink" href="#data-reader-interface-and-datasets" title="永久链接至标题"></a></h1> <h1>Data Reader Interface and DataSets<a class="headerlink" href="#data-reader-interface-and-datasets" title="永久链接至标题"></a></h1>
<div class="section" id="datatypes"> <div class="section" id="datatypes">
<h2>DataTypes<a class="headerlink" href="#datatypes" title="永久链接至标题"></a></h2> <h2>DataTypes<a class="headerlink" href="#datatypes" title="永久链接至标题"></a></h2>
<dl class="function">
<dt>
<code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_array</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt>
<dd><p>Dense Array. It means the input feature is dense array with float type.
For example, if the input is an image with 28*28 pixels, the input of
Paddle neural network could be a dense vector with dimension 784 or a
numpy array with shape (28, 28).</p>
<p>For the 2-D convolution operation, each sample in one mini-batch must have
the similarly size in PaddlePaddle now. But, it supports variable-dimension
feature across mini-batch. For the variable-dimension, the param dim is not
used. While the data reader must yield numpy array and the data feeder will
set the data shape correctly.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
<li><strong>dim</strong> (<em>int</em>) &#8211; dimension of this vector.</li>
<li><strong>seq_type</strong> (<em>int</em>) &#8211; sequence type of input.</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">An input type object.</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">InputType</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>
<dl class="function"> <dl class="function">
<dt> <dt>
<code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_vector</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt> <code class="descclassname">paddle.v2.data_type.</code><code class="descname">dense_vector</code><span class="sig-paren">(</span><em>dim</em>, <em>seq_type=0</em><span class="sig-paren">)</span></dt>
<dd><p>Dense Vector. It means the input feature is dense float vector. For example, <dd><p>Dense Array. It means the input feature is dense array with float type.
if the input is an image with 28*28 pixels, the input of Paddle neural For example, if the input is an image with 28*28 pixels, the input of
network should be a dense vector with dimension 784.</p> Paddle neural network could be a dense vector with dimension 784 or a
numpy array with shape (28, 28).</p>
<p>For the 2-D convolution operation, each sample in one mini-batch must have
the similarly size in PaddlePaddle now. But, it supports variable-dimension
feature across mini-batch. For the variable-dimension, the param dim is not
used. While the data reader must yield numpy array and the data feeder will
set the data shape correctly.</p>
<table class="docutils field-list" frame="void" rules="none"> <table class="docutils field-list" frame="void" rules="none">
<col class="field-name" /> <col class="field-name" />
<col class="field-body" /> <col class="field-body" />
......
...@@ -193,7 +193,7 @@ ...@@ -193,7 +193,7 @@
</div> </div>
<div class="section" id="task-queue"> <div class="section" id="task-queue">
<span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="永久链接至标题"></a></h2> <span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="永久链接至标题"></a></h2>
<p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>blocks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p> <p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>chunks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p>
<div class="section" id="task-queue-creation"> <div class="section" id="task-queue-creation">
<span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="永久链接至标题"></a></h3> <span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="永久链接至标题"></a></h3>
<ol> <ol>
...@@ -204,21 +204,21 @@ ...@@ -204,21 +204,21 @@
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">The master server will scan through each RecordIO file to generate the <em>block index</em> and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file.</p> <li><p class="first">The master server will scan through each RecordIO file to generate the <em>chunk index</em> and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.</p>
<p>The definition of the block is:</p> <p>The definition of the chunk is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Block</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Chunk</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the block within the file</span> <span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the chunk within the file</span>
<span class="nx">Path</span> <span class="kt">string</span> <span class="nx">Path</span> <span class="kt">string</span>
<span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// block index</span> <span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// chunk index</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p> <li><p class="first">Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p>
<p>The definition of the task is:</p> <p>The definition of the task is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Index</span> <span class="kt">int</span> <span class="nx">Index</span> <span class="kt">int</span>
<span class="nx">Blocks</span> <span class="p">[]</span><span class="nx">Block</span> <span class="nx">Chunks</span> <span class="p">[]</span><span class="nx">Chunk</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
......
...@@ -233,7 +233,7 @@ name:sparse-n-1 ...@@ -233,7 +233,7 @@ name:sparse-n-1
<div class="highlight-c"><div class="highlight"><pre><span></span><span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">config_proto</span><span class="p">);</span> <div class="highlight-c"><div class="highlight"><pre><span></span><span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">config_proto</span><span class="p">);</span>
</pre></div> </pre></div>
</div> </div>
<p>The selected trainer&#8217;s call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return with 1, and the other trainers&#8217; call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will block until initialization is done, and return 0. As illustrated below:</p> <p>The selected trainer&#8217;s call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return with 1, and the other trainers&#8217; call to <code class="docutils literal"><span class="pre">paddle_begin_init_params</span></code> will return 0. <code class="docutils literal"><span class="pre">paddle_get_params</span></code> will be blocked until initialization is completed. As illustrated below:</p>
<p><img src="./src/pserver_init.png"></p> <p><img src="./src/pserver_init.png"></p>
</div> </div>
</div> </div>
...@@ -266,16 +266,13 @@ name:sparse-n-1 ...@@ -266,16 +266,13 @@ name:sparse-n-1
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * paddle_begin_init_params will be called from multiple trainers,</span> <span class="cm"> * paddle_begin_init_params will be called from multiple trainers,</span>
<span class="cm"> * only one trainer will be selected to initialize the parameters on</span> <span class="cm"> * only one trainer will be selected to initialize the parameters on</span>
<span class="cm"> * parameter servers. Other trainers will be blocked until the</span> <span class="cm"> * parameter servers. Other trainers need to get the initialized</span>
<span class="cm"> * initialization is done, and they need to get the initialized</span>
<span class="cm"> * parameters from parameter servers using @paddle_get_params.</span> <span class="cm"> * parameters from parameter servers using @paddle_get_params.</span>
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * @param pserver_config_proto serialized parameter server configuration in</span>
<span class="cm"> * Protocol Buffers format.</span>
<span class="cm"> * @return 1 if the trainer is selected to initialize parameter</span> <span class="cm"> * @return 1 if the trainer is selected to initialize parameter</span>
<span class="cm"> * servers, otherwise 0.</span> <span class="cm"> * servers, otherwise 0.</span>
<span class="cm"> */</span> <span class="cm"> */</span>
<span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">pserver_config_proto</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">paddle_begin_init_params</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">);</span>
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_init_param initializes the parameter on parameter</span> <span class="cm"> * @brief paddle_init_param initializes the parameter on parameter</span>
...@@ -283,12 +280,13 @@ name:sparse-n-1 ...@@ -283,12 +280,13 @@ name:sparse-n-1
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * @param param the parameter to initialize.</span> <span class="cm"> * @param param the parameter to initialize.</span>
<span class="cm"> * @param param_config_proto the configuration for the parameter.</span> <span class="cm"> * @param param_config_proto the configuration for the parameter.</span>
<span class="cm"> * @param config_len the length of param_config_proto</span>
<span class="cm"> * @return 0 if successful, otherwise -1. On failure, the trainer</span> <span class="cm"> * @return 0 if successful, otherwise -1. On failure, the trainer</span>
<span class="cm"> * needs to restart the entire initialization process (starting from</span> <span class="cm"> * needs to restart the entire initialization process (starting from</span>
<span class="cm"> * @paddle_begin_init_param). Or simply exit the program and wait for</span> <span class="cm"> * @paddle_begin_init_param). Or simply exit the program and wait for</span>
<span class="cm"> * the cluster management system to restart the trainer.</span> <span class="cm"> * the cluster management system to restart the trainer.</span>
<span class="cm"> */</span> <span class="cm"> */</span>
<span class="kt">int</span> <span class="nf">paddle_init_param</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="n">paddle_parameter</span> <span class="n">params</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">param_config_proto</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">paddle_init_param</span><span class="p">(</span><span class="n">paddle_pserver_client</span><span class="o">*</span> <span class="n">client</span><span class="p">,</span> <span class="n">paddle_parameter</span> <span class="n">param</span><span class="p">,</span> <span class="k">const</span> <span class="kt">unsigned</span> <span class="kt">char</span><span class="o">*</span> <span class="n">param_config_proto</span><span class="p">,</span> <span class="kt">int</span> <span class="n">config_len</span><span class="p">);</span>
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_finish_init_params tells parameter servers client has</span> <span class="cm"> * @brief paddle_finish_init_params tells parameter servers client has</span>
...@@ -315,6 +313,9 @@ name:sparse-n-1 ...@@ -315,6 +313,9 @@ name:sparse-n-1
<span class="cm">/**</span> <span class="cm">/**</span>
<span class="cm"> * @brief paddle_get_params gets parameters from parameter servers.</span> <span class="cm"> * @brief paddle_get_params gets parameters from parameter servers.</span>
<span class="cm"> *</span> <span class="cm"> *</span>
<span class="cm"> * paddle_get_params will block until parameters are initialized on</span>
<span class="cm"> * the parameter servers.</span>
<span class="cm"> *</span>
<span class="cm"> * @param names the array of names of the parameters to get.</span> <span class="cm"> * @param names the array of names of the parameters to get.</span>
<span class="cm"> * @param dst the destination array of parameters to save to.</span> <span class="cm"> * @param dst the destination array of parameters to save to.</span>
<span class="cm"> * @param len the length of the names array and the paddle_parameter</span> <span class="cm"> * @param len the length of the names array and the paddle_parameter</span>
......
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Design Doc: The C++ Class Parameters &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../genindex.html"/>
<link rel="search" title="搜索" href="../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Folk me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_cn.html">PaddlePaddle的Docker容器使用方式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/ubuntu_install_cn.html">Ubuntu部署PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/cmake/build_from_source_cn.html">PaddlePaddle的编译选项</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_cn.html">运行分布式训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_basis_cn.html">Kubernetes 简介</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_cn.html">Kubernetes单机训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/k8s/k8s_distributed_cn.html">Kubernetes分布式训练</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">数据访问</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">训练与应用</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a></li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Design Doc: The C++ Class <code class="docutils literal"><span class="pre">Parameters</span></code></li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="design-doc-the-c-class-parameters">
<span id="design-doc-the-c-class-parameters"></span><h1>Design Doc: The C++ Class <code class="docutils literal"><span class="pre">Parameters</span></code><a class="headerlink" href="#design-doc-the-c-class-parameters" title="永久链接至标题"></a></h1>
<p><code class="docutils literal"><span class="pre">Parameters</span></code> is a concept we designed in Paddle V2 API. <code class="docutils literal"><span class="pre">Parameters</span></code> is a container of parameters, and make Paddle can shared parameter between topologies. We described usages of <code class="docutils literal"><span class="pre">Parameter</span></code> in <a class="reference internal" href="api.html"><span class="doc">api.md</span></a>.</p>
<p>We used Python to implement Parameters when designing V2 API before. There are several defects for current implementation:</p>
<ul class="simple">
<li>We just use <code class="docutils literal"><span class="pre">memcpy</span></code> to share Parameters between topologies, but this is very inefficient.</li>
<li>We did not implement share Parameters while training. We just trigger <code class="docutils literal"><span class="pre">memcpy</span></code> when start training.</li>
</ul>
<p>It is necessary that we implement Parameters in CPP side. However, it could be a code refactoring for Paddle, because Paddle was designed for training only one topology before, i.e., each GradientMachine contains its Parameter as a data member. In current Paddle implementation, there are three concepts associated with <code class="docutils literal"><span class="pre">Parameters</span></code>:</p>
<ol class="simple">
<li><code class="docutils literal"><span class="pre">paddle::Parameter</span></code>. A <code class="docutils literal"><span class="pre">Parameters</span></code> is a container for <code class="docutils literal"><span class="pre">paddle::Parameter</span></code>.
It is evident that we should use <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> when developing <code class="docutils literal"><span class="pre">Parameters</span></code>.
However, the <code class="docutils literal"><span class="pre">Parameter</span></code> class contains many functions and does not have a clear interface.
It contains <code class="docutils literal"><span class="pre">create/store</span> <span class="pre">Parameter</span></code>, <code class="docutils literal"><span class="pre">serialize/deserialize</span></code>, <code class="docutils literal"><span class="pre">optimize(i.e</span> <span class="pre">SGD)</span></code>, <code class="docutils literal"><span class="pre">randomize/zero</span></code>.
When we developing <code class="docutils literal"><span class="pre">Parameters</span></code>, we only use <code class="docutils literal"><span class="pre">create/store</span> <span class="pre">Parameter</span></code> functionality.
We should extract functionalities of Parameter into many classes to clean Paddle CPP implementation.</li>
<li><code class="docutils literal"><span class="pre">paddle::GradientMachine</span></code> and its sub-classes, e.g., <code class="docutils literal"><span class="pre">paddle::MultiGradientMachine</span></code>, <code class="docutils literal"><span class="pre">paddle::NeuralNetwork</span></code>.
We should pass <code class="docutils literal"><span class="pre">Parameters</span></code> to <code class="docutils literal"><span class="pre">paddle::GradientMachine</span></code> when <code class="docutils literal"><span class="pre">forward/backward</span></code> to avoid <code class="docutils literal"><span class="pre">memcpy</span></code> between topologies.
Also, we should handle multi-GPU/CPU training, because <code class="docutils literal"><span class="pre">forward</span></code> and <code class="docutils literal"><span class="pre">backward</span></code> would perform on multi-GPUs and multi-CPUs.
<code class="docutils literal"><span class="pre">Parameters</span></code> should dispatch the parameter value to each device, and gather the parameter gradient from each device.</li>
<li><code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code>. The ParameterUpdater is used to update parameters in Paddle.
So <code class="docutils literal"><span class="pre">Parameters</span></code> should be used by <code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code>, and <code class="docutils literal"><span class="pre">paddle::ParameterUpdater</span></code> should optimize <code class="docutils literal"><span class="pre">Parameters</span></code> (by SGD).</li>
</ol>
<p>The step by step approach for implementation Parameters in Paddle C++ core is listed below. Each step should be a PR and could be merged into Paddle one by one.</p>
<ol class="simple">
<li>Clean <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> interface. Extract the functionalities of <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> to prepare for the implementation of Parameters.</li>
<li>Implementation a <code class="docutils literal"><span class="pre">Parameters</span></code> class. It just stores the <code class="docutils literal"><span class="pre">paddle::Parameter</span></code> inside. Make <code class="docutils literal"><span class="pre">GradientMachine</span></code> uses <code class="docutils literal"><span class="pre">Parameters</span></code> as a class member.</li>
<li>Make <code class="docutils literal"><span class="pre">Parameters</span></code> support Multi-CPU and Multi-GPU training to prepare for sharing <code class="docutils literal"><span class="pre">Parameter</span></code> between topologies.
Because we need share <code class="docutils literal"><span class="pre">Parameters</span></code> between topologies, it is <code class="docutils literal"><span class="pre">Parameters</span></code>&#8216;s response to exchange Parameters between GPUs.
<code class="docutils literal"><span class="pre">GradientMachine</span></code> should not handle how to exchange Parameters because <code class="docutils literal"><span class="pre">GradientMachine</span></code> only used to train one topology and we need to support train many topologies in Paddle, i.e., there could be many GradientMachines use one <code class="docutils literal"><span class="pre">Parameters</span></code>.<ul>
<li>We should use a global function to exchange Parameters between GPUs, not a member function in <code class="docutils literal"><span class="pre">Parameters</span></code>. The <code class="docutils literal"><span class="pre">MultiGradientMachine</span></code> invoke this function, which uses <code class="docutils literal"><span class="pre">Parameters</span></code> as this function inputs.</li>
<li>The MultiGradientMachine contains many functionalities. Extracting the Parameters exchanging logic could make MultiGradientMachine clearer and simpler.</li>
</ul>
</li>
<li>Make <code class="docutils literal"><span class="pre">Parameters</span></code> as an argument for <code class="docutils literal"><span class="pre">forward/backward</span></code> function, not a data member for <code class="docutils literal"><span class="pre">GradientMachine</span></code>. For example, <code class="docutils literal"><span class="pre">forward</span></code> could be <code class="docutils literal"><span class="pre">forward(const</span> <span class="pre">Parameters&amp;</span> <span class="pre">params,</span> <span class="pre">...)</span></code> and <code class="docutils literal"><span class="pre">backward</span></code> could be <code class="docutils literal"><span class="pre">backward(Parameters*</span> <span class="pre">params,</span> <span class="pre">...)</span></code>. After this step, Paddle could share <code class="docutils literal"><span class="pre">Parameters</span></code> between topologies.</li>
<li><code class="docutils literal"><span class="pre">ParameterUpdater</span></code> is invoked by <code class="docutils literal"><span class="pre">GradientMachine</span></code> and <code class="docutils literal"><span class="pre">Trainer</span></code>, but it updates <code class="docutils literal"><span class="pre">Parameters</span></code>. In the end of this code refactoring, we could change <code class="docutils literal"><span class="pre">ParameterUpdater</span></code> directly uses <code class="docutils literal"><span class="pre">Parameters</span></code> to make <code class="docutils literal"><span class="pre">ParameterUpdater</span></code>&#8216;s implementation clear.</li>
</ol>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
因为 它太大了无法显示 source diff 。你可以改为 查看blob
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册