TensorArray as a new concept is borrowed from TensorFlow,
it is meant to be used with dynamic iteration primitives such as `while_loop` and `map_fn`.
This concept can be used to support our new design of dynamic operations, and help to refactor some existing variant-sentence-related layers,
such as `RecurrentGradientMachine`.
In [our design for dynamic RNN](https://github.com/PaddlePaddle/Paddle/pull/4401),
`TensorArray` is used to segment inputs and store states in all time steps.
By providing some methods similar to a C++ array,
the definition of some state-based dynamic models such as RNN could be more natural and highly flexible.
## Dynamic-Related Methods
Some basic methods should be proposed as follows:
### stack()
Pack the values in a `TensorArray` into a tensor with rank one higher than each tensor in `values`.
### unstack(axis=0)
Unpacks the given dimension of a rank-`R` tensor into rank-`(R-1)` tensors.
### concat()
Return the values in the `TensorArray` as a concatenated Tensor.
### write(index, value, data_shared=true)
Write value into index of the TensorArray.
### read(index)
Read the value at location `index` in the `TensorArray`.
### size()
Return the number of values.
## LoDTensor-related Supports
The `RecurrentGradientMachine` in Paddle serves as a flexible RNN layer; it takes variant length sequences as input,
because each step of RNN could only take a tensor-represented batch of data as input,
some preprocess should be taken on the inputs such as sorting the sentences by their length in descending order and cut each word and pack to new batches.
Such cut-like operations can be embedded into `TensorArray` as general methods called `unpack` and `pack`.
With these two methods, a variant-sentence-RNN can be implemented like
```c++
// input is the varient-length data
LodTensor sentence_input(xxx);
TensorArray ta;
Tensor indice_map;
Tensor boot_state = xxx; // to initialize rnn's first state
the code above shows that by embedding the LoDTensor-related preprocess operations into `TensorArray`,
the implementation of a RNN that supports varient-length sentences is far more concise than `RecurrentGradientMachine` because the latter mixes all the codes together, hard to read and extend.
some details are as follows.
### unpack(level, sort_by_length)
Split LodTensor in some `level` and generate batches, if set `sort_by_length`, will sort by length.
Returns:
- a new `TensorArray`, whose values are LodTensors and represents batches of data.
- an int32 Tensor, which stores the map from the new batch's indices to original LoDTensor
### pack(level, indices_map)
Recover the original LoD-arranged LoDTensor with the values in a `TensorArray` and `level` and `indices_map`.
<liclass="toctree-l2"><aclass="reference internal"href="../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<liclass="toctree-l3"><aclass="reference internal"href="../getstarted/build_and_install/docker_install_en.html">PaddlePaddle in Docker Containers</a></li>
<liclass="toctree-l3"><aclass="reference internal"href="../getstarted/build_and_install/build_from_source_en.html">Installing from Sources</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../howto/usage/k8s/k8s_en.html">Paddle On Kubernetes</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../howto/usage/k8s/k8s_aws_en.html">Distributed PaddlePaddle Training on AWS with Kubernetes</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../howto/dev/build_en.html">Build PaddlePaddle from Source Code and Run Unit Test</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../howto/dev/new_layer_en.html">Write New Layers</a></li>
<spanid="design-for-tensorarray"></span><h1>Design for TensorArray<aclass="headerlink"href="#design-for-tensorarray"title="Permalink to this headline">¶</a></h1>
<p>TensorArray as a new concept is borrowed from TensorFlow,
it is meant to be used with dynamic iteration primitives such as <codeclass="docutils literal"><spanclass="pre">while_loop</span></code> and <codeclass="docutils literal"><spanclass="pre">map_fn</span></code>.</p>
<p>This concept can be used to support our new design of dynamic operations, and help to refactor some existing variant-sentence-related layers,
such as <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code>.</p>
<p>In <aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/pull/4401">our design for dynamic RNN</a>,
<codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> is used to segment inputs and store states in all time steps.
By providing some methods similar to a C++ array,
the definition of some state-based dynamic models such as RNN could be more natural and highly flexible.</p>
<divclass="section"id="dynamic-related-methods">
<spanid="dynamic-related-methods"></span><h2>Dynamic-Related Methods<aclass="headerlink"href="#dynamic-related-methods"title="Permalink to this headline">¶</a></h2>
<p>Some basic methods should be proposed as follows:</p>
<divclass="section"id="stack">
<spanid="stack"></span><h3>stack()<aclass="headerlink"href="#stack"title="Permalink to this headline">¶</a></h3>
<p>Pack the values in a <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> into a tensor with rank one higher than each tensor in <codeclass="docutils literal"><spanclass="pre">values</span></code>.</p>
</div>
<divclass="section"id="unstack-axis-0">
<spanid="unstack-axis-0"></span><h3>unstack(axis=0)<aclass="headerlink"href="#unstack-axis-0"title="Permalink to this headline">¶</a></h3>
<p>Unpacks the given dimension of a rank-<codeclass="docutils literal"><spanclass="pre">R</span></code> tensor into rank-<codeclass="docutils literal"><spanclass="pre">(R-1)</span></code> tensors.</p>
</div>
<divclass="section"id="concat">
<spanid="concat"></span><h3>concat()<aclass="headerlink"href="#concat"title="Permalink to this headline">¶</a></h3>
<p>Return the values in the <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> as a concatenated Tensor.</p>
<spanid="write-index-value-data-shared-true"></span><h3>write(index, value, data_shared=true)<aclass="headerlink"href="#write-index-value-data-shared-true"title="Permalink to this headline">¶</a></h3>
<p>Write value into index of the TensorArray.</p>
</div>
<divclass="section"id="read-index">
<spanid="read-index"></span><h3>read(index)<aclass="headerlink"href="#read-index"title="Permalink to this headline">¶</a></h3>
<p>Read the value at location <codeclass="docutils literal"><spanclass="pre">index</span></code> in the <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>.</p>
</div>
<divclass="section"id="size">
<spanid="size"></span><h3>size()<aclass="headerlink"href="#size"title="Permalink to this headline">¶</a></h3>
<spanid="lodtensor-related-supports"></span><h2>LoDTensor-related Supports<aclass="headerlink"href="#lodtensor-related-supports"title="Permalink to this headline">¶</a></h2>
<p>The <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code> in Paddle serves as a flexible RNN layer; it takes variant length sequences as input,
because each step of RNN could only take a tensor-represented batch of data as input,
some preprocess should be taken on the inputs such as sorting the sentences by their length in descending order and cut each word and pack to new batches.</p>
<p>Such cut-like operations can be embedded into <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> as general methods called <codeclass="docutils literal"><spanclass="pre">unpack</span></code> and <codeclass="docutils literal"><spanclass="pre">pack</span></code>.</p>
<p>With these two methods, a variant-sentence-RNN can be implemented like</p>
<divclass="highlight-c++"><divclass="highlight"><pre><span></span><spanclass="c1">// input is the varient-length data</span>
<spanclass="n">Tensor</span><spanclass="n">boot_state</span><spanclass="o">=</span><spanclass="n">xxx</span><spanclass="p">;</span><spanclass="c1">// to initialize rnn's first state</span>
<p>the code above shows that by embedding the LoDTensor-related preprocess operations into <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>,
the implementation of a RNN that supports varient-length sentences is far more concise than <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code> because the latter mixes all the codes together, hard to read and extend.</p>
<spanid="unpack-level-sort-by-length"></span><h3>unpack(level, sort_by_length)<aclass="headerlink"href="#unpack-level-sort-by-length"title="Permalink to this headline">¶</a></h3>
<p>Split LodTensor in some <codeclass="docutils literal"><spanclass="pre">level</span></code> and generate batches, if set <codeclass="docutils literal"><spanclass="pre">sort_by_length</span></code>, will sort by length.</p>
<p>Returns:</p>
<ulclass="simple">
<li>a new <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>, whose values are LodTensors and represents batches of data.</li>
<li>an int32 Tensor, which stores the map from the new batch’s indices to original LoDTensor</li>
</ul>
</div>
<divclass="section"id="pack-level-indices-map">
<spanid="pack-level-indices-map"></span><h3>pack(level, indices_map)<aclass="headerlink"href="#pack-level-indices-map"title="Permalink to this headline">¶</a></h3>
<p>Recover the original LoD-arranged LoDTensor with the values in a <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> and <codeclass="docutils literal"><spanclass="pre">level</span></code> and <codeclass="docutils literal"><spanclass="pre">indices_map</span></code>.</p>
Built with <ahref="http://sphinx-doc.org/">Sphinx</a> using a <ahref="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <ahref="https://readthedocs.org">Read the Docs</a>.
TensorArray as a new concept is borrowed from TensorFlow,
it is meant to be used with dynamic iteration primitives such as `while_loop` and `map_fn`.
This concept can be used to support our new design of dynamic operations, and help to refactor some existing variant-sentence-related layers,
such as `RecurrentGradientMachine`.
In [our design for dynamic RNN](https://github.com/PaddlePaddle/Paddle/pull/4401),
`TensorArray` is used to segment inputs and store states in all time steps.
By providing some methods similar to a C++ array,
the definition of some state-based dynamic models such as RNN could be more natural and highly flexible.
## Dynamic-Related Methods
Some basic methods should be proposed as follows:
### stack()
Pack the values in a `TensorArray` into a tensor with rank one higher than each tensor in `values`.
### unstack(axis=0)
Unpacks the given dimension of a rank-`R` tensor into rank-`(R-1)` tensors.
### concat()
Return the values in the `TensorArray` as a concatenated Tensor.
### write(index, value, data_shared=true)
Write value into index of the TensorArray.
### read(index)
Read the value at location `index` in the `TensorArray`.
### size()
Return the number of values.
## LoDTensor-related Supports
The `RecurrentGradientMachine` in Paddle serves as a flexible RNN layer; it takes variant length sequences as input,
because each step of RNN could only take a tensor-represented batch of data as input,
some preprocess should be taken on the inputs such as sorting the sentences by their length in descending order and cut each word and pack to new batches.
Such cut-like operations can be embedded into `TensorArray` as general methods called `unpack` and `pack`.
With these two methods, a variant-sentence-RNN can be implemented like
```c++
// input is the varient-length data
LodTensor sentence_input(xxx);
TensorArray ta;
Tensor indice_map;
Tensor boot_state = xxx; // to initialize rnn's first state
the code above shows that by embedding the LoDTensor-related preprocess operations into `TensorArray`,
the implementation of a RNN that supports varient-length sentences is far more concise than `RecurrentGradientMachine` because the latter mixes all the codes together, hard to read and extend.
some details are as follows.
### unpack(level, sort_by_length)
Split LodTensor in some `level` and generate batches, if set `sort_by_length`, will sort by length.
Returns:
- a new `TensorArray`, whose values are LodTensors and represents batches of data.
- an int32 Tensor, which stores the map from the new batch's indices to original LoDTensor
### pack(level, indices_map)
Recover the original LoD-arranged LoDTensor with the values in a `TensorArray` and `level` and `indices_map`.
<spanid="design-for-tensorarray"></span><h1>Design for TensorArray<aclass="headerlink"href="#design-for-tensorarray"title="永久链接至标题">¶</a></h1>
<p>TensorArray as a new concept is borrowed from TensorFlow,
it is meant to be used with dynamic iteration primitives such as <codeclass="docutils literal"><spanclass="pre">while_loop</span></code> and <codeclass="docutils literal"><spanclass="pre">map_fn</span></code>.</p>
<p>This concept can be used to support our new design of dynamic operations, and help to refactor some existing variant-sentence-related layers,
such as <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code>.</p>
<p>In <aclass="reference external"href="https://github.com/PaddlePaddle/Paddle/pull/4401">our design for dynamic RNN</a>,
<codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> is used to segment inputs and store states in all time steps.
By providing some methods similar to a C++ array,
the definition of some state-based dynamic models such as RNN could be more natural and highly flexible.</p>
<p>Pack the values in a <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> into a tensor with rank one higher than each tensor in <codeclass="docutils literal"><spanclass="pre">values</span></code>.</p>
<p>Unpacks the given dimension of a rank-<codeclass="docutils literal"><spanclass="pre">R</span></code> tensor into rank-<codeclass="docutils literal"><spanclass="pre">(R-1)</span></code> tensors.</p>
<p>Read the value at location <codeclass="docutils literal"><spanclass="pre">index</span></code> in the <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code> in Paddle serves as a flexible RNN layer; it takes variant length sequences as input,
because each step of RNN could only take a tensor-represented batch of data as input,
some preprocess should be taken on the inputs such as sorting the sentences by their length in descending order and cut each word and pack to new batches.</p>
<p>Such cut-like operations can be embedded into <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> as general methods called <codeclass="docutils literal"><spanclass="pre">unpack</span></code> and <codeclass="docutils literal"><spanclass="pre">pack</span></code>.</p>
<p>With these two methods, a variant-sentence-RNN can be implemented like</p>
<divclass="highlight-c++"><divclass="highlight"><pre><span></span><spanclass="c1">// input is the varient-length data</span>
<spanclass="n">Tensor</span><spanclass="n">boot_state</span><spanclass="o">=</span><spanclass="n">xxx</span><spanclass="p">;</span><spanclass="c1">// to initialize rnn's first state</span>
<p>the code above shows that by embedding the LoDTensor-related preprocess operations into <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>,
the implementation of a RNN that supports varient-length sentences is far more concise than <codeclass="docutils literal"><spanclass="pre">RecurrentGradientMachine</span></code> because the latter mixes all the codes together, hard to read and extend.</p>
<p>Split LodTensor in some <codeclass="docutils literal"><spanclass="pre">level</span></code> and generate batches, if set <codeclass="docutils literal"><spanclass="pre">sort_by_length</span></code>, will sort by length.</p>
<p>Returns:</p>
<ulclass="simple">
<li>a new <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code>, whose values are LodTensors and represents batches of data.</li>
<li>an int32 Tensor, which stores the map from the new batch’s indices to original LoDTensor</li>
<p>Recover the original LoD-arranged LoDTensor with the values in a <codeclass="docutils literal"><spanclass="pre">TensorArray</span></code> and <codeclass="docutils literal"><spanclass="pre">level</span></code> and <codeclass="docutils literal"><spanclass="pre">indices_map</span></code>.</p>
Built with <ahref="http://sphinx-doc.org/">Sphinx</a> using a <ahref="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <ahref="https://readthedocs.org">Read the Docs</a>.