Deploy to GitHub Pages: 624d22d9

546fa1a8 · Travis CI · ad7f6583 · 546fa1a8 · 546fa1a8 · 546fa1a8
11 changed file
--- a/develop/doc/_sources/api/v2/config/layer.rst.txt
+++ b/develop/doc/_sources/api/v2/config/layer.rst.txt
@@ -87,6 +87,11 @@ roi_pool
 ..  autoclass:: paddle.v2.layer.roi_pool
    :noindex:
+pad
+----
+..  autoclass:: paddle.v2.layer.pad
+    :noindex:
 Norm Layer
 ==========
@@ -133,6 +138,11 @@ grumemory
 ..  autoclass:: paddle.v2.layer.grumemory
    :noindex:
+gated_unit
+-----------
+..  autoclass:: paddle.v2.layer.gated_unit
+    :noindex:
 Recurrent Layer Group
 =====================
@@ -340,6 +350,11 @@ bilinear_interp
 ..  autoclass:: paddle.v2.layer.bilinear_interp
    :noindex:
+dropout
+--------
+..  autoclass:: paddle.v2.layer.dropout
+    :noindex:
 dot_prod
 ---------
 .. autoclass:: paddle.v2.layer.dot_prod
@@ -402,6 +417,11 @@ scale_shift
 ..  autoclass:: paddle.v2.layer.scale_shift
    :noindex:
+factorization_machine
+---------------------
+..  autoclass:: paddle.v2.layer.factorization_machine
+    :noindex:
 Sampling Layers
 ===============
@@ -420,22 +440,6 @@ multiplex
 ..  autoclass:: paddle.v2.layer.multiplex
    :noindex:
-Factorization Machine Layer
-============================
-factorization_machine
---------------------
-..  autoclass:: paddle.v2.layer.factorization_machine
-    :noindex:
-Slicing and Joining Layers
-==========================
-pad
----
-..  autoclass:: paddle.v2.layer.pad
-    :noindex:
 ..  _api_v2.layer_costs:
 Cost Layers
@@ -526,6 +530,11 @@ multibox_loss
 ..  autoclass:: paddle.v2.layer.multibox_loss
    :noindex:
+detection_output
+----------------
+..  autoclass:: paddle.v2.layer.detection_output
+    :noindex:
 Check Layer
 ============
@@ -534,31 +543,10 @@ eos
 ..  autoclass:: paddle.v2.layer.eos
    :noindex:
-Miscs
+Activation
-=====
+==========
-dropout
--------
-..  autoclass:: paddle.v2.layer.dropout
-    :noindex:
-Activation with learnable parameter
-===================================
 prelu
 --------
 ..  autoclass:: paddle.v2.layer.prelu
    :noindex:
-gated_unit
-----------
-..  autoclass:: paddle.v2.layer.gated_unit
-    :noindex:
-Detection output Layer
-======================
-detection_output
----------------
-..  autoclass:: paddle.v2.layer.detection_output
-    :noindex:
--- a/develop/doc/_sources/api/v2/data/dataset.rst.txt
+++ b/develop/doc/_sources/api/v2/data/dataset.rst.txt
@@ -73,3 +73,10 @@ wmt14
 ..  automodule:: paddle.v2.dataset.wmt14
    :members:
    :noindex:
+wmt16
+++++
+..  automodule:: paddle.v2.dataset.wmt16
+    :members:
+    :noindex:
--- a/develop/doc/api/v2/config/layer.html
+++ b/develop/doc/api/v2/config/layer.html
--- a/develop/doc/api/v2/data/dataset.html
+++ b/develop/doc/api/v2/data/dataset.html
@@ -729,6 +729,191 @@ sequence.</p>
 <dd><p>Converts dataset to recordio format</p>
 </dd></dl>
+</div>
+<div class="section" id="wmt16">
+<h2>wmt16<a class="headerlink" href="#wmt16" title="Permalink to this headline">¶</a></h2>
+<p>ACL2016 Multimodal Machine Translation. Please see this website for more
+details: <a class="reference external" href="http://www.statmt.org/wmt16/multimodal-task.html#task1">http://www.statmt.org/wmt16/multimodal-task.html#task1</a></p>
+<p>If you use the dataset created for your task, please cite the following paper:
+Multi30K: Multilingual English-German Image Descriptions.</p>
+<dl class="docutils">
+<dt>&#64;article{elliott-EtAl:2016:VL16,</dt>
+<dd>author    = {{Elliott}, D. and {Frank}, S. and {Sima&#8221;an}, K. and {Specia}, L.},
+title     = {Multi30K: Multilingual English-German Image Descriptions},
+booktitle = {Proceedings of the 6th Workshop on Vision and Language},
+year      = {2016},
+pages     = {70&#8211;74},
+year      = 2016</dd>
+</dl>
+<p>}</p>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">train</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 train set reader.</p>
+<p>This function returns the reader for train data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for training data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The train reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">test</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 test set reader.</p>
+<p>This function returns the reader for test data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for test data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The test reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">validation</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 validation set reader.</p>
+<p>This function returns the reader for validation data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for validation data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The validation reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">get_dict</code><span class="sig-paren">(</span><em>lang</em>, <em>dict_size</em>, <em>reverse=False</em><span class="sig-paren">)</span></dt>
+<dd><p>return the word dictionary for the specified language.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+<li><strong>dict_size</strong> (<em>int</em>) &#8211; Size of the specified language dictionary.</li>
+<li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set to False, the returned python
+dictionary will use word as key and use index as value.
+If reverse is set to True, the returned python
+dictionary will use index as key and word as value.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The word dictionary for the specific language.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">dict</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">fetch</code><span class="sig-paren">(</span><span class="sig-paren">)</span></dt>
+<dd><p>download the entire dataset.</p>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">convert</code><span class="sig-paren">(</span><em>path</em>, <em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang</em><span class="sig-paren">)</span></dt>
+<dd><p>Converts dataset to recordio format.</p>
+</dd></dl>
 </div>
 </div>

--- a/develop/doc/operators.json
+++ b/develop/doc/operators.json
--- a/develop/doc/searchindex.js
+++ b/develop/doc/searchindex.js
--- a/develop/doc_cn/_sources/api/v2/config/layer.rst.txt
+++ b/develop/doc_cn/_sources/api/v2/config/layer.rst.txt
@@ -87,6 +87,11 @@ roi_pool
 ..  autoclass:: paddle.v2.layer.roi_pool
    :noindex:
+pad
+----
+..  autoclass:: paddle.v2.layer.pad
+    :noindex:
 Norm Layer
 ==========
@@ -133,6 +138,11 @@ grumemory
 ..  autoclass:: paddle.v2.layer.grumemory
    :noindex:
+gated_unit
+-----------
+..  autoclass:: paddle.v2.layer.gated_unit
+    :noindex:
 Recurrent Layer Group
 =====================
@@ -340,6 +350,11 @@ bilinear_interp
 ..  autoclass:: paddle.v2.layer.bilinear_interp
    :noindex:
+dropout
+--------
+..  autoclass:: paddle.v2.layer.dropout
+    :noindex:
 dot_prod
 ---------
 .. autoclass:: paddle.v2.layer.dot_prod
@@ -402,6 +417,11 @@ scale_shift
 ..  autoclass:: paddle.v2.layer.scale_shift
    :noindex:
+factorization_machine
+---------------------
+..  autoclass:: paddle.v2.layer.factorization_machine
+    :noindex:
 Sampling Layers
 ===============
@@ -420,22 +440,6 @@ multiplex
 ..  autoclass:: paddle.v2.layer.multiplex
    :noindex:
-Factorization Machine Layer
-============================
-factorization_machine
---------------------
-..  autoclass:: paddle.v2.layer.factorization_machine
-    :noindex:
-Slicing and Joining Layers
-==========================
-pad
----
-..  autoclass:: paddle.v2.layer.pad
-    :noindex:
 ..  _api_v2.layer_costs:
 Cost Layers
@@ -526,6 +530,11 @@ multibox_loss
 ..  autoclass:: paddle.v2.layer.multibox_loss
    :noindex:
+detection_output
+----------------
+..  autoclass:: paddle.v2.layer.detection_output
+    :noindex:
 Check Layer
 ============
@@ -534,31 +543,10 @@ eos
 ..  autoclass:: paddle.v2.layer.eos
    :noindex:
-Miscs
+Activation
-=====
+==========
-dropout
--------
-..  autoclass:: paddle.v2.layer.dropout
-    :noindex:
-Activation with learnable parameter
-===================================
 prelu
 --------
 ..  autoclass:: paddle.v2.layer.prelu
    :noindex:
-gated_unit
-----------
-..  autoclass:: paddle.v2.layer.gated_unit
-    :noindex:
-Detection output Layer
-======================
-detection_output
----------------
-..  autoclass:: paddle.v2.layer.detection_output
-    :noindex:
--- a/develop/doc_cn/_sources/api/v2/data/dataset.rst.txt
+++ b/develop/doc_cn/_sources/api/v2/data/dataset.rst.txt
@@ -73,3 +73,10 @@ wmt14
 ..  automodule:: paddle.v2.dataset.wmt14
    :members:
    :noindex:
+wmt16
+++++
+..  automodule:: paddle.v2.dataset.wmt16
+    :members:
+    :noindex:
--- a/develop/doc_cn/api/v2/config/layer.html
+++ b/develop/doc_cn/api/v2/config/layer.html
--- a/develop/doc_cn/api/v2/data/dataset.html
+++ b/develop/doc_cn/api/v2/data/dataset.html
@@ -748,6 +748,191 @@ sequence.</p>
 <dd><p>Converts dataset to recordio format</p>
 </dd></dl>
+</div>
+<div class="section" id="wmt16">
+<h2>wmt16<a class="headerlink" href="#wmt16" title="永久链接至标题">¶</a></h2>
+<p>ACL2016 Multimodal Machine Translation. Please see this website for more
+details: <a class="reference external" href="http://www.statmt.org/wmt16/multimodal-task.html#task1">http://www.statmt.org/wmt16/multimodal-task.html#task1</a></p>
+<p>If you use the dataset created for your task, please cite the following paper:
+Multi30K: Multilingual English-German Image Descriptions.</p>
+<dl class="docutils">
+<dt>&#64;article{elliott-EtAl:2016:VL16,</dt>
+<dd>author    = {{Elliott}, D. and {Frank}, S. and {Sima&#8221;an}, K. and {Specia}, L.},
+title     = {Multi30K: Multilingual English-German Image Descriptions},
+booktitle = {Proceedings of the 6th Workshop on Vision and Language},
+year      = {2016},
+pages     = {70&#8211;74},
+year      = 2016</dd>
+</dl>
+<p>}</p>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">train</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 train set reader.</p>
+<p>This function returns the reader for train data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for training data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The train reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">test</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 test set reader.</p>
+<p>This function returns the reader for test data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for test data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The test reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">validation</code><span class="sig-paren">(</span><em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang='en'</em><span class="sig-paren">)</span></dt>
+<dd><p>WMT16 validation set reader.</p>
+<p>This function returns the reader for validation data. Each sample the reader
+returns is made up of three fields: the source language word index sequence,
+target language word index sequence and next word index sequence.</p>
+<p>NOTE:
+The original like for validation data is:
+<a class="reference external" href="http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz">http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz</a></p>
+<p>paddle.dataset.wmt16 provides a tokenized version of the original dataset by
+using moses&#8217;s tokenization script:
+<a class="reference external" href="https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl">https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl</a></p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
+<li><strong>src_dict_size</strong> (<em>int</em>) &#8211; Size of the source language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>trg_dict_size</strong> (<em>int</em>) &#8211; Size of the target language dictionary. Three
+special tokens will be added into the dictionary:
+&lt;s&gt; for start mark, &lt;e&gt; for end mark, and &lt;unk&gt; for
+unknown word.</li>
+<li><strong>src_lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The validation reader.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">callable</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">get_dict</code><span class="sig-paren">(</span><em>lang</em>, <em>dict_size</em>, <em>reverse=False</em><span class="sig-paren">)</span></dt>
+<dd><p>return the word dictionary for the specified language.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">参数:</th><td class="field-body"><ul class="first simple">
+<li><strong>lang</strong> (<em>string</em>) &#8211; A string indicating which language is the source
+language. Available options are: &#8220;en&#8221; for English
+and &#8220;de&#8221; for Germany.</li>
+<li><strong>dict_size</strong> (<em>int</em>) &#8211; Size of the specified language dictionary.</li>
+<li><strong>reverse</strong> (<em>bool</em>) &#8211; If reverse is set to False, the returned python
+dictionary will use word as key and use index as value.
+If reverse is set to True, the returned python
+dictionary will use index as key and word as value.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">返回:</th><td class="field-body"><p class="first">The word dictionary for the specific language.</p>
+</td>
+</tr>
+<tr class="field-odd field"><th class="field-name">返回类型:</th><td class="field-body"><p class="first last">dict</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">fetch</code><span class="sig-paren">(</span><span class="sig-paren">)</span></dt>
+<dd><p>download the entire dataset.</p>
+</dd></dl>
+<dl class="function">
+<dt>
+<code class="descclassname">paddle.v2.dataset.wmt16.</code><code class="descname">convert</code><span class="sig-paren">(</span><em>path</em>, <em>src_dict_size</em>, <em>trg_dict_size</em>, <em>src_lang</em><span class="sig-paren">)</span></dt>
+<dd><p>Converts dataset to recordio format.</p>
+</dd></dl>
 </div>
 </div>

--- a/develop/doc_cn/searchindex.js
+++ b/develop/doc_cn/searchindex.js