diff --git a/doc/api/data_provider/pydataprovider2_en.rst b/doc/api/data_provider/pydataprovider2_en.rst index 083436e2710b4582e11741aaeaf5932d59869473..50e8b0d32923c4fea37f2296a76cf5b44c8364e7 100644 --- a/doc/api/data_provider/pydataprovider2_en.rst +++ b/doc/api/data_provider/pydataprovider2_en.rst @@ -1,4 +1,4 @@ -.. _api_pydataprovider: +.. _api_pydataprovider2_en: PyDataProvider2 =============== @@ -104,6 +104,8 @@ And PaddlePadle will do all of the rest things\: Is this cool? +.. _api_pydataprovider2_en_sequential_model: + DataProvider for the sequential model ------------------------------------- A sequence model takes sequences as its input. A sequence is made up of several diff --git a/doc/api/predict/swig_py_paddle_en.rst b/doc/api/predict/swig_py_paddle_en.rst index 9845cd1607b425dc0a4ddc665aab40d96fa2fbe4..8b145e5b30a88db9f61c63249885dac92dd1fa9c 100644 --- a/doc/api/predict/swig_py_paddle_en.rst +++ b/doc/api/predict/swig_py_paddle_en.rst @@ -23,7 +23,7 @@ python's :code:`help()` function. Let's walk through the above python script: * At the beginning, use :code:`swig_paddle.initPaddle()` to initialize PaddlePaddle with command line arguments, for more about command line arguments - see `Command Line Arguments <../cmd_argument/detail_introduction.html>`_. + see :ref:`cmd_detail_introduction_en` . * Parse the configuration file that is used in training with :code:`parse_config()`. Because data to predict with always have no label, and output of prediction work normally is the output layer rather than the cost layer, so you should modify @@ -36,7 +36,7 @@ python's :code:`help()` function. Let's walk through the above python script: - Note: As swig_paddle can only accept C++ matrices, we offer a utility class DataProviderConverter that can accept the same input data with PyDataProvider2, for more information please refer to document - of `PyDataProvider2 <../data_provider/pydataprovider2.html>`_. + of :ref:`api_pydataprovider2_en` . * Do the prediction with :code:`forwardTest()`, which takes the converted input data and outputs the activations of the output layer. diff --git a/doc/api/trainer_config_helpers/layers.rst b/doc/api/trainer_config_helpers/layers.rst index 12a75080d0deab1ecce6b2579b059ba56abf6711..52a6cfb120504d57617f0d777b5ca49cd7d269d7 100644 --- a/doc/api/trainer_config_helpers/layers.rst +++ b/doc/api/trainer_config_helpers/layers.rst @@ -1,3 +1,5 @@ +.. _api_trainer_config_helpers_layers: + ====== Layers ====== diff --git a/doc/getstarted/basic_usage/index_en.rst b/doc/getstarted/basic_usage/index_en.rst index dca7a6b1f4f017b302148c611122806f112564a9..4ffadc68ee53e12e3b3cb56ea27021c52505aebf 100644 --- a/doc/getstarted/basic_usage/index_en.rst +++ b/doc/getstarted/basic_usage/index_en.rst @@ -99,11 +99,3 @@ In PaddlePaddle, training is just to get a collection of model parameters, which Although starts from a random guess, you can see that value of ``w`` changes quickly towards 2 and ``b`` changes quickly towards 0.3. In the end, the predicted line is almost identical with real answer. There, you have recovered the underlying pattern between ``X`` and ``Y`` only from observed data. - - -5. Where to Go from Here -------------------------- - -- `Install and Build <../build_and_install/index.html>`_ -- `Tutorials <../demo/quick_start/index_en.html>`_ -- `Example and Demo <../demo/index.html>`_ diff --git a/doc/howto/cmd_parameter/detail_introduction_en.md b/doc/howto/cmd_parameter/detail_introduction_en.md index 510396b629e398cef2ccda2f1cec474160693219..82136b7d4f65ffcdff60243feb25b31a4a468637 100644 --- a/doc/howto/cmd_parameter/detail_introduction_en.md +++ b/doc/howto/cmd_parameter/detail_introduction_en.md @@ -1,3 +1,7 @@ +```eval_rst +.. _cmd_detail_introduction_en: +``` + # Detail Description ## Common diff --git a/doc/howto/deep_model/rnn/rnn_en.rst b/doc/howto/deep_model/rnn/rnn_en.rst index da29b8efadd299fe4fc74a71392cbc9a56e32be3..64f464b1dc0546462cd6b9294e93e5be935e4f46 100644 --- a/doc/howto/deep_model/rnn/rnn_en.rst +++ b/doc/howto/deep_model/rnn/rnn_en.rst @@ -30,7 +30,7 @@ Then at the :code:`process` function, each :code:`yield` function will return th yield src_ids, trg_ids, trg_ids_next -For more details description of how to write a data provider, please refer to `PyDataProvider2 <../../ui/data_provider/index.html>`_. The full data provider file is located at :code:`demo/seqToseq/dataprovider.py`. +For more details description of how to write a data provider, please refer to :ref:`api_pydataprovider2_en` . The full data provider file is located at :code:`demo/seqToseq/dataprovider.py`. =============================================== Configure Recurrent Neural Network Architecture @@ -106,7 +106,7 @@ We will use the sequence to sequence model with attention as an example to demon In this model, the source sequence :math:`S = \{s_1, \dots, s_T\}` is encoded with a bidirectional gated recurrent neural networks. The hidden states of the bidirectional gated recurrent neural network :math:`H_S = \{H_1, \dots, H_T\}` is called *encoder vector* The decoder is a gated recurrent neural network. When decoding each token :math:`y_t`, the gated recurrent neural network generates a set of weights :math:`W_S^t = \{W_1^t, \dots, W_T^t\}`, which are used to compute a weighted sum of the encoder vector. The weighted sum of the encoder vector is utilized to condition the generation of the token :math:`y_t`. -The encoder part of the model is listed below. It calls :code:`grumemory` to represent gated recurrent neural network. It is the recommended way of using recurrent neural network if the network architecture is simple, because it is faster than :code:`recurrent_group`. We have implemented most of the commonly used recurrent neural network architectures, you can refer to `Layers <../../ui/api/trainer_config_helpers/layers_index.html>`_ for more details. +The encoder part of the model is listed below. It calls :code:`grumemory` to represent gated recurrent neural network. It is the recommended way of using recurrent neural network if the network architecture is simple, because it is faster than :code:`recurrent_group`. We have implemented most of the commonly used recurrent neural network architectures, you can refer to :ref:`api_trainer_config_helpers_layers` for more details. We also project the encoder vector to :code:`decoder_size` dimensional space, get the first instance of the backward recurrent network, and project it to :code:`decoder_size` dimensional space: @@ -246,6 +246,6 @@ The code is listed below: outputs(beam_gen) -Notice that this generation technique is only useful for decoder like generation process. If you are working on sequence tagging tasks, please refer to `Semantic Role Labeling Demo <../../demo/semantic_role_labeling/index.html>`_ for more details. +Notice that this generation technique is only useful for decoder like generation process. If you are working on sequence tagging tasks, please refer to :ref:`sentiment_analysis_en` for more details. The full configuration file is located at :code:`demo/seqToseq/seqToseq_net.py`. diff --git a/doc/howto/optimization/gpu_profiling_en.rst b/doc/howto/optimization/gpu_profiling_en.rst index 667bf1364e7cd4c9098caba72a127228d78ca38b..40ba698f4e571dfd9370fcfb9382ea50e814ca2e 100644 --- a/doc/howto/optimization/gpu_profiling_en.rst +++ b/doc/howto/optimization/gpu_profiling_en.rst @@ -51,7 +51,7 @@ In this tutorial, we will focus on nvprof and nvvp. :code:`test_GpuProfiler` from :code:`paddle/math/tests` directory will be used to evaluate above profilers. -.. literalinclude:: ../../paddle/math/tests/test_GpuProfiler.cpp +.. literalinclude:: ../../../paddle/math/tests/test_GpuProfiler.cpp :language: c++ :lines: 111-124 :linenos: @@ -77,7 +77,7 @@ As a simple example, consider the following: 1. Add :code:`REGISTER_TIMER_INFO` and :code:`printAllStatus` functions (see the emphasize-lines). - .. literalinclude:: ../../paddle/math/tests/test_GpuProfiler.cpp + .. literalinclude:: ../../../paddle/math/tests/test_GpuProfiler.cpp :language: c++ :lines: 111-124 :emphasize-lines: 8-10,13 @@ -124,7 +124,7 @@ To use this command line profiler **nvprof**, you can simply issue the following 1. Add :code:`REGISTER_GPU_PROFILER` function (see the emphasize-lines). - .. literalinclude:: ../../paddle/math/tests/test_GpuProfiler.cpp + .. literalinclude:: ../../../paddle/math/tests/test_GpuProfiler.cpp :language: c++ :lines: 111-124 :emphasize-lines: 6-7 diff --git a/doc/tutorials/embedding_model/index_en.md b/doc/tutorials/embedding_model/index_en.md index 06f3ff1f009e470cdb9687658613a76acbb79751..d793a50f488e464bcd90a2fb506a8dcc3c760433 100644 --- a/doc/tutorials/embedding_model/index_en.md +++ b/doc/tutorials/embedding_model/index_en.md @@ -93,7 +93,7 @@ where `train.sh` is almost the same as `demo/seqToseq/translation/train.sh`, the - `--init_model_path`: path of the initialization model, here is `data/paraphrase_model` - `--load_missing_parameter_strategy`: operations when model file is missing, here use a normal distibution to initialize the other parameters except for the embedding layer -For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to [Text generation Tutorial](../text_generation/text_generation.md). +For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to [Text generation Tutorial](../text_generation/index_en.md). ## Optional Function ## ### Embedding Parameters Observation diff --git a/doc/tutorials/rec/ml_regression_en.rst b/doc/tutorials/rec/ml_regression_en.rst index ddc00dc706535e1204b033b505ee8bd579f8dea3..6346090a84fad71ab9dff21de0dcc536b5760b83 100644 --- a/doc/tutorials/rec/ml_regression_en.rst +++ b/doc/tutorials/rec/ml_regression_en.rst @@ -264,7 +264,7 @@ In this :code:`dataprovider.py`, we should set\: * use_seq\: Whether this :code:`dataprovider.py` in sequence mode or not. * process\: Return each sample of data to :code:`paddle`. -The data provider details document see :ref:`api_pydataprovider`. +The data provider details document see :ref:`api_pydataprovider2_en`. Train ````` diff --git a/doc/tutorials/semantic_role_labeling/semantic_role_labeling_cn.md b/doc/tutorials/semantic_role_labeling/semantic_role_labeling_cn.md deleted file mode 100644 index f3c855a9fd72b894ab69050b08c750fe9e4aa1a2..0000000000000000000000000000000000000000 --- a/doc/tutorials/semantic_role_labeling/semantic_role_labeling_cn.md +++ /dev/null @@ -1,201 +0,0 @@ -# 语义角色标注教程 # - -语义角色标注(Semantic role labeling, SRL)是浅语义解析的一种形式,其目的是在给定的输入句子中发现每个谓词的谓词参数结构。 SRL作为很多自然语言处理任务中的中间步骤是很有用的,如信息提取、文档自动分类和问答。 实例如下 [1]: - - [ A0 他 ] [ AM-MOD 将 ][ AM-NEG 不会 ] [ V 接受] [ A1 任何东西 ] 从 [A2 那些他写的东西中 ]。 - -- V: 动词 -- A0: 接受者 -- A1: 接受的东西 -- A2: 从……接受 -- A3: 属性 -- AM-MOD: 情态动词 -- AM-NEG: 否定 - -给定动词“接受”,句子中的大部分将会扮演某些语义角色。这里,标签方案来自 Penn Proposition Bank。 - -到目前为止,大多数成功的SRL系统是建立在某种形式的解析结果之上的,其中在语法结构上使用了预先定义的特征模板。 本教程将介绍使用深度双向长短期记忆(DB-LSTM)模型[2]的端到端系统来解决SRL任务,这在很大程度上优于先前的最先进的系统。 这个系统将SRL任务视为序列标记问题。 - -## 数据描述 -相关论文[2]采用 CoNLL-2005&2012 共享任务中设置的数据进行训练和测试。根据数据许可证,演示采用 CoNLL-2005 的测试数据集,可以在网站上找到。 - -用户只需执行以下命令就可以下载并处理原始数据: - -```bash -cd data -./get_data.sh -``` -`data `目录会出现如下几个新的文件: -```bash -conll05st-release:the test data set of CoNll-2005 shared task -test.wsj.words:the Wall Street Journal data sentences -test.wsj.props: the propositional arguments -feature: the extracted features from data set -``` - -## 训练 -### DB-LSTM -请参阅情绪分析的演示以了解有关长期短期记忆单元的更多信息。 - -与在 Sentiment Analysis 演示中使用的 Bidirectional-LSTM 不同,DB-LSTM 采用另一种方法来堆叠LSTM层。首先,标准LSTM以正向处理该序列。该 LSTM 层的输入和输出作为下一个 LSTM 层的输入,并被反向处理。这两个标准 LSTM 层组成一对 LSTM。然后我们堆叠一对对的 LSTM 层后得到深度 LSTM 模型。 - -下图展示了时间扩展的2层 DB-LSTM 网络。 -