@@ -99,11 +99,3 @@ In PaddlePaddle, training is just to get a collection of model parameters, which
Although starts from a random guess, you can see that value of ``w`` changes quickly towards 2 and ``b`` changes quickly towards 0.3. In the end, the predicted line is almost identical with real answer.
There, you have recovered the underlying pattern between ``X`` and ``Y`` only from observed data.
5. Where to Go from Here
-------------------------
- `Install and Build <../build_and_install/index.html>`_
If you already have a local PaddlePaddle repo and have not initialized the submodule, your local submodule folder will be empty. You can simply run the last line of the above codes in your PaddlePaddle home directory to initialize your submodule folder.
If you have already initialized your submodule and you would like to sync with the upstream submodule repo, you can run the following command
```
git submodule update --remote
```
## <span id="requirements">Requirements</span>
To compile the source code, your computer must be equipped with the following dependencies.
@@ -30,7 +30,7 @@ Then at the :code:`process` function, each :code:`yield` function will return th
yield src_ids, trg_ids, trg_ids_next
For more details description of how to write a data provider, please refer to `PyDataProvider2 <../../ui/data_provider/index.html>`_. The full data provider file is located at :code:`demo/seqToseq/dataprovider.py`.
For more details description of how to write a data provider, please refer to :ref:`api_pydataprovider2_en` . The full data provider file is located at :code:`demo/seqToseq/dataprovider.py`.
===============================================
Configure Recurrent Neural Network Architecture
...
...
@@ -106,7 +106,7 @@ We will use the sequence to sequence model with attention as an example to demon
In this model, the source sequence :math:`S = \{s_1, \dots, s_T\}` is encoded with a bidirectional gated recurrent neural networks. The hidden states of the bidirectional gated recurrent neural network :math:`H_S = \{H_1, \dots, H_T\}` is called *encoder vector* The decoder is a gated recurrent neural network. When decoding each token :math:`y_t`, the gated recurrent neural network generates a set of weights :math:`W_S^t = \{W_1^t, \dots, W_T^t\}`, which are used to compute a weighted sum of the encoder vector. The weighted sum of the encoder vector is utilized to condition the generation of the token :math:`y_t`.
The encoder part of the model is listed below. It calls :code:`grumemory` to represent gated recurrent neural network. It is the recommended way of using recurrent neural network if the network architecture is simple, because it is faster than :code:`recurrent_group`. We have implemented most of the commonly used recurrent neural network architectures, you can refer to `Layers <../../ui/api/trainer_config_helpers/layers_index.html>`_ for more details.
The encoder part of the model is listed below. It calls :code:`grumemory` to represent gated recurrent neural network. It is the recommended way of using recurrent neural network if the network architecture is simple, because it is faster than :code:`recurrent_group`. We have implemented most of the commonly used recurrent neural network architectures, you can refer to :ref:`api_trainer_config_helpers_layers` for more details.
We also project the encoder vector to :code:`decoder_size` dimensional space, get the first instance of the backward recurrent network, and project it to :code:`decoder_size` dimensional space:
...
...
@@ -246,6 +246,6 @@ The code is listed below:
outputs(beam_gen)
Notice that this generation technique is only useful for decoder like generation process. If you are working on sequence tagging tasks, please refer to `Semantic Role Labeling Demo <../../demo/semantic_role_labeling/index.html>`_ for more details.
Notice that this generation technique is only useful for decoder like generation process. If you are working on sequence tagging tasks, please refer to :ref:`semantic_role_labeling_en` for more details.
The full configuration file is located at :code:`demo/seqToseq/seqToseq_net.py`.
@@ -93,7 +93,7 @@ where `train.sh` is almost the same as `demo/seqToseq/translation/train.sh`, the
-`--init_model_path`: path of the initialization model, here is `data/paraphrase_model`
-`--load_missing_parameter_strategy`: operations when model file is missing, here use a normal distibution to initialize the other parameters except for the embedding layer
For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to [Text generation Tutorial](../text_generation/text_generation.md).
For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to [Text generation Tutorial](../text_generation/index_en.md).
@@ -12,7 +12,7 @@ This tutorial will teach the basics of deep learning (DL), including how to impl
To get started, please install PaddlePaddle on your computer. Throughout this tutorial, you will learn by implementing different DL models for text classification.
To install PaddlePaddle, please follow the instructions here: <ahref = "../../build/index.html">Build and Install</a>.
To install PaddlePaddle, please follow the instructions here: <ahref = "../../getstarted/build_and_install/index_en.html">Build and Install</a>.
## Overview
For the first step, you will use PaddlePaddle to build a **text classification** system. For example, suppose you run an e-commence website, and you want to analyze the sentiment of user reviews to evaluate product quality.
You can refer to the following link for more detailed examples and data formats: <ahref = "../../ui/data_provider/pydataprovider2.html">PyDataProvider2</a>.
You can refer to the following link for more detailed examples and data formats: <ahref = "../../api/data_provider/pydataprovider2_en.html">PyDataProvider2</a>.
## Network Architecture
You will describe four kinds of network architectures in this section.
<center> </center>
First, you will build a logistic regression model. Later, you will also get chance to build other more powerful network architectures.
For more detailed documentation, you could refer to: <ahref = "../../ui/api/trainer_config_helpers/layers_index.html">Layer documentation</a>。All configuration files are in `demo/quick_start` directory.
For more detailed documentation, you could refer to: <ahref = "../../api/trainer_config_helpers/layers.html">layer documentation</a>. All configuration files are in `demo/quick_start` directory.
### Logistic Regression
The architecture is illustrated in the following picture:
...
...
@@ -366,7 +366,7 @@ You can use single layer LSTM model with Dropout for our text classification pro
<br>
## Optimization Algorithm
<ahref = "../../ui/api/trainer_config_helpers/optimizers.html">Optimization algorithms</a> include Momentum, RMSProp, AdaDelta, AdaGrad, Adam, and Adamax. You can use Adam optimization method here, with L2 regularization and gradient clipping, because Adam has been proved to work very well for training recurrent neural network.
<ahref = "../../api/trainer_config_helpers/optimizers.html">Optimization algorithms</a> include Momentum, RMSProp, AdaDelta, AdaGrad, Adam, and Adamax. You can use Adam optimization method here, with L2 regularization and gradient clipping, because Adam has been proved to work very well for training recurrent neural network.
```python
settings(batch_size=128,
...
...
@@ -391,7 +391,8 @@ paddle train \
--use_gpu=false
```
If you want to install the remote training platform, which enables distributed training on clusters, follow the instructions here: <ahref = "../../cluster/index.html">Platform</a> documentation. We do not provide examples on how to train on clusters. Please refer to other demos or platform training documentation for mode details on training on clusters.
We do not provide examples on how to train on clusters here. If you want to train on clusters, please follow the <ahref = "../../howto/cluster/cluster_train_en.html">distributed training</a> documentation or other demos for more details.
## Inference
You can use the trained model to perform prediction on the dataset with no labels. You can also evaluate the model on dataset with labels to obtain its test accuracy.
<center> </center>
...
...
@@ -406,7 +407,7 @@ paddle train \
--init_model_path=./output/pass-0000x
```
We will give an example of performing prediction using Recurrent model on a dataset with no labels. You can refer to: <ahref = "../../ui/predict/swig_py_paddle_en.html">Python Prediction API</a> tutorial,or other <ahref = "../../demo/index.html">demo</a> for the prediction process using Python. You can also use the following script for inference or evaluation.
We will give an example of performing prediction using Recurrent model on a dataset with no labels. You can refer to<ahref = "../../api/predict/swig_py_paddle_en.html">Python Prediction API</a> tutorial,or other <ahref = "../../tutorials/index_en.html">demo</a> for the prediction process using Python. You can also use the following script for inference or evaluation.
inference script (predict.sh):
...
...
@@ -508,7 +509,7 @@ The scripts of data downloading, network configurations, and training scrips are
*\--config_args:Other configuration arguments.
*\--init_model_path:The path of the initial model parameter.
By default, the trainer will save model every pass. You can also specify `saving_period_by_batches` to set the frequency of batch saving. You can use `show_parameter_stats_period` to print the statistics of the parameters, which are very useful for tuning parameters. Other command line arguments can be found in <ahref = "../../ui/index.html#command-line-argument">command line argument documentation</a>。
By default, the trainer will save model every pass. You can also specify `saving_period_by_batches` to set the frequency of batch saving. You can use `show_parameter_stats_period` to print the statistics of the parameters, which are very useful for tuning parameters. Other command line arguments can be found in <ahref = "../../howto/cmd_parameter/index_en.html">command line argument documentation</a>。
Semantic role labeling (SRL) is a form of shallow semantic parsing whose goal is to discover the predicate-argument structure of each predicate in a given input sentence. SRL is useful as an intermediate step in a wide range of natural language processing tasks, such as information extraction. automatic document categorization and question answering. An instance is as following [1]:
[1] Martha Palmer, Dan Gildea, and Paul Kingsbury. The Proposition Bank: An Annotated Corpus of Semantic Roles , Computational Linguistics, 31(1), 2005.
[2] Zhou, Jie, and Wei Xu. "End-to-end learning of semantic role labeling using recurrent neural networks." Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2015.