diff --git a/demo/sentiment/trainer_config.py b/demo/sentiment/trainer_config.py index 894070e7c97dcb29e8c0df31437a374be5f5d691..114a9138ebfef054c7d3ba99b4a510a452f8f2cd 100644 --- a/demo/sentiment/trainer_config.py +++ b/demo/sentiment/trainer_config.py @@ -29,6 +29,7 @@ settings( batch_size=128, learning_rate=2e-3, learning_method=AdamOptimizer(), + average_window=0.5, regularization=L2Regularization(8e-4), gradient_clipping_threshold=25) diff --git a/doc/algorithm/rnn/rnn.rst b/doc/algorithm/rnn/rnn.rst index 399c5da5fffc20dda78b9eefb2629308cabd748e..01d2caefb5cdf4e949511fd0f5bbafe0e604e881 100644 --- a/doc/algorithm/rnn/rnn.rst +++ b/doc/algorithm/rnn/rnn.rst @@ -17,7 +17,7 @@ PaddlePaddle does not need any preprocessing to sequence data, such as padding. .. code-block:: python - settings.slots = [ + settings.input_types = [ integer_value_sequence(len(settings.src_dict)), integer_value_sequence(len(settings.trg_dict)), integer_value_sequence(len(settings.trg_dict))] diff --git a/doc/demo/sentiment_analysis/sentiment_analysis.md b/doc/demo/sentiment_analysis/sentiment_analysis.md index 385f49891dcd840c525f7d1c3aaf7f08a7e4903f..c53952c544de9fa88a6318432e34b0d05b149445 100644 --- a/doc/demo/sentiment_analysis/sentiment_analysis.md +++ b/doc/demo/sentiment_analysis/sentiment_analysis.md @@ -6,7 +6,7 @@ Sentiment analysis is also used to monitor social media based on large amount of On the other hand, grabbing the user comments of products and analyzing their sentiment are useful to understand user preferences for companies, products, even competing products. -This tutorial will guide you through the process of training a Long Short Term Memory (LSTM) Network to classify the sentiment of sentences from [Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/), sometimes known as the [Internet Movie Database (IMDB)](http://ai.stanford.edu/~amaas/papers/wvSent_acl2011.pdf). This dataset contains movie reviews along with their associated binary sentiment polarity labels, namely positive and negative. So randomly guessing yields 50% accuracy. +This tutorial will guide you through the process of training a Long Short Term Memory (LSTM) Network to classify the sentiment of sentences from [Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/), sometimes known as the Internet Movie Database (IMDB). This dataset contains movie reviews along with their associated binary sentiment polarity labels, namely positive and negative. So randomly guessing yields 50% accuracy. ## Data Preparation @@ -39,7 +39,7 @@ imdbEr.txt imdb.vocab README test train * imdbEr.txt: expected rating for each token in imdb.vocab. * README: data documentation. -Both train and test set directory contains: +The file in train set directory is as follows. The test set also contains them except `unsup` and `urls_unsup.txt`. ``` labeledBow.feat neg pos unsup unsupBow.feat urls_neg.txt urls_pos.txt urls_unsup.txt @@ -151,6 +151,7 @@ settings( batch_size=128, learning_rate=2e-3, learning_method=AdamOptimizer(), + average_window=0.5, regularization=L2Regularization(8e-4), gradient_clipping_threshold=25 ) @@ -163,17 +164,18 @@ stacked_lstm_net(dict_dim, class_dim=class_dim, * **Data Definition**: * get\_config\_arg(): get arguments setted by `--config_args=xx` in commandline argument. - * Define TrainData and TestData provider, here using Python interface (PyDataProviderWrapper) of PaddlePaddle to load data. For details, you can refer to the document of PyDataProvider. + * Define data provider, here using Python interface to load data. For details, you can refer to the document of PyDataProvider2. * **Algorithm Configuration**: - * use sgd algorithm. - * use adam optimization. * set batch size of 128. - * set average sgd window. * set global learning rate. + * use adam optimization. + * set average sgd window. + * set L2 regularization. + * set gradient clipping threshold. * **Network Configuration**: - * dict_dim: get dictionary dimension. - * class_dim: set category number, IMDB has two label, namely positive and negative label. + * dict_dim: dictionary dimension. + * class_dim: category number, IMDB has two label, namely positive and negative label. * `stacked_lstm_net`: predefined network as shown in Figure 3, use this network by default. * `bidirectional_lstm_net`: predefined network as shown in Figure 2. diff --git a/doc/dev/new_layer/new_layer.rst b/doc/dev/new_layer/new_layer.rst index 2fa00730486dbe1f2c9585872068a77efa09f004..af8b76a3075194ead9be40d2c943238b2cfadecc 100644 --- a/doc/dev/new_layer/new_layer.rst +++ b/doc/dev/new_layer/new_layer.rst @@ -60,7 +60,7 @@ Implement C++ Class The C++ class of the layer implements the initialization, forward, and backward part of the layer. The fully connected layer is at :code:`paddle/gserver/layers/FullyConnectedLayer.h` and :code:`paddle/gserver/layers/FullyConnectedLayer.cpp`. We list simplified version of the code below. -It needs to derive the base class :code:`paddle::BaseLayer`, and it needs to override the following functions: +It needs to derive the base class :code:`paddle::Layer`, and it needs to override the following functions: - constructor and destructor. - :code:`init` function. It is used to initialize the parameters and settings.