Merge pull request #289 from hedaoyuan/sentiment_prediction

Prediction for understand sentiment

Merge pull request #289 from hedaoyuan/sentiment_prediction
Prediction for understand sentiment
469c1eaa · hedaoyuan · GitHub · 49442aff · d662c9fe · 469c1eaa
隐藏空白更改
内联并排

Showing with 124 addition and 20 deletion

05.understand_sentiment/README.md 05.understand_sentiment/README.md +62 -10

05.understand_sentiment/index.html 05.understand_sentiment/index.html +62 -10

未找到文件。
--- a/05.understand_sentiment/README.md
+++ b/05.understand_sentiment/README.md
@@ -141,7 +141,8 @@ import paddle.v2 as paddle
 def convolution_net(input_dim,
                    class_dim=2,
                    emb_dim=128,
-                    hid_dim=128):
+                    hid_dim=128,
+                    is_predict=False):
    data = paddle.layer.data("word",
                             paddle.data_type.integer_value_sequence(input_dim))
    emb = paddle.layer.embedding(input=data, size=emb_dim)
@@ -152,9 +153,12 @@ def convolution_net(input_dim,
    output = paddle.layer.fc(input=[conv_3, conv_4],
                             size=class_dim,
                             act=paddle.activation.Softmax())
-    lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
-    cost = paddle.layer.classification_cost(input=output, label=lbl)
-    return cost
+    if not is_predict:
+        lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
+        cost = paddle.layer.classification_cost(input=output, label=lbl)
+        return cost
+    else:
+        return output
 ```
 网络的输入`input_dim`表示的是词典的大小，`class_dim`表示类别数。这里，我们使用[`sequence_conv_pool`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/networks.py) API实现了卷积和池化操作。

@@ -165,7 +169,8 @@ def stacked_lstm_net(input_dim,
                     class_dim=2,
                     emb_dim=128,
                     hid_dim=512,
-                     stacked_num=3):
+                     stacked_num=3,
+                     is_predict=False):
    """
    A Wrapper for sentiment classification task.
    This network uses bi-directional recurrent network,
@@ -223,9 +228,12 @@ def stacked_lstm_net(input_dim,
                             bias_attr=bias_attr,
                             param_attr=para_attr)

-    lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
-    cost = paddle.layer.classification_cost(input=output, label=lbl)
-    return cost
+    if not is_predict:
+        lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
+        cost = paddle.layer.classification_cost(input=output, label=lbl)
+        return cost
+    else:
+        return output
 ```
 网络的输入`stacked_num`表示的是LSTM的层数，需要是奇数，确保最高层LSTM正向。Paddle里面是通过一个fc和一个lstmemory来实现基于LSTM的循环神经网络。

@@ -294,7 +302,7 @@ Paddle中提供了一系列优化算法的API，这里使用Adam优化算法。

 ### 训练

-可以通过`paddle.trainer.SGD`构造一个sgd trainer，并调用`trainer.train`来训练模型。
+可以通过`paddle.trainer.SGD`构造一个sgd trainer，并调用`trainer.train`来训练模型。另外，通过给train函数传递一个`event_handler`来获取每个batch和每个pass结束的状态。
 ```python
    # End batch and end pass event handler
    def event_handler(event):
@@ -309,7 +317,21 @@ Paddle中提供了一系列优化算法的API，这里使用Adam优化算法。
            result = trainer.test(reader=test_reader, feeding=feeding)
            print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics)
 ```
-可以通过给train函数传递一个`event_handler`来获取每个batch和每个pass结束的状态。比如构造如下一个`event_handler`可以在每100个batch结束后输出cost和error；在每个pass结束后调用`trainer.test`计算一遍测试集并获得当前模型在测试集上的error。
+比如，构造如下一个`event_handler`可以在每100个batch结束后输出cost和error；在每个pass结束后调用`trainer.test`计算一遍测试集并获得当前模型在测试集上的error。
+```python
+    from paddle.v2.plot import Ploter
+
+    train_title = "Train cost"
+    cost_ploter = Ploter(train_title)
+    step = 0
+    def event_handler_plot(event):
+        global step
+        if isinstance(event, paddle.event.EndIteration):
+            cost_ploter.append(train_title, step, event.cost)
+            cost_ploter.plot()
+            step += 1
+```
+或者构造一个`event_handler_plot`画出cost曲线。
 ```python
    # create trainer
    trainer = paddle.trainer.SGD(cost=cost,
@@ -331,6 +353,36 @@ Pass 0, Batch 100, Cost 0.294321, {'classification_error_evaluator': 0.1015625}
 Test with Pass 0, {'classification_error_evaluator': 0.11432000249624252}
 ```

+## 应用模型
+
+可以使用训练好的模型对电影评论进行分类，下面程序展示了如何使用`paddle.infer`接口进行推断。
+```python
+    import numpy as np
+
+    # Movie Reviews, from imdb test
+    reviews = [
+        'Read the book, forget the movie!',
+        'This is a great movie.'
+    ]
+    reviews = [c.split() for c in reviews]
+
+    UNK = word_dict['<unk>']
+    input = []
+    for c in reviews:
+        input.append([[word_dict.get(words, UNK) for words in c]])
+
+    # 0 stands for positive sample, 1 stands for negative sample
+    label = {0:'pos', 1:'neg'}
+    # Use the network used by trainer
+    out = convolution_net(dict_dim, class_dim=class_dim, is_predict=True)
+    # out = stacked_lstm_net(dict_dim, class_dim=class_dim, stacked_num=3, is_predict=True)
+    probs = paddle.infer(output_layer=out, parameters=parameters, input=input)
+
+    labs = np.argsort(-probs)
+    for idx, lab in enumerate(labs):
+        print idx, "predicting probability is", probs[idx], "label is", label[lab[0]]
+```
+
 ## 总结

 本章我们以情感分析为例，介绍了使用深度学习的方法进行端对端的短文本分类，并且使用PaddlePaddle完成了全部相关实验。同时，我们简要介绍了两种文本处理模型：卷积神经网络和循环神经网络。在后续的章节中我们会看到这两种基本的深度学习模型在其它任务上的应用。

--- a/05.understand_sentiment/index.html
+++ b/05.understand_sentiment/index.html
@@ -183,7 +183,8 @@ import paddle.v2 as paddle
 def convolution_net(input_dim,
                    class_dim=2,
                    emb_dim=128,
-                    hid_dim=128):
+                    hid_dim=128,
+                    is_predict=False):
    data = paddle.layer.data("word",
                             paddle.data_type.integer_value_sequence(input_dim))
    emb = paddle.layer.embedding(input=data, size=emb_dim)
@@ -194,9 +195,12 @@ def convolution_net(input_dim,
    output = paddle.layer.fc(input=[conv_3, conv_4],
                             size=class_dim,
                             act=paddle.activation.Softmax())
-    lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
-    cost = paddle.layer.classification_cost(input=output, label=lbl)
-    return cost
+    if not is_predict:
+        lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
+        cost = paddle.layer.classification_cost(input=output, label=lbl)
+        return cost
+    else:
+        return output
 ```
 网络的输入`input_dim`表示的是词典的大小，`class_dim`表示类别数。这里，我们使用[`sequence_conv_pool`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/networks.py) API实现了卷积和池化操作。

@@ -207,7 +211,8 @@ def stacked_lstm_net(input_dim,
                     class_dim=2,
                     emb_dim=128,
                     hid_dim=512,
-                     stacked_num=3):
+                     stacked_num=3,
+                     is_predict=False):
    """
    A Wrapper for sentiment classification task.
    This network uses bi-directional recurrent network,
@@ -265,9 +270,12 @@ def stacked_lstm_net(input_dim,
                             bias_attr=bias_attr,
                             param_attr=para_attr)

-    lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
-    cost = paddle.layer.classification_cost(input=output, label=lbl)
-    return cost
+    if not is_predict:
+        lbl = paddle.layer.data("label", paddle.data_type.integer_value(2))
+        cost = paddle.layer.classification_cost(input=output, label=lbl)
+        return cost
+    else:
+        return output
 ```
 网络的输入`stacked_num`表示的是LSTM的层数，需要是奇数，确保最高层LSTM正向。Paddle里面是通过一个fc和一个lstmemory来实现基于LSTM的循环神经网络。

@@ -336,7 +344,7 @@ Paddle中提供了一系列优化算法的API，这里使用Adam优化算法。

 ### 训练

-可以通过`paddle.trainer.SGD`构造一个sgd trainer，并调用`trainer.train`来训练模型。
+可以通过`paddle.trainer.SGD`构造一个sgd trainer，并调用`trainer.train`来训练模型。另外，通过给train函数传递一个`event_handler`来获取每个batch和每个pass结束的状态。
 ```python
    # End batch and end pass event handler
    def event_handler(event):
@@ -351,7 +359,21 @@ Paddle中提供了一系列优化算法的API，这里使用Adam优化算法。
            result = trainer.test(reader=test_reader, feeding=feeding)
            print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics)
 ```
-可以通过给train函数传递一个`event_handler`来获取每个batch和每个pass结束的状态。比如构造如下一个`event_handler`可以在每100个batch结束后输出cost和error；在每个pass结束后调用`trainer.test`计算一遍测试集并获得当前模型在测试集上的error。
+比如，构造如下一个`event_handler`可以在每100个batch结束后输出cost和error；在每个pass结束后调用`trainer.test`计算一遍测试集并获得当前模型在测试集上的error。
+```python
+    from paddle.v2.plot import Ploter
+
+    train_title = "Train cost"
+    cost_ploter = Ploter(train_title)
+    step = 0
+    def event_handler_plot(event):
+        global step
+        if isinstance(event, paddle.event.EndIteration):
+            cost_ploter.append(train_title, step, event.cost)
+            cost_ploter.plot()
+            step += 1
+```
+或者构造一个`event_handler_plot`画出cost曲线。
 ```python
    # create trainer
    trainer = paddle.trainer.SGD(cost=cost,
@@ -373,6 +395,36 @@ Pass 0, Batch 100, Cost 0.294321, {'classification_error_evaluator': 0.1015625}
 Test with Pass 0, {'classification_error_evaluator': 0.11432000249624252}
 ```

+## 应用模型
+
+可以使用训练好的模型对电影评论进行分类，下面程序展示了如何使用`paddle.infer`接口进行推断。
+```python
+    import numpy as np
+
+    # Movie Reviews, from imdb test
+    reviews = [
+        'Read the book, forget the movie!',
+        'This is a great movie.'
+    ]
+    reviews = [c.split() for c in reviews]
+
+    UNK = word_dict['<unk>']
+    input = []
+    for c in reviews:
+        input.append([[word_dict.get(words, UNK) for words in c]])
+
+    # 0 stands for positive sample, 1 stands for negative sample
+    label = {0:'pos', 1:'neg'}
+    # Use the network used by trainer
+    out = convolution_net(dict_dim, class_dim=class_dim, is_predict=True)
+    # out = stacked_lstm_net(dict_dim, class_dim=class_dim, stacked_num=3, is_predict=True)
+    probs = paddle.infer(output_layer=out, parameters=parameters, input=input)
+
+    labs = np.argsort(-probs)
+    for idx, lab in enumerate(labs):
+        print idx, "predicting probability is", probs[idx], "label is", label[lab[0]]
+```
+
 ## 总结

 本章我们以情感分析为例，介绍了使用深度学习的方法进行端对端的短文本分类，并且使用PaddlePaddle完成了全部相关实验。同时，我们简要介绍了两种文本处理模型：卷积神经网络和循环神经网络。在后续的章节中我们会看到这两种基本的深度学习模型在其它任务上的应用。