未验证 提交 0dbe8eb0 编写于 作者: S Shan Yi 提交者: GitHub

Merge pull request #13095 from shanyi15/0.14.0

fix_bookdoc_0.14.0
*.pyc
train.log
output
data/cifar-10-batches-py/
data/cifar-10-python.tar.gz
data/*.txt
data/*.list
data/mean.meta
......@@ -10,9 +10,9 @@
.. toctree::
:maxdepth: 2
image_classification/index.md
word2vec/index.md
recommender_system/index.md
understand_sentiment/index.md
label_semantic_roles/index.md
machine_translation/index.md
image_classification/README.cn.md
word2vec/README.cn.md
recommender_system/README.cn.md
understand_sentiment/README.cn.md
label_semantic_roles/README.cn.md
machine_translation/README.cn.md
data/train.list
data/test.*
data/conll05st-release.tar.gz
data/conll05st-release
data/predicate_dict
data/label_dict
data/word_dict
data/emb
data/feature
output
predict.res
train.log
data/wmt14
data/pre-wmt14
pretrained/wmt14_model
gen.log
gen_result
train.log
dataprovider_copy_1.py
*.pyc
multi-bleu.perl
data/aclImdb
data/imdb
data/pre-imdb
data/mosesdecoder-master
*.log
model_output
dataprovider_copy_1.py
model.list
*.pyc
.DS_Store
# 情感分析
本教程源代码目录在[book/understand_sentiment](https://github.com/PaddlePaddle/book/tree/develop/06.understand_sentiment), 初次使用请参考PaddlePaddle[安装教程](https://github.com/PaddlePaddle/book/blob/develop/README.cn.md#运行这本书)
本教程源代码目录在[book/understand_sentiment](https://github.com/PaddlePaddle/book/tree/develop/06.understand_sentiment), 初次使用请参考PaddlePaddle[安装教程](https://github.com/PaddlePaddle/book/blob/develop/README.cn.md#运行这本书),更多内容请参考本教程的[视频课堂](http://bit.baidu.com/course/detail/id/177.html)
## 背景介绍
......@@ -36,8 +36,8 @@
循环神经网络是一种能对序列数据进行精确建模的有力工具。实际上,循环神经网络的理论计算能力是图灵完备的\[[4](#参考文献)\]。自然语言是一种典型的序列数据(词序列),近年来,循环神经网络及其变体(如long short term memory\[[5](#参考文献)\]等)在自然语言处理的多个领域,如语言模型、句法解析、语义角色标注(或一般的序列标注)、语义表示、图文生成、对话、机器翻译等任务上均表现优异甚至成为目前效果最好的方法。
![rnn](./image/rnn.png)
<p align="center">
<img src="image/rnn.png" width = "60%" align="center"/><br/>
图1. 循环神经网络按时间展开的示意图
</p>
......@@ -65,8 +65,8 @@ $$ o_t = \sigma(W_{xo}x_t+W_{ho}h_{t-1}+W_{co}c_{t}+b_o) $$
$$ h_t = o_t\odot tanh(c_t) $$
其中,$i_t, f_t, c_t, o_t$分别表示输入门,遗忘门,记忆单元及输出门的向量值,带角标的$W$及$b$为模型参数,$tanh$为双曲正切函数,$\odot$表示逐元素(elementwise)的乘法操作。输入门控制着新输入进入记忆单元$c$的强度,遗忘门控制着记忆单元维持上一时刻值的强度,输出门控制着输出记忆单元的强度。三种门的计算方式类似,但有着完全不同的参数,它们各自以不同的方式控制着记忆单元$c$,如图2所示:
![lstm](./image/lstm.png)
<p align="center">
<img src="image/lstm.png" width = "65%" align="center"/><br/>
图2. 时刻$t$的LSTM [7]
</p>
......@@ -82,8 +82,8 @@ $$ h_t=Recrurent(x_t,h_{t-1})$$
如图3所示(以三层为例),奇数层LSTM正向,偶数层LSTM反向,高一层的LSTM使用低一层LSTM及之前所有层的信息作为输入,对最高层LSTM序列使用时间维度上的最大池化即可得到文本的定长向量表示(这一表示充分融合了文本的上下文信息,并且对文本进行了深层次抽象),最后我们将文本表示连接至softmax构建分类模型。
![stacked_lstm](./image/stacked_lstm.jpg)
<p align="center">
<img src="image/stacked_lstm.jpg" width=450><br/>
图3. 栈式双向LSTM用于文本分类
</p>
......@@ -94,11 +94,11 @@ $$ h_t=Recrurent(x_t,h_{t-1})$$
```text
aclImdb
|- test
|-- neg
|-- pos
|-- neg
|-- pos
|- train
|-- neg
|-- pos
|-- neg
|-- pos
```
Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取,并提供了读取字典、训练数据、测试数据等API。
......@@ -107,6 +107,7 @@ Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取
在该示例中,我们实现了两种文本分类算法,分别基于[推荐系统](https://github.com/PaddlePaddle/book/tree/develop/05.recommender_system)一节介绍过的文本卷积神经网络,以及[栈式双向LSTM](#栈式双向LSTM(Stacked Bidirectional LSTM))。我们首先引入要用到的库和定义全局变量:
```python
from __future__ import print_function
import paddle
import paddle.fluid as fluid
from functools import partial
......@@ -115,6 +116,7 @@ import numpy as np
CLASS_DIM = 2
EMB_DIM = 128
HID_DIM = 512
STACKED_NUM = 3
BATCH_SIZE = 128
USE_GPU = False
```
......@@ -126,23 +128,23 @@ USE_GPU = False
```python
def convolution_net(data, input_dim, class_dim, emb_dim, hid_dim):
emb = fluid.layers.embedding(
input=data, size=[input_dim, emb_dim], is_sparse=True)
conv_3 = fluid.nets.sequence_conv_pool(
input=emb,
num_filters=hid_dim,
filter_size=3,
act="tanh",
pool_type="sqrt")
conv_4 = fluid.nets.sequence_conv_pool(
input=emb,
num_filters=hid_dim,
filter_size=4,
act="tanh",
pool_type="sqrt")
prediction = fluid.layers.fc(
input=[conv_3, conv_4], size=class_dim, act="softmax")
return prediction
emb = fluid.layers.embedding(
input=data, size=[input_dim, emb_dim], is_sparse=True)
conv_3 = fluid.nets.sequence_conv_pool(
input=emb,
num_filters=hid_dim,
filter_size=3,
act="tanh",
pool_type="sqrt")
conv_4 = fluid.nets.sequence_conv_pool(
input=emb,
num_filters=hid_dim,
filter_size=4,
act="tanh",
pool_type="sqrt")
prediction = fluid.layers.fc(
input=[conv_3, conv_4], size=class_dim, act="softmax")
return prediction
```
网络的输入`input_dim`表示的是词典的大小,`class_dim`表示类别数。这里,我们使用[`sequence_conv_pool`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/networks.py) API实现了卷积和池化操作。
......@@ -154,27 +156,26 @@ return prediction
```python
def stacked_lstm_net(data, input_dim, class_dim, emb_dim, hid_dim, stacked_num):
emb = fluid.layers.embedding(
input=data, size=[input_dim, emb_dim], is_sparse=True)
emb = fluid.layers.embedding(
input=data, size=[input_dim, emb_dim], is_sparse=True)
fc1 = fluid.layers.fc(input=emb, size=hid_dim)
lstm1, cell1 = fluid.layers.dynamic_lstm(input=fc1, size=hid_dim)
fc1 = fluid.layers.fc(input=emb, size=hid_dim)
lstm1, cell1 = fluid.layers.dynamic_lstm(input=fc1, size=hid_dim)
inputs = [fc1, lstm1]
inputs = [fc1, lstm1]
for i in range(2, stacked_num + 1):
fc = fluid.layers.fc(input=inputs, size=hid_dim)
lstm, cell = fluid.layers.dynamic_lstm(
input=fc, size=hid_dim, is_reverse=(i % 2) == 0)
inputs = [fc, lstm]
for i in range(2, stacked_num + 1):
fc = fluid.layers.fc(input=inputs, size=hid_dim)
lstm, cell = fluid.layers.dynamic_lstm(
input=fc, size=hid_dim, is_reverse=(i % 2) == 0)
inputs = [fc, lstm]
fc_last = fluid.layers.sequence_pool(input=inputs[0], pool_type='max')
lstm_last = fluid.layers.sequence_pool(input=inputs[1], pool_type='max')
fc_last = fluid.layers.sequence_pool(input=inputs[0], pool_type='max')
lstm_last = fluid.layers.sequence_pool(input=inputs[1], pool_type='max')
prediction = fluid.layers.fc(input=[fc_last, lstm_last],
size=class_dim,
act='softmax')
return prediction
prediction = fluid.layers.fc(
input=[fc_last, lstm_last], size=class_dim, act='softmax')
return prediction
```
以上的栈式双向LSTM抽象出了高级特征并把其映射到和分类类别数同样大小的向量上。`paddle.activation.Softmax`函数用来计算分类属于某个类别的概率。
......@@ -184,12 +185,13 @@ return prediction
```python
def inference_program(word_dict):
data = fluid.layers.data(
name="words", shape=[1], dtype="int64", lod_level=1)
data = fluid.layers.data(
name="words", shape=[1], dtype="int64", lod_level=1)
dict_dim = len(word_dict)
net = convolution_net(data, dict_dim, CLASS_DIM, EMB_DIM, HID_DIM)
return net
dict_dim = len(word_dict)
net = convolution_net(data, dict_dim, CLASS_DIM, EMB_DIM, HID_DIM)
# net = stacked_lstm_net(data, dict_dim, CLASS_DIM, EMB_DIM, HID_DIM, STACKED_NUM)
return net
```
我们这里定义了`training_program`。它使用了从`inference_program`返回的结果来计算误差。我们同时定义了优化函数`optimizer_func`
......@@ -200,16 +202,16 @@ return net
```python
def train_program(word_dict):
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=prediction, label=label)
return [avg_cost, accuracy]
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=prediction, label=label)
return [avg_cost, accuracy]
def optimizer_func():
return fluid.optimizer.Adagrad(learning_rate=0.002)
return fluid.optimizer.Adagrad(learning_rate=0.002)
```
## 训练模型
......@@ -236,9 +238,9 @@ word_dict = paddle.dataset.imdb.word_dict()
print ("Reading training data....")
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.imdb.train(word_dict), buf_size=25000),
batch_size=BATCH_SIZE)
paddle.reader.shuffle(
paddle.dataset.imdb.train(word_dict), buf_size=25000),
batch_size=BATCH_SIZE)
```
### 构造训练器(trainer)
......@@ -246,9 +248,9 @@ batch_size=BATCH_SIZE)
```python
trainer = fluid.Trainer(
train_func=partial(train_program, word_dict),
place=place,
optimizer_func=optimizer_func)
train_func=partial(train_program, word_dict),
place=place,
optimizer_func=optimizer_func)
```
### 提供数据
......@@ -268,13 +270,13 @@ feed_order = ['words', 'label']
params_dirname = "understand_sentiment_conv.inference.model"
def event_handler(event):
if isinstance(event, fluid.EndStepEvent):
print("Step {0}, Epoch {1} Metrics {2}".format(
event.step, event.epoch, map(np.array, event.metrics)))
if isinstance(event, fluid.EndStepEvent):
print("Step {0}, Epoch {1} Metrics {2}".format(
event.step, event.epoch, map(np.array, event.metrics)))
if event.step == 10:
trainer.save_params(params_dirname)
trainer.stop()
if event.step == 10:
trainer.save_params(params_dirname)
trainer.stop()
```
### 开始训练
......@@ -283,10 +285,10 @@ trainer.stop()
```python
trainer.train(
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
```
## 应用模型
......@@ -297,7 +299,7 @@ feed_order=feed_order)
```python
inferencer = fluid.Inferencer(
inference_program, param_path=params_dirname, place=place)
infer_func=partial(inference_program, word_dict), param_path=params_dirname, place=place)
```
### 生成测试用输入数据
......@@ -307,14 +309,14 @@ inference_program, param_path=params_dirname, place=place)
```python
reviews_str = [
'read the book forget the movie', 'this is a great movie', 'this is very bad'
'read the book forget the movie', 'this is a great movie', 'this is very bad'
]
reviews = [c.split() for c in reviews_str]
UNK = word_dict['<unk>']
lod = []
for c in reviews:
lod.append([word_dict.get(words, UNK) for words in c])
lod.append([word_dict.get(words, UNK) for words in c])
base_shape = [[len(c) for c in lod]]
......@@ -329,7 +331,7 @@ tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
results = inferencer.infer({'words': tensor_words})
for i, r in enumerate(results[0]):
print("Predict probability of ", r[0], " to be positive and ", r[1], " to be negative for review \'", reviews_str[i], "\'")
print("Predict probability of ", r[0], " to be positive and ", r[1], " to be negative for review \'", reviews_str[i], "\'")
```
......
data/train.list
data/test.list
data/simple-examples*
......@@ -231,11 +231,13 @@ trainer.train(
event_handler=event_handler_plot,
feed_order=feed_order)
```
<p align="center">
<img src = "image/train_and_test.png" width=400><br/>
<img src = "image/train_and_test1.png" width=400><br/>
图3. 训练结果
</p>
## 预测
提供一个`inference_program`和一个`params_dirname`来初始化预测器。`params_dirname`用来存储我们的参数。
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册