未验证 提交 6c782c88 编写于 作者: L lujun 提交者: GitHub

Merge pull request #1 from PaddlePaddle/develop

合并代码
...@@ -103,17 +103,9 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$ ...@@ -103,17 +103,9 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy import numpy
import math
import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) 我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing)
...@@ -123,7 +115,7 @@ except ImportError: ...@@ -123,7 +115,7 @@ except ImportError:
1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。 1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。
2. [数据预处理](#数据预处理)的过程。 2. [数据预处理](#数据预处理)的过程。
接下来我们定义了用于训练和测试的数据提供器。提供器每次读入一个大小为`BATCH_SIZE`的数据批次。如果用户希望加一些随机性,她可以同时定义一个批次大小和一个缓存大小。这样的话,每次数据提供器会从缓存中随机读取批次大小那么多的数据。 接下来我们定义了用于训练的数据提供器。提供器每次读入一个大小为`BATCH_SIZE`的数据批次。如果用户希望加一些随机性,它可以同时定义一个批次大小和一个缓存大小。这样的话,每次数据提供器会从缓存中随机读取批次大小那么多的数据。
```python ```python
BATCH_SIZE = 20 BATCH_SIZE = 20
...@@ -143,17 +135,15 @@ test_reader = paddle.batch( ...@@ -143,17 +135,15 @@ test_reader = paddle.batch(
训练程序的目的是定义一个训练模型的网络结构。对于线性回归来讲,它就是一个从输入到输出的简单的全连接层。更加复杂的结果,比如卷积神经网络,递归神经网络等会在随后的章节中介绍。训练程序必须返回`平均损失`作为第一个返回值,因为它会被后面反向传播算法所用到。 训练程序的目的是定义一个训练模型的网络结构。对于线性回归来讲,它就是一个从输入到输出的简单的全连接层。更加复杂的结果,比如卷积神经网络,递归神经网络等会在随后的章节中介绍。训练程序必须返回`平均损失`作为第一个返回值,因为它会被后面反向传播算法所用到。
```python ```python
def train_program(): x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y = fluid.layers.data(name='y', shape=[1], dtype='float32') y = fluid.layers.data(name='y', shape=[1], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y) main_program = fluid.default_main_program()
avg_loss = fluid.layers.mean(loss) startup_program = fluid.default_startup_program()
return avg_loss cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(cost)
``` ```
### Optimizer Function 配置 ### Optimizer Function 配置
...@@ -161,8 +151,11 @@ def train_program(): ...@@ -161,8 +151,11 @@ def train_program():
在下面的 `SGD optimizer``learning_rate` 是训练的速度,与网络的训练收敛速度有关系。 在下面的 `SGD optimizer``learning_rate` 是训练的速度,与网络的训练收敛速度有关系。
```python ```python
def optimizer_program(): sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
return fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer.minimize(avg_loss)
#clone a test_program
test_program = main_program.clone(for_test=True)
``` ```
### 定义运算场所 ### 定义运算场所
...@@ -171,113 +164,129 @@ def optimizer_program(): ...@@ -171,113 +164,129 @@ def optimizer_program():
```python ```python
use_cuda = False use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
``` ```
### 创建训练器 除此之外,还可以通过画图,来展现`训练进程`
训练器会读入一个训练程序和一些必要的其他参数:
```python ```python
trainer = Trainer( # Plot data
train_func=train_program, from paddle.utils.plot import Ploter
place=place,
optimizer_func=optimizer_program) train_prompt = "Train cost"
test_prompt = "Test cost"
plot_prompt = Ploter(train_prompt, test_prompt)
``` ```
### 开始提供数据 ### 创建训练过程
PaddlePaddle提供了读取数据者发生器机制来读取训练数据。读取数据者会一次提供多列数据,因此我们需要一个Python的list来定义读取顺序 训练需要有一个训练程序和一些必要参数,并构建了一个获取训练过程中测试误差的函数
```python ```python
feed_order=['x', 'y'] num_epochs = 100
# For training test cost
def train_test(executor, program, reader, feeder, fetch_list):
accumulated = 1 * [0]
count = 0
for data_test in reader():
outs = executor.run(program=program,
feed=feeder.feed(data_test),
fetch_list=fetch_list)
accumulated = [x_c[0] + x_c[1][0] for x_c in zip(accumulated, outs)]
count += 1
return [x_d / count for x_d in accumulated]
``` ```
除此之外,可以定义一个事件响应器来处理类似`打印训练进程`的事件: ### 训练主循环
PaddlePaddle提供了读取数据者发生器机制来读取训练数据。读取数据者会一次提供多列数据,因此我们需要一个Python的list来定义读取顺序。我们构建一个循环来进行训练,直到训练结果足够好或者循环次数足够多。
如果训练顺利,可以把训练参数保存到`params_dirname`
```python ```python
%matplotlib inline
# Specify the directory to save the parameters # Specify the directory to save the parameters
params_dirname = "fit_a_line.inference.model" params_dirname = "fit_a_line.inference.model"
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
naive_exe = fluid.Executor(place)
train_title = "Train cost" naive_exe.run(startup_program)
test_title = "Test cost"
step = 0 step = 0
# event_handler prints training and testing info exe_test = fluid.Executor(place)
def event_handler(event):
global step
if isinstance(event, EndStepEvent):
if step % 10 == 0: # record a train cost every 10 batches
print("%s, Step %d, Cost %f" % (train_title, step, event.metrics[0]))
# main train loop.
for pass_id in range(num_epochs):
for data_train in train_reader():
avg_loss_value, = exe.run(main_program,
feed=feeder.feed(data_train),
fetch_list=[avg_loss])
if step % 10 == 0: # record a train cost every 10 batches
plot_cost.append(train_prompt, step, avg_loss_value[0])
plot_cost.plot()
if step % 100 == 0: # record a test cost every 100 batches if step % 100 == 0: # record a test cost every 100 batches
test_metrics = trainer.test( test_metics = train_test(executor=exe_test,
reader=test_reader, feed_order=feed_order) program=test_program,
print("%s, Step %d, Cost %f" % (test_title, step, test_metrics[0])) reader=test_reader,
fetch_list=[avg_loss.name],
if test_metrics[0] < 10.0: feeder=feeder)
plot_cost.append(test_prompt, step, test_metics[0])
plot_cost.plot()
# If the accuracy is good enough, we can stop the training. # If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop') if test_metics[0] < 10.0:
trainer.stop() break
step += 1 step += 1
if isinstance(event, EndEpochEvent): if math.isnan(float(avg_loss_value[0])):
if event.epoch % 10 == 0: sys.exit("got NaN loss, training failed.")
# We can save the trained parameters for the inferences later
if params_dirname is not None: if params_dirname is not None:
trainer.save_params(params_dirname) # We can save the trained parameters for the inferences later
``` fluid.io.save_inference_model(params_dirname, ['x'],
[y_predict], exe)
### 开始训练
我们现在可以通过调用`trainer.train()`来开始训练
```python
%matplotlib inline
# The training could take up to a few minutes.
trainer.train(
reader=train_reader,
num_epochs=100,
event_handler=event_handler,
feed_order=feed_order)
``` ```
## 预测 ## 预测
提供一个`inference_program`和一个`params_dirname`来初始化预测器。`params_dirname`用来存储我们的参数。 需要构建一个使用训练好的参数来进行预测的程序,训练好的参数位置在`params_dirname`
### 设定预测程序
类似于`trainer.train`,预测器需要一个预测程序来做预测。我们可以稍加修改我们的训练程序来把预测值包含进来。
### 准备预测环境
类似于训练过程,预测器需要一个预测程序来做预测。我们可以稍加修改我们的训练程序来把预测值包含进来。
```python ```python
def inference_program(): infer_exe = fluid.Executor(place)
x = fluid.layers.data(name='x', shape=[13], dtype='float32') inference_scope = fluid.core.Scope()
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
``` ```
### 预测 ### 预测
预测器会从`params_dirname`中读取已经训练好的模型,来对从未遇见过的数据进行预测。 通过fluid.io.load_inference_model,预测器会从`params_dirname`中读取已经训练好的模型,来对从未遇见过的数据进行预测。
```python ```python
inferencer = Inferencer( with fluid.scope_guard(inference_scope):
infer_func=inference_program, param_path=params_dirname, place=place) [inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, infer_exe)
batch_size = 10 batch_size = 10
test_reader = paddle.batch(paddle.dataset.uci_housing.test(),batch_size=batch_size)
test_data = next(test_reader()) infer_reader = paddle.batch(
test_x = numpy.array([data[0] for data in test_data]).astype("float32") paddle.dataset.uci_housing.test(), batch_size=batch_size)
test_y = numpy.array([data[1] for data in test_data]).astype("float32")
infer_data = next(infer_reader())
results = inferencer.infer({'x': test_x}) infer_feat = numpy.array(
[data[0] for data in infer_data]).astype("float32")
print("infer results: (House Price)") infer_label = numpy.array(
for idx, val in enumerate(results[0]): [data[1] for data in infer_data]).astype("float32")
assert feed_target_names[0] == 'x'
results = infer_exe.run(inference_program,
feed={feed_target_names[0]: numpy.array(infer_feat)},
fetch_list=fetch_targets)
print("infer results: (House Price)")
for idx, val in enumerate(results[0]):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
print("\nground truth:") print("\nground truth:")
for idx, val in enumerate(test_y): for idx, val in enumerate(infer_label):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
``` ```
......
...@@ -145,17 +145,9 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$ ...@@ -145,17 +145,9 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy import numpy
import math
import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) 我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing)
...@@ -165,7 +157,7 @@ except ImportError: ...@@ -165,7 +157,7 @@ except ImportError:
1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。 1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。
2. [数据预处理](#数据预处理)的过程。 2. [数据预处理](#数据预处理)的过程。
接下来我们定义了用于训练和测试的数据提供器。提供器每次读入一个大小为`BATCH_SIZE`的数据批次。如果用户希望加一些随机性,她可以同时定义一个批次大小和一个缓存大小。这样的话,每次数据提供器会从缓存中随机读取批次大小那么多的数据。 接下来我们定义了用于训练的数据提供器。提供器每次读入一个大小为`BATCH_SIZE`的数据批次。如果用户希望加一些随机性,它可以同时定义一个批次大小和一个缓存大小。这样的话,每次数据提供器会从缓存中随机读取批次大小那么多的数据。
```python ```python
BATCH_SIZE = 20 BATCH_SIZE = 20
...@@ -185,17 +177,15 @@ test_reader = paddle.batch( ...@@ -185,17 +177,15 @@ test_reader = paddle.batch(
训练程序的目的是定义一个训练模型的网络结构。对于线性回归来讲,它就是一个从输入到输出的简单的全连接层。更加复杂的结果,比如卷积神经网络,递归神经网络等会在随后的章节中介绍。训练程序必须返回`平均损失`作为第一个返回值,因为它会被后面反向传播算法所用到。 训练程序的目的是定义一个训练模型的网络结构。对于线性回归来讲,它就是一个从输入到输出的简单的全连接层。更加复杂的结果,比如卷积神经网络,递归神经网络等会在随后的章节中介绍。训练程序必须返回`平均损失`作为第一个返回值,因为它会被后面反向传播算法所用到。
```python ```python
def train_program(): x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y = fluid.layers.data(name='y', shape=[1], dtype='float32') y = fluid.layers.data(name='y', shape=[1], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y) main_program = fluid.default_main_program()
avg_loss = fluid.layers.mean(loss) startup_program = fluid.default_startup_program()
return avg_loss cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(cost)
``` ```
### Optimizer Function 配置 ### Optimizer Function 配置
...@@ -203,8 +193,11 @@ def train_program(): ...@@ -203,8 +193,11 @@ def train_program():
在下面的 `SGD optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。 在下面的 `SGD optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。
```python ```python
def optimizer_program(): sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
return fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer.minimize(avg_loss)
#clone a test_program
test_program = main_program.clone(for_test=True)
``` ```
### 定义运算场所 ### 定义运算场所
...@@ -213,113 +206,129 @@ def optimizer_program(): ...@@ -213,113 +206,129 @@ def optimizer_program():
```python ```python
use_cuda = False use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
``` ```
### 创建训练器 除此之外,还可以通过画图,来展现`训练进程`:
训练器会读入一个训练程序和一些必要的其他参数:
```python ```python
trainer = Trainer( # Plot data
train_func=train_program, from paddle.utils.plot import Ploter
place=place,
optimizer_func=optimizer_program) train_prompt = "Train cost"
test_prompt = "Test cost"
plot_prompt = Ploter(train_prompt, test_prompt)
``` ```
### 开始提供数据 ### 创建训练过程
PaddlePaddle提供了读取数据者发生器机制来读取训练数据。读取数据者会一次提供多列数据,因此我们需要一个Python的list来定义读取顺序 训练需要有一个训练程序和一些必要参数,并构建了一个获取训练过程中测试误差的函数
```python ```python
feed_order=['x', 'y'] num_epochs = 100
# For training test cost
def train_test(executor, program, reader, feeder, fetch_list):
accumulated = 1 * [0]
count = 0
for data_test in reader():
outs = executor.run(program=program,
feed=feeder.feed(data_test),
fetch_list=fetch_list)
accumulated = [x_c[0] + x_c[1][0] for x_c in zip(accumulated, outs)]
count += 1
return [x_d / count for x_d in accumulated]
``` ```
除此之外,可以定义一个事件响应器来处理类似`打印训练进程`的事件: ### 训练主循环
PaddlePaddle提供了读取数据者发生器机制来读取训练数据。读取数据者会一次提供多列数据,因此我们需要一个Python的list来定义读取顺序。我们构建一个循环来进行训练,直到训练结果足够好或者循环次数足够多。
如果训练顺利,可以把训练参数保存到`params_dirname`。
```python ```python
%matplotlib inline
# Specify the directory to save the parameters # Specify the directory to save the parameters
params_dirname = "fit_a_line.inference.model" params_dirname = "fit_a_line.inference.model"
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
naive_exe = fluid.Executor(place)
train_title = "Train cost" naive_exe.run(startup_program)
test_title = "Test cost"
step = 0 step = 0
# event_handler prints training and testing info exe_test = fluid.Executor(place)
def event_handler(event):
global step
if isinstance(event, EndStepEvent):
if step % 10 == 0: # record a train cost every 10 batches
print("%s, Step %d, Cost %f" % (train_title, step, event.metrics[0]))
# main train loop.
for pass_id in range(num_epochs):
for data_train in train_reader():
avg_loss_value, = exe.run(main_program,
feed=feeder.feed(data_train),
fetch_list=[avg_loss])
if step % 10 == 0: # record a train cost every 10 batches
plot_cost.append(train_prompt, step, avg_loss_value[0])
plot_cost.plot()
if step % 100 == 0: # record a test cost every 100 batches if step % 100 == 0: # record a test cost every 100 batches
test_metrics = trainer.test( test_metics = train_test(executor=exe_test,
reader=test_reader, feed_order=feed_order) program=test_program,
print("%s, Step %d, Cost %f" % (test_title, step, test_metrics[0])) reader=test_reader,
fetch_list=[avg_loss.name],
if test_metrics[0] < 10.0: feeder=feeder)
plot_cost.append(test_prompt, step, test_metics[0])
plot_cost.plot()
# If the accuracy is good enough, we can stop the training. # If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop') if test_metics[0] < 10.0:
trainer.stop() break
step += 1 step += 1
if isinstance(event, EndEpochEvent): if math.isnan(float(avg_loss_value[0])):
if event.epoch % 10 == 0: sys.exit("got NaN loss, training failed.")
# We can save the trained parameters for the inferences later
if params_dirname is not None: if params_dirname is not None:
trainer.save_params(params_dirname) # We can save the trained parameters for the inferences later
``` fluid.io.save_inference_model(params_dirname, ['x'],
[y_predict], exe)
### 开始训练
我们现在可以通过调用`trainer.train()`来开始训练
```python
%matplotlib inline
# The training could take up to a few minutes.
trainer.train(
reader=train_reader,
num_epochs=100,
event_handler=event_handler,
feed_order=feed_order)
``` ```
## 预测 ## 预测
提供一个`inference_program`和一个`params_dirname`来初始化预测器。`params_dirname`用来存储我们的参数 需要构建一个使用训练好的参数来进行预测的程序训练好的参数位置在`params_dirname`。
### 设定预测程序
类似于`trainer.train`,预测器需要一个预测程序来做预测我们可以稍加修改我们的训练程序来把预测值包含进来
### 准备预测环境
类似于训练过程预测器需要一个预测程序来做预测我们可以稍加修改我们的训练程序来把预测值包含进来
```python ```python
def inference_program(): infer_exe = fluid.Executor(place)
x = fluid.layers.data(name='x', shape=[13], dtype='float32') inference_scope = fluid.core.Scope()
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
``` ```
### 预测 ### 预测
预测器会从`params_dirname`中读取已经训练好的模型来对从未遇见过的数据进行预测 通过fluid.io.load_inference_model预测器会从`params_dirname`中读取已经训练好的模型来对从未遇见过的数据进行预测
```python ```python
inferencer = Inferencer( with fluid.scope_guard(inference_scope):
infer_func=inference_program, param_path=params_dirname, place=place) [inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, infer_exe)
batch_size = 10 batch_size = 10
test_reader = paddle.batch(paddle.dataset.uci_housing.test(),batch_size=batch_size)
test_data = next(test_reader()) infer_reader = paddle.batch(
test_x = numpy.array([data[0] for data in test_data]).astype("float32") paddle.dataset.uci_housing.test(), batch_size=batch_size)
test_y = numpy.array([data[1] for data in test_data]).astype("float32")
infer_data = next(infer_reader())
results = inferencer.infer({'x': test_x}) infer_feat = numpy.array(
[data[0] for data in infer_data]).astype("float32")
print("infer results: (House Price)") infer_label = numpy.array(
for idx, val in enumerate(results[0]): [data[1] for data in infer_data]).astype("float32")
assert feed_target_names[0] == 'x'
results = infer_exe.run(inference_program,
feed={feed_target_names[0]: numpy.array(infer_feat)},
fetch_list=fetch_targets)
print("infer results: (House Price)")
for idx, val in enumerate(results[0]):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
print("\nground truth:") print("\nground truth:")
for idx, val in enumerate(test_y): for idx, val in enumerate(infer_label):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
``` ```
......
...@@ -13,122 +13,134 @@ ...@@ -13,122 +13,134 @@
# limitations under the License. # limitations under the License.
from __future__ import print_function from __future__ import print_function
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy
import math
import sys import sys
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
import numpy # For training test cost
def train_test(executor, program, reader, feeder, fetch_list):
accumulated = 1 * [0]
count = 0
for data_test in reader():
outs = executor.run(
program=program, feed=feeder.feed(data_test), fetch_list=fetch_list)
accumulated = [x_c[0] + x_c[1][0] for x_c in zip(accumulated, outs)]
count += 1
return [x_d / count for x_d in accumulated]
BATCH_SIZE = 20
train_reader = paddle.batch( def main():
batch_size = 20
train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.uci_housing.train(), buf_size=500), paddle.reader.shuffle(paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE) batch_size=batch_size)
test_reader = paddle.batch(
test_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.uci_housing.test(), buf_size=500), paddle.reader.shuffle(paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=BATCH_SIZE) batch_size=batch_size)
def train_program():
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
# feature vector of length 13 # feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32') x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None) y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y) main_program = fluid.default_main_program()
avg_loss = fluid.layers.mean(loss) startup_program = fluid.default_startup_program()
return avg_loss
def optimizer_program(): cost = fluid.layers.square_error_cost(input=y_predict, label=y)
return fluid.optimizer.SGD(learning_rate=0.001) avg_loss = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
sgd_optimizer.minimize(avg_loss)
# can use CPU or GPU test_program = main_program.clone(for_test=True)
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = Trainer( # can use CPU or GPU
train_func=train_program, place=place, optimizer_func=optimizer_program) use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
feed_order = ['x', 'y'] # Specify the directory to save the parameters
params_dirname = "fit_a_line.inference.model"
num_epochs = 100
# Specify the directory to save the parameters # main train loop.
params_dirname = "fit_a_line.inference.model" feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
exe.run(startup_program)
train_title = "Train cost" train_prompt = "Train cost"
test_title = "Test cost" test_prompt = "Test cost"
step = 0
step = 0 exe_test = fluid.Executor(place)
for pass_id in range(num_epochs):
# event_handler prints training and testing info for data_train in train_reader():
def event_handler(event): avg_loss_value, = exe.run(
global step main_program,
if isinstance(event, EndStepEvent): feed=feeder.feed(data_train),
fetch_list=[avg_loss])
if step % 10 == 0: # record a train cost every 10 batches if step % 10 == 0: # record a train cost every 10 batches
print("%s, Step %d, Cost %f" % print("%s, Step %d, Cost %f" %
(train_title, step, event.metrics[0])) (train_prompt, step, avg_loss_value[0]))
if step % 100 == 0: # record a test cost every 100 batches if step % 100 == 0: # record a test cost every 100 batches
test_metrics = trainer.test( test_metics = train_test(
reader=test_reader, feed_order=feed_order) executor=exe_test,
print("%s, Step %d, Cost %f" % (test_title, step, test_metrics[0])) program=test_program,
if test_metrics[0] < 10.0: reader=test_reader,
fetch_list=[avg_loss],
feeder=feeder)
print("%s, Step %d, Cost %f" %
(test_prompt, step, test_metics[0]))
# If the accuracy is good enough, we can stop the training. # If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop') if test_metics[0] < 10.0:
trainer.stop() break
step += 1 step += 1
if isinstance(event, EndEpochEvent): if math.isnan(float(avg_loss_value[0])):
if event.epoch % 10 == 0: sys.exit("got NaN loss, training failed.")
# We can save the trained parameters for the inferences later
if params_dirname is not None: if params_dirname is not None:
trainer.save_params(params_dirname) # We can save the trained parameters for the inferences later
fluid.io.save_inference_model(params_dirname, ['x'], [y_predict],
exe)
# The training could take up to a few minutes.
trainer.train(
reader=train_reader,
num_epochs=100,
event_handler=event_handler,
feed_order=feed_order)
def inference_program():
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
infer_exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
inferencer = Inferencer( # infer
infer_func=inference_program, param_path=params_dirname, place=place) with fluid.scope_guard(inference_scope):
[inference_program, feed_target_names, fetch_targets
] = fluid.io.load_inference_model(params_dirname, infer_exe)
batch_size = 10
batch_size = 10 infer_reader = paddle.batch(
test_reader = paddle.batch(
paddle.dataset.uci_housing.test(), batch_size=batch_size) paddle.dataset.uci_housing.test(), batch_size=batch_size)
test_data = next(test_reader())
test_x = numpy.array([data[0] for data in test_data]).astype("float32")
test_y = numpy.array([data[1] for data in test_data]).astype("float32")
results = inferencer.infer({'x': test_x}) infer_data = next(infer_reader())
infer_feat = numpy.array(
[data[0] for data in infer_data]).astype("float32")
infer_label = numpy.array(
[data[1] for data in infer_data]).astype("float32")
print("infer results: (House Price)") assert feed_target_names[0] == 'x'
for idx, val in enumerate(results[0]): results = infer_exe.run(
inference_program,
feed={feed_target_names[0]: numpy.array(infer_feat)},
fetch_list=fetch_targets)
print("infer results: (House Price)")
for idx, val in enumerate(results[0]):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
print("\nground truth:") print("\nground truth:")
for idx, val in enumerate(test_y): for idx, val in enumerate(infer_label):
print("%d: %.2f" % (idx, val)) print("%d: %.2f" % (idx, val))
if __name__ == '__main__':
main()
...@@ -157,18 +157,12 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni ...@@ -157,18 +157,12 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni
加载 PaddlePaddle 的 Fluid API 包。 加载 PaddlePaddle 的 Fluid API 包。
```python ```python
import os
from PIL import Image
import numpy
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
### Program Functions 配置 ### Program Functions 配置
...@@ -246,8 +240,7 @@ def train_program(): ...@@ -246,8 +240,7 @@ def train_program():
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label) acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc] return predict, [avg_cost, acc]
``` ```
...@@ -269,18 +262,21 @@ def optimizer_program(): ...@@ -269,18 +262,21 @@ def optimizer_program():
`batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader。在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。 `batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader。在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。
```python ```python
BATCH_SIZE = 64
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.reader.shuffle( paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500), paddle.dataset.mnist.train(), buf_size=500),
batch_size=64) batch_size=BATCH_SIZE)
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=64) paddle.dataset.mnist.test(), batch_size=BATCH_SIZE)
``` ```
### Trainer 配置 ### Trainer 配置
现在,我们需要配置 `Trainer``Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer` 现在,我们需要构建一个 `Trainer``Trainer` 包含一个训练程序 `train_program`, `place` 和优化器 `optimizer`,并包含训练迭代、检查训练期间测试误差以及保存所需要用来预测的模型参数
```python ```python
# 该模型运行在单个CPU上 # 该模型运行在单个CPU上
...@@ -293,47 +289,115 @@ trainer = Trainer( ...@@ -293,47 +289,115 @@ trainer = Trainer(
#### Event Handler 配置 #### Event Handler 配置
Fluid API 在训练期间为回调函数提供了一个钩子。用户能够通过机制监控培训进度。 我们可以在训练期间通过调用一个handler函数来监控培训进度。
我们将在这里演示两个 `event_handler` 程序。请随意修改 Jupyter 笔记本 ,看看有什么不同。 我们将在这里演示两个 `event_handler` 程序。请随意修改 Jupyter 笔记本 ,看看有什么不同。
`event_handler` 用来在训练过程中输出训练结果 `event_handler` 用来在训练过程中输出训练结果
```python ```python
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer def event_handler(pass_id, batch_id, cost):
params_dirname = "recognize_digits_network.inference.model" print("Pass %d, Batch %d, Cost %f" % (pass_id,batch_id, cost))
lists = []
def event_handler(event):
if isinstance(event, EndStepEvent):
if event.step % 100 == 0:
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
print("Pass %d, Batch %d, Cost %f" % (
event.step, event.epoch, event.metrics[0]))
if isinstance(event, EndEpochEvent):
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
print("Test with Epoch %d, avg_cost: %s, acc: %s" % (event.epoch, avg_cost, acc))
# save parameters
trainer.save_params(params_dirname)
lists.append((event.epoch, avg_cost, acc))
``` ```
```python
from paddle.v2.plot import Ploter
#### 开始训练 train_prompt = "Train cost"
test_prompt = "Test cost"
cost_ploter = Ploter(train_prompt, test_prompt)
# event_handler to plot a figure
def event_handler_plot(ploter_title, step, cost):
cost_ploter.append(ploter_title, step, cost)
cost_ploter.plot()
```
既然我们设置了 `event_handler``data reader`,我们就可以开始训练模型了。 `event_handler_plot` 可以用来在训练过程中画图如下:
![png](./image/train_and_test.png)
#### 开始训练
可以加入我们设置的 `event_handler``data reader`,然后就可以开始训练模型了。
设置一些运行需要的参数,配置数据描述
`feed_order` 用于将数据目录映射到 `train_program` `feed_order` 用于将数据目录映射到 `train_program`
创建一个反馈训练过程中误差的`train_test`
训练完成后,模型参数存入`save_dirname`
```python ```python
trainer.train( # 该模型运行在单个CPU上
num_epochs=5, use_cuda = False # set to True if training with GPU
event_handler=event_handler, place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
reader=train_reader,
feed_order=['img', 'label']) prediction, [avg_loss, acc] = train_program()
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
feeder = fluid.DataFeeder(feed_list=[img, label], place=place)
optimizer = fluid.optimizer.Adam(learning_rate=0.001)
optimizer.minimize(avg_loss)
PASS_NUM = 5
epochs = [epoch_id for epoch_id in range(PASS_NUM)]
save_dirname = "recognize_digits.inference.model"
def train_test(train_test_program,
train_test_feed, train_test_reader):
acc_set = []
avg_loss_set = []
for test_data in train_test_reader():
acc_np, avg_loss_np = exe.run(
program=train_test_program,
feed=train_test_feed.feed(test_data),
fetch_list=[acc, avg_loss])
acc_set.append(float(acc_np))
avg_loss_set.append(float(avg_loss_np))
# get test acc and loss
acc_val_mean = numpy.array(acc_set).mean()
avg_loss_val_mean = numpy.array(avg_loss_set).mean()
return avg_loss_val_mean, acc_val_mean
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
main_program = fluid.default_main_program()
test_program = fluid.default_main_program().clone(for_test=True)
lists = []
step = 0
for epoch_id in epochs:
for step_id, data in enumerate(train_reader()):
metrics = exe.run(main_program,
feed=feeder.feed(data),
fetch_list=[avg_loss, acc])
if step % 100 == 0:
print("Pass %d, Batch %d, Cost %f" % (step, epoch_id, metrics[0]))
event_handler_plot(train_prompt, step, metrics[0])
step += 1
# test for epoch
avg_loss_val, acc_val = train_test(train_test_program=test_program,
train_test_reader=test_reader,
train_test_feed=feeder)
print("Test with Epoch %d, avg_cost: %s, acc: %s" %(epoch_id, avg_loss_val, acc_val))
event_handler_plot(test_prompt, step, metrics[0])
lists.append((epoch_id, avg_loss_val, acc_val))
if save_dirname is not None:
fluid.io.save_inference_model(save_dirname,
["img"], [prediction], exe,
model_filename=None,
params_filename=None)
# find the best pass
best = sorted(lists, key=lambda list: float(list[1]))[0]
print('Best pass is %s, testing Avgcost is %s' % (best[0], best[1]))
print('The classification accuracy is %.2f%%' % (float(best[2]) * 100))
``` ```
训练过程是完全自动的,event_handler里打印的日志类似如下所示: 训练过程是完全自动的,event_handler里打印的日志类似如下所示:
...@@ -357,52 +421,52 @@ Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338 ...@@ -357,52 +421,52 @@ Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338
## 应用模型 ## 应用模型
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用 `fluid.contrib.inferencer.Inferencer` 接口进行推断。 可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用训练好的模型进行推断。
### Inference 配置
`Inference` 需要一个 `infer_func``param_path` 来设置网络和经过训练的参数。
我们可以简单地插入在此之前定义的分类器。
```python
inferencer = Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression
# infer_func=multilayer_perceptron, # uncomment for MLP
infer_func=convolutional_neural_network, # uncomment for LeNet5
param_path=params_dirname,
place=place)
```
### 生成预测输入数据 ### 生成预测输入数据
`infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。 `infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。
```python ```python
# Prepare the test image
import os
import numpy as np
from PIL import Image
def load_image(file): def load_image(file):
im = Image.open(file).convert('L') im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS) im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32) im = numpy.array(im).reshape(1, 1, 28, 28).astype(numpy.float32)
im = im / 255.0 * 2.0 - 1.0 im = im / 255.0 * 2.0 - 1.0
return im return im
cur_dir = cur_dir = os.getcwd() cur_dir = cur_dir = os.getcwd()
img = load_image(cur_dir + '/image/infer_3.png') tensor_img = load_image(cur_dir + '/image/infer_3.png')
``` ```
### 预测 ### Inference 创建及预测
通过`load_inference_model`来设置网络和经过训练的参数。我们可以简单地插入在此之前定义的分类器。
现在我们准备做预测。
```python ```python
results = inferencer.infer({'img': img}) inference_scope = fluid.core.Scope()
lab = np.argsort(results) # probs and lab are the results of one batch data with fluid.scope_guard(inference_scope):
print ("Inference result of image/infer_3.png is: %d" % lab[0][0][-1]) # Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(
save_dirname, exe, None, None)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(inference_program,
feed={feed_target_names[0]: tensor_img},
fetch_list=fetch_targets)
lab = numpy.argsort(results)
print("Inference result of image/infer_3.png is: %d" % lab[0][0][-1])
``` ```
### 预测结果
如果顺利,预测结果输入如下:
`Inference result of image/infer_3.png is: 3`
## 总结 ## 总结
本教程的softmax回归、多层感知器和卷积神经网络是最基础的深度学习模型,后续章节中复杂的神经网络都是从它们衍生出来的,因此这几个模型对之后的学习大有裨益。同时,我们也观察到从最简单的softmax回归变换到稍复杂的卷积神经网络的时候,MNIST数据集上的识别准确率有了大幅度的提升,原因是卷积层具有局部连接和共享权重的特性。在之后学习新模型的时候,希望大家也要深入到新模型相比原模型带来效果提升的关键之处。此外,本教程还介绍了PaddlePaddle模型搭建的基本流程,从dataprovider的编写、网络层的构建,到最后的训练和预测。对这个流程熟悉以后,大家就可以用自己的数据,定义自己的网络模型,并完成自己的训练和预测任务了。 本教程的softmax回归、多层感知器和卷积神经网络是最基础的深度学习模型,后续章节中复杂的神经网络都是从它们衍生出来的,因此这几个模型对之后的学习大有裨益。同时,我们也观察到从最简单的softmax回归变换到稍复杂的卷积神经网络的时候,MNIST数据集上的识别准确率有了大幅度的提升,原因是卷积层具有局部连接和共享权重的特性。在之后学习新模型的时候,希望大家也要深入到新模型相比原模型带来效果提升的关键之处。此外,本教程还介绍了PaddlePaddle模型搭建的基本流程,从dataprovider的编写、网络层的构建,到最后的训练和预测。对这个流程熟悉以后,大家就可以用自己的数据,定义自己的网络模型,并完成自己的训练和预测任务了。
......
...@@ -199,18 +199,12 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni ...@@ -199,18 +199,12 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni
加载 PaddlePaddle 的 Fluid API 包。 加载 PaddlePaddle 的 Fluid API 包。
```python ```python
import os
from PIL import Image
import numpy
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
### Program Functions 配置 ### Program Functions 配置
...@@ -288,8 +282,7 @@ def train_program(): ...@@ -288,8 +282,7 @@ def train_program():
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label) acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc] return predict, [avg_cost, acc]
``` ```
...@@ -311,18 +304,21 @@ def optimizer_program(): ...@@ -311,18 +304,21 @@ def optimizer_program():
`batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader。在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。 `batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader。在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。
```python ```python
BATCH_SIZE = 64
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.reader.shuffle( paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500), paddle.dataset.mnist.train(), buf_size=500),
batch_size=64) batch_size=BATCH_SIZE)
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=64) paddle.dataset.mnist.test(), batch_size=BATCH_SIZE)
``` ```
### Trainer 配置 ### Trainer 配置
现在,我们需要配置 `Trainer`。`Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer` 现在,我们需要构建一个 `Trainer`。`Trainer` 包含一个训练程序 `train_program`, `place` 和优化器 `optimizer`,并包含训练迭代、检查训练期间测试误差以及保存所需要用来预测的模型参数
```python ```python
# 该模型运行在单个CPU上 # 该模型运行在单个CPU上
...@@ -335,47 +331,115 @@ trainer = Trainer( ...@@ -335,47 +331,115 @@ trainer = Trainer(
#### Event Handler 配置 #### Event Handler 配置
Fluid API 在训练期间为回调函数提供了一个钩子。用户能够通过机制监控培训进度。 我们可以在训练期间通过调用一个handler函数来监控培训进度。
我们将在这里演示两个 `event_handler` 程序。请随意修改 Jupyter 笔记本 ,看看有什么不同。 我们将在这里演示两个 `event_handler` 程序。请随意修改 Jupyter 笔记本 ,看看有什么不同。
`event_handler` 用来在训练过程中输出训练结果 `event_handler` 用来在训练过程中输出训练结果
```python ```python
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer def event_handler(pass_id, batch_id, cost):
params_dirname = "recognize_digits_network.inference.model" print("Pass %d, Batch %d, Cost %f" % (pass_id,batch_id, cost))
lists = []
def event_handler(event):
if isinstance(event, EndStepEvent):
if event.step % 100 == 0:
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
print("Pass %d, Batch %d, Cost %f" % (
event.step, event.epoch, event.metrics[0]))
if isinstance(event, EndEpochEvent):
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
print("Test with Epoch %d, avg_cost: %s, acc: %s" % (event.epoch, avg_cost, acc))
# save parameters
trainer.save_params(params_dirname)
lists.append((event.epoch, avg_cost, acc))
``` ```
```python
from paddle.v2.plot import Ploter
#### 开始训练 train_prompt = "Train cost"
test_prompt = "Test cost"
cost_ploter = Ploter(train_prompt, test_prompt)
# event_handler to plot a figure
def event_handler_plot(ploter_title, step, cost):
cost_ploter.append(ploter_title, step, cost)
cost_ploter.plot()
```
既然我们设置了 `event_handler` 和 `data reader`,我们就可以开始训练模型了。 `event_handler_plot` 可以用来在训练过程中画图如下:
![png](./image/train_and_test.png)
#### 开始训练
可以加入我们设置的 `event_handler` 和 `data reader`,然后就可以开始训练模型了。
设置一些运行需要的参数,配置数据描述
`feed_order` 用于将数据目录映射到 `train_program` `feed_order` 用于将数据目录映射到 `train_program`
创建一个反馈训练过程中误差的`train_test`
训练完成后,模型参数存入`save_dirname`中
```python ```python
trainer.train( # 该模型运行在单个CPU上
num_epochs=5, use_cuda = False # set to True if training with GPU
event_handler=event_handler, place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
reader=train_reader,
feed_order=['img', 'label']) prediction, [avg_loss, acc] = train_program()
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
feeder = fluid.DataFeeder(feed_list=[img, label], place=place)
optimizer = fluid.optimizer.Adam(learning_rate=0.001)
optimizer.minimize(avg_loss)
PASS_NUM = 5
epochs = [epoch_id for epoch_id in range(PASS_NUM)]
save_dirname = "recognize_digits.inference.model"
def train_test(train_test_program,
train_test_feed, train_test_reader):
acc_set = []
avg_loss_set = []
for test_data in train_test_reader():
acc_np, avg_loss_np = exe.run(
program=train_test_program,
feed=train_test_feed.feed(test_data),
fetch_list=[acc, avg_loss])
acc_set.append(float(acc_np))
avg_loss_set.append(float(avg_loss_np))
# get test acc and loss
acc_val_mean = numpy.array(acc_set).mean()
avg_loss_val_mean = numpy.array(avg_loss_set).mean()
return avg_loss_val_mean, acc_val_mean
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
main_program = fluid.default_main_program()
test_program = fluid.default_main_program().clone(for_test=True)
lists = []
step = 0
for epoch_id in epochs:
for step_id, data in enumerate(train_reader()):
metrics = exe.run(main_program,
feed=feeder.feed(data),
fetch_list=[avg_loss, acc])
if step % 100 == 0:
print("Pass %d, Batch %d, Cost %f" % (step, epoch_id, metrics[0]))
event_handler_plot(train_prompt, step, metrics[0])
step += 1
# test for epoch
avg_loss_val, acc_val = train_test(train_test_program=test_program,
train_test_reader=test_reader,
train_test_feed=feeder)
print("Test with Epoch %d, avg_cost: %s, acc: %s" %(epoch_id, avg_loss_val, acc_val))
event_handler_plot(test_prompt, step, metrics[0])
lists.append((epoch_id, avg_loss_val, acc_val))
if save_dirname is not None:
fluid.io.save_inference_model(save_dirname,
["img"], [prediction], exe,
model_filename=None,
params_filename=None)
# find the best pass
best = sorted(lists, key=lambda list: float(list[1]))[0]
print('Best pass is %s, testing Avgcost is %s' % (best[0], best[1]))
print('The classification accuracy is %.2f%%' % (float(best[2]) * 100))
``` ```
训练过程是完全自动的,event_handler里打印的日志类似如下所示: 训练过程是完全自动的,event_handler里打印的日志类似如下所示:
...@@ -399,52 +463,52 @@ Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338 ...@@ -399,52 +463,52 @@ Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338
## 应用模型 ## 应用模型
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用 `fluid.contrib.inferencer.Inferencer` 接口进行推断。 可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用训练好的模型进行推断。
### Inference 配置
`Inference` 需要一个 `infer_func` 和 `param_path` 来设置网络和经过训练的参数。
我们可以简单地插入在此之前定义的分类器。
```python
inferencer = Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression
# infer_func=multilayer_perceptron, # uncomment for MLP
infer_func=convolutional_neural_network, # uncomment for LeNet5
param_path=params_dirname,
place=place)
```
### 生成预测输入数据 ### 生成预测输入数据
`infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。 `infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。
```python ```python
# Prepare the test image
import os
import numpy as np
from PIL import Image
def load_image(file): def load_image(file):
im = Image.open(file).convert('L') im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS) im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32) im = numpy.array(im).reshape(1, 1, 28, 28).astype(numpy.float32)
im = im / 255.0 * 2.0 - 1.0 im = im / 255.0 * 2.0 - 1.0
return im return im
cur_dir = cur_dir = os.getcwd() cur_dir = cur_dir = os.getcwd()
img = load_image(cur_dir + '/image/infer_3.png') tensor_img = load_image(cur_dir + '/image/infer_3.png')
``` ```
### 预测 ### Inference 创建及预测
通过`load_inference_model`来设置网络和经过训练的参数。我们可以简单地插入在此之前定义的分类器。
现在我们准备做预测。
```python ```python
results = inferencer.infer({'img': img}) inference_scope = fluid.core.Scope()
lab = np.argsort(results) # probs and lab are the results of one batch data with fluid.scope_guard(inference_scope):
print ("Inference result of image/infer_3.png is: %d" % lab[0][0][-1]) # Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(
save_dirname, exe, None, None)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(inference_program,
feed={feed_target_names[0]: tensor_img},
fetch_list=fetch_targets)
lab = numpy.argsort(results)
print("Inference result of image/infer_3.png is: %d" % lab[0][0][-1])
``` ```
### 预测结果
如果顺利,预测结果输入如下:
`Inference result of image/infer_3.png is: 3`
## 总结 ## 总结
本教程的softmax回归、多层感知器和卷积神经网络是最基础的深度学习模型,后续章节中复杂的神经网络都是从它们衍生出来的,因此这几个模型对之后的学习大有裨益。同时,我们也观察到从最简单的softmax回归变换到稍复杂的卷积神经网络的时候,MNIST数据集上的识别准确率有了大幅度的提升,原因是卷积层具有局部连接和共享权重的特性。在之后学习新模型的时候,希望大家也要深入到新模型相比原模型带来效果提升的关键之处。此外,本教程还介绍了PaddlePaddle模型搭建的基本流程,从dataprovider的编写、网络层的构建,到最后的训练和预测。对这个流程熟悉以后,大家就可以用自己的数据,定义自己的网络模型,并完成自己的训练和预测任务了。 本教程的softmax回归、多层感知器和卷积神经网络是最基础的深度学习模型,后续章节中复杂的神经网络都是从它们衍生出来的,因此这几个模型对之后的学习大有裨益。同时,我们也观察到从最简单的softmax回归变换到稍复杂的卷积神经网络的时候,MNIST数据集上的识别准确率有了大幅度的提升,原因是卷积层具有局部连接和共享权重的特性。在之后学习新模型的时候,希望大家也要深入到新模型相比原模型带来效果提升的关键之处。此外,本教程还介绍了PaddlePaddle模型搭建的基本流程,从dataprovider的编写、网络层的构建,到最后的训练和预测。对这个流程熟悉以后,大家就可以用自己的数据,定义自己的网络模型,并完成自己的训练和预测任务了。
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function from __future__ import print_function
import os import os
from PIL import Image from PIL import Image
import numpy as np import numpy
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
try: BATCH_SIZE = 64
from paddle.fluid.contrib.trainer import * PASS_NUM = 5
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
def softmax_regression(): def loss_net(hidden, label):
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32') prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
predict = fluid.layers.fc(input=img, size=10, act='softmax') loss = fluid.layers.cross_entropy(input=prediction, label=label)
return predict avg_loss = fluid.layers.mean(loss)
acc = fluid.layers.accuracy(input=prediction, label=label)
return prediction, avg_loss, acc
def multilayer_perceptron(): def multilayer_perceptron(img, label):
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32') img = fluid.layers.fc(input=img, size=200, act='tanh')
# first fully-connected layer, using ReLu as its activation function hidden = fluid.layers.fc(input=img, size=200, act='tanh')
hidden = fluid.layers.fc(input=img, size=128, act='relu') return loss_net(hidden, label)
# second fully-connected layer, using ReLu as its activation function
hidden = fluid.layers.fc(input=hidden, size=64, act='relu')
# The thrid fully-connected layer, note that the hidden size should be 10,
# which is the number of unique digits
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
return prediction
def convolutional_neural_network(): def softmax_regression(img, label):
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32') return loss_net(img, label)
# first conv pool
def convolutional_neural_network(img, label):
conv_pool_1 = fluid.nets.simple_img_conv_pool( conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img, input=img,
filter_size=5, filter_size=5,
...@@ -45,7 +51,6 @@ def convolutional_neural_network(): ...@@ -45,7 +51,6 @@ def convolutional_neural_network():
pool_stride=2, pool_stride=2,
act="relu") act="relu")
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1) conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# second conv pool
conv_pool_2 = fluid.nets.simple_img_conv_pool( conv_pool_2 = fluid.nets.simple_img_conv_pool(
input=conv_pool_1, input=conv_pool_1,
filter_size=5, filter_size=5,
...@@ -53,99 +58,160 @@ def convolutional_neural_network(): ...@@ -53,99 +58,160 @@ def convolutional_neural_network():
pool_size=2, pool_size=2,
pool_stride=2, pool_stride=2,
act="relu") act="relu")
# output layer with softmax activation function. size = 10 since there are only 10 possible digits. return loss_net(conv_pool_2, label)
prediction = fluid.layers.fc(input=conv_pool_2, size=10, act='softmax')
return prediction
def train_program():
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
# Here we can build the prediction network in different ways. Please def train(nn_type,
# predict = softmax_regression() # uncomment for Softmax use_cuda,
# predict = multilayer_perceptron() # uncomment for MLP save_dirname=None,
predict = convolutional_neural_network() # uncomment for LeNet5 model_filename=None,
params_filename=None):
if use_cuda and not fluid.core.is_compiled_with_cuda():
return
# Calculate the cost from the prediction and label. img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
cost = fluid.layers.cross_entropy(input=predict, label=label) label = fluid.layers.data(name='label', shape=[1], dtype='int64')
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc]
if nn_type == 'softmax_regression':
net_conf = softmax_regression
elif nn_type == 'multilayer_perceptron':
net_conf = multilayer_perceptron
else:
net_conf = convolutional_neural_network
prediction, avg_loss, acc = net_conf(img, label)
test_program = fluid.default_main_program().clone(for_test=True)
optimizer = fluid.optimizer.Adam(learning_rate=0.001)
optimizer.minimize(avg_loss)
def train_test(train_test_program, train_test_feed, train_test_reader):
acc_set = []
avg_loss_set = []
for test_data in train_test_reader():
acc_np, avg_loss_np = exe.run(
program=train_test_program,
feed=train_test_feed.feed(test_data),
fetch_list=[acc, avg_loss])
acc_set.append(float(acc_np))
avg_loss_set.append(float(avg_loss_np))
# get test acc and loss
acc_val_mean = numpy.array(acc_set).mean()
avg_loss_val_mean = numpy.array(avg_loss_set).mean()
return avg_loss_val_mean, acc_val_mean
def optimizer_program(): place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
return fluid.optimizer.Adam(learning_rate=0.001)
exe = fluid.Executor(place)
def main():
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=500), paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=500),
batch_size=64) batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=BATCH_SIZE)
feeder = fluid.DataFeeder(feed_list=[img, label], place=place)
test_reader = paddle.batch(paddle.dataset.mnist.test(), batch_size=64) exe.run(fluid.default_startup_program())
main_program = fluid.default_main_program()
use_cuda = False # set to True if training with GPU epochs = [epoch_id for epoch_id in range(PASS_NUM)]
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_program)
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer
params_dirname = "recognize_digits_network.inference.model"
lists = [] lists = []
step = 0
def event_handler(event): for epoch_id in epochs:
if isinstance(event, EndStepEvent): for step_id, data in enumerate(train_reader()):
if event.step % 100 == 0: metrics = exe.run(
# event.metrics maps with train program return arguments. main_program,
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example. feed=feeder.feed(data),
print("Pass %d, Batch %d, Cost %f" % (event.step, event.epoch, fetch_list=[avg_loss, acc])
event.metrics[0])) if step % 100 == 0:
print("Pass %d, Batch %d, Cost %f" % (step, epoch_id,
if isinstance(event, EndEpochEvent): metrics[0]))
avg_cost, acc = trainer.test( step += 1
reader=test_reader, feed_order=['img', 'label']) # test for epoch
avg_loss_val, acc_val = train_test(
train_test_program=test_program,
train_test_reader=test_reader,
train_test_feed=feeder)
print("Test with Epoch %d, avg_cost: %s, acc: %s" % print("Test with Epoch %d, avg_cost: %s, acc: %s" %
(event.epoch, avg_cost, acc)) (epoch_id, avg_loss_val, acc_val))
lists.append((epoch_id, avg_loss_val, acc_val))
# save parameters if save_dirname is not None:
trainer.save_params(params_dirname) fluid.io.save_inference_model(
lists.append((event.epoch, avg_cost, acc)) save_dirname, ["img"], [prediction],
exe,
# Train the model now model_filename=model_filename,
trainer.train( params_filename=params_filename)
num_epochs=5,
event_handler=event_handler,
reader=train_reader,
feed_order=['img', 'label'])
# find the best pass # find the best pass
best = sorted(lists, key=lambda list: float(list[1]))[0] best = sorted(lists, key=lambda list: float(list[1]))[0]
print('Best pass is %s, testing Avgcost is %s' % (best[0], best[1])) print('Best pass is %s, testing Avgcost is %s' % (best[0], best[1]))
print('The classification accuracy is %.2f%%' % (float(best[2]) * 100)) print('The classification accuracy is %.2f%%' % (float(best[2]) * 100))
def infer(use_cuda,
save_dirname=None,
model_filename=None,
params_filename=None):
if save_dirname is None:
return
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
def load_image(file): def load_image(file):
im = Image.open(file).convert('L') im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS) im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32) im = numpy.array(im).reshape(1, 1, 28, 28).astype(numpy.float32)
im = im / 255.0 * 2.0 - 1.0 im = im / 255.0 * 2.0 - 1.0
return im return im
cur_dir = os.path.dirname(os.path.realpath(__file__)) cur_dir = os.path.dirname(os.path.realpath(__file__))
img = load_image(cur_dir + '/image/infer_3.png') tensor_img = load_image(cur_dir + '/image/infer_3.png')
inferencer = Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression inference_scope = fluid.core.Scope()
# infer_func=multilayer_perceptron, # uncomment for MLP with fluid.scope_guard(inference_scope):
infer_func=convolutional_neural_network, # uncomment for LeNet5 # Use fluid.io.load_inference_model to obtain the inference program desc,
param_path=params_dirname, # the feed_target_names (the names of variables that will be feeded
place=place) # data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
results = inferencer.infer({'img': img}) [inference_program, feed_target_names,
lab = np.argsort(results) # probs and lab are the results of one batch data fetch_targets] = fluid.io.load_inference_model(
save_dirname, exe, model_filename, params_filename)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(
inference_program,
feed={feed_target_names[0]: tensor_img},
fetch_list=fetch_targets)
lab = numpy.argsort(results)
print("Inference result of image/infer_3.png is: %d" % lab[0][0][-1]) print("Inference result of image/infer_3.png is: %d" % lab[0][0][-1])
def main(use_cuda, nn_type):
model_filename = None
params_filename = None
save_dirname = "recognize_digits_" + nn_type + ".inference.model"
# call train() with is_local argument to run distributed train
train(
nn_type=nn_type,
use_cuda=use_cuda,
save_dirname=save_dirname,
model_filename=model_filename,
params_filename=params_filename)
infer(
use_cuda=use_cuda,
save_dirname=save_dirname,
model_filename=model_filename,
params_filename=params_filename)
if __name__ == '__main__': if __name__ == '__main__':
main() use_cuda = False
# predict = 'softmax_regression' # uncomment for Softmax
# predict = 'multilayer_perceptron' # uncomment for MLP
predict = 'convolutional_neural_network' # uncomment for LeNet5
main(use_cuda=use_cuda, nn_type=predict)
...@@ -169,15 +169,7 @@ import paddle.fluid as fluid ...@@ -169,15 +169,7 @@ import paddle.fluid as fluid
import numpy import numpy
import sys import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
本教程中我们提供了VGG和ResNet两个模型的配置。 本教程中我们提供了VGG和ResNet两个模型的配置。
...@@ -348,19 +340,6 @@ def optimizer_program(): ...@@ -348,19 +340,6 @@ def optimizer_program():
## 训练模型 ## 训练模型
### Trainer 配置
现在,我们需要配置 `Trainer``Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer_func`
```python
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = Trainer(
train_func=train_program,
optimizer_func=optimizer_program,
place=place)
```
### Data Feeders 配置 ### Data Feeders 配置
`cifar.train10()` 每次产生一条样本,在完成shuffle和batch之后,作为训练的输入。 `cifar.train10()` 每次产生一条样本,在完成shuffle和batch之后,作为训练的输入。
...@@ -379,50 +358,104 @@ test_reader = paddle.batch( ...@@ -379,50 +358,104 @@ test_reader = paddle.batch(
paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE) paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE)
``` ```
### Event Handler ### Trainer 程序的实现
我们需要为训练过程制定一个main_program, 同样的,还需要为测试程序配置一个test_program。定义训练的 `place` ,并使用先前定义的优化器 `optimizer_func`
可以使用`event_handler`回调函数来观察训练过程,或进行测试等, 该回调函数是`trainer.train`函数里设定。
`event_handler` 用来在训练过程中输出文本日志
```python ```python
params_dirname = "image_classification_resnet.inference.model" use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
# event handler to track training and testing process feed_order = ['pixel', 'label']
def event_handler(event):
if isinstance(event, EndStepEvent): main_program = fluid.default_main_program()
if event.step % 100 == 0: star_program = fluid.default_startup_program()
print("\nPass %d, Batch %d, Cost %f, Acc %f" %
(event.step, event.epoch, event.metrics[0], predict = inference_program()
event.metrics[1])) avg_cost, acc = train_program(predict)
else:
sys.stdout.write('.') # Test program
sys.stdout.flush() test_program = main_program.clone(for_test=True)
optimizer = optimizer_program()
optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
EPOCH_NUM = 2
# For training test cost
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(
feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, acc]) * [0]
for tid, test_data in enumerate(reader()):
avg_cost_np = test_exe.run(program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, acc])
accumulated = [x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)]
count += 1
return [x / count for x in accumulated]
```
### 训练主循环以及过程输出
if isinstance(event, EndEpochEvent): 在接下来的主训练循环中,我们将通过输出来来观察训练过程,或进行测试等。
# Test against with the test dataset to get accuracy.
avg_cost, accuracy = trainer.test(
reader=test_reader, feed_order=['pixel', 'label'])
print('\nTest with Pass {0}, Loss {1:2.2}, Acc {2:2.2}'.format(event.epoch, avg_cost, accuracy)) 也可以使用`plot`, 利用回调数据来打点画图:
```python
params_dirname = "image_classification_resnet.inference.model"
from paddle.utils.plot import Ploter
train_prompt = "Train cost"
test_prompt = "Test cost"
plot_cost = Ploter(test_prompt,train_prompt)
# main train loop.
def train_loop():
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(
feed_list=feed_var_list_loop, place=place)
exe.run(star_program)
step = 0
for pass_id in range(EPOCH_NUM):
for step_id, data_train in enumerate(train_reader()):
avg_loss_value = exe.run(main_program,
feed=feeder.feed(data_train),
fetch_list=[avg_cost, acc])
if step % 1 == 0:
plot_cost.append(train_prompt, step, avg_loss_value[0])
plot_cost.plot()
step += 1
avg_cost_test, accuracy_test = train_test(test_program,
reader=test_reader)
plot_cost.append(test_prompt, step, avg_cost_test)
# save parameters # save parameters
if params_dirname is not None: if params_dirname is not None:
trainer.save_params(params_dirname) fluid.io.save_inference_model(params_dirname, ["pixel"],
[predict], exe)
``` ```
### 训练 ### 训练
通过`trainer.train`函数训练: 通过`trainer_loop`函数训练, 这里我们只进行了2个Epoch, 一般我们在实际应用上会执行上百个以上Epoch
**注意:** CPU,每个 Epoch 将花费大约15~20分钟。这部分可能需要一段时间。请随意修改代码,在GPU上运行测试,以提高训练速度。 **注意:** CPU,每个 Epoch 将花费大约15~20分钟。这部分可能需要一段时间。请随意修改代码,在GPU上运行测试,以提高训练速度。
```python ```python
trainer.train( train_loop()
reader=train_reader,
num_epochs=2,
event_handler=event_handler,
feed_order=['pixel', 'label'])
``` ```
一轮训练log示例如下所示,经过1个pass, 训练集上平均 Accuracy 为0.59 ,测试集上平均 Accuracy 为0.6 。 一轮训练log示例如下所示,经过1个pass, 训练集上平均 Accuracy 为0.59 ,测试集上平均 Accuracy 为0.6 。
...@@ -448,23 +481,22 @@ Test with Pass 0, Loss 1.1, Acc 0.6 ...@@ -448,23 +481,22 @@ Test with Pass 0, Loss 1.1, Acc 0.6
## 应用模型 ## 应用模型
可以使用训练好的模型对图片进行分类,下面程序展示了如何使用 `fluid.contrib.inferencer.Inferencer` 接口进行推断,可以打开注释,更改加载的模型 可以使用训练好的模型对图片进行分类,下面程序展示了如何加载已经训练好的网络和参数进行推断
### 生成预测输入数据 ### 生成预测输入数据
`dog.png` is an example image of a dog. Turn it into an numpy array to match the data feeder format. `dog.png` 是一张小狗的图片. 我们将它转换成 `numpy` 数组以满足`feeder`的格式.
```python ```python
# Prepare testing data. # Prepare testing data.
from PIL import Image from PIL import Image
import numpy as np
import os import os
def load_image(file): def load_image(file):
im = Image.open(file) im = Image.open(file)
im = im.resize((32, 32), Image.ANTIALIAS) im = im.resize((32, 32), Image.ANTIALIAS)
im = np.array(im).astype(np.float32) im = numpy.array(im).astype(numpy.float32)
# The storage order of the loaded image is W(width), # The storage order of the loaded image is W(width),
# H(height), C(channel). PaddlePaddle requires # H(height), C(channel). PaddlePaddle requires
# the CHW order, so transpose them. # the CHW order, so transpose them.
...@@ -481,17 +513,48 @@ img = load_image(cur_dir + '/image/dog.png') ...@@ -481,17 +513,48 @@ img = load_image(cur_dir + '/image/dog.png')
### Inferencer 配置和预测 ### Inferencer 配置和预测
`Inferencer` 需要一个 `infer_func``param_path` 来设置网络和经过训练的参数。 与训练过程类似,inferencer需要构建相应的过程。我们从`params_dirname` 加载网络和经过训练的参数。
我们可以简单地插入前面定义的推理程序。 我们可以简单地插入前面定义的推理程序。
现在我们准备做预测。 现在我们准备做预测。
```python ```python
inferencer = Inferencer( place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
infer_func=inference_program, param_path=params_dirname, place=place) exe = fluid.Executor(place)
label_list = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"] inference_scope = fluid.core.Scope()
# inference
results = inferencer.infer({'pixel': img}) with fluid.scope_guard(inference_scope):
print("infer results: %s" % label_list[np.argmax(results[0])])
[inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# The input's dimension of conv should be 4-D or 5-D.
# Use inference_transpiler to speedup
inference_transpiler_program = inference_program.clone()
t = fluid.transpiler.InferenceTranspiler()
t.transpile(inference_transpiler_program, place)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(inference_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
transpiler_results = exe.run(inference_transpiler_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
assert len(results[0]) == len(transpiler_results[0])
for i in range(len(results[0])):
numpy.testing.assert_almost_equal(
results[0][i], transpiler_results[0][i], decimal=5)
# infer label
label_list = [
"airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse",
"ship", "truck"
]
print("infer results: %s" % label_list[numpy.argmax(results[0])])
``` ```
## 总结 ## 总结
......
...@@ -211,15 +211,7 @@ import paddle.fluid as fluid ...@@ -211,15 +211,7 @@ import paddle.fluid as fluid
import numpy import numpy
import sys import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
本教程中我们提供了VGG和ResNet两个模型的配置。 本教程中我们提供了VGG和ResNet两个模型的配置。
...@@ -390,19 +382,6 @@ def optimizer_program(): ...@@ -390,19 +382,6 @@ def optimizer_program():
## 训练模型 ## 训练模型
### Trainer 配置
现在,我们需要配置 `Trainer`。`Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer_func`。
```python
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = Trainer(
train_func=train_program,
optimizer_func=optimizer_program,
place=place)
```
### Data Feeders 配置 ### Data Feeders 配置
`cifar.train10()` 每次产生一条样本,在完成shuffle和batch之后,作为训练的输入。 `cifar.train10()` 每次产生一条样本,在完成shuffle和batch之后,作为训练的输入。
...@@ -421,50 +400,104 @@ test_reader = paddle.batch( ...@@ -421,50 +400,104 @@ test_reader = paddle.batch(
paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE) paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE)
``` ```
### Event Handler ### Trainer 程序的实现
我们需要为训练过程制定一个main_program, 同样的,还需要为测试程序配置一个test_program。定义训练的 `place` ,并使用先前定义的优化器 `optimizer_func`。
可以使用`event_handler`回调函数来观察训练过程,或进行测试等, 该回调函数是`trainer.train`函数里设定。
`event_handler` 用来在训练过程中输出文本日志
```python ```python
params_dirname = "image_classification_resnet.inference.model" use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
# event handler to track training and testing process feed_order = ['pixel', 'label']
def event_handler(event):
if isinstance(event, EndStepEvent): main_program = fluid.default_main_program()
if event.step % 100 == 0: star_program = fluid.default_startup_program()
print("\nPass %d, Batch %d, Cost %f, Acc %f" %
(event.step, event.epoch, event.metrics[0], predict = inference_program()
event.metrics[1])) avg_cost, acc = train_program(predict)
else:
sys.stdout.write('.') # Test program
sys.stdout.flush() test_program = main_program.clone(for_test=True)
optimizer = optimizer_program()
optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
EPOCH_NUM = 2
# For training test cost
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(
feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, acc]) * [0]
for tid, test_data in enumerate(reader()):
avg_cost_np = test_exe.run(program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, acc])
accumulated = [x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)]
count += 1
return [x / count for x in accumulated]
```
### 训练主循环以及过程输出
if isinstance(event, EndEpochEvent): 在接下来的主训练循环中,我们将通过输出来来观察训练过程,或进行测试等。
# Test against with the test dataset to get accuracy.
avg_cost, accuracy = trainer.test(
reader=test_reader, feed_order=['pixel', 'label'])
print('\nTest with Pass {0}, Loss {1:2.2}, Acc {2:2.2}'.format(event.epoch, avg_cost, accuracy)) 也可以使用`plot`, 利用回调数据来打点画图:
```python
params_dirname = "image_classification_resnet.inference.model"
from paddle.utils.plot import Ploter
train_prompt = "Train cost"
test_prompt = "Test cost"
plot_cost = Ploter(test_prompt,train_prompt)
# main train loop.
def train_loop():
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(
feed_list=feed_var_list_loop, place=place)
exe.run(star_program)
step = 0
for pass_id in range(EPOCH_NUM):
for step_id, data_train in enumerate(train_reader()):
avg_loss_value = exe.run(main_program,
feed=feeder.feed(data_train),
fetch_list=[avg_cost, acc])
if step % 1 == 0:
plot_cost.append(train_prompt, step, avg_loss_value[0])
plot_cost.plot()
step += 1
avg_cost_test, accuracy_test = train_test(test_program,
reader=test_reader)
plot_cost.append(test_prompt, step, avg_cost_test)
# save parameters # save parameters
if params_dirname is not None: if params_dirname is not None:
trainer.save_params(params_dirname) fluid.io.save_inference_model(params_dirname, ["pixel"],
[predict], exe)
``` ```
### 训练 ### 训练
通过`trainer.train`函数训练: 通过`trainer_loop`函数训练, 这里我们只进行了2个Epoch, 一般我们在实际应用上会执行上百个以上Epoch
**注意:** CPU,每个 Epoch 将花费大约15~20分钟。这部分可能需要一段时间。请随意修改代码,在GPU上运行测试,以提高训练速度。 **注意:** CPU,每个 Epoch 将花费大约15~20分钟。这部分可能需要一段时间。请随意修改代码,在GPU上运行测试,以提高训练速度。
```python ```python
trainer.train( train_loop()
reader=train_reader,
num_epochs=2,
event_handler=event_handler,
feed_order=['pixel', 'label'])
``` ```
一轮训练log示例如下所示,经过1个pass, 训练集上平均 Accuracy 为0.59 ,测试集上平均 Accuracy 为0.6 。 一轮训练log示例如下所示,经过1个pass, 训练集上平均 Accuracy 为0.59 ,测试集上平均 Accuracy 为0.6 。
...@@ -490,23 +523,22 @@ Test with Pass 0, Loss 1.1, Acc 0.6 ...@@ -490,23 +523,22 @@ Test with Pass 0, Loss 1.1, Acc 0.6
## 应用模型 ## 应用模型
可以使用训练好的模型对图片进行分类,下面程序展示了如何使用 `fluid.contrib.inferencer.Inferencer` 接口进行推断,可以打开注释,更改加载的模型 可以使用训练好的模型对图片进行分类,下面程序展示了如何加载已经训练好的网络和参数进行推断
### 生成预测输入数据 ### 生成预测输入数据
`dog.png` is an example image of a dog. Turn it into an numpy array to match the data feeder format. `dog.png` 是一张小狗的图片. 我们将它转换成 `numpy` 数组以满足`feeder`的格式.
```python ```python
# Prepare testing data. # Prepare testing data.
from PIL import Image from PIL import Image
import numpy as np
import os import os
def load_image(file): def load_image(file):
im = Image.open(file) im = Image.open(file)
im = im.resize((32, 32), Image.ANTIALIAS) im = im.resize((32, 32), Image.ANTIALIAS)
im = np.array(im).astype(np.float32) im = numpy.array(im).astype(numpy.float32)
# The storage order of the loaded image is W(width), # The storage order of the loaded image is W(width),
# H(height), C(channel). PaddlePaddle requires # H(height), C(channel). PaddlePaddle requires
# the CHW order, so transpose them. # the CHW order, so transpose them.
...@@ -523,17 +555,48 @@ img = load_image(cur_dir + '/image/dog.png') ...@@ -523,17 +555,48 @@ img = load_image(cur_dir + '/image/dog.png')
### Inferencer 配置和预测 ### Inferencer 配置和预测
`Inferencer` 需要一个 `infer_func` 和 `param_path` 来设置网络和经过训练的参数。 与训练过程类似,inferencer需要构建相应的过程。我们从`params_dirname` 加载网络和经过训练的参数。
我们可以简单地插入前面定义的推理程序。 我们可以简单地插入前面定义的推理程序。
现在我们准备做预测。 现在我们准备做预测。
```python ```python
inferencer = Inferencer( place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
infer_func=inference_program, param_path=params_dirname, place=place) exe = fluid.Executor(place)
label_list = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"] inference_scope = fluid.core.Scope()
# inference
results = inferencer.infer({'pixel': img}) with fluid.scope_guard(inference_scope):
print("infer results: %s" % label_list[np.argmax(results[0])])
[inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# The input's dimension of conv should be 4-D or 5-D.
# Use inference_transpiler to speedup
inference_transpiler_program = inference_program.clone()
t = fluid.transpiler.InferenceTranspiler()
t.transpile(inference_transpiler_program, place)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(inference_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
transpiler_results = exe.run(inference_transpiler_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
assert len(results[0]) == len(transpiler_results[0])
for i in range(len(results[0])):
numpy.testing.assert_almost_equal(
results[0][i], transpiler_results[0][i], decimal=5)
# infer label
label_list = [
"airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse",
"ship", "truck"
]
print("infer results: %s" % label_list[numpy.argmax(results[0])])
``` ```
## 总结 ## 总结
......
...@@ -15,17 +15,6 @@ ...@@ -15,17 +15,6 @@
from __future__ import print_function from __future__ import print_function
import paddle.fluid as fluid import paddle.fluid as fluid
import sys
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
__all__ = ['resnet_cifar10'] __all__ = ['resnet_cifar10']
......
...@@ -14,21 +14,11 @@ ...@@ -14,21 +14,11 @@
from __future__ import print_function from __future__ import print_function
import os
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy import numpy
import sys import sys
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
from vgg import vgg_bn_drop from vgg import vgg_bn_drop
from resnet import resnet_cifar10 from resnet import resnet_cifar10
...@@ -43,8 +33,7 @@ def inference_network(): ...@@ -43,8 +33,7 @@ def inference_network():
return predict return predict
def train_network(): def train_network(predict):
predict = inference_network()
label = fluid.layers.data(name='label', shape=[1], dtype='int64') label = fluid.layers.data(name='label', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -56,62 +45,101 @@ def optimizer_program(): ...@@ -56,62 +45,101 @@ def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001) return fluid.optimizer.Adam(learning_rate=0.001)
def train(use_cuda, train_program, params_dirname): def train(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
BATCH_SIZE = 128 BATCH_SIZE = 128
EPOCH_NUM = 2
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.cifar.train10(), buf_size=50000), paddle.reader.shuffle(
paddle.dataset.cifar.train10(), buf_size=128 * 100),
batch_size=BATCH_SIZE) batch_size=BATCH_SIZE)
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE) paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE)
def event_handler(event): feed_order = ['pixel', 'label']
if isinstance(event, EndStepEvent):
if event.step % 100 == 0: main_program = fluid.default_main_program()
print("\nPass %d, Batch %d, Cost %f, Acc %f" % star_program = fluid.default_startup_program()
(event.step, event.epoch, event.metrics[0],
event.metrics[1])) predict = inference_network()
avg_cost, acc = train_network(predict)
# Test program
test_program = main_program.clone(for_test=True)
optimizer = optimizer_program()
optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
EPOCH_NUM = 1
# For training test cost
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, acc]) * [0]
for tid, test_data in enumerate(reader()):
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, acc])
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
# main train loop.
def train_loop():
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list=feed_var_list_loop, place=place)
exe.run(star_program)
step = 0
for pass_id in range(EPOCH_NUM):
for step_id, data_train in enumerate(train_reader()):
avg_loss_value = exe.run(
main_program,
feed=feeder.feed(data_train),
fetch_list=[avg_cost, acc])
if step_id % 100 == 0:
print("\nPass %d, Batch %d, Cost %f, Acc %f" % (
step_id, pass_id, avg_loss_value[0], avg_loss_value[1]))
else: else:
sys.stdout.write('.') sys.stdout.write('.')
sys.stdout.flush() sys.stdout.flush()
step += 1
if isinstance(event, EndEpochEvent): avg_cost_test, accuracy_test = train_test(
avg_cost, accuracy = trainer.test( test_program, reader=test_reader)
reader=test_reader, feed_order=['pixel', 'label'])
print('\nTest with Pass {0}, Loss {1:2.2}, Acc {2:2.2}'.format( print('\nTest with Pass {0}, Loss {1:2.2}, Acc {2:2.2}'.format(
event.epoch, avg_cost, accuracy)) pass_id, avg_cost_test, accuracy_test))
if params_dirname is not None:
trainer.save_params(params_dirname)
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() if params_dirname is not None:
trainer = Trainer( fluid.io.save_inference_model(params_dirname, ["pixel"],
train_func=train_program, optimizer_func=optimizer_program, place=place) [predict], exe)
trainer.train(
reader=train_reader,
num_epochs=EPOCH_NUM,
event_handler=event_handler,
feed_order=['pixel', 'label'])
train_loop()
def infer(use_cuda, inference_program, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
inferencer = Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
# Prepare testing data. def infer(use_cuda, params_dirname=None):
from PIL import Image from PIL import Image
import numpy as np place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
import os exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
def load_image(file): def load_image(infer_file):
im = Image.open(file) im = Image.open(infer_file)
im = im.resize((32, 32), Image.ANTIALIAS) im = im.resize((32, 32), Image.ANTIALIAS)
im = np.array(im).astype(np.float32) im = numpy.array(im).astype(numpy.float32)
# The storage order of the loaded image is W(width), # The storage order of the loaded image is W(width),
# H(height), C(channel). PaddlePaddle requires # H(height), C(channel). PaddlePaddle requires
# the CHW order, so transpose them. # the CHW order, so transpose them.
...@@ -125,14 +153,44 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -125,14 +153,44 @@ def infer(use_cuda, inference_program, params_dirname=None):
cur_dir = os.path.dirname(os.path.realpath(__file__)) cur_dir = os.path.dirname(os.path.realpath(__file__))
img = load_image(cur_dir + '/image/dog.png') img = load_image(cur_dir + '/image/dog.png')
# inference with fluid.scope_guard(inference_scope):
results = inferencer.infer({'pixel': img}) # Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inference_program, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# The input's dimension of conv should be 4-D or 5-D.
# Use inference_transpiler to speedup
inference_transpiler_program = inference_program.clone()
t = fluid.transpiler.InferenceTranspiler()
t.transpile(inference_transpiler_program, place)
# Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(
inference_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
transpiler_results = exe.run(
inference_transpiler_program,
feed={feed_target_names[0]: img},
fetch_list=fetch_targets)
assert len(results[0]) == len(transpiler_results[0])
for i in range(len(results[0])):
numpy.testing.assert_almost_equal(
results[0][i], transpiler_results[0][i], decimal=5)
# infer label
label_list = [ label_list = [
"airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "airplane", "automobile", "bird", "cat", "deer", "dog", "frog",
"ship", "truck" "horse", "ship", "truck"
] ]
print("infer results: %s" % label_list[np.argmax(results[0])])
print("infer results: %s" % label_list[numpy.argmax(results[0])])
def main(use_cuda): def main(use_cuda):
...@@ -140,15 +198,9 @@ def main(use_cuda): ...@@ -140,15 +198,9 @@ def main(use_cuda):
return return
save_path = "image_classification_resnet.inference.model" save_path = "image_classification_resnet.inference.model"
train( train(use_cuda=use_cuda, params_dirname=save_path)
use_cuda=use_cuda,
train_program=train_network,
params_dirname=save_path)
infer( infer(use_cuda=use_cuda, params_dirname=save_path)
use_cuda=use_cuda,
inference_program=inference_network,
params_dirname=save_path)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -14,21 +14,7 @@ ...@@ -14,21 +14,7 @@
from __future__ import print_function from __future__ import print_function
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import sys
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
__all__ = ['vgg_bn_drop']
def vgg_bn_drop(input): def vgg_bn_drop(input):
......
...@@ -202,40 +202,32 @@ dream that one day <e> ...@@ -202,40 +202,32 @@ dream that one day <e>
首先,加载所需要的包: 首先,加载所需要的包:
```python ```python
import paddle
import paddle as paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import six
import numpy import numpy
from functools import partial
import math import math
import os
import six
import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
然后,定义参数: 然后,定义参数:
```python ```python
EMBED_SIZE = 32 # word vector dimension EMBED_SIZE = 32
HIDDEN_SIZE = 256 # hidden layer dimension HIDDEN_SIZE = 256
N = 5 # train 5-gram N = 5
BATCH_SIZE = 32 # batch size BATCH_SIZE = 100
PASS_NUM = 100
# can use CPU or GPU use_cuda = False # set to True if training with GPU
use_cuda = os.getenv('WITH_GPU', '0') != '0'
word_dict = paddle.dataset.imikolov.build_dict() word_dict = paddle.dataset.imikolov.build_dict()
dict_size = len(word_dict) dict_size = len(word_dict)
``` ```
更大的`BATCH_SIZE`将使得训练更快收敛,但也会消耗更多内存。由于词向量计算规模较大,如果环境允许,请开启使用GPU进行训练,能更快得到结果。
不同于之前的PaddlePaddle v2版本,在新的Fluid版本里,我们不必再手动计算词向量。PaddlePaddle提供了一个内置的方法`fluid.layers.embedding`,我们就可以直接用它来构造 N-gram 神经网络。 不同于之前的PaddlePaddle v2版本,在新的Fluid版本里,我们不必再手动计算词向量。PaddlePaddle提供了一个内置的方法`fluid.layers.embedding`,我们就可以直接用它来构造 N-gram 神经网络。
- 我们来定义我们的 N-gram 神经网络结构。这个结构在训练和预测中都会使用到。因为词向量比较稀疏,我们传入参数 `is_sparse == True`, 可以加速稀疏矩阵的更新。 - 我们来定义我们的 N-gram 神经网络结构。这个结构在训练和预测中都会使用到。因为词向量比较稀疏,我们传入参数 `is_sparse == True`, 可以加速稀疏矩阵的更新。
......
...@@ -244,40 +244,32 @@ dream that one day <e> ...@@ -244,40 +244,32 @@ dream that one day <e>
首先,加载所需要的包: 首先,加载所需要的包:
```python ```python
import paddle
import paddle as paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import six
import numpy import numpy
from functools import partial
import math import math
import os
import six
import sys
from __future__ import print_function from __future__ import print_function
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
``` ```
然后,定义参数: 然后,定义参数:
```python ```python
EMBED_SIZE = 32 # word vector dimension EMBED_SIZE = 32
HIDDEN_SIZE = 256 # hidden layer dimension HIDDEN_SIZE = 256
N = 5 # train 5-gram N = 5
BATCH_SIZE = 32 # batch size BATCH_SIZE = 100
PASS_NUM = 100
# can use CPU or GPU use_cuda = False # set to True if training with GPU
use_cuda = os.getenv('WITH_GPU', '0') != '0'
word_dict = paddle.dataset.imikolov.build_dict() word_dict = paddle.dataset.imikolov.build_dict()
dict_size = len(word_dict) dict_size = len(word_dict)
``` ```
更大的`BATCH_SIZE`将使得训练更快收敛,但也会消耗更多内存。由于词向量计算规模较大,如果环境允许,请开启使用GPU进行训练,能更快得到结果。
不同于之前的PaddlePaddle v2版本,在新的Fluid版本里,我们不必再手动计算词向量。PaddlePaddle提供了一个内置的方法`fluid.layers.embedding`,我们就可以直接用它来构造 N-gram 神经网络。 不同于之前的PaddlePaddle v2版本,在新的Fluid版本里,我们不必再手动计算词向量。PaddlePaddle提供了一个内置的方法`fluid.layers.embedding`,我们就可以直接用它来构造 N-gram 神经网络。
- 我们来定义我们的 N-gram 神经网络结构。这个结构在训练和预测中都会使用到。因为词向量比较稀疏,我们传入参数 `is_sparse == True`, 可以加速稀疏矩阵的更新。 - 我们来定义我们的 N-gram 神经网络结构。这个结构在训练和预测中都会使用到。因为词向量比较稀疏,我们传入参数 `is_sparse == True`, 可以加速稀疏矩阵的更新。
......
...@@ -15,28 +15,15 @@ from __future__ import print_function ...@@ -15,28 +15,15 @@ from __future__ import print_function
import paddle as paddle import paddle as paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import six import six
import sys
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
import numpy import numpy
import sys import sys
from functools import partial
import math import math
import os
EMBED_SIZE = 32 EMBED_SIZE = 32
HIDDEN_SIZE = 256 HIDDEN_SIZE = 256
N = 5 N = 5
BATCH_SIZE = 100 BATCH_SIZE = 32
PASS_NUM = 100
use_cuda = False # set to True if training with GPU use_cuda = False # set to True if training with GPU
...@@ -44,32 +31,28 @@ word_dict = paddle.dataset.imikolov.build_dict() ...@@ -44,32 +31,28 @@ word_dict = paddle.dataset.imikolov.build_dict()
dict_size = len(word_dict) dict_size = len(word_dict)
def inference_program(is_sparse): def inference_program(words, is_sparse):
first_word = fluid.layers.data(name='firstw', shape=[1], dtype='int64')
second_word = fluid.layers.data(name='secondw', shape=[1], dtype='int64')
third_word = fluid.layers.data(name='thirdw', shape=[1], dtype='int64')
fourth_word = fluid.layers.data(name='fourthw', shape=[1], dtype='int64')
embed_first = fluid.layers.embedding( embed_first = fluid.layers.embedding(
input=first_word, input=words[0],
size=[dict_size, EMBED_SIZE], size=[dict_size, EMBED_SIZE],
dtype='float32', dtype='float32',
is_sparse=is_sparse, is_sparse=is_sparse,
param_attr='shared_w') param_attr='shared_w')
embed_second = fluid.layers.embedding( embed_second = fluid.layers.embedding(
input=second_word, input=words[1],
size=[dict_size, EMBED_SIZE], size=[dict_size, EMBED_SIZE],
dtype='float32', dtype='float32',
is_sparse=is_sparse, is_sparse=is_sparse,
param_attr='shared_w') param_attr='shared_w')
embed_third = fluid.layers.embedding( embed_third = fluid.layers.embedding(
input=third_word, input=words[2],
size=[dict_size, EMBED_SIZE], size=[dict_size, EMBED_SIZE],
dtype='float32', dtype='float32',
is_sparse=is_sparse, is_sparse=is_sparse,
param_attr='shared_w') param_attr='shared_w')
embed_fourth = fluid.layers.embedding( embed_fourth = fluid.layers.embedding(
input=fourth_word, input=words[3],
size=[dict_size, EMBED_SIZE], size=[dict_size, EMBED_SIZE],
dtype='float32', dtype='float32',
is_sparse=is_sparse, is_sparse=is_sparse,
...@@ -83,11 +66,10 @@ def inference_program(is_sparse): ...@@ -83,11 +66,10 @@ def inference_program(is_sparse):
return predict_word return predict_word
def train_program(is_sparse): def train_program(predict_word):
# The declaration of 'next_word' must be after the invoking of inference_program, # The declaration of 'next_word' must be after the invoking of inference_program,
# or the data input order of train program would be [next_word, firstw, secondw, # or the data input order of train program would be [next_word, firstw, secondw,
# thirdw, fourthw], which is not correct. # thirdw, fourthw], which is not correct.
predict_word = inference_program(is_sparse)
next_word = fluid.layers.data(name='nextw', shape=[1], dtype='int64') next_word = fluid.layers.data(name='nextw', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict_word, label=next_word) cost = fluid.layers.cross_entropy(input=predict_word, label=next_word)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -100,59 +82,112 @@ def optimizer_func(): ...@@ -100,59 +82,112 @@ def optimizer_func():
regularization=fluid.regularizer.L2DecayRegularizer(8e-4)) regularization=fluid.regularizer.L2DecayRegularizer(8e-4))
def train(use_cuda, train_program, params_dirname): def train(if_use_cuda, params_dirname, is_sparse=True):
place = fluid.CUDAPlace(0) if if_use_cuda else fluid.CPUPlace()
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.dataset.imikolov.train(word_dict, N), BATCH_SIZE) paddle.dataset.imikolov.train(word_dict, N), BATCH_SIZE)
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.imikolov.test(word_dict, N), BATCH_SIZE) paddle.dataset.imikolov.test(word_dict, N), BATCH_SIZE)
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() first_word = fluid.layers.data(name='firstw', shape=[1], dtype='int64')
second_word = fluid.layers.data(name='secondw', shape=[1], dtype='int64')
def event_handler(event): third_word = fluid.layers.data(name='thirdw', shape=[1], dtype='int64')
if isinstance(event, EndStepEvent): forth_word = fluid.layers.data(name='fourthw', shape=[1], dtype='int64')
outs = trainer.test( next_word = fluid.layers.data(name='nextw', shape=[1], dtype='int64')
reader=test_reader,
feed_order=['firstw', 'secondw', 'thirdw', 'fourthw', 'nextw'])
avg_cost = outs[0]
if event.step % 10 == 0:
print("Step %d: Average Cost %f" % (event.step, avg_cost))
word_list = [first_word, second_word, third_word, forth_word, next_word]
feed_order = ['firstw', 'secondw', 'thirdw', 'fourthw', 'nextw']
main_program = fluid.default_main_program()
star_program = fluid.default_startup_program()
predict_word = inference_program(word_list, is_sparse)
avg_cost = train_program(predict_word)
test_program = main_program.clone(for_test=True)
sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost]) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost])
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
def train_loop():
step = 0
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list=feed_var_list_loop, place=place)
exe.run(star_program)
for pass_id in range(PASS_NUM):
for data in train_reader():
avg_cost_np = exe.run(
main_program, feed=feeder.feed(data), fetch_list=[avg_cost])
if step % 10 == 0:
#outs = train_test(test_program, test_reader)
# print("Step %d: Average Cost %f" % (step, avg_cost_np[0]))
print("Step %d: Average Cost %f" % (step, avg_cost_np[0]))
# it will take a few hours.
# If average cost is lower than 5.8, we consider the model good enough to stop. # If average cost is lower than 5.8, we consider the model good enough to stop.
# Note 5.8 is a relatively high value. In order to get a better model, one should # Note 5.8 is a relatively high value. In order to get a better model, one should
# aim for avg_cost lower than 3.5. But the training could take longer time. # aim for avg_cost lower than 3.5. But the training could take longer time.
if avg_cost < 5.8: if avg_cost_np[0] < 5.8:
trainer.save_params(params_dirname) if params_dirname is not None:
trainer.stop() fluid.io.save_inference_model(params_dirname, [
'firstw', 'secondw', 'thirdw', 'fourthw'
if math.isnan(avg_cost): ], [predict_word], exe)
return
step += 1
if math.isnan(float(avg_cost_np[0])):
sys.exit("got NaN loss, training failed.") sys.exit("got NaN loss, training failed.")
trainer = Trainer( raise AssertionError("Cost is too large {0:2.2}".format(avg_cost_np[0]))
train_func=train_program,
# optimizer=fluid.optimizer.SGD(learning_rate=0.001),
optimizer_func=optimizer_func,
place=place)
trainer.train( train_loop()
reader=train_reader,
num_epochs=1,
event_handler=event_handler,
feed_order=['firstw', 'secondw', 'thirdw', 'fourthw', 'nextw'])
def infer(use_cuda, inference_program, params_dirname=None): def infer(use_cuda, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
inferencer = Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place) exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
with fluid.scope_guard(inference_scope):
# Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# Setup inputs by creating 4 LoDTensors representing 4 words. Here each word # Setup inputs by creating 4 LoDTensors representing 4 words. Here each word
# is simply an index to look up for the corresponding word vector and hence # is simply an index to look up for the corresponding word vector and hence
# the shape of word (base_shape) should be [1]. The length-based level of # the shape of word (base_shape) should be [1]. The recursive_sequence_lengths,
# detail (lod) info of each LoDtensor should be [[1]] meaning there is only # which is length-based level of detail (lod) of each LoDTensor, should be [[1]]
# one lod_level and there is only one sequence of one word on this level. # meaning there is only one level of detail and there is only one sequence of
# Note that lod info should be a list of lists. # one word on this level.
# Note that recursive_sequence_lengths should be a list of lists.
data1 = [[211]] # 'among' data1 = [[211]] # 'among'
data2 = [[6]] # 'a' data2 = [[6]] # 'a'
data3 = [[96]] # 'group' data3 = [[96]] # 'group'
...@@ -164,23 +199,36 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -164,23 +199,36 @@ def infer(use_cuda, inference_program, params_dirname=None):
third_word = fluid.create_lod_tensor(data3, lod, place) third_word = fluid.create_lod_tensor(data3, lod, place)
fourth_word = fluid.create_lod_tensor(data4, lod, place) fourth_word = fluid.create_lod_tensor(data4, lod, place)
result = inferencer.infer( assert feed_target_names[0] == 'firstw'
{ assert feed_target_names[1] == 'secondw'
'firstw': first_word, assert feed_target_names[2] == 'thirdw'
'secondw': second_word, assert feed_target_names[3] == 'fourthw'
'thirdw': third_word,
'fourthw': fourth_word # Construct feed as a dictionary of {feed_target_name: feed_target_data}
# and results will contain a list of data corresponding to fetch_targets.
results = exe.run(
inferencer,
feed={
feed_target_names[0]: first_word,
feed_target_names[1]: second_word,
feed_target_names[2]: third_word,
feed_target_names[3]: fourth_word
}, },
fetch_list=fetch_targets,
return_numpy=False) return_numpy=False)
print(numpy.array(result[0])) print(numpy.array(results[0]))
most_possible_word_index = numpy.argmax(result[0]) most_possible_word_index = numpy.argmax(results[0])
print(most_possible_word_index) print(most_possible_word_index)
print([ print([
key for key, value in six.iteritems(word_dict) key for key, value in six.iteritems(word_dict)
if value == most_possible_word_index if value == most_possible_word_index
][0]) ][0])
print(results[0].recursive_sequence_lengths())
np_data = numpy.array(results[0])
print("Inference Shape: ", np_data.shape)
def main(use_cuda, is_sparse): def main(use_cuda, is_sparse):
if use_cuda and not fluid.core.is_compiled_with_cuda(): if use_cuda and not fluid.core.is_compiled_with_cuda():
...@@ -189,14 +237,11 @@ def main(use_cuda, is_sparse): ...@@ -189,14 +237,11 @@ def main(use_cuda, is_sparse):
params_dirname = "word2vec.inference.model" params_dirname = "word2vec.inference.model"
train( train(
use_cuda=use_cuda, if_use_cuda=use_cuda,
train_program=partial(train_program, is_sparse), params_dirname=params_dirname,
params_dirname=params_dirname) is_sparse=is_sparse)
infer( infer(use_cuda=use_cuda, params_dirname=params_dirname)
use_cuda=use_cuda,
inference_program=partial(inference_program, is_sparse),
params_dirname=params_dirname)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -225,15 +225,6 @@ import paddle ...@@ -225,15 +225,6 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.fluid.layers as layers import paddle.fluid.layers as layers
import paddle.fluid.nets as nets import paddle.fluid.nets as nets
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
IS_SPARSE = True IS_SPARSE = True
USE_GPU = False USE_GPU = False
...@@ -414,13 +405,8 @@ test_reader = paddle.batch( ...@@ -414,13 +405,8 @@ test_reader = paddle.batch(
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE) paddle.dataset.movielens.test(), batch_size=BATCH_SIZE)
``` ```
### 构造训练器(trainer) ### 构造训练过程(trainer)
训练器需要一个训练程序和一个训练优化函数。 我们这里构造了一个训练过程,包括训练优化函数。
```python
trainer = Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_func)
```
### 提供数据 ### 提供数据
...@@ -433,56 +419,92 @@ feed_order = [ ...@@ -433,56 +419,92 @@ feed_order = [
] ]
``` ```
### 事件处理器 ### 构建训练程序以及测试程序
回调函数`event_handler`在一个之前定义好的事件发生后会被调用。例如,我们可以在每步训练结束后查看误差 分别构建训练程序和测试程序,并引入训练优化器
```python ```python
# Specify the directory path to save the parameters main_program = fluid.default_main_program()
params_dirname = "recommender_system.inference.model" star_program = fluid.default_startup_program()
def event_handler(event): [avg_cost, scale_infer] = train_program()
if isinstance(event, EndStepEvent):
test_reader = paddle.batch( test_program = main_program.clone(for_test=True)
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE) sgd_optimizer = optimizer_func()
avg_cost_set = trainer.test( sgd_optimizer.minimize(avg_cost)
reader=test_reader, feed_order=feed_order) exe = fluid.Executor(place)
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(
feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, scale_infer]) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, scale_infer])
accumulated = [x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)]
count += 1
return [x / count for x in accumulated]
```
# get avg cost ### 构建训练主循环并开始训练
avg_cost = np.array(avg_cost_set).mean() 我们根据上面定义的训练循环数(`PASS_NUM`)和一些别的参数,来进行训练循环,并且每次循环都进行一次测试,当测试结果足够好时退出训练并保存训练好的参数。
print("avg_cost: %s" % avg_cost) ```python
# Specify the directory path to save the parameters
params_dirname = "recommender_system.inference.model"
if float(avg_cost) < 4: # Change this number to adjust accuracy from paddle.utils.plot import Ploter
trainer.save_params(params_dirname) test_prompt = "Test cost"
trainer.stop() plot_cost = Ploter(test_prompt)
def train_loop():
feed_list = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list, place)
exe.run(star_program)
for pass_id in range(PASS_NUM):
for batch_id, data in enumerate(train_reader()):
# train a mini-batch
outs = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost])
out = np.array(outs[0])
avg_cost_set = train_test(test_program, test_reader)
# get test avg_cost
test_avg_cost = np.array(avg_cost_set).mean()
plot_cost.append(test_prompt, batch_id, outs[0])
plot_cost.plot()
print("avg_cost: %s" % test_avg_cost)
if batch_id == 20:
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, [
"user_id", "gender_id", "age_id", "job_id",
"movie_id", "category_id", "movie_title"
], [scale_infer], exe)
return
else: else:
print('BatchID {0}, Test Loss {1:0.2}'.format(event.epoch + 1, print('BatchID {0}, Test Loss {1:0.2}'.format(pass_id + 1,
float(avg_cost))) float(test_avg_cost)))
if math.isnan(float(avg_cost)):
if math.isnan(float(out[0])):
sys.exit("got NaN loss, training failed.") sys.exit("got NaN loss, training failed.")
``` ```
### 开始训练
最后,我们传入训练循环数(`num_epoch`)和一些别的参数,调用 `trainer.train` 来开始训练。
```python ```python
trainer.train( train_loop()
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
``` ```
## 应用模型 ## 应用模型
### 构建预测器 ### 生成测试数据
传入`inference_program``params_dirname`来初始化一个预测器, `params_dirname`用来存放训练过程中的各个参数。
```python
inferencer = Inferencer(
inference_program, param_path=params_dirname, place=place)
```
### 生成测试用输入数据
使用 create_lod_tensor(data, lod, place) 的API来生成细节层次的张量。`data`是一个序列,每个元素是一个索引号的序列。`lod`是细节层次的信息,对应于`data`。比如,data = [[10, 2, 3], [2, 3]] 意味着它包含两个序列,长度分别是3和2。于是相应地 lod = [[3, 2]],它表明其包含一层细节信息,意味着 `data` 有两个序列,长度分别是3和2。 使用 create_lod_tensor(data, lod, place) 的API来生成细节层次的张量。`data`是一个序列,每个元素是一个索引号的序列。`lod`是细节层次的信息,对应于`data`。比如,data = [[10, 2, 3], [2, 3]] 意味着它包含两个序列,长度分别是3和2。于是相应地 lod = [[3, 2]],它表明其包含一层细节信息,意味着 `data` 有两个序列,长度分别是3和2。
在这个预测例子中,我们试着预测用户ID为1的用户对于电影'Hunchback of Notre Dame'的评分 在这个预测例子中,我们试着预测用户ID为1的用户对于电影'Hunchback of Notre Dame'的评分
...@@ -500,13 +522,27 @@ movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]], [[5]], ...@@ -500,13 +522,27 @@ movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]], [[5]],
place) # 'hunchback','of','notre','dame','the' place) # 'hunchback','of','notre','dame','the'
``` ```
### 构建预测过程并测试
与训练过程类似,我们需要构建一个预测过程。其中, `params_dirname`是之前用来存放训练过程中的各个参数的地址。
```python
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
```
### 测试 ### 测试
现在我们可以进行预测了。我们要提供的`feed_order`应该和训练过程一致。 现在我们可以进行预测了。我们要提供的`feed_order`应该和训练过程一致。
```python ```python
results = inferencer.infer( with fluid.scope_guard(inference_scope):
{ [inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
results = exe.run(inferencer,
feed={
'user_id': user_id, 'user_id': user_id,
'gender_id': gender_id, 'gender_id': gender_id,
'age_id': age_id, 'age_id': age_id,
...@@ -515,12 +551,13 @@ results = inferencer.infer( ...@@ -515,12 +551,13 @@ results = inferencer.infer(
'category_id': category_id, 'category_id': category_id,
'movie_title': movie_title 'movie_title': movie_title
}, },
fetch_list=fetch_targets,
return_numpy=False) return_numpy=False)
predict_rating = np.array(results[0])
predict_rating = np.array(results[0]) print("Predict Rating of user id 1 on movie \"" + infer_movie_name +
print("Predict Rating of user id 1 on movie \"" + infer_movie_name + "\" is " + str(predict_rating[0][0])) "\" is " + str(predict_rating[0][0]))
print("Actual Rating of user id 1 on movie \"" + infer_movie_name + "\" is 4.") print("Actual Rating of user id 1 on movie \"" + infer_movie_name +
"\" is 4.")
``` ```
## 总结 ## 总结
......
...@@ -267,15 +267,6 @@ import paddle ...@@ -267,15 +267,6 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.fluid.layers as layers import paddle.fluid.layers as layers
import paddle.fluid.nets as nets import paddle.fluid.nets as nets
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
IS_SPARSE = True IS_SPARSE = True
USE_GPU = False USE_GPU = False
...@@ -456,13 +447,8 @@ test_reader = paddle.batch( ...@@ -456,13 +447,8 @@ test_reader = paddle.batch(
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE) paddle.dataset.movielens.test(), batch_size=BATCH_SIZE)
``` ```
### 构造训练器(trainer) ### 构造训练过程(trainer)
训练器需要一个训练程序和一个训练优化函数。 我们这里构造了一个训练过程,包括训练优化函数。
```python
trainer = Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_func)
```
### 提供数据 ### 提供数据
...@@ -475,56 +461,92 @@ feed_order = [ ...@@ -475,56 +461,92 @@ feed_order = [
] ]
``` ```
### 事件处理器 ### 构建训练程序以及测试程序
回调函数`event_handler`在一个之前定义好的事件发生后会被调用。例如,我们可以在每步训练结束后查看误差 分别构建训练程序和测试程序,并引入训练优化器
```python ```python
# Specify the directory path to save the parameters main_program = fluid.default_main_program()
params_dirname = "recommender_system.inference.model" star_program = fluid.default_startup_program()
def event_handler(event): [avg_cost, scale_infer] = train_program()
if isinstance(event, EndStepEvent):
test_reader = paddle.batch( test_program = main_program.clone(for_test=True)
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE) sgd_optimizer = optimizer_func()
avg_cost_set = trainer.test( sgd_optimizer.minimize(avg_cost)
reader=test_reader, feed_order=feed_order) exe = fluid.Executor(place)
def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(
feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, scale_infer]) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, scale_infer])
accumulated = [x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)]
count += 1
return [x / count for x in accumulated]
```
# get avg cost ### 构建训练主循环并开始训练
avg_cost = np.array(avg_cost_set).mean() 我们根据上面定义的训练循环数(`PASS_NUM`)和一些别的参数,来进行训练循环,并且每次循环都进行一次测试,当测试结果足够好时退出训练并保存训练好的参数。
print("avg_cost: %s" % avg_cost) ```python
# Specify the directory path to save the parameters
params_dirname = "recommender_system.inference.model"
if float(avg_cost) < 4: # Change this number to adjust accuracy from paddle.utils.plot import Ploter
trainer.save_params(params_dirname) test_prompt = "Test cost"
trainer.stop() plot_cost = Ploter(test_prompt)
def train_loop():
feed_list = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list, place)
exe.run(star_program)
for pass_id in range(PASS_NUM):
for batch_id, data in enumerate(train_reader()):
# train a mini-batch
outs = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost])
out = np.array(outs[0])
avg_cost_set = train_test(test_program, test_reader)
# get test avg_cost
test_avg_cost = np.array(avg_cost_set).mean()
plot_cost.append(test_prompt, batch_id, outs[0])
plot_cost.plot()
print("avg_cost: %s" % test_avg_cost)
if batch_id == 20:
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, [
"user_id", "gender_id", "age_id", "job_id",
"movie_id", "category_id", "movie_title"
], [scale_infer], exe)
return
else: else:
print('BatchID {0}, Test Loss {1:0.2}'.format(event.epoch + 1, print('BatchID {0}, Test Loss {1:0.2}'.format(pass_id + 1,
float(avg_cost))) float(test_avg_cost)))
if math.isnan(float(avg_cost)):
if math.isnan(float(out[0])):
sys.exit("got NaN loss, training failed.") sys.exit("got NaN loss, training failed.")
``` ```
### 开始训练
最后我们传入训练循环数(`num_epoch`)和一些别的参数调用 `trainer.train` 来开始训练
```python ```python
trainer.train( train_loop()
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
``` ```
## 应用模型 ## 应用模型
### 构建预测器 ### 生成测试数据
传入`inference_program``params_dirname`来初始化一个预测器, `params_dirname`用来存放训练过程中的各个参数
```python
inferencer = Inferencer(
inference_program, param_path=params_dirname, place=place)
```
### 生成测试用输入数据
使用 create_lod_tensor(data, lod, place) 的API来生成细节层次的张量。`data`是一个序列,每个元素是一个索引号的序列。`lod`是细节层次的信息,对应于`data`。比如,data = [[10, 2, 3], [2, 3]] 意味着它包含两个序列,长度分别是3和2。于是相应地 lod = [[3, 2]],它表明其包含一层细节信息,意味着 `data` 有两个序列,长度分别是3和2。 使用 create_lod_tensor(data, lod, place) 的API来生成细节层次的张量。`data`是一个序列,每个元素是一个索引号的序列。`lod`是细节层次的信息,对应于`data`。比如,data = [[10, 2, 3], [2, 3]] 意味着它包含两个序列,长度分别是3和2。于是相应地 lod = [[3, 2]],它表明其包含一层细节信息,意味着 `data` 有两个序列,长度分别是3和2。
在这个预测例子中,我们试着预测用户ID为1的用户对于电影'Hunchback of Notre Dame'的评分 在这个预测例子中,我们试着预测用户ID为1的用户对于电影'Hunchback of Notre Dame'的评分
...@@ -542,13 +564,27 @@ movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]], [[5]], ...@@ -542,13 +564,27 @@ movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]], [[5]],
place) # 'hunchback','of','notre','dame','the' place) # 'hunchback','of','notre','dame','the'
``` ```
### 构建预测过程并测试
与训练过程类似,我们需要构建一个预测过程。其中, `params_dirname`是之前用来存放训练过程中的各个参数的地址。
```python
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
```
### 测试 ### 测试
现在我们可以进行预测了。我们要提供的`feed_order`应该和训练过程一致。 现在我们可以进行预测了。我们要提供的`feed_order`应该和训练过程一致。
```python ```python
results = inferencer.infer( with fluid.scope_guard(inference_scope):
{ [inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
results = exe.run(inferencer,
feed={
'user_id': user_id, 'user_id': user_id,
'gender_id': gender_id, 'gender_id': gender_id,
'age_id': age_id, 'age_id': age_id,
...@@ -557,12 +593,13 @@ results = inferencer.infer( ...@@ -557,12 +593,13 @@ results = inferencer.infer(
'category_id': category_id, 'category_id': category_id,
'movie_title': movie_title 'movie_title': movie_title
}, },
fetch_list=fetch_targets,
return_numpy=False) return_numpy=False)
predict_rating = np.array(results[0])
predict_rating = np.array(results[0]) print("Predict Rating of user id 1 on movie \"" + infer_movie_name +
print("Predict Rating of user id 1 on movie \"" + infer_movie_name + "\" is " + str(predict_rating[0][0])) "\" is " + str(predict_rating[0][0]))
print("Actual Rating of user id 1 on movie \"" + infer_movie_name + "\" is 4.") print("Actual Rating of user id 1 on movie \"" + infer_movie_name +
"\" is 4.")
``` ```
## 总结 ## 总结
......
...@@ -20,19 +20,11 @@ import paddle ...@@ -20,19 +20,11 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.fluid.layers as layers import paddle.fluid.layers as layers
import paddle.fluid.nets as nets import paddle.fluid.nets as nets
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
IS_SPARSE = True IS_SPARSE = True
USE_GPU = False USE_GPU = False
BATCH_SIZE = 256 BATCH_SIZE = 256
PASS_NUM = 100
def get_usr_combined_features(): def get_usr_combined_features():
...@@ -148,71 +140,101 @@ def inference_program(): ...@@ -148,71 +140,101 @@ def inference_program():
inference = layers.cos_sim(X=usr_combined_features, Y=mov_combined_features) inference = layers.cos_sim(X=usr_combined_features, Y=mov_combined_features)
scale_infer = layers.scale(x=inference, scale=5.0) scale_infer = layers.scale(x=inference, scale=5.0)
return scale_infer
def train_program():
scale_infer = inference_program()
label = layers.data(name='score', shape=[1], dtype='float32') label = layers.data(name='score', shape=[1], dtype='float32')
square_cost = layers.square_error_cost(input=scale_infer, label=label) square_cost = layers.square_error_cost(input=scale_infer, label=label)
avg_cost = layers.mean(square_cost) avg_cost = layers.mean(square_cost)
return [avg_cost, scale_infer] return scale_infer, avg_cost
def optimizer_func(): def optimizer_func():
return fluid.optimizer.SGD(learning_rate=0.2) return fluid.optimizer.SGD(learning_rate=0.2)
def train(use_cuda, train_program, params_dirname): def train(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = Trainer( train_reader = paddle.batch(
train_func=train_program, place=place, optimizer_func=optimizer_func) paddle.reader.shuffle(paddle.dataset.movielens.train(), buf_size=8192),
batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE)
feed_order = [ feed_order = [
'user_id', 'gender_id', 'age_id', 'job_id', 'movie_id', 'category_id', 'user_id', 'gender_id', 'age_id', 'job_id', 'movie_id', 'category_id',
'movie_title', 'score' 'movie_title', 'score'
] ]
def event_handler(event): main_program = fluid.default_main_program()
if isinstance(event, EndStepEvent): star_program = fluid.default_startup_program()
test_reader = paddle.batch( scale_infer, avg_cost = inference_program()
paddle.dataset.movielens.test(), batch_size=BATCH_SIZE)
avg_cost_set = trainer.test(
reader=test_reader, feed_order=feed_order)
# get avg cost test_program = main_program.clone(for_test=True)
avg_cost = np.array(avg_cost_set).mean() sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
print("avg_cost: %s" % avg_cost) def train_test(program, reader):
count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len([avg_cost, scale_infer]) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=[avg_cost, scale_infer])
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
if float(avg_cost) < 4: # Change this number to adjust accuracy def train_loop():
trainer.save_params(params_dirname) feed_list = [
trainer.stop() main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list, place)
exe.run(star_program)
for pass_id in range(PASS_NUM):
for batch_id, data in enumerate(train_reader()):
# train a mini-batch
outs = exe.run(
program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost])
out = np.array(outs[0])
avg_cost_set = train_test(test_program, test_reader)
# get test avg_cost
test_avg_cost = np.array(avg_cost_set).mean()
print("avg_cost: %s" % test_avg_cost)
# if test_avg_cost < 4.0: # Change this number to adjust accuracy
if batch_id == 20:
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, [
"user_id", "gender_id", "age_id", "job_id",
"movie_id", "category_id", "movie_title"
], [scale_infer], exe)
return
else: else:
print('BatchID {0}, Test Loss {1:0.2}'.format(event.epoch + 1, print('BatchID {0}, Test Loss {1:0.2}'.format(
float(avg_cost))) pass_id + 1, float(test_avg_cost)))
if math.isnan(float(avg_cost)):
sys.exit("got NaN loss, training failed.")
train_reader = paddle.batch( if math.isnan(float(out[0])):
paddle.reader.shuffle(paddle.dataset.movielens.train(), buf_size=8192), sys.exit("got NaN loss, training failed.")
batch_size=BATCH_SIZE)
trainer.train( train_loop()
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
def infer(use_cuda, inference_program, params_dirname): def infer(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
inferencer = Inferencer(
inference_program, param_path=params_dirname, place=place)
# Use the first data from paddle.dataset.movielens.test() as input. # Use the first data from paddle.dataset.movielens.test() as input.
# Use create_lod_tensor(data, lod, place) API to generate LoD Tensor, # Use create_lod_tensor(data, lod, place) API to generate LoD Tensor,
...@@ -225,27 +247,66 @@ def infer(use_cuda, inference_program, params_dirname): ...@@ -225,27 +247,66 @@ def infer(use_cuda, inference_program, params_dirname):
infer_movie_id = 783 infer_movie_id = 783
infer_movie_name = paddle.dataset.movielens.movie_info()[ infer_movie_name = paddle.dataset.movielens.movie_info()[
infer_movie_id].title infer_movie_id].title
exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
with fluid.scope_guard(inference_scope):
# Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# Use the first data from paddle.dataset.movielens.test() as input
assert feed_target_names[0] == "user_id"
# Use create_lod_tensor(data, recursive_sequence_lengths, place) API
# to generate LoD Tensor where `data` is a list of sequences of index
# numbers, `recursive_sequence_lengths` is the length-based level of detail
# (lod) info associated with `data`.
# For example, data = [[10, 2, 3], [2, 3]] means that it contains
# two sequences of indexes, of length 3 and 2, respectively.
# Correspondingly, recursive_sequence_lengths = [[3, 2]] contains one
# level of detail info, indicating that `data` consists of two sequences
# of length 3 and 2, respectively.
user_id = fluid.create_lod_tensor([[1]], [[1]], place) user_id = fluid.create_lod_tensor([[1]], [[1]], place)
assert feed_target_names[1] == "gender_id"
gender_id = fluid.create_lod_tensor([[1]], [[1]], place) gender_id = fluid.create_lod_tensor([[1]], [[1]], place)
assert feed_target_names[2] == "age_id"
age_id = fluid.create_lod_tensor([[0]], [[1]], place) age_id = fluid.create_lod_tensor([[0]], [[1]], place)
assert feed_target_names[3] == "job_id"
job_id = fluid.create_lod_tensor([[10]], [[1]], place) job_id = fluid.create_lod_tensor([[10]], [[1]], place)
assert feed_target_names[4] == "movie_id"
movie_id = fluid.create_lod_tensor([[783]], [[1]], place) movie_id = fluid.create_lod_tensor([[783]], [[1]], place)
assert feed_target_names[5] == "category_id"
category_id = fluid.create_lod_tensor([[10, 8, 9]], [[3]], place) category_id = fluid.create_lod_tensor([[10, 8, 9]], [[3]], place)
movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]], [[5]],
place) assert feed_target_names[6] == "movie_title"
movie_title = fluid.create_lod_tensor([[1069, 4140, 2923, 710, 988]],
results = inferencer.infer( [[5]], place)
{
'user_id': user_id, # Construct feed as a dictionary of {feed_target_name: feed_target_data}
'gender_id': gender_id, # and results will contain a list of data corresponding to fetch_targets.
'age_id': age_id, results = exe.run(
'job_id': job_id, inferencer,
'movie_id': movie_id, feed={
'category_id': category_id, feed_target_names[0]: user_id,
'movie_title': movie_title feed_target_names[1]: gender_id,
feed_target_names[2]: age_id,
feed_target_names[3]: job_id,
feed_target_names[4]: movie_id,
feed_target_names[5]: category_id,
feed_target_names[6]: movie_title
}, },
fetch_list=fetch_targets,
return_numpy=False) return_numpy=False)
predict_rating = np.array(results[0]) predict_rating = np.array(results[0])
print("Predict Rating of user id 1 on movie \"" + infer_movie_name + print("Predict Rating of user id 1 on movie \"" + infer_movie_name +
"\" is " + str(predict_rating[0][0])) "\" is " + str(predict_rating[0][0]))
...@@ -257,14 +318,8 @@ def main(use_cuda): ...@@ -257,14 +318,8 @@ def main(use_cuda):
if use_cuda and not fluid.core.is_compiled_with_cuda(): if use_cuda and not fluid.core.is_compiled_with_cuda():
return return
params_dirname = "recommender_system.inference.model" params_dirname = "recommender_system.inference.model"
train( train(use_cuda=use_cuda, params_dirname=params_dirname)
use_cuda=use_cuda, infer(use_cuda=use_cuda, params_dirname=params_dirname)
train_program=train_program,
params_dirname=params_dirname)
infer(
use_cuda=use_cuda,
inference_program=inference_program,
params_dirname=params_dirname)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -110,24 +110,16 @@ Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取 ...@@ -110,24 +110,16 @@ Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取
from __future__ import print_function from __future__ import print_function
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from functools import partial
import numpy as np import numpy as np
try: import sys
from paddle.fluid.contrib.trainer import * import math
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
CLASS_DIM = 2 CLASS_DIM = 2
EMB_DIM = 128 EMB_DIM = 128
HID_DIM = 512 HID_DIM = 512
STACKED_NUM = 3 STACKED_NUM = 3
BATCH_SIZE = 128 BATCH_SIZE = 128
USE_GPU = False
``` ```
...@@ -212,8 +204,7 @@ def inference_program(word_dict): ...@@ -212,8 +204,7 @@ def inference_program(word_dict):
在测试过程中,分类器会计算各个输出的概率。第一个返回的数值规定为 损耗(cost)。 在测试过程中,分类器会计算各个输出的概率。第一个返回的数值规定为 损耗(cost)。
```python ```python
def train_program(word_dict): def train_program(prediction):
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64") label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -258,59 +249,77 @@ train_reader = paddle.batch( ...@@ -258,59 +249,77 @@ train_reader = paddle.batch(
训练器需要一个训练程序和一个训练优化函数。 训练器需要一个训练程序和一个训练优化函数。
```python ```python
trainer = Trainer( exe = fluid.Executor(place)
train_func=partial(train_program, word_dict), prediction = inference_program(word_dict)
place=place, [avg_cost, accuracy] = train_program(prediction)
optimizer_func=optimizer_func) sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
``` ```
### 提供数据 ### 提供数据并构建主训练循环
`feed_order`用来定义每条产生的数据和`paddle.layer.data`之间的映射关系。比如,`imdb.train`产生的第一列的数据对应的是`words`这个特征。 `feed_order`用来定义每条产生的数据和`paddle.layer.data`之间的映射关系。比如,`imdb.train`产生的第一列的数据对应的是`words`这个特征。
```python ```python
# Specify the directory path to save the parameters
params_dirname = "understand_sentiment_conv.inference.model"
feed_order = ['words', 'label'] feed_order = ['words', 'label']
``` pass_num = 1
### 事件处理器 def train_loop(main_program):
exe.run(fluid.default_startup_program())
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(
feed_list=feed_var_list_loop, place=place)
回调函数event_handler在一个之前定义好的事件发生后会被调用。例如,我们可以在每步训练结束后查看误差。 test_program = fluid.default_main_program().clone(for_test=True)
```python for epoch_id in range(pass_num):
# Specify the directory path to save the parameters for step_id, data in enumerate(train_reader()):
params_dirname = "understand_sentiment_conv.inference.model" metrics = exe.run(main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, accuracy])
def event_handler(event): avg_cost_test, acc_test = train_test(test_program, test_reader)
if isinstance(event, EndStepEvent): print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format(
print("Step {0}, Epoch {1} Metrics {2}".format( step_id, avg_cost_test, acc_test))
event.step, event.epoch, list(map(np.array, event.metrics))))
if event.step == 10: print("Step {0}, Epoch {1} Metrics {2}".format(
trainer.save_params(params_dirname) step_id, epoch_id, list(map(np.array,
trainer.stop() metrics))))
if step_id == 30:
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, ["words"],
prediction, exe)
return
``` ```
### 训练过程处理
我们在训练主循环里打印了每一步输出,可以观察训练情况。
### 开始训练 ### 开始训练
最后,我们传入训练循环数(num_epoch)和一些别的参数,调用 trainer.train 来开始训练 最后,我们启动训练主循环来开始训练。训练时间较长,如果为了更快的返回结果,可以通过调整损耗值范围或者训练步数,以减少准确率的代价来缩短训练时间
```python ```python
trainer.train( train_loop(fluid.default_main_program())
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
``` ```
## 应用模型 ## 应用模型
### 构建预测器 ### 构建预测器
传入`inference_program``params_dirname`来初始化一个预测器, `params_dirname`用来存放训练过程中的各个参数。 和训练过程一样,我们需要创建一个预测过程,并使用训练得到的模型和参数来进行预测,`params_dirname`用来存放训练过程中的各个参数。
```python ```python
inferencer = Inferencer( place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
infer_func=partial(inference_program, word_dict), param_path=params_dirname, place=place) exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
``` ```
### 生成测试用输入数据 ### 生成测试用输入数据
...@@ -334,15 +343,25 @@ base_shape = [[len(c) for c in lod]] ...@@ -334,15 +343,25 @@ base_shape = [[len(c) for c in lod]]
tensor_words = fluid.create_lod_tensor(lod, base_shape, place) tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
``` ```
## 应用模型 ## 应用模型并进行预测
现在我们可以对每一条评论进行正面或者负面的预测啦。 现在我们可以对每一条评论进行正面或者负面的预测啦。
```python ```python
results = inferencer.infer({'words': tensor_words}) with fluid.scope_guard(inference_scope):
for i, r in enumerate(results[0]): [inferencer, feed_target_names,
print("Predict probability of ", r[0], " to be positive and ", r[1], " to be negative for review \'", reviews_str[i], "\'") fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
assert feed_target_names[0] == "words"
results = exe.run(inference_program,
feed={feed_target_names[0]: tensor_words},
fetch_list=fetch_targets,
return_numpy=False)
np_data = np.array(results[0])
for i, r in enumerate(np_data):
print("Predict probability of ", r[0], " to be positive and ", r[1],
" to be negative for review \'", reviews_str[i], "\'")
``` ```
......
...@@ -152,24 +152,16 @@ Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取 ...@@ -152,24 +152,16 @@ Paddle在`dataset/imdb.py`中提实现了imdb数据集的自动下载和读取
from __future__ import print_function from __future__ import print_function
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from functools import partial
import numpy as np import numpy as np
try: import sys
from paddle.fluid.contrib.trainer import * import math
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
CLASS_DIM = 2 CLASS_DIM = 2
EMB_DIM = 128 EMB_DIM = 128
HID_DIM = 512 HID_DIM = 512
STACKED_NUM = 3 STACKED_NUM = 3
BATCH_SIZE = 128 BATCH_SIZE = 128
USE_GPU = False
``` ```
...@@ -254,8 +246,7 @@ def inference_program(word_dict): ...@@ -254,8 +246,7 @@ def inference_program(word_dict):
在测试过程中,分类器会计算各个输出的概率。第一个返回的数值规定为 损耗(cost)。 在测试过程中,分类器会计算各个输出的概率。第一个返回的数值规定为 损耗(cost)。
```python ```python
def train_program(word_dict): def train_program(prediction):
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64") label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -300,59 +291,77 @@ train_reader = paddle.batch( ...@@ -300,59 +291,77 @@ train_reader = paddle.batch(
训练器需要一个训练程序和一个训练优化函数。 训练器需要一个训练程序和一个训练优化函数。
```python ```python
trainer = Trainer( exe = fluid.Executor(place)
train_func=partial(train_program, word_dict), prediction = inference_program(word_dict)
place=place, [avg_cost, accuracy] = train_program(prediction)
optimizer_func=optimizer_func) sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
``` ```
### 提供数据 ### 提供数据并构建主训练循环
`feed_order`用来定义每条产生的数据和`paddle.layer.data`之间的映射关系。比如,`imdb.train`产生的第一列的数据对应的是`words`这个特征。 `feed_order`用来定义每条产生的数据和`paddle.layer.data`之间的映射关系。比如,`imdb.train`产生的第一列的数据对应的是`words`这个特征。
```python ```python
# Specify the directory path to save the parameters
params_dirname = "understand_sentiment_conv.inference.model"
feed_order = ['words', 'label'] feed_order = ['words', 'label']
``` pass_num = 1
### 事件处理器 def train_loop(main_program):
exe.run(fluid.default_startup_program())
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(
feed_list=feed_var_list_loop, place=place)
回调函数event_handler在一个之前定义好的事件发生后会被调用。例如,我们可以在每步训练结束后查看误差。 test_program = fluid.default_main_program().clone(for_test=True)
```python for epoch_id in range(pass_num):
# Specify the directory path to save the parameters for step_id, data in enumerate(train_reader()):
params_dirname = "understand_sentiment_conv.inference.model" metrics = exe.run(main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, accuracy])
def event_handler(event): avg_cost_test, acc_test = train_test(test_program, test_reader)
if isinstance(event, EndStepEvent): print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format(
print("Step {0}, Epoch {1} Metrics {2}".format( step_id, avg_cost_test, acc_test))
event.step, event.epoch, list(map(np.array, event.metrics))))
if event.step == 10: print("Step {0}, Epoch {1} Metrics {2}".format(
trainer.save_params(params_dirname) step_id, epoch_id, list(map(np.array,
trainer.stop() metrics))))
if step_id == 30:
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, ["words"],
prediction, exe)
return
``` ```
### 训练过程处理
我们在训练主循环里打印了每一步输出,可以观察训练情况。
### 开始训练 ### 开始训练
最后,我们传入训练循环数(num_epoch)和一些别的参数,调用 trainer.train 来开始训练 最后,我们启动训练主循环来开始训练。训练时间较长,如果为了更快的返回结果,可以通过调整损耗值范围或者训练步数,以减少准确率的代价来缩短训练时间
```python ```python
trainer.train( train_loop(fluid.default_main_program())
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
``` ```
## 应用模型 ## 应用模型
### 构建预测器 ### 构建预测器
传入`inference_program`和`params_dirname`来初始化一个预测器, `params_dirname`用来存放训练过程中的各个参数。 和训练过程一样,我们需要创建一个预测过程,并使用训练得到的模型和参数来进行预测,`params_dirname`用来存放训练过程中的各个参数。
```python ```python
inferencer = Inferencer( place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
infer_func=partial(inference_program, word_dict), param_path=params_dirname, place=place) exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
``` ```
### 生成测试用输入数据 ### 生成测试用输入数据
...@@ -376,15 +385,25 @@ base_shape = [[len(c) for c in lod]] ...@@ -376,15 +385,25 @@ base_shape = [[len(c) for c in lod]]
tensor_words = fluid.create_lod_tensor(lod, base_shape, place) tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
``` ```
## 应用模型 ## 应用模型并进行预测
现在我们可以对每一条评论进行正面或者负面的预测啦。 现在我们可以对每一条评论进行正面或者负面的预测啦。
```python ```python
results = inferencer.infer({'words': tensor_words}) with fluid.scope_guard(inference_scope):
for i, r in enumerate(results[0]): [inferencer, feed_target_names,
print("Predict probability of ", r[0], " to be positive and ", r[1], " to be negative for review \'", reviews_str[i], "\'") fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
assert feed_target_names[0] == "words"
results = exe.run(inference_program,
feed={feed_target_names[0]: tensor_words},
fetch_list=fetch_targets,
return_numpy=False)
np_data = np.array(results[0])
for i, r in enumerate(np_data):
print("Predict probability of ", r[0], " to be positive and ", r[1],
" to be negative for review \'", reviews_str[i], "\'")
``` ```
......
...@@ -14,22 +14,11 @@ ...@@ -14,22 +14,11 @@
from __future__ import print_function from __future__ import print_function
import os
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from functools import partial
import numpy as np import numpy as np
import sys import sys
import math
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
CLASS_DIM = 2 CLASS_DIM = 2
EMB_DIM = 128 EMB_DIM = 128
...@@ -66,8 +55,7 @@ def inference_program(word_dict): ...@@ -66,8 +55,7 @@ def inference_program(word_dict):
return net return net
def train_program(word_dict): def train_program(prediction):
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64") label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -79,8 +67,9 @@ def optimizer_func(): ...@@ -79,8 +67,9 @@ def optimizer_func():
return fluid.optimizer.Adagrad(learning_rate=0.002) return fluid.optimizer.Adagrad(learning_rate=0.002)
def train(use_cuda, train_program, params_dirname): def train(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
print("Loading IMDB word dict....") print("Loading IMDB word dict....")
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
...@@ -94,44 +83,88 @@ def train(use_cuda, train_program, params_dirname): ...@@ -94,44 +83,88 @@ def train(use_cuda, train_program, params_dirname):
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE) paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE)
trainer = Trainer(
train_func=partial(train_program, word_dict),
place=place,
optimizer_func=optimizer_func)
feed_order = ['words', 'label'] feed_order = ['words', 'label']
pass_num = 1
main_program = fluid.default_main_program()
star_program = fluid.default_startup_program()
prediction = inference_program(word_dict)
train_func_outputs = train_program(prediction)
avg_cost = train_func_outputs[0]
test_program = main_program.clone(for_test=True)
# [avg_cost, accuracy] = train_program(prediction)
sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
def event_handler(event): def train_test(program, reader):
if isinstance(event, EndStepEvent): count = 0
if event.step % 10 == 0: feed_var_list = [
avg_cost, acc = trainer.test( program.global_block().var(var_name) for var_name in feed_order
reader=test_reader, feed_order=feed_order) ]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len(train_func_outputs) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=train_func_outputs)
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
def train_loop():
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list=feed_var_list_loop, place=place)
exe.run(star_program)
for epoch_id in range(pass_num):
for step_id, data in enumerate(train_reader()):
metrics = exe.run(
main_program,
feed=feeder.feed(data),
fetch_list=[var.name for var in train_func_outputs])
print("step: {0}, Metrics {1}".format(
step_id, list(map(np.array, metrics))))
if (step_id + 1) % 10 == 0:
avg_cost_test, acc_test = train_test(test_program,
test_reader)
print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format( print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format(
event.step, avg_cost, acc)) step_id, avg_cost_test, acc_test))
print("Step {0}, Epoch {1} Metrics {2}".format( print("Step {0}, Epoch {1} Metrics {2}".format(
event.step, event.epoch, list(map(np.array, step_id, epoch_id, list(map(np.array, metrics))))
event.metrics)))) if math.isnan(float(metrics[0])):
sys.exit("got NaN loss, training failed.")
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, ["words"],
prediction, exe)
elif isinstance(event, EndEpochEvent): train_loop()
trainer.save_params(params_dirname)
trainer.train(
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
def infer(use_cuda, params_dirname=None):
def infer(use_cuda, inference_program, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
inferencer = Inferencer( exe = fluid.Executor(place)
infer_func=partial(inference_program, word_dict),
param_path=params_dirname, inference_scope = fluid.core.Scope()
place=place) with fluid.scope_guard(inference_scope):
# Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# Setup input by creating LoDTensor to represent sequence of words. # Setup input by creating LoDTensor to represent sequence of words.
# Here each word is the basic element of the LoDTensor and the shape of # Here each word is the basic element of the LoDTensor and the shape of
...@@ -143,7 +176,6 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -143,7 +176,6 @@ def infer(use_cuda, inference_program, params_dirname=None):
# element (word). Hence the LoDTensor will hold data for three sentences of # element (word). Hence the LoDTensor will hold data for three sentences of
# length 3, 4 and 2, respectively. # length 3, 4 and 2, respectively.
# Note that lod info should be a list of lists. # Note that lod info should be a list of lists.
reviews_str = [ reviews_str = [
'read the book forget the movie', 'this is a great movie', 'read the book forget the movie', 'this is a great movie',
'this is very bad' 'this is very bad'
...@@ -158,9 +190,14 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -158,9 +190,14 @@ def infer(use_cuda, inference_program, params_dirname=None):
base_shape = [[len(c) for c in lod]] base_shape = [[len(c) for c in lod]]
tensor_words = fluid.create_lod_tensor(lod, base_shape, place) tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
results = inferencer.infer({'words': tensor_words}) assert feed_target_names[0] == "words"
results = exe.run(
for i, r in enumerate(results[0]): inferencer,
feed={feed_target_names[0]: tensor_words},
fetch_list=fetch_targets,
return_numpy=False)
np_data = np.array(results[0])
for i, r in enumerate(np_data):
print("Predict probability of ", r[0], " to be positive and ", r[1], print("Predict probability of ", r[0], " to be positive and ", r[1],
" to be negative for review \'", reviews_str[i], "\'") " to be negative for review \'", reviews_str[i], "\'")
...@@ -169,8 +206,8 @@ def main(use_cuda): ...@@ -169,8 +206,8 @@ def main(use_cuda):
if use_cuda and not fluid.core.is_compiled_with_cuda(): if use_cuda and not fluid.core.is_compiled_with_cuda():
return return
params_dirname = "understand_sentiment_conv.inference.model" params_dirname = "understand_sentiment_conv.inference.model"
train(use_cuda, train_program, params_dirname) train(use_cuda, params_dirname)
infer(use_cuda, inference_program, params_dirname) infer(use_cuda, params_dirname)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -14,28 +14,16 @@ ...@@ -14,28 +14,16 @@
from __future__ import print_function from __future__ import print_function
import os
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from functools import partial
import numpy as np import numpy as np
import sys import sys
import math
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
CLASS_DIM = 2 CLASS_DIM = 2
EMB_DIM = 128 EMB_DIM = 128
BATCH_SIZE = 128 BATCH_SIZE = 128
LSTM_SIZE = 128 LSTM_SIZE = 128
USE_GPU = False
def dynamic_rnn_lstm(data, input_dim, class_dim, emb_dim, lstm_size): def dynamic_rnn_lstm(data, input_dim, class_dim, emb_dim, lstm_size):
...@@ -83,8 +71,7 @@ def inference_program(word_dict): ...@@ -83,8 +71,7 @@ def inference_program(word_dict):
return pred return pred
def train_program(word_dict): def train_program(prediction):
prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64") label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -96,7 +83,7 @@ def optimizer_func(): ...@@ -96,7 +83,7 @@ def optimizer_func():
return fluid.optimizer.Adagrad(learning_rate=0.002) return fluid.optimizer.Adagrad(learning_rate=0.002)
def train(use_cuda, train_program, params_dirname): def train(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
print("Loading IMDB word dict....") print("Loading IMDB word dict....")
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
...@@ -111,44 +98,85 @@ def train(use_cuda, train_program, params_dirname): ...@@ -111,44 +98,85 @@ def train(use_cuda, train_program, params_dirname):
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE) paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE)
trainer = Trainer(
train_func=partial(train_program, word_dict),
place=place,
optimizer_func=optimizer_func)
feed_order = ['words', 'label'] feed_order = ['words', 'label']
pass_num = 1
def event_handler(event): main_program = fluid.default_main_program()
if isinstance(event, EndStepEvent): star_program = fluid.default_startup_program()
if event.step % 10 == 0: prediction = inference_program(word_dict)
avg_cost, acc = trainer.test( train_func_outputs = train_program(prediction)
reader=test_reader, feed_order=feed_order) avg_cost = train_func_outputs[0]
print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format( test_program = main_program.clone(for_test=True)
event.step, avg_cost, acc))
print("Step {0}, Epoch {1} Metrics {2}".format( sgd_optimizer = optimizer_func()
event.step, event.epoch, list(map(np.array, sgd_optimizer.minimize(avg_cost)
event.metrics)))) exe = fluid.Executor(place)
elif isinstance(event, EndEpochEvent): def train_test(program, reader):
trainer.save_params(params_dirname) count = 0
feed_var_list = [
program.global_block().var(var_name) for var_name in feed_order
]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len(train_func_outputs) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=train_func_outputs)
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
trainer.train( def train_loop():
num_epochs=1,
event_handler=event_handler, feed_var_list_loop = [
reader=train_reader, main_program.global_block().var(var_name) for var_name in feed_order
feed_order=feed_order) ]
feeder = fluid.DataFeeder(feed_list=feed_var_list_loop, place=place)
exe.run(fluid.default_startup_program())
for epoch_id in range(pass_num):
for step_id, data in enumerate(train_reader()):
metrics = exe.run(
main_program,
feed=feeder.feed(data),
fetch_list=[var.name for var in train_func_outputs])
if (step_id + 1) % 10 == 0:
def infer(use_cuda, inference_program, params_dirname=None): #avg_cost_test, acc_test = train_test(test_program, test_reader)
#print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format(
# step_id, avg_cost_test, acc_test))
print("Step {0}, Epoch {1} Metrics {2}".format(
step_id, epoch_id, list(map(np.array, metrics))))
if math.isnan(float(metrics[0])):
sys.exit("got NaN loss, training failed.")
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, ["words"],
prediction, exe)
train_loop()
def infer(use_cuda, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
inferencer = Inferencer( exe = fluid.Executor(place)
infer_func=partial(inference_program, word_dict),
param_path=params_dirname, inference_scope = fluid.core.Scope()
place=place) with fluid.scope_guard(inference_scope):
# Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# Setup input by creating LoDTensor to represent sequence of words. # Setup input by creating LoDTensor to represent sequence of words.
# Here each word is the basic element of the LoDTensor and the shape of # Here each word is the basic element of the LoDTensor and the shape of
...@@ -160,7 +188,6 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -160,7 +188,6 @@ def infer(use_cuda, inference_program, params_dirname=None):
# element (word). Hence the LoDTensor will hold data for three sentences of # element (word). Hence the LoDTensor will hold data for three sentences of
# length 3, 4 and 2, respectively. # length 3, 4 and 2, respectively.
# Note that lod info should be a list of lists. # Note that lod info should be a list of lists.
reviews_str = [ reviews_str = [
'read the book forget the movie', 'this is a great movie', 'read the book forget the movie', 'this is a great movie',
'this is very bad' 'this is very bad'
...@@ -175,9 +202,14 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -175,9 +202,14 @@ def infer(use_cuda, inference_program, params_dirname=None):
base_shape = [[len(c) for c in lod]] base_shape = [[len(c) for c in lod]]
tensor_words = fluid.create_lod_tensor(lod, base_shape, place) tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
results = inferencer.infer({'words': tensor_words}) assert feed_target_names[0] == "words"
results = exe.run(
for i, r in enumerate(results[0]): inferencer,
feed={feed_target_names[0]: tensor_words},
fetch_list=fetch_targets,
return_numpy=False)
np_data = np.array(results[0])
for i, r in enumerate(np_data):
print("Predict probability of ", r[0], " to be positive and ", r[1], print("Predict probability of ", r[0], " to be positive and ", r[1],
" to be negative for review \'", reviews_str[i], "\'") " to be negative for review \'", reviews_str[i], "\'")
...@@ -186,8 +218,8 @@ def main(use_cuda): ...@@ -186,8 +218,8 @@ def main(use_cuda):
if use_cuda and not fluid.core.is_compiled_with_cuda(): if use_cuda and not fluid.core.is_compiled_with_cuda():
return return
params_dirname = "understand_sentiment_conv.inference.model" params_dirname = "understand_sentiment_conv.inference.model"
train(use_cuda, train_program, params_dirname) train(use_cuda, params_dirname)
infer(use_cuda, inference_program, params_dirname) infer(use_cuda, params_dirname)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -17,19 +17,9 @@ from __future__ import print_function ...@@ -17,19 +17,9 @@ from __future__ import print_function
import os import os
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from functools import partial
import numpy as np import numpy as np
import sys import sys
import math
try:
from paddle.fluid.contrib.trainer import *
from paddle.fluid.contrib.inferencer import *
except ImportError:
print(
"In the fluid 1.0, the trainer and inferencer are moving to paddle.fluid.contrib",
file=sys.stderr)
from paddle.fluid.trainer import *
from paddle.fluid.inferencer import *
CLASS_DIM = 2 CLASS_DIM = 2
EMB_DIM = 128 EMB_DIM = 128
...@@ -74,8 +64,8 @@ def inference_program(word_dict): ...@@ -74,8 +64,8 @@ def inference_program(word_dict):
return net return net
def train_program(word_dict): def train_program(prediction):
prediction = inference_program(word_dict) # prediction = inference_program(word_dict)
label = fluid.layers.data(name="label", shape=[1], dtype="int64") label = fluid.layers.data(name="label", shape=[1], dtype="int64")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(cost) avg_cost = fluid.layers.mean(cost)
...@@ -87,8 +77,9 @@ def optimizer_func(): ...@@ -87,8 +77,9 @@ def optimizer_func():
return fluid.optimizer.Adagrad(learning_rate=0.002) return fluid.optimizer.Adagrad(learning_rate=0.002)
def train(use_cuda, train_program, params_dirname): def train(use_cuda, params_dirname):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
print("Loading IMDB word dict....") print("Loading IMDB word dict....")
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
...@@ -102,44 +93,88 @@ def train(use_cuda, train_program, params_dirname): ...@@ -102,44 +93,88 @@ def train(use_cuda, train_program, params_dirname):
test_reader = paddle.batch( test_reader = paddle.batch(
paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE) paddle.dataset.imdb.test(word_dict), batch_size=BATCH_SIZE)
trainer = Trainer(
train_func=partial(train_program, word_dict),
place=place,
optimizer_func=optimizer_func)
feed_order = ['words', 'label'] feed_order = ['words', 'label']
pass_num = 1
main_program = fluid.default_main_program()
star_program = fluid.default_startup_program()
prediction = inference_program(word_dict)
train_func_outputs = train_program(prediction)
avg_cost = train_func_outputs[0]
test_program = main_program.clone(for_test=True)
# [avg_cost, accuracy] = train_program(prediction)
sgd_optimizer = optimizer_func()
sgd_optimizer.minimize(avg_cost)
exe = fluid.Executor(place)
def event_handler(event): def train_test(program, reader):
if isinstance(event, EndStepEvent): count = 0
if event.step % 10 == 0: feed_var_list = [
avg_cost, acc = trainer.test( program.global_block().var(var_name) for var_name in feed_order
reader=test_reader, feed_order=feed_order) ]
feeder_test = fluid.DataFeeder(feed_list=feed_var_list, place=place)
test_exe = fluid.Executor(place)
accumulated = len(train_func_outputs) * [0]
for test_data in reader():
avg_cost_np = test_exe.run(
program=program,
feed=feeder_test.feed(test_data),
fetch_list=train_func_outputs)
accumulated = [
x[0] + x[1][0] for x in zip(accumulated, avg_cost_np)
]
count += 1
return [x / count for x in accumulated]
def train_loop():
feed_var_list_loop = [
main_program.global_block().var(var_name) for var_name in feed_order
]
feeder = fluid.DataFeeder(feed_list=feed_var_list_loop, place=place)
exe.run(fluid.default_startup_program())
for epoch_id in range(pass_num):
for step_id, data in enumerate(train_reader()):
metrics = exe.run(
main_program,
feed=feeder.feed(data),
fetch_list=[var.name for var in train_func_outputs])
print("step: {0}, Metrics {1}".format(
step_id, list(map(np.array, metrics))))
if (step_id + 1) % 10 == 0:
avg_cost_test, acc_test = train_test(test_program,
test_reader)
print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format( print('Step {0}, Test Loss {1:0.2}, Acc {2:0.2}'.format(
event.step, avg_cost, acc)) step_id, avg_cost_test, acc_test))
print("Step {0}, Epoch {1} Metrics {2}".format( print("Step {0}, Epoch {1} Metrics {2}".format(
event.step, event.epoch, list(map(np.array, step_id, epoch_id, list(map(np.array, metrics))))
event.metrics)))) if math.isnan(float(metrics[0])):
sys.exit("got NaN loss, training failed.")
if params_dirname is not None:
fluid.io.save_inference_model(params_dirname, ["words"],
prediction, exe)
elif isinstance(event, EndEpochEvent): train_loop()
trainer.save_params(params_dirname)
trainer.train(
num_epochs=1,
event_handler=event_handler,
reader=train_reader,
feed_order=feed_order)
def infer(use_cuda, params_dirname=None):
def infer(use_cuda, inference_program, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
word_dict = paddle.dataset.imdb.word_dict() word_dict = paddle.dataset.imdb.word_dict()
inferencer = Inferencer( exe = fluid.Executor(place)
infer_func=partial(inference_program, word_dict),
param_path=params_dirname, inference_scope = fluid.core.Scope()
place=place) with fluid.scope_guard(inference_scope):
# Use fluid.io.load_inference_model to obtain the inference program desc,
# the feed_target_names (the names of variables that will be feeded
# data using feed operators), and the fetch_targets (variables that
# we want to obtain data from using fetch operators).
[inferencer, feed_target_names,
fetch_targets] = fluid.io.load_inference_model(params_dirname, exe)
# Setup input by creating LoDTensor to represent sequence of words. # Setup input by creating LoDTensor to represent sequence of words.
# Here each word is the basic element of the LoDTensor and the shape of # Here each word is the basic element of the LoDTensor and the shape of
...@@ -151,7 +186,6 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -151,7 +186,6 @@ def infer(use_cuda, inference_program, params_dirname=None):
# element (word). Hence the LoDTensor will hold data for three sentences of # element (word). Hence the LoDTensor will hold data for three sentences of
# length 3, 4 and 2, respectively. # length 3, 4 and 2, respectively.
# Note that lod info should be a list of lists. # Note that lod info should be a list of lists.
reviews_str = [ reviews_str = [
'read the book forget the movie', 'this is a great movie', 'read the book forget the movie', 'this is a great movie',
'this is very bad' 'this is very bad'
...@@ -166,9 +200,14 @@ def infer(use_cuda, inference_program, params_dirname=None): ...@@ -166,9 +200,14 @@ def infer(use_cuda, inference_program, params_dirname=None):
base_shape = [[len(c) for c in lod]] base_shape = [[len(c) for c in lod]]
tensor_words = fluid.create_lod_tensor(lod, base_shape, place) tensor_words = fluid.create_lod_tensor(lod, base_shape, place)
results = inferencer.infer({'words': tensor_words}) assert feed_target_names[0] == "words"
results = exe.run(
for i, r in enumerate(results[0]): inferencer,
feed={feed_target_names[0]: tensor_words},
fetch_list=fetch_targets,
return_numpy=False)
np_data = np.array(results[0])
for i, r in enumerate(np_data):
print("Predict probability of ", r[0], " to be positive and ", r[1], print("Predict probability of ", r[0], " to be positive and ", r[1],
" to be negative for review \'", reviews_str[i], "\'") " to be negative for review \'", reviews_str[i], "\'")
...@@ -177,8 +216,8 @@ def main(use_cuda): ...@@ -177,8 +216,8 @@ def main(use_cuda):
if use_cuda and not fluid.core.is_compiled_with_cuda(): if use_cuda and not fluid.core.is_compiled_with_cuda():
return return
params_dirname = "understand_sentiment_stacked_lstm.inference.model" params_dirname = "understand_sentiment_stacked_lstm.inference.model"
train(use_cuda, train_program, params_dirname) train(use_cuda, params_dirname)
infer(use_cuda, inference_program, params_dirname) infer(use_cuda, params_dirname)
if __name__ == '__main__': if __name__ == '__main__':
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册