未验证 提交 63b5ec3f 编写于 作者: N Nicky Chan 提交者: GitHub

Merge pull request #561 from PaddlePaddle/high-level-api-branch

Merge High level api branch
......@@ -47,22 +47,6 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$
## 数据集
### 数据集接口的封装
首先加载需要的包
```python
import paddle.v2 as paddle
import paddle.v2.dataset.uci_housing as uci_housing
```
我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing)
其中,在uci_housing模块中封装了:
1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。
2. [数据预处理](#数据预处理)的过程。
### 数据集介绍
这份数据集共506行,每行包含了波士顿郊区的一类房屋的相关信息及该类房屋价格的中位数。其各维属性的意义如下:
......@@ -110,157 +94,158 @@ import paddle.v2.dataset.uci_housing as uci_housing
`fit_a_line/trainer.py`演示了训练的整体过程。
### 初始化PaddlePaddle
### 配置数据提供器(Datafeeder)
首先我们引入必要的库:
```python
paddle.init(use_gpu=False, trainer_count=1)
import paddle
import paddle.fluid as fluid
import numpy
```
### 模型配置
我们通过uci_housing模块引入了数据集合[UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing)
线性回归的模型其实就是一个采用线性激活函数(linear activation,`LinearActivation`)的全连接层(fully-connected layer,`fc_layer`
其中,在uci_housing模块中封装了
```python
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
y_predict = paddle.layer.fc(input=x,
size=1,
act=paddle.activation.Linear())
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
```
1. 数据下载的过程。下载数据保存在~/.cache/paddle/dataset/uci_housing/housing.data。
2. [数据预处理](#数据预处理)的过程。
### 保存网络拓扑
接下来我们定义了用于训练和测试的数据提供器。提供器每次读入一个大小为`BATCH_SIZE`的数据批次。如果用户希望加一些随机性,她可以同时定义一个批次大小和一个缓存大小。这样的话,每次数据提供器会从缓存中随机读取批次大小那么多的数据。
```python
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=y_predict)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
```
BATCH_SIZE = 20
### 创建参数
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE)
```python
parameters = paddle.parameters.create(cost)
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=BATCH_SIZE)
```
### 创建Trainer
### 配置训练程序
训练程序的目的是定义一个训练模型的网络结构。对于线性回归来讲,它就是一个从输入到输出的简单的全连接层。更加复杂的结果,比如卷积神经网络,递归神经网络等会在随后的章节中介绍。训练程序必须返回`平均损失`作为第一个返回值,因为它会被后面反向传播算法所用到。
```python
optimizer = paddle.optimizer.Momentum(momentum=0)
def train_program():
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(loss)
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
return avg_loss
```
### 读取数据且打印训练的中间信息
### 定义运算场所
我们可以定义运算是发生在CPU还是GPU
PaddlePaddle提供一个
[reader机制](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/reader)
来读取数据。 Reader返回的数据可以包括多列,我们需要一个Python dict把列
序号映射到网络里的数据层。
```python
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
```
### 创建训练器
训练器会读入一个训练程序和一些必要的其他参数:
```python
feeding={'x': 0, 'y': 1}
trainer = fluid.Trainer(
train_func=train_program,
place=place,
optimizer_func=fluid.optimizer.SGD(learning_rate=0.001))
```
此外,我们还可以提供一个 event handler,来打印训练的进度:
### 开始提供数据
PaddlePaddle提供了读取数据者发生器机制来读取训练数据。读取数据者会一次提供多列数据,因此我们需要一个Python的list来定义读取顺序。
```python
# event_handler to print training and testing info
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f" % (
event.pass_id, event.batch_id, event.cost)
if isinstance(event, paddle.event.EndPass):
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
print "Test %d, Cost %f" % (event.pass_id, result.cost)
feed_order=['x', 'y']
```
除此之外,可以定义一个事件相应器来处理类似`打印训练进程`的事件:
```python
# event_handler to print training and testing info
from paddle.v2.plot import Ploter
# Specify the directory path to save the parameters
params_dirname = "fit_a_line.inference.model"
# Plot data
from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
cost_ploter = Ploter(train_title, test_title)
plot_cost = Ploter(train_title, test_title)
step = 0
# event_handler to print training and testing info
def event_handler_plot(event):
global step
if isinstance(event, paddle.event.EndIteration):
if step % 10 == 0: # every 10 batches, record a train cost
cost_ploter.append(train_title, step, event.cost)
if isinstance(event, fluid.EndStepEvent):
if event.step % 10 == 0: # every 10 batches, record a test cost
test_metrics = trainer.test(
reader=test_reader, feed_order=feed_order)
if step % 100 == 0: # every 100 batches, record a test cost
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
cost_ploter.append(test_title, step, result.cost)
plot_cost.append(test_title, step, test_metrics[0])
plot_cost.plot()
if step % 100 == 0: # every 100 batches, update cost plot
cost_ploter.plot()
if test_metrics[0] < 10.0:
# If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop')
trainer.stop()
step += 1
# We can save the trained parameters for the inferences later
if params_dirname is not None:
trainer.save_params(params_dirname)
if isinstance(event, paddle.event.EndPass):
if event.pass_id % 10 == 0:
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
step += 1
```
### 开始训练
我们现在可以通过调用`trainer.train()`来开始训练
```python
%matplotlib inline
# The training could take up to a few minutes.
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
uci_housing.train(), buf_size=500),
batch_size=2),
feeding=feeding,
reader=train_reader,
num_epochs=100,
event_handler=event_handler_plot,
num_passes=30)
feed_order=feed_order)
```
![png](./image/train_and_test.png)
### 应用模型
## 预测
提供一个`inference_program`和一个`params_dirname`来初始化预测器。`params_dirname`用来存储我们的参数。
### 设定预测程序
类似于`trainer.train`,预测器需要一个预测程序来做预测。我们可以稍加修改我们的训练程序来把预测值包含进来。
#### 1. 生成测试数据
```python
test_data_creator = paddle.dataset.uci_housing.test()
test_data = []
test_label = []
for item in test_data_creator():
test_data.append((item[0],))
test_label.append(item[1])
if len(test_data) == 5:
break
def inference_program():
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
```
#### 2. 推测 inference
### 预测
预测器会从`params_dirname`中读取已经训练好的模型,来对从未遇见过的数据进行预测。
```python
# load parameters from tar file.
# users can remove the comments and change the model name
# with open('params_pass_20.tar', 'r') as f:
# parameters = paddle.parameters.Parameters.from_tar(f)
inferencer = fluid.Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
probs = paddle.infer(
output_layer=y_predict, parameters=parameters, input=test_data)
batch_size = 10
tensor_x = numpy.random.uniform(0, 10, [batch_size, 13]).astype("float32")
for i in xrange(len(probs)):
print "label=" + str(test_label[i][0]) + ", predict=" + str(probs[i][0])
results = inferencer.infer({'x': tensor_x})
print("infer results: ", results[0])
```
## 总结
......
......@@ -39,7 +39,7 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$
That is, for a dataset of size $n$, MSE is the average value of the the prediction sqaure errors.
### Training
### Training Process
After setting up our model, there are several major steps to go through to train it:
1. Initialize the parameters including the weights $\vec{\omega}$ and the bias $b$. For example, we can set their mean values as $0$s, and their standard deviations as $1$s.
......@@ -48,21 +48,6 @@ After setting up our model, there are several major steps to go through to train
4. Repeat steps 2~3, until the loss is below a predefined threshold or the maximum number of epochs is reached.
## Dataset
### Python Dataset Modules
Our program starts with importing necessary packages:
```python
import paddle.v2 as paddle
import paddle.v2.dataset.uci_housing as uci_housing
```
We encapsulated the [UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) in our Python module `uci_housing`. This module can
1. download the dataset to `~/.cache/paddle/dataset/uci_housing/housing.data`, if you haven't yet, and
2. [preprocess](#preprocessing) the dataset.
### An Introduction of the Dataset
The UCI housing dataset has 506 instances. Each instance describes the attributes of a house in surburban Boston. The attributes are explained below:
......@@ -116,49 +101,71 @@ When training complex models, we usually have one more split: the validation set
`fit_a_line/trainer.py` demonstrates the training using [PaddlePaddle](http://paddlepaddle.org).
### Initialize PaddlePaddle
### Datafeeder Configuration
Our program starts with importing necessary packages:
```python
paddle.init(use_gpu=False, trainer_count=1)
import paddle
import paddle.fluid as fluid
import numpy
```
### Model Configuration
We encapsulated the [UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) in our Python module `uci_housing`. This module can
1. download the dataset to `~/.cache/paddle/dataset/uci_housing/housing.data`, if you haven't yet, and
2. [preprocess](#preprocessing) the dataset.
Linear regression is essentially a fully-connected layer with linear activation:
We define data feeders for test and train. The feeder reads a `BATCH_SIZE` of data each time and feed them to the training/testing process. If the user wants some randomness on the data order, she can define both a `BATCH_SIZE` and a `buf_size`. That way the datafeeder will yield the first `BATCH_SIZE` data out of a shuffle of the first `buf_size` data.
```python
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
y_predict = paddle.layer.fc(input=x,
size=1,
act=paddle.activation.Linear())
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
BATCH_SIZE = 20
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=BATCH_SIZE)
```
### Save Topology
### Train Program Configuration
`train_program` sets up the network structure of this current training model. For linear regression, it is simply a fully connected layer from the input to the output. More complex structures like CNN and RNN will be introduced in later chapters. The `train_program` must return an avg_loss as its first returned parameter because it is needed in backpropagation.
```python
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=y_predict)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
def train_program():
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(loss)
return avg_loss
```
### Create Parameters
### Specify Place
Specify your training environment, you should specify if the training is on CPU or GPU.
```python
parameters = paddle.parameters.create(cost)
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
```
### Create Trainer
The trainer will take the `train_program` as input.
```python
optimizer = paddle.optimizer.Momentum(momentum=0)
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
trainer = fluid.Trainer(
train_func=train_program,
place=place,
optimizer_func=fluid.optimizer.SGD(learning_rate=0.001))
```
### Feeding Data
......@@ -168,105 +175,90 @@ PaddlePaddle provides the
for loading the training data. A reader may return multiple columns, and we need a Python dictionary to specify the mapping from column index to data layers.
```python
feeding={'x': 0, 'y': 1}
feed_order=['x', 'y']
```
Moreover, an event handler is provided to print the training progress:
```python
# event_handler to print training and testing info
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f" % (
event.pass_id, event.batch_id, event.cost)
if isinstance(event, paddle.event.EndPass):
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
print "Test %d, Cost %f" % (event.pass_id, result.cost)
```
# Specify the directory path to save the parameters
params_dirname = "fit_a_line.inference.model"
```python
# event_handler to plot training and testing info
# Plot data
from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
plot_cost = Ploter(train_title, test_title)
step = 0
# event_handler to print training and testing info
def event_handler_plot(event):
global step
if isinstance(event, paddle.event.EndIteration):
if step % 10 == 0: # every 10 batches, record a train cost
plot_cost.append(train_title, step, event.cost)
if step % 100 == 0: # every 100 batches, record a test cost
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
plot_cost.append(test_title, step, result.cost)
if step % 100 == 0: # every 100 batches, update cost plot
if isinstance(event, fluid.EndStepEvent):
if event.step % 10 == 0: # every 10 batches, record a test cost
test_metrics = trainer.test(
reader=test_reader, feed_order=feed_order)
plot_cost.append(test_title, step, test_metrics[0])
plot_cost.plot()
step += 1
if test_metrics[0] < 10.0:
# If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop')
trainer.stop()
# We can save the trained parameters for the inferences later
if params_dirname is not None:
trainer.save_params(params_dirname)
if isinstance(event, paddle.event.EndPass):
if event.pass_id % 10 == 0:
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
step += 1
```
### Start Training
We now can start training by calling `trainer.train()`.
```python
%matplotlib inline
# The training could take up to a few minutes.
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
uci_housing.train(), buf_size=500),
batch_size=2),
feeding=feeding,
reader=train_reader,
num_epochs=100,
event_handler=event_handler_plot,
num_passes=30)
feed_order=feed_order)
```
![png](./image/train_and_test.png)
### Apply model
## Inference
Initialize the Inferencer with the inference_program and the params_dirname, which is where we saved our params
#### 1. generate testing data
### Setup the Inference Program
Similar to the trainer.train, the Inferencer needs to take an inference_program to do inference.
Prune the train_program to only have the y_predict.
```python
test_data_creator = paddle.dataset.uci_housing.test()
test_data = []
test_label = []
for item in test_data_creator():
test_data.append((item[0],))
test_label.append(item[1])
if len(test_data) == 5:
break
def inference_program():
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
```
#### 2. inference
### Infer
Inferencer will load the trained model from `params_dirname` and use it to infer the unseen data.
```python
# load parameters from tar file.
# users can remove the comments and change the model name
# with open('params_pass_20.tar', 'r') as f:
# parameters = paddle.parameters.Parameters.from_tar(f)
inferencer = fluid.Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
probs = paddle.infer(
output_layer=y_predict, parameters=parameters, input=test_data)
batch_size = 10
tensor_x = numpy.random.uniform(0, 10, [batch_size, 13]).astype("float32")
for i in xrange(len(probs)):
print "label=" + str(test_label[i][0]) + ", predict=" + str(probs[i][0])
results = inferencer.infer({'x': tensor_x})
print("infer results: ", results[0])
```
## Summary
......
......@@ -81,7 +81,7 @@ $$MSE=\frac{1}{n}\sum_{i=1}^{n}{(\hat{Y_i}-Y_i)}^2$$
That is, for a dataset of size $n$, MSE is the average value of the the prediction sqaure errors.
### Training
### Training Process
After setting up our model, there are several major steps to go through to train it:
1. Initialize the parameters including the weights $\vec{\omega}$ and the bias $b$. For example, we can set their mean values as $0$s, and their standard deviations as $1$s.
......@@ -90,21 +90,6 @@ After setting up our model, there are several major steps to go through to train
4. Repeat steps 2~3, until the loss is below a predefined threshold or the maximum number of epochs is reached.
## Dataset
### Python Dataset Modules
Our program starts with importing necessary packages:
```python
import paddle.v2 as paddle
import paddle.v2.dataset.uci_housing as uci_housing
```
We encapsulated the [UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) in our Python module `uci_housing`. This module can
1. download the dataset to `~/.cache/paddle/dataset/uci_housing/housing.data`, if you haven't yet, and
2. [preprocess](#preprocessing) the dataset.
### An Introduction of the Dataset
The UCI housing dataset has 506 instances. Each instance describes the attributes of a house in surburban Boston. The attributes are explained below:
......@@ -158,49 +143,71 @@ When training complex models, we usually have one more split: the validation set
`fit_a_line/trainer.py` demonstrates the training using [PaddlePaddle](http://paddlepaddle.org).
### Initialize PaddlePaddle
### Datafeeder Configuration
Our program starts with importing necessary packages:
```python
paddle.init(use_gpu=False, trainer_count=1)
import paddle
import paddle.fluid as fluid
import numpy
```
### Model Configuration
We encapsulated the [UCI Housing Data Set](https://archive.ics.uci.edu/ml/datasets/Housing) in our Python module `uci_housing`. This module can
1. download the dataset to `~/.cache/paddle/dataset/uci_housing/housing.data`, if you haven't yet, and
2. [preprocess](#preprocessing) the dataset.
Linear regression is essentially a fully-connected layer with linear activation:
We define data feeders for test and train. The feeder reads a `BATCH_SIZE` of data each time and feed them to the training/testing process. If the user wants some randomness on the data order, she can define both a `BATCH_SIZE` and a `buf_size`. That way the datafeeder will yield the first `BATCH_SIZE` data out of a shuffle of the first `buf_size` data.
```python
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
y_predict = paddle.layer.fc(input=x,
size=1,
act=paddle.activation.Linear())
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
BATCH_SIZE = 20
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=BATCH_SIZE)
```
### Save Topology
### Train Program Configuration
`train_program` sets up the network structure of this current training model. For linear regression, it is simply a fully connected layer from the input to the output. More complex structures like CNN and RNN will be introduced in later chapters. The `train_program` must return an avg_loss as its first returned parameter because it is needed in backpropagation.
```python
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=y_predict)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
def train_program():
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(loss)
return avg_loss
```
### Create Parameters
### Specify Place
Specify your training environment, you should specify if the training is on CPU or GPU.
```python
parameters = paddle.parameters.create(cost)
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
```
### Create Trainer
The trainer will take the `train_program` as input.
```python
optimizer = paddle.optimizer.Momentum(momentum=0)
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
trainer = fluid.Trainer(
train_func=train_program,
place=place,
optimizer=fluid.optimizer.SGD(learning_rate=0.001))
```
### Feeding Data
......@@ -210,105 +217,90 @@ PaddlePaddle provides the
for loading the training data. A reader may return multiple columns, and we need a Python dictionary to specify the mapping from column index to data layers.
```python
feeding={'x': 0, 'y': 1}
feed_order=['x', 'y']
```
Moreover, an event handler is provided to print the training progress:
```python
# event_handler to print training and testing info
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f" % (
event.pass_id, event.batch_id, event.cost)
if isinstance(event, paddle.event.EndPass):
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
print "Test %d, Cost %f" % (event.pass_id, result.cost)
```
# Specify the directory path to save the parameters
params_dirname = "fit_a_line.inference.model"
```python
# event_handler to plot training and testing info
# Plot data
from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
plot_cost = Ploter(train_title, test_title)
step = 0
# event_handler to print training and testing info
def event_handler_plot(event):
global step
if isinstance(event, paddle.event.EndIteration):
if step % 10 == 0: # every 10 batches, record a train cost
plot_cost.append(train_title, step, event.cost)
if step % 100 == 0: # every 100 batches, record a test cost
result = trainer.test(
reader=paddle.batch(
uci_housing.test(), batch_size=2),
feeding=feeding)
plot_cost.append(test_title, step, result.cost)
if step % 100 == 0: # every 100 batches, update cost plot
if isinstance(event, fluid.EndStepEvent):
if event.step % 10 == 0: # every 10 batches, record a test cost
test_metrics = trainer.test(
reader=test_reader, feed_order=feed_order)
plot_cost.append(test_title, step, test_metrics[0])
plot_cost.plot()
step += 1
if test_metrics[0] < 10.0:
# If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop')
trainer.stop()
# We can save the trained parameters for the inferences later
if params_dirname is not None:
trainer.save_params(params_dirname)
if isinstance(event, paddle.event.EndPass):
if event.pass_id % 10 == 0:
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
step += 1
```
### Start Training
We now can start training by calling `trainer.train()`.
```python
%matplotlib inline
# The training could take up to a few minutes.
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
uci_housing.train(), buf_size=500),
batch_size=2),
feeding=feeding,
reader=train_reader,
num_epochs=100,
event_handler=event_handler_plot,
num_passes=30)
feed_order=feed_order)
```
![png](./image/train_and_test.png)
### Apply model
## Inference
Initialize the Inferencer with the inference_program and the params_dirname, which is where we saved our params
#### 1. generate testing data
### Setup the Inference Program
Similar to the trainer.train, the Inferencer needs to take an inference_program to do inference.
Prune the train_program to only have the y_predict.
```python
test_data_creator = paddle.dataset.uci_housing.test()
test_data = []
test_label = []
for item in test_data_creator():
test_data.append((item[0],))
test_label.append(item[1])
if len(test_data) == 5:
break
def inference_program():
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
```
#### 2. inference
### Infer
Inferencer will load the trained model from `params_dirname` and use it to infer the unseen data.
```python
# load parameters from tar file.
# users can remove the comments and change the model name
# with open('params_pass_20.tar', 'r') as f:
# parameters = paddle.parameters.Parameters.from_tar(f)
inferencer = fluid.Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
probs = paddle.infer(
output_layer=y_predict, parameters=parameters, input=test_data)
batch_size = 10
tensor_x = numpy.random.uniform(0, 10, [batch_size, 13]).astype("float32")
for i in xrange(len(probs)):
print "label=" + str(test_label[i][0]) + ", predict=" + str(probs[i][0])
results = inferencer.infer({'x': tensor_x})
print("infer results: ", results[0])
```
## Summary
......
import paddle.v2 as paddle
# Initialize PaddlePaddle.
paddle.init(use_gpu=False, trainer_count=1)
# Configure the neural network.
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear())
# Infer using provided test data.
probs = paddle.infer(
output_layer=y_predict,
parameters=paddle.dataset.uci_housing.model(),
input=[item for item in paddle.dataset.uci_housing.test()()])
for i in xrange(len(probs)):
print 'Predicted price: ${:,.2f}'.format(probs[i][0] * 1000)
import os
import paddle.v2 as paddle
import paddle.v2.dataset.uci_housing as uci_housing
with_gpu = os.getenv('WITH_GPU', '0') != '0'
def main():
# init
paddle.init(use_gpu=with_gpu, trainer_count=1)
# network config
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear())
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=y_predict)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
# create parameters
parameters = paddle.parameters.create(cost)
# create optimizer
optimizer = paddle.optimizer.Momentum(momentum=0)
trainer = paddle.trainer.SGD(
cost=cost, parameters=parameters, update_equation=optimizer)
feeding = {'x': 0, 'y': 1}
# event_handler to print training and testing info
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f" % (
event.pass_id, event.batch_id, event.cost)
if isinstance(event, paddle.event.EndPass):
if event.pass_id % 10 == 0:
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
result = trainer.test(
reader=paddle.batch(uci_housing.test(), batch_size=2),
feeding=feeding)
print "Test %d, Cost %f" % (event.pass_id, result.cost)
# training
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(uci_housing.train(), buf_size=500),
batch_size=2),
feeding=feeding,
event_handler=event_handler,
num_passes=30)
# inference
test_data_creator = paddle.dataset.uci_housing.test()
test_data = []
test_label = []
for item in test_data_creator():
test_data.append((item[0], ))
test_label.append(item[1])
if len(test_data) == 5:
break
# load parameters from tar file.
# users can remove the comments and change the model name
# with open('params_pass_20.tar', 'r') as f:
# parameters = paddle.parameters.Parameters.from_tar(f)
probs = paddle.infer(
output_layer=y_predict, parameters=parameters, input=test_data)
for i in xrange(len(probs)):
print "label=" + str(test_label[i][0]) + ", predict=" + str(probs[i][0])
if __name__ == '__main__':
main()
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle
import paddle.fluid as fluid
import numpy
BATCH_SIZE = 20
train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.uci_housing.test(), buf_size=500),
batch_size=BATCH_SIZE)
def train_program():
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
# feature vector of length 13
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_loss = fluid.layers.mean(loss)
return avg_loss
def optimizer_program():
return fluid.optimizer.SGD(learning_rate=0.001)
# can use CPU or GPU
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = fluid.Trainer(
train_func=train_program,
place=place,
optimizer_func=optimizer_program)
feed_order = ['x', 'y']
# Specify the directory path to save the parameters
params_dirname = "fit_a_line.inference.model"
# Plot data
from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
plot_cost = Ploter(train_title, test_title)
step = 0
# event_handler to print training and testing info
def event_handler_plot(event):
global step
if isinstance(event, fluid.EndStepEvent):
if event.step % 10 == 0: # every 10 batches, record a test cost
test_metrics = trainer.test(
reader=test_reader, feed_order=feed_order)
plot_cost.append(test_title, step, test_metrics[0])
plot_cost.plot()
if test_metrics[0] < 10.0:
# If the accuracy is good enough, we can stop the training.
print('loss is less than 10.0, stop')
trainer.stop()
# We can save the trained parameters for the inferences later
if params_dirname is not None:
trainer.save_params(params_dirname)
step += 1
# The training could take up to a few minutes.
trainer.train(
reader=train_reader,
num_epochs=100,
event_handler=event_handler_plot,
feed_order=feed_order)
def inference_program():
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
return y_predict
inferencer = fluid.Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
batch_size = 10
tensor_x = numpy.random.uniform(0, 10, [batch_size, 13]).astype("float32")
results = inferencer.infer({'x': tensor_x})
print("infer results: ", results[0])
......@@ -127,112 +127,188 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni
|t10k-images-idx3-ubyte | 测试数据图片,10,000条数据 |
|t10k-labels-idx1-ubyte | 测试数据标签,10,000条数据 |
## 配置说明
## Fluid API 概述
演示将使用最新的 `Fluid API`。Fluid API是最新的 PaddlePaddle API。它在不牺牲性能的情况下简化了模型配置。
我们建议使用 Fluid API,因为它更容易学起来。
下面是快速的 Fluid API 概述。
1. `inference_program`:指定如何从数据输入中获得预测的函数。
这是指定网络流的地方。
1. `train_program`:指定如何从 `inference_program``标签值`中获取 `loss` 的函数。
这是指定损失计算的地方。
1. `optimizer_func`: “指定优化器配置的函数。优化器负责减少损失并驱动培训。Paddle 支持多种不同的优化器。
首先,加载PaddlePaddle的V2 api包。
1. `Trainer`:PaddlePaddle Trainer 管理由 `train_program``optimizer` 指定的训练过程。
通过 `event_handler` 回调函数,用户可以监控培训的进展。
1. `Inferencer`:Fluid inferencer 加载 `inference_program` 和由 Trainer 训练的参数。
然后,它可以推断数据和返回预测。
在这个演示中,我们将深入了解它们。
## 配置说明
加载 PaddlePaddle 的 Fluid API 包。
```python
import paddle.v2 as paddle
import paddle
import paddle.fluid as fluid
```
其次,定义三个不同的分类器:
### Program Functions 配置
我们需要设置“推理程序”函数。我们想用这个程序来演示三个不同的分类器,每个分类器都定义为 Python 函数。
我们需要将图像数据馈送到分类器。Paddle 为读取数据提供了一个特殊的层 `layer.data` 层。
让我们创建一个数据层来读取图像并将其连接到分类网络。
- Softmax回归:只通过一层简单的以softmax为激活函数的全连接层,就可以得到分类的结果。
```python
def softmax_regression(img):
predict = paddle.layer.fc(input=img,
size=10,
act=paddle.activation.Softmax())
def softmax_regression():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
predict = fluid.layers.fc(
input=img, size=10, act='softmax')
return predict
```
- 多层感知器:下面代码实现了一个含有两个隐藏层(即全连接层)的多层感知器。其中两个隐藏层的激活函数均采用ReLU,输出层的激活函数用Softmax。
```python
def multilayer_perceptron(img):
def multilayer_perceptron():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# 第一个全连接层,激活函数为ReLU
hidden1 = paddle.layer.fc(input=img, size=128, act=paddle.activation.Relu())
hidden = fluid.layers.fc(input=img, size=200, act='relu')
# 第二个全连接层,激活函数为ReLU
hidden2 = paddle.layer.fc(input=hidden1,
size=64,
act=paddle.activation.Relu())
hidden = fluid.layers.fc(input=hidden, size=200, act='relu')
# 以softmax为激活函数的全连接输出层,输出层的大小必须为数字的个数10
predict = paddle.layer.fc(input=hidden2,
size=10,
act=paddle.activation.Softmax())
return predict
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
return prediction
```
- 卷积神经网络LeNet-5: 输入的二维图像,首先经过两次卷积层到池化层,再经过全连接层,最后使用以softmax为激活函数的全连接层作为输出层。
```python
def convolutional_neural_network(img):
def convolutional_neural_network():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# 第一个卷积-池化层
conv_pool_1 = paddle.networks.simple_img_conv_pool(
conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
act="relu")
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# 第二个卷积-池化层
conv_pool_2 = paddle.networks.simple_img_conv_pool(
conv_pool_2 = fluid.nets.simple_img_conv_pool(
input=conv_pool_1,
filter_size=5,
num_filters=50,
num_channel=20,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
act="relu")
# 以softmax为激活函数的全连接输出层,输出层的大小必须为数字的个数10
predict = paddle.layer.fc(input=conv_pool_2,
size=10,
act=paddle.activation.Softmax())
return predict
prediction = fluid.layers.fc(input=conv_pool_2, size=10, act='softmax')
return prediction
```
接着,通过`layer.data`调用来获取数据,然后调用分类器(这里我们提供了三个不同的分类器)得到分类结果。训练时,对该结果计算其损失函数,分类问题常常选择交叉熵损失函数。
#### Train Program 配置
然后我们需要设置训练程序 `train_program`。它首先从分类器中进行预测。
在训练期间,它将从预测中计算 `avg_cost`
**注意:** 训练程序应该返回一个数组,第一个返回参数必须是 `avg_cost`。训练器使用它来计算梯度。
请随意修改代码,测试 Softmax 回归 `softmax_regression`, `MLP` 和 卷积神经网络 `convolutional neural network` 分类器之间的不同结果。
```python
# 该模型运行在单个CPU上
paddle.init(use_gpu=False, trainer_count=1)
def train_program():
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
images = paddle.layer.data(
name='pixel', type=paddle.data_type.dense_vector(784))
label = paddle.layer.data(
name='label', type=paddle.data_type.integer_value(10))
# predict = softmax_regression() # uncomment for Softmax回归
# predict = multilayer_perceptron() # uncomment for 多层感知器
predict = convolutional_neural_network() # uncomment for LeNet5卷积神经网络
cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc]
# predict = softmax_regression(images) # Softmax回归
# predict = multilayer_perceptron(images) #多层感知器
predict = convolutional_neural_network(images) #LeNet5卷积神经网络
cost = paddle.layer.classification_cost(input=predict, label=label)
# 该模型运行在单个CPU上
```
然后,指定训练相关的参数。
- 训练方法(optimizer): 代表训练过程在更新权重时采用动量优化器 `Momentum` ,其中参数0.9代表动量优化每次保持前一次速度的0.9倍。
- 训练速度(learning_rate): 迭代的速度,与网络的训练收敛速度有关系。
- 正则化(regularization): 是防止网络过拟合的一种手段,此处采用L2正则化。
```python
parameters = paddle.parameters.create(cost)
#### Optimizer Function 配置
optimizer = paddle.optimizer.Momentum(
learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
在下面的 `Adam optimizer``learning_rate` 是训练的速度,与网络的训练收敛速度有关系。
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```
### 数据集 Feeders 配置
下一步,我们开始训练过程。`paddle.dataset.movielens.train()``paddle.dataset.movielens.test()`分别做训练和测试数据集。这两个函数各自返回一个reader——PaddlePaddle中的reader是一个Python函数,每次调用的时候返回一个Python yield generator。
下面`shuffle`是一个reader decorator,它接受一个reader A,返回另一个reader B —— reader B 每次读入`buffer_size`条训练数据到一个buffer里,然后随机打乱其顺序,并且逐条输出。
`batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader —— 在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。
`event_handler_plot`可以用来在训练过程中画图如下:
```python
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500),
batch_size=64)
test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=64)
```
### Trainer 配置
现在,我们需要配置 `Trainer``Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer`
```python
# 该模型运行在单个CPU上
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_program)
```
#### Event Handler 配置
Fluid API 在训练期间为回调函数提供了一个钩子用户能够通过机制监控培训进度
我们将在这里演示两个 `event_handler` 程序请随意修改 Jupyter 笔记本 看看有什么不同
`event_handler` 用来在训练过程中输出训练结果
```python
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer
params_dirname = "recognize_digits_network.inference.model"
lists = []
def event_handler(event):
if isinstance(event, fluid.EndStepEvent):
if event.step % 100 == 0:
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
print "Pass %d, Batch %d, Cost %f" % (
event.step, event.epoch, event.metrics[0])
if isinstance(event, fluid.EndEpochEvent):
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
print("Test with Epoch %d, avg_cost: %s, acc: %s" % (event.epoch, avg_cost, acc))
# save parameters
trainer.save_params(params_dirname)
lists.append((event.epoch, avg_cost, acc))
```
`event_handler_plot` 可以用来在训练过程中画图如下:
![png](./image/train_and_test.png)
......@@ -242,68 +318,57 @@ from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
cost_ploter = Ploter(train_title, test_title)
step = 0
lists = []
# event_handler to plot a figure
def event_handler_plot(event):
global step
if isinstance(event, paddle.event.EndIteration):
if isinstance(event, fluid.EndStepEvent):
if step % 100 == 0:
cost_ploter.append(train_title, step, event.cost)
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
cost_ploter.append(train_title, step, event.metrics[0])
cost_ploter.plot()
step += 1
if isinstance(event, paddle.event.EndPass):
if isinstance(event, fluid.EndEpochEvent):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
trainer.save_params(params_dirname)
result = trainer.test(reader=paddle.batch(
paddle.dataset.mnist.test(), batch_size=128))
cost_ploter.append(test_title, step, result.cost)
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
cost_ploter.append(test_title, step, avg_cost)
lists.append((event.epoch, avg_cost, acc))
```
`event_handler` 用来在训练过程中输出训练结果
```python
lists = []
#### 开始训练
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
if isinstance(event, paddle.event.EndPass):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
result = trainer.test(reader=paddle.batch(
paddle.dataset.mnist.test(), batch_size=128))
print "Test with Pass %d, Cost %f, %s\n" % (
event.pass_id, result.cost, result.metrics)
lists.append((event.pass_id, result.cost,
result.metrics['classification_error_evaluator']))
```
既然我们设置了 `event_handler` 和 `data reader`,我们就可以开始训练模型了。
`feed_order` 用于将数据目录映射到 `train_program`
```python
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=8192),
batch_size=128),
event_handler=event_handler_plot,
num_passes=5)
num_epochs=5,
event_handler=event_handler,
reader=train_reader,
feed_order=['img', 'label'])
```
训练过程是完全自动的,event_handler里打印的日志类似如下所示:
```
# Pass 0, Batch 0, Cost 2.780790, {'classification_error_evaluator': 0.9453125}
# Pass 0, Batch 100, Cost 0.635356, {'classification_error_evaluator': 0.2109375}
# Pass 0, Batch 200, Cost 0.326094, {'classification_error_evaluator': 0.1328125}
# Pass 0, Batch 300, Cost 0.361920, {'classification_error_evaluator': 0.1015625}
# Pass 0, Batch 400, Cost 0.410101, {'classification_error_evaluator': 0.125}
# Test with Pass 0, Cost 0.326659, {'classification_error_evaluator': 0.09470000118017197}
Pass 0, Batch 0, Cost 0.125650
Pass 100, Batch 0, Cost 0.161387
Pass 200, Batch 0, Cost 0.040036
Pass 300, Batch 0, Cost 0.023391
Pass 400, Batch 0, Cost 0.005856
Pass 500, Batch 0, Cost 0.003315
Pass 600, Batch 0, Cost 0.009977
Pass 700, Batch 0, Cost 0.020959
Pass 800, Batch 0, Cost 0.105560
Pass 900, Batch 0, Cost 0.239809
Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338
```
训练之后,检查模型的预测准确度。用 MNIST 训练的时候,一般 softmax回归模型的分类准确率为约为 92.34%,多层感知器为97.66%,卷积神经网络可以达到 99.20%。
......@@ -311,27 +376,50 @@ trainer.train(
## 应用模型
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用paddle.infer接口进行推断。
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用 `fluid.Inferencer` 接口进行推断。
### Inference 配置
`Inference` 需要一个 `infer_func` 和 `param_path` 来设置网络和经过训练的参数。
我们可以简单地插入在此之前定义的分类器。
```python
from PIL import Image
import numpy as np
inferencer = fluid.Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression
# infer_func=multilayer_perceptron, # uncomment for MLP
infer_func=convolutional_neural_network, # uncomment for LeNet5
param_path=params_dirname,
place=place)
```
### 生成预测输入数据
`infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。
```python
# Prepare the test image
import os
import numpy as np
from PIL import Image
def load_image(file):
im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).astype(np.float32).flatten()
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32)
im = im / 255.0 * 2.0 - 1.0
return im
test_data = []
cur_dir = os.getcwd()
test_data.append((load_image(cur_dir + '/image/infer_3.png'),))
cur_dir = cur_dir = os.getcwd()
img = load_image(cur_dir + '/image/infer_3.png')
```
### 预测
probs = paddle.infer(
output_layer=predict, parameters=parameters, input=test_data)
lab = np.argsort(-probs) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0]
现在我们准备做预测。
```python
results = inferencer.infer({'img': img})
lab = np.argsort(results) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0][-1]
```
## 总结
......
此差异已折叠。
......@@ -169,112 +169,188 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni
|t10k-images-idx3-ubyte | 测试数据图片,10,000条数据 |
|t10k-labels-idx1-ubyte | 测试数据标签,10,000条数据 |
## 配置说明
## Fluid API 概述
演示将使用最新的 `Fluid API`。Fluid API是最新的 PaddlePaddle API。它在不牺牲性能的情况下简化了模型配置。
我们建议使用 Fluid API,因为它更容易学起来。
下面是快速的 Fluid API 概述。
1. `inference_program`:指定如何从数据输入中获得预测的函数。
这是指定网络流的地方。
1. `train_program`:指定如何从 `inference_program` 和`标签值`中获取 `loss` 的函数。
这是指定损失计算的地方。
1. `optimizer_func`: “指定优化器配置的函数。优化器负责减少损失并驱动培训。Paddle 支持多种不同的优化器。
首先,加载PaddlePaddle的V2 api包。
1. `Trainer`:PaddlePaddle Trainer 管理由 `train_program` 和 `optimizer` 指定的训练过程。
通过 `event_handler` 回调函数,用户可以监控培训的进展。
1. `Inferencer`:Fluid inferencer 加载 `inference_program` 和由 Trainer 训练的参数。
然后,它可以推断数据和返回预测。
在这个演示中,我们将深入了解它们。
## 配置说明
加载 PaddlePaddle 的 Fluid API 包。
```python
import paddle.v2 as paddle
import paddle
import paddle.fluid as fluid
```
其次,定义三个不同的分类器:
### Program Functions 配置
我们需要设置“推理程序”函数。我们想用这个程序来演示三个不同的分类器,每个分类器都定义为 Python 函数。
我们需要将图像数据馈送到分类器。Paddle 为读取数据提供了一个特殊的层 `layer.data` 层。
让我们创建一个数据层来读取图像并将其连接到分类网络。
- Softmax回归:只通过一层简单的以softmax为激活函数的全连接层,就可以得到分类的结果。
```python
def softmax_regression(img):
predict = paddle.layer.fc(input=img,
size=10,
act=paddle.activation.Softmax())
def softmax_regression():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
predict = fluid.layers.fc(
input=img, size=10, act='softmax')
return predict
```
- 多层感知器:下面代码实现了一个含有两个隐藏层(即全连接层)的多层感知器。其中两个隐藏层的激活函数均采用ReLU,输出层的激活函数用Softmax。
```python
def multilayer_perceptron(img):
def multilayer_perceptron():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# 第一个全连接层,激活函数为ReLU
hidden1 = paddle.layer.fc(input=img, size=128, act=paddle.activation.Relu())
hidden = fluid.layers.fc(input=img, size=200, act='relu')
# 第二个全连接层,激活函数为ReLU
hidden2 = paddle.layer.fc(input=hidden1,
size=64,
act=paddle.activation.Relu())
hidden = fluid.layers.fc(input=hidden, size=200, act='relu')
# 以softmax为激活函数的全连接输出层,输出层的大小必须为数字的个数10
predict = paddle.layer.fc(input=hidden2,
size=10,
act=paddle.activation.Softmax())
return predict
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
return prediction
```
- 卷积神经网络LeNet-5: 输入的二维图像,首先经过两次卷积层到池化层,再经过全连接层,最后使用以softmax为激活函数的全连接层作为输出层。
```python
def convolutional_neural_network(img):
def convolutional_neural_network():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# 第一个卷积-池化层
conv_pool_1 = paddle.networks.simple_img_conv_pool(
conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
act="relu")
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# 第二个卷积-池化层
conv_pool_2 = paddle.networks.simple_img_conv_pool(
conv_pool_2 = fluid.nets.simple_img_conv_pool(
input=conv_pool_1,
filter_size=5,
num_filters=50,
num_channel=20,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
act="relu")
# 以softmax为激活函数的全连接输出层,输出层的大小必须为数字的个数10
predict = paddle.layer.fc(input=conv_pool_2,
size=10,
act=paddle.activation.Softmax())
return predict
prediction = fluid.layers.fc(input=conv_pool_2, size=10, act='softmax')
return prediction
```
接着,通过`layer.data`调用来获取数据,然后调用分类器(这里我们提供了三个不同的分类器)得到分类结果。训练时,对该结果计算其损失函数,分类问题常常选择交叉熵损失函数。
#### Train Program 配置
然后我们需要设置训练程序 `train_program`。它首先从分类器中进行预测。
在训练期间,它将从预测中计算 `avg_cost`。
**注意:** 训练程序应该返回一个数组,第一个返回参数必须是 `avg_cost`。训练器使用它来计算梯度。
请随意修改代码,测试 Softmax 回归 `softmax_regression`, `MLP` 和 卷积神经网络 `convolutional neural network` 分类器之间的不同结果。
```python
# 该模型运行在单个CPU上
paddle.init(use_gpu=False, trainer_count=1)
def train_program():
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
images = paddle.layer.data(
name='pixel', type=paddle.data_type.dense_vector(784))
label = paddle.layer.data(
name='label', type=paddle.data_type.integer_value(10))
# predict = softmax_regression() # uncomment for Softmax回归
# predict = multilayer_perceptron() # uncomment for 多层感知器
predict = convolutional_neural_network() # uncomment for LeNet5卷积神经网络
cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc]
# predict = softmax_regression(images) # Softmax回归
# predict = multilayer_perceptron(images) #多层感知器
predict = convolutional_neural_network(images) #LeNet5卷积神经网络
cost = paddle.layer.classification_cost(input=predict, label=label)
# 该模型运行在单个CPU上
```
然后,指定训练相关的参数。
- 训练方法(optimizer): 代表训练过程在更新权重时采用动量优化器 `Momentum` ,其中参数0.9代表动量优化每次保持前一次速度的0.9倍。
- 训练速度(learning_rate): 迭代的速度,与网络的训练收敛速度有关系。
- 正则化(regularization): 是防止网络过拟合的一种手段,此处采用L2正则化。
```python
parameters = paddle.parameters.create(cost)
#### Optimizer Function 配置
optimizer = paddle.optimizer.Momentum(
learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
在下面的 `Adam optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```
### 数据集 Feeders 配置
下一步,我们开始训练过程。`paddle.dataset.movielens.train()`和`paddle.dataset.movielens.test()`分别做训练和测试数据集。这两个函数各自返回一个reader——PaddlePaddle中的reader是一个Python函数,每次调用的时候返回一个Python yield generator。
下面`shuffle`是一个reader decorator,它接受一个reader A,返回另一个reader B —— reader B 每次读入`buffer_size`条训练数据到一个buffer里,然后随机打乱其顺序,并且逐条输出。
`batch`是一个特殊的decorator,它的输入是一个reader,输出是一个batched reader —— 在PaddlePaddle里,一个reader每次yield一条训练数据,而一个batched reader每次yield一个minibatch。
`event_handler_plot`可以用来在训练过程中画图如下:
```python
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500),
batch_size=64)
test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=64)
```
### Trainer 配置
现在,我们需要配置 `Trainer`。`Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer`。
```python
# 该模型运行在单个CPU上
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_program)
```
#### Event Handler 配置
Fluid API 在训练期间为回调函数提供了一个钩子。用户能够通过机制监控培训进度。
我们将在这里演示两个 `event_handler` 程序。请随意修改 Jupyter 笔记本 ,看看有什么不同。
`event_handler` 用来在训练过程中输出训练结果
```python
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer
params_dirname = "recognize_digits_network.inference.model"
lists = []
def event_handler(event):
if isinstance(event, fluid.EndStepEvent):
if event.step % 100 == 0:
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
print "Pass %d, Batch %d, Cost %f" % (
event.step, event.epoch, event.metrics[0])
if isinstance(event, fluid.EndEpochEvent):
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
print("Test with Epoch %d, avg_cost: %s, acc: %s" % (event.epoch, avg_cost, acc))
# save parameters
trainer.save_params(params_dirname)
lists.append((event.epoch, avg_cost, acc))
```
`event_handler_plot` 可以用来在训练过程中画图如下:
![png](./image/train_and_test.png)
......@@ -284,68 +360,57 @@ from paddle.v2.plot import Ploter
train_title = "Train cost"
test_title = "Test cost"
cost_ploter = Ploter(train_title, test_title)
step = 0
lists = []
# event_handler to plot a figure
def event_handler_plot(event):
global step
if isinstance(event, paddle.event.EndIteration):
if isinstance(event, fluid.EndStepEvent):
if step % 100 == 0:
cost_ploter.append(train_title, step, event.cost)
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
cost_ploter.append(train_title, step, event.metrics[0])
cost_ploter.plot()
step += 1
if isinstance(event, paddle.event.EndPass):
if isinstance(event, fluid.EndEpochEvent):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
trainer.save_params(params_dirname)
result = trainer.test(reader=paddle.batch(
paddle.dataset.mnist.test(), batch_size=128))
cost_ploter.append(test_title, step, result.cost)
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
cost_ploter.append(test_title, step, avg_cost)
lists.append((event.epoch, avg_cost, acc))
```
`event_handler` 用来在训练过程中输出训练结果
```python
lists = []
#### 开始训练
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
if isinstance(event, paddle.event.EndPass):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
result = trainer.test(reader=paddle.batch(
paddle.dataset.mnist.test(), batch_size=128))
print "Test with Pass %d, Cost %f, %s\n" % (
event.pass_id, result.cost, result.metrics)
lists.append((event.pass_id, result.cost,
result.metrics['classification_error_evaluator']))
```
既然我们设置了 `event_handler` 和 `data reader`,我们就可以开始训练模型了。
`feed_order` 用于将数据目录映射到 `train_program`
```python
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=8192),
batch_size=128),
event_handler=event_handler_plot,
num_passes=5)
num_epochs=5,
event_handler=event_handler,
reader=train_reader,
feed_order=['img', 'label'])
```
训练过程是完全自动的,event_handler里打印的日志类似如下所示:
```
# Pass 0, Batch 0, Cost 2.780790, {'classification_error_evaluator': 0.9453125}
# Pass 0, Batch 100, Cost 0.635356, {'classification_error_evaluator': 0.2109375}
# Pass 0, Batch 200, Cost 0.326094, {'classification_error_evaluator': 0.1328125}
# Pass 0, Batch 300, Cost 0.361920, {'classification_error_evaluator': 0.1015625}
# Pass 0, Batch 400, Cost 0.410101, {'classification_error_evaluator': 0.125}
# Test with Pass 0, Cost 0.326659, {'classification_error_evaluator': 0.09470000118017197}
Pass 0, Batch 0, Cost 0.125650
Pass 100, Batch 0, Cost 0.161387
Pass 200, Batch 0, Cost 0.040036
Pass 300, Batch 0, Cost 0.023391
Pass 400, Batch 0, Cost 0.005856
Pass 500, Batch 0, Cost 0.003315
Pass 600, Batch 0, Cost 0.009977
Pass 700, Batch 0, Cost 0.020959
Pass 800, Batch 0, Cost 0.105560
Pass 900, Batch 0, Cost 0.239809
Test with Epoch 0, avg_cost: 0.053097883707459624, acc: 0.9822850318471338
```
训练之后,检查模型的预测准确度。用 MNIST 训练的时候,一般 softmax回归模型的分类准确率为约为 92.34%,多层感知器为97.66%,卷积神经网络可以达到 99.20%。
......@@ -353,27 +418,50 @@ trainer.train(
## 应用模型
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用paddle.infer接口进行推断。
可以使用训练好的模型对手写体数字图片进行分类,下面程序展示了如何使用 `fluid.Inferencer` 接口进行推断。
### Inference 配置
`Inference` 需要一个 `infer_func` 和 `param_path` 来设置网络和经过训练的参数。
我们可以简单地插入在此之前定义的分类器。
```python
from PIL import Image
import numpy as np
inferencer = fluid.Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression
# infer_func=multilayer_perceptron, # uncomment for MLP
infer_func=convolutional_neural_network, # uncomment for LeNet5
param_path=params_dirname,
place=place)
```
### 生成预测输入数据
`infer_3.png` 是数字 3 的一个示例图像。把它变成一个 numpy 数组以匹配数据馈送格式。
```python
# Prepare the test image
import os
import numpy as np
from PIL import Image
def load_image(file):
im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).astype(np.float32).flatten()
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32)
im = im / 255.0 * 2.0 - 1.0
return im
test_data = []
cur_dir = os.getcwd()
test_data.append((load_image(cur_dir + '/image/infer_3.png'),))
cur_dir = cur_dir = os.getcwd()
img = load_image(cur_dir + '/image/infer_3.png')
```
### 预测
probs = paddle.infer(
output_layer=predict, parameters=parameters, input=test_data)
lab = np.argsort(-probs) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0]
现在我们准备做预测。
```python
results = inferencer.infer({'img': img})
lab = np.argsort(results) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0][-1]
```
## 总结
......
此差异已折叠。
import os
from PIL import Image
import numpy as np
import paddle.v2 as paddle
import paddle
import paddle.fluid as fluid
with_gpu = os.getenv('WITH_GPU', '0') != '0'
def softmax_regression(img):
predict = paddle.layer.fc(
input=img, size=10, act=paddle.activation.Softmax())
def softmax_regression():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
predict = fluid.layers.fc(input=img, size=10, act='softmax')
return predict
def multilayer_perceptron(img):
# The first fully-connected layer
hidden1 = paddle.layer.fc(input=img, size=128, act=paddle.activation.Relu())
# The second fully-connected layer and the according activation function
hidden2 = paddle.layer.fc(
input=hidden1, size=64, act=paddle.activation.Relu())
def multilayer_perceptron():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# first fully-connected layer, using ReLu as its activation function
hidden = fluid.layers.fc(input=img, size=128, act='relu')
# second fully-connected layer, using ReLu as its activation function
hidden = fluid.layers.fc(input=hidden, size=64, act='relu')
# The thrid fully-connected layer, note that the hidden size should be 10,
# which is the number of unique digits
predict = paddle.layer.fc(
input=hidden2, size=10, act=paddle.activation.Softmax())
return predict
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
return prediction
def convolutional_neural_network(img):
# first conv layer
conv_pool_1 = paddle.networks.simple_img_conv_pool(
def convolutional_neural_network():
img = fluid.layers.data(name='img', shape=[1, 28, 28], dtype='float32')
# first conv pool
conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
# second conv layer
conv_pool_2 = paddle.networks.simple_img_conv_pool(
act="relu")
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# second conv pool
conv_pool_2 = fluid.nets.simple_img_conv_pool(
input=conv_pool_1,
filter_size=5,
num_filters=50,
num_channel=20,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
# fully-connected layer
predict = paddle.layer.fc(
input=conv_pool_2, size=10, act=paddle.activation.Softmax())
return predict
act="relu")
# output layer with softmax activation function. size = 10 since there are only 10 possible digits.
prediction = fluid.layers.fc(input=conv_pool_2, size=10, act='softmax')
return prediction
def main():
paddle.init(use_gpu=with_gpu, trainer_count=1)
# define network topology
images = paddle.layer.data(
name='pixel', type=paddle.data_type.dense_vector(784))
label = paddle.layer.data(
name='label', type=paddle.data_type.integer_value(10))
def train_program():
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
# Here we can build the prediction network in different ways. Please
# choose one by uncomment corresponding line.
# predict = softmax_regression(images)
# predict = multilayer_perceptron(images)
predict = convolutional_neural_network(images)
# predict = softmax_regression() # uncomment for Softmax
# predict = multilayer_perceptron() # uncomment for MLP
predict = convolutional_neural_network() # uncomment for LeNet5
# Calculate the cost from the prediction and label.
cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, acc]
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
cost = paddle.layer.classification_cost(input=predict, label=label)
parameters = paddle.parameters.create(cost)
def main():
train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=500),
batch_size=64)
test_reader = paddle.batch(paddle.dataset.mnist.test(), batch_size=64)
optimizer = paddle.optimizer.Momentum(
learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
use_cuda = os.getenv('WITH_GPU', '0') != '0'
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = paddle.trainer.SGD(
cost=cost, parameters=parameters, update_equation=optimizer)
trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer_func=optimizer_program)
# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer
params_dirname = "recognize_digits_network.inference.model"
lists = []
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "Pass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
if isinstance(event, paddle.event.EndPass):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
if isinstance(event, fluid.EndStepEvent):
if event.step % 100 == 0:
# event.metrics maps with train program return arguments.
# event.metrics[0] will yeild avg_cost and event.metrics[1] will yeild acc in this example.
print "Pass %d, Batch %d, Cost %f" % (event.step, event.epoch,
event.metrics[0])
result = trainer.test(reader=paddle.batch(
paddle.dataset.mnist.test(), batch_size=128))
print "Test with Pass %d, Cost %f, %s\n" % (
event.pass_id, result.cost, result.metrics)
lists.append((event.pass_id, result.cost,
result.metrics['classification_error_evaluator']))
if isinstance(event, fluid.EndEpochEvent):
avg_cost, acc = trainer.test(
reader=test_reader, feed_order=['img', 'label'])
print("Test with Epoch %d, avg_cost: %s, acc: %s" %
(event.epoch, avg_cost, acc))
# save parameters
trainer.save_params(params_dirname)
lists.append((event.epoch, avg_cost, acc))
# Train the model now
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=8192),
batch_size=128),
num_epochs=5,
event_handler=event_handler,
num_passes=5)
reader=train_reader,
feed_order=['img', 'label'])
# find the best pass
best = sorted(lists, key=lambda list: float(list[1]))[0]
print 'Best pass is %s, testing Avgcost is %s' % (best[0], best[1])
print 'The classification accuracy is %.2f%%' % (100 - float(best[2]) * 100)
print 'The classification accuracy is %.2f%%' % (float(best[2]) * 100)
def load_image(file):
im = Image.open(file).convert('L')
im = im.resize((28, 28), Image.ANTIALIAS)
im = np.array(im).astype(np.float32).flatten()
im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32)
im = im / 255.0 * 2.0 - 1.0
return im
test_data = []
cur_dir = os.path.dirname(os.path.realpath(__file__))
test_data.append((load_image(cur_dir + '/image/infer_3.png'), ))
probs = paddle.infer(
output_layer=predict, parameters=parameters, input=test_data)
lab = np.argsort(-probs) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0]
img = load_image(cur_dir + '/image/infer_3.png')
inferencer = fluid.Inferencer(
# infer_func=softmax_regression, # uncomment for softmax regression
# infer_func=multilayer_perceptron, # uncomment for MLP
infer_func=convolutional_neural_network, # uncomment for LeNet5
param_path=params_dirname,
place=place)
results = inferencer.infer({'img': img})
lab = np.argsort(results) # probs and lab are the results of one batch data
print "Label of image/infer_3.png is: %d" % lab[0][0][-1]
if __name__ == '__main__':
......
此差异已折叠。
此差异已折叠。
......@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle.v2 as paddle
import paddle.fluid as fluid
__all__ = ['resnet_cifar10']
......@@ -22,37 +22,35 @@ def conv_bn_layer(input,
filter_size,
stride,
padding,
active_type=paddle.activation.Relu(),
ch_in=None):
tmp = paddle.layer.img_conv(
act='relu',
bias_attr=False):
tmp = fluid.layers.conv2d(
input=input,
filter_size=filter_size,
num_channels=ch_in,
num_filters=ch_out,
stride=stride,
padding=padding,
act=paddle.activation.Linear(),
bias_attr=False)
return paddle.layer.batch_norm(input=tmp, act=active_type)
act=None,
bias_attr=bias_attr)
return fluid.layers.batch_norm(input=tmp, act=act)
def shortcut(ipt, ch_in, ch_out, stride):
def shortcut(input, ch_in, ch_out, stride):
if ch_in != ch_out:
return conv_bn_layer(ipt, ch_out, 1, stride, 0,
paddle.activation.Linear())
return conv_bn_layer(input, ch_out, 1, stride, 0, None)
else:
return ipt
return input
def basicblock(ipt, ch_in, ch_out, stride):
tmp = conv_bn_layer(ipt, ch_out, 3, stride, 1)
tmp = conv_bn_layer(tmp, ch_out, 3, 1, 1, paddle.activation.Linear())
short = shortcut(ipt, ch_in, ch_out, stride)
return paddle.layer.addto(input=[tmp, short], act=paddle.activation.Relu())
def basicblock(input, ch_in, ch_out, stride):
tmp = conv_bn_layer(input, ch_out, 3, stride, 1)
tmp = conv_bn_layer(tmp, ch_out, 3, 1, 1, act=None, bias_attr=True)
short = shortcut(input, ch_in, ch_out, stride)
return fluid.layers.elementwise_add(x=tmp, y=short, act='relu')
def layer_warp(block_func, ipt, ch_in, ch_out, count, stride):
tmp = block_func(ipt, ch_in, ch_out, stride)
def layer_warp(block_func, input, ch_in, ch_out, count, stride):
tmp = block_func(input, ch_in, ch_out, stride)
for i in range(1, count):
tmp = block_func(tmp, ch_out, ch_out, 1)
return tmp
......@@ -63,11 +61,11 @@ def resnet_cifar10(ipt, depth=32):
assert (depth - 2) % 6 == 0
n = (depth - 2) / 6
nStages = {16, 64, 128}
conv1 = conv_bn_layer(
ipt, ch_in=3, ch_out=16, filter_size=3, stride=1, padding=1)
conv1 = conv_bn_layer(ipt, ch_out=16, filter_size=3, stride=1, padding=1)
res1 = layer_warp(basicblock, conv1, 16, 16, n, 1)
res2 = layer_warp(basicblock, res1, 16, 32, n, 2)
res3 = layer_warp(basicblock, res2, 32, 64, n, 2)
pool = paddle.layer.img_pool(
input=res3, pool_size=8, stride=1, pool_type=paddle.pooling.Avg())
return pool
pool = fluid.layers.pool2d(
input=res3, pool_size=8, pool_type='avg', pool_stride=1)
predict = fluid.layers.fc(input=pool, size=10, act='softmax')
return predict
......@@ -12,92 +12,87 @@
# See the License for the specific language governing permissions and
# limitations under the License
import sys, os
from __future__ import print_function
import paddle.v2 as paddle
import paddle
import paddle.fluid as fluid
import numpy
import sys
from vgg import vgg_bn_drop
from resnet import resnet_cifar10
with_gpu = os.getenv('WITH_GPU', '0') != '0'
def inference_network():
# The image is 32 * 32 with RGB representation.
data_shape = [3, 32, 32]
images = fluid.layers.data(name='pixel', shape=data_shape, dtype='float32')
def main():
datadim = 3 * 32 * 32
classdim = 10
predict = resnet_cifar10(images, 32)
# predict = vgg_bn_drop(images) # un-comment to use vgg net
return predict
# PaddlePaddle init
paddle.init(use_gpu=with_gpu, trainer_count=1)
image = paddle.layer.data(
name="image", type=paddle.data_type.dense_vector(datadim))
def train_network():
predict = inference_network()
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=predict, label=label)
return [avg_cost, accuracy]
# Add neural network config
# option 1. resnet
# net = resnet_cifar10(image, depth=32)
# option 2. vgg
net = vgg_bn_drop(image)
out = paddle.layer.fc(
input=net, size=classdim, act=paddle.activation.Softmax())
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
lbl = paddle.layer.data(
name="label", type=paddle.data_type.integer_value(classdim))
cost = paddle.layer.classification_cost(input=out, label=lbl)
# Create parameters
parameters = paddle.parameters.create(cost)
def train(use_cuda, train_program, params_dirname):
BATCH_SIZE = 128
EPOCH_NUM = 2
# Create optimizer
momentum_optimizer = paddle.optimizer.Momentum(
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0002 * 128),
learning_rate=0.1 / 128.0,
learning_rate_decay_a=0.1,
learning_rate_decay_b=50000 * 100,
learning_rate_schedule='discexp')
train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.cifar.train10(), buf_size=50000),
batch_size=BATCH_SIZE)
# Create trainer
trainer = paddle.trainer.SGD(
cost=cost, parameters=parameters, update_equation=momentum_optimizer)
test_reader = paddle.batch(
paddle.dataset.cifar.test10(), batch_size=BATCH_SIZE)
# End batch and end pass event handler
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "\nPass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
if isinstance(event, fluid.EndStepEvent):
if event.step % 100 == 0:
print("\nPass %d, Batch %d, Cost %f, Acc %f" %
(event.step, event.epoch, event.metrics[0],
event.metrics[1]))
else:
sys.stdout.write('.')
sys.stdout.flush()
if isinstance(event, paddle.event.EndPass):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
trainer.save_parameter_to_tar(f)
result = trainer.test(
reader=paddle.batch(
paddle.dataset.cifar.test10(), batch_size=128),
feeding={'image': 0,
'label': 1})
print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics)
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=out)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
if isinstance(event, fluid.EndEpochEvent):
avg_cost, accuracy = trainer.test(
reader=test_reader, feed_order=['pixel', 'label'])
print('\nTest with Pass {0}, Loss {1:2.2}, Acc {2:2.2}'.format(
event.epoch, avg_cost, accuracy))
if params_dirname is not None:
trainer.save_params(params_dirname)
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
trainer = fluid.Trainer(
train_func=train_program, optimizer_func=optimizer_program, place=place)
trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
paddle.dataset.cifar.train10(), buf_size=50000),
batch_size=128),
num_passes=200,
reader=train_reader,
num_epochs=EPOCH_NUM,
event_handler=event_handler,
feeding={'image': 0,
'label': 1})
feed_order=['pixel', 'label'])
# inference
def infer(use_cuda, inference_program, params_dirname=None):
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
inferencer = fluid.Inferencer(
infer_func=inference_program, param_path=params_dirname, place=place)
# Prepare testing data.
from PIL import Image
import numpy as np
import os
......@@ -105,32 +100,44 @@ def main():
def load_image(file):
im = Image.open(file)
im = im.resize((32, 32), Image.ANTIALIAS)
im = np.array(im).astype(np.float32)
# The storage order of the loaded image is W(widht),
# The storage order of the loaded image is W(width),
# H(height), C(channel). PaddlePaddle requires
# the CHW order, so transpose them.
im = im.transpose((2, 0, 1)) # CHW
# In the training phase, the channel order of CIFAR
# image is B(Blue), G(green), R(Red). But PIL open
# image in RGB mode. It must swap the channel order.
im = im[(2, 1, 0), :, :] # BGR
im = im.flatten()
im = im / 255.0
# Add one dimension to mimic the list format.
im = numpy.expand_dims(im, axis=0)
return im
test_data = []
cur_dir = os.path.dirname(os.path.realpath(__file__))
test_data.append((load_image(cur_dir + '/image/dog.png'), ))
img = load_image(cur_dir + '/image/dog.png')
# inference
results = inferencer.infer({'pixel': img})
print("infer results: ", results)
def main(use_cuda):
if use_cuda and not fluid.core.is_compiled_with_cuda():
return
save_path = "image_classification_resnet.inference.model"
# users can remove the comments and change the model name
# with open('params_pass_50.tar', 'r') as f:
# parameters = paddle.parameters.Parameters.from_tar(f)
train(
use_cuda=use_cuda,
train_program=train_network,
params_dirname=save_path)
probs = paddle.infer(
output_layer=out, parameters=parameters, input=test_data)
lab = np.argsort(-probs) # probs and lab are the results of one batch data
print "Label of image/dog.png is: %d" % lab[0][0]
infer(
use_cuda=use_cuda,
inference_program=inference_network,
params_dirname=save_path)
if __name__ == '__main__':
main()
# For demo purpose, the training runs on CPU
# Please change accordingly.
main(use_cuda=False)
......@@ -12,36 +12,35 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle.v2 as paddle
import paddle
import paddle.fluid as fluid
__all__ = ['vgg_bn_drop']
def vgg_bn_drop(input):
def conv_block(ipt, num_filter, groups, dropouts, num_channels=None):
return paddle.networks.img_conv_group(
def conv_block(ipt, num_filter, groups, dropouts):
return fluid.nets.img_conv_group(
input=ipt,
num_channels=num_channels,
pool_size=2,
pool_stride=2,
conv_num_filter=[num_filter] * groups,
conv_filter_size=3,
conv_act=paddle.activation.Relu(),
conv_act='relu',
conv_with_batchnorm=True,
conv_batchnorm_drop_rate=dropouts,
pool_type=paddle.pooling.Max())
pool_type='max')
conv1 = conv_block(input, 64, 2, [0.3, 0], 3)
conv1 = conv_block(input, 64, 2, [0.3, 0])
conv2 = conv_block(conv1, 128, 2, [0.4, 0])
conv3 = conv_block(conv2, 256, 3, [0.4, 0.4, 0])
conv4 = conv_block(conv3, 512, 3, [0.4, 0.4, 0])
conv5 = conv_block(conv4, 512, 3, [0.4, 0.4, 0])
drop = paddle.layer.dropout(input=conv5, dropout_rate=0.5)
fc1 = paddle.layer.fc(input=drop, size=512, act=paddle.activation.Linear())
bn = paddle.layer.batch_norm(
input=fc1,
act=paddle.activation.Relu(),
layer_attr=paddle.attr.Extra(drop_rate=0.5))
fc2 = paddle.layer.fc(input=bn, size=512, act=paddle.activation.Linear())
return fc2
drop = fluid.layers.dropout(x=conv5, dropout_prob=0.5)
fc1 = fluid.layers.fc(input=drop, size=512, act=None)
bn = fluid.layers.batch_norm(input=fc1, act='relu')
drop2 = fluid.layers.dropout(x=bn, dropout_prob=0.5)
fc2 = fluid.layers.fc(input=drop2, size=512, act=None)
predict = fluid.layers.fc(input=fc2, size=10, act='softmax')
return predict
\ No newline at end of file
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册