提交 24a84c03 编写于 作者: T Travis CI

Deploy to GitHub Pages: 51398659

上级 7a3f6bf7
The tutorials in v1_api_tutorials are using v1_api currently, and will be upgraded to v2_api later.
Thus, v1_api_tutorials is a temporary directory. We decide not to maintain it and will delete it in future.
Please go to [PaddlePaddle/book](https://github.com/PaddlePaddle/book) and
[PaddlePaddle/models](https://github.com/PaddlePaddle/models) to learn PaddlePaddle.
# Chinese Word Embedding Model Tutorial #
----------
This tutorial is to guide you through the process of using a Pretrained Chinese Word Embedding Model in the PaddlePaddle standard format.
We thank @lipeng for the pull request that defined the model schemas and pretrained the models.
## Introduction ###
### Chinese Word Dictionary ###
Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of "《红楼梦》" is "《","红楼梦","》",and "《红楼梦》". Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206326, including 4 special token:
- `<s>`: the start of a sequence
- `<e>`: the end of a sequence
- `PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING`: a placeholder, just ignore it and its embedding
- `<unk>`: a word not included in dictionary
### Pretrained Chinese Word Embedding Model ###
Inspired by paper [A Neural Probabilistic Language Model](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf), our model architecture (**Embedding joint of six words->FullyConnect->SoftMax**) is as following graph. And for our dictionary, we pretrain four models with different word vector dimenstions, i.e 32, 64, 128, 256.
<center>![](./neural-n-gram-model.png)</center>
<center>Figure 1. neural-n-gram-model</center>
### Download and Extract ###
To download and extract our dictionary and pretrained model, run the following commands.
cd $PADDLE_ROOT/demo/model_zoo/embedding
./pre_DictAndModel.sh
## Chinese Paraphrasing Example ##
We provide a paraphrasing task to show the usage of pretrained Chinese Word Dictionary and Embedding Model.
### Data Preparation and Preprocess ###
First, run the following commands to download and extract the in-house dataset. The dataset (using UTF-8 format) has 20 training samples, 5 testing samples and 2 generating samples.
cd $PADDLE_ROOT/demo/seqToseq/data
./paraphrase_data.sh
Second, preprocess data and build dictionary on train data by running the following commands, and the preprocessed dataset is stored in `$PADDLE_SOURCE_ROOT/demo/seqToseq/data/pre-paraphrase`:
cd $PADDLE_ROOT/demo/seqToseq/
python preprocess.py -i data/paraphrase [--mergeDict]
- `--mergeDict`: if using this option, the source and target dictionary are merged, i.e, two dictionaries have the same context. Here, as source and target data are all chinese words, this option can be used.
### User Specified Embedding Model ###
The general command of extracting desired parameters from the pretrained embedding model based on user dictionary is:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python extract_para.py --preModel PREMODEL --preDict PREDICT --usrModel USRMODEL--usrDict USRDICT -d DIM
- `--preModel PREMODEL`: the name of pretrained embedding model
- `--preDict PREDICT`: the name of pretrained dictionary
- `--usrModel USRMODEL`: the name of extracted embedding model
- `--usrDict USRDICT`: the name of user specified dictionary
- `-d DIM`: dimension of parameter
Here, you can simply run the command:
cd $PADDLE_ROOT/demo/seqToseq/data/
./paraphrase_model.sh
And you will see following embedding model structure:
paraphrase_model
|--- _source_language_embedding
|--- _target_language_embedding
### Training Model in PaddlePaddle ###
First, create a model config file, see example `demo/seqToseq/paraphrase/train.conf`:
from seqToseq_net import *
is_generating = False
################## Data Definition #####################
train_conf = seq_to_seq_data(data_dir = "./data/pre-paraphrase",
job_mode = job_mode)
############## Algorithm Configuration ##################
settings(
learning_method = AdamOptimizer(),
batch_size = 50,
learning_rate = 5e-4)
################# Network configure #####################
gru_encoder_decoder(train_conf, is_generating, word_vector_dim = 32)
This config is almost the same as `demo/seqToseq/translation/train.conf`.
Then, train the model by running the command:
cd $PADDLE_SOURCE_ROOT/demo/seqToseq/paraphrase
./train.sh
where `train.sh` is almost the same as `demo/seqToseq/translation/train.sh`, the only difference is following two command arguments:
- `--init_model_path`: path of the initialization model, here is `data/paraphrase_model`
- `--load_missing_parameter_strategy`: operations when model file is missing, here use a normal distibution to initialize the other parameters except for the embedding layer
For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to [Text generation Tutorial](../text_generation/index_en.md).
## Optional Function ##
### Embedding Parameters Observation
For users who want to observe the embedding parameters, this function can convert a PaddlePaddle binary embedding model to a text model by running the command:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --b2t -i INPUT -o OUTPUT -d DIM
- `-i INPUT`: the name of input binary embedding model
- `-o OUTPUT`: the name of output text embedding model
- `-d DIM`: the dimension of parameter
You will see parameters like this in output text model:
0,4,32156096
-0.7845433,1.1937413,-0.1704215,0.4154715,0.9566584,-0.5558153,-0.2503305, ......
0.0000909,0.0009465,-0.0008813,-0.0008428,0.0007879,0.0000183,0.0001984, ......
......
- 1st line is **PaddlePaddle format file head**, it has 3 attributes:
- version of PaddlePaddle, here is 0
- sizeof(float), here is 4
- total number of parameter, here is 32156096
- Other lines print the paramters (assume `<dim>` = 32)
- each line print 32 paramters splitted by ','
- there is 32156096/32 = 1004877 lines, meaning there is 1004877 embedding words
### Embedding Parameters Revision
For users who want to revise the embedding parameters, this function can convert a revised text embedding model to a PaddlePaddle binary model by running the command:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --t2b -i INPUT -o OUTPUT
- `-i INPUT`: the name of input text embedding model.
- `-o OUTPUT`: the name of output binary embedding model
Note that the format of input text model is as follows:
-0.7845433,1.1937413,-0.1704215,0.4154715,0.9566584,-0.5558153,-0.2503305, ......
0.0000909,0.0009465,-0.0008813,-0.0008428,0.0007879,0.0000183,0.0001984, ......
......
- there is no file header in 1st line
- each line stores parameters for one word, the separator is commas ','
# Generative Adversarial Networks (GAN)
This demo implements GAN training described in the original [GAN paper](https://arxiv.org/abs/1406.2661) and deep convolutional generative adversarial networks [DCGAN paper](https://arxiv.org/abs/1511.06434).
The high-level structure of GAN is shown in Figure. 1 below. It is composed of two major parts: a generator and a discriminator, both of which are based on neural networks. The generator takes in some kind of noise with a known distribution and transforms it into an image. The discriminator takes in an image and determines whether it is artificially generated by the generator or a real image. So the generator and the discriminator are in a competitive game in which generator is trying to generate image to look as real as possible to fool the discriminator, while the discriminator is trying to distinguish between real and fake images.
<center>![](./gan.png)</center>
<p align="center">
Figure 1. GAN-Model-Structure
<a href="https://ishmaelbelghazi.github.io/ALI/">figure credit</a>
</p>
The generator and discriminator take turn to be trained using SGD. The objective function of the generator is for its generated images being classified as real by the discriminator, and the objective function of the discriminator is to correctly classify real and fake images. When the GAN model is trained to converge to the equilibrium state, the generator will transform the given noise distribution to the distribution of real images, and the discriminator will not be able to distinguish between real and fake images at all.
## Implementation of GAN Model Structure
Since GAN model involves multiple neural networks, it requires to use paddle python API. So the code walk-through below can also partially serve as an introduction to the usage of Paddle Python API.
There are three networks defined in gan_conf.py, namely **generator_training**, **discriminator_training** and **generator**. The relationship to the model structure we defined above is that **discriminator_training** is the discriminator, **generator** is the generator, and the **generator_training** combined the generator and discriminator since training generator would require the discriminator to provide loss function. This relationship is described in the following code:
```python
if is_generator_training:
noise = data_layer(name="noise", size=noise_dim)
sample = generator(noise)
if is_discriminator_training:
sample = data_layer(name="sample", size=sample_dim)
if is_generator_training or is_discriminator_training:
label = data_layer(name="label", size=1)
prob = discriminator(sample)
cost = cross_entropy(input=prob, label=label)
classification_error_evaluator(
input=prob, label=label, name=mode + '_error')
outputs(cost)
if is_generator:
noise = data_layer(name="noise", size=noise_dim)
outputs(generator(noise))
```
In order to train the networks defined in gan_conf.py, one first needs to initialize a Paddle environment, parse the config, create GradientMachine from the config and create trainer from GradientMachine as done in the code chunk below:
```python
import py_paddle.swig_paddle as api
# init paddle environment
api.initPaddle('--use_gpu=' + use_gpu, '--dot_period=10',
'--log_period=100', '--gpu_id=' + args.gpu_id,
'--save_dir=' + "./%s_params/" % data_source)
# Parse config
gen_conf = parse_config(conf, "mode=generator_training,data=" + data_source)
dis_conf = parse_config(conf, "mode=discriminator_training,data=" + data_source)
generator_conf = parse_config(conf, "mode=generator,data=" + data_source)
# Create GradientMachine
dis_training_machine = api.GradientMachine.createFromConfigProto(
dis_conf.model_config)
gen_training_machine = api.GradientMachine.createFromConfigProto(
gen_conf.model_config)
generator_machine = api.GradientMachine.createFromConfigProto(
generator_conf.model_config)
# Create trainer
dis_trainer = api.Trainer.create(dis_conf, dis_training_machine)
gen_trainer = api.Trainer.create(gen_conf, gen_training_machine)
```
In order to balance the strength between generator and discriminator, we schedule to train whichever one is performing worse by comparing their loss function value. The loss function value can be calculated by a forward pass through the GradientMachine.
```python
def get_training_loss(training_machine, inputs):
outputs = api.Arguments.createArguments(0)
training_machine.forward(inputs, outputs, api.PASS_TEST)
loss = outputs.getSlotValue(0).copyToNumpyMat()
return numpy.mean(loss)
```
After training one network, one needs to sync the new parameters to the other networks. The code below demonstrates one example of such use case:
```python
# Train the gen_training
gen_trainer.trainOneDataBatch(batch_size, data_batch_gen)
# Copy the parameters from gen_training to dis_training and generator
copy_shared_parameters(gen_training_machine,
dis_training_machine)
copy_shared_parameters(gen_training_machine, generator_machine)
```
## A Toy Example
With the infrastructure explained above, we can now walk you through a toy example of generating two dimensional uniform distribution using 10 dimensional Gaussian noise.
The Gaussian noises are generated using the code below:
```python
def get_noise(batch_size, noise_dim):
return numpy.random.normal(size=(batch_size, noise_dim)).astype('float32')
```
The real samples (2-D uniform) are generated using the code below:
```python
# synthesize 2-D uniform data in gan_trainer.py:114
def load_uniform_data():
data = numpy.random.rand(1000000, 2).astype('float32')
return data
```
The generator and discriminator network are built using fully-connected layer and batch_norm layer, and are defined in gan_conf.py.
To train the GAN model, one can use the command below. The flag -d specifies the training data (cifar, mnist or uniform) and flag --useGpu specifies whether to use gpu for training (0 is cpu, 1 is gpu).
```bash
$python gan_trainer.py -d uniform --useGpu 1
```
The generated samples can be found in ./uniform_samples/ and one example is shown below as Figure 2. One can see that it roughly recovers the 2D uniform distribution.
<center>![](./uniform_sample.png)</center>
<p align="center">
Figure 2. Uniform Sample
</p>
## MNIST Example
### Data preparation
To download the MNIST data, one can use the following commands:
```bash
$cd data/
$./get_mnist_data.sh
```
### Model description
Following the DC-Gan paper (https://arxiv.org/abs/1511.06434), we use convolution/convolution-transpose layer in the discriminator/generator network to better deal with images. The details of the network structures are defined in gan_conf_image.py.
### Training the model
To train the GAN model on mnist data, one can use the following command:
```bash
$python gan_trainer.py -d mnist --useGpu 1
```
The generated sample images can be found at ./mnist_samples/ and one example is shown below as Figure 3.
<center>![](./mnist_sample.png)</center>
<p align="center">
Figure 3. MNIST Sample
</p>
# Model Zoo - ImageNet #
[ImageNet](http://www.image-net.org/) is a popular dataset for generic object classification. This tutorial provides convolutional neural network(CNN) models for ImageNet.
## ResNet Introduction
ResNets from paper [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385) won the 1st place on the ILSVRC 2015 classification task. They present residual learning framework to ease the training of networks that are substantially deeper than those used previously. The residual connections are shown in following figure. The left building block is used in network of 34 layers and the right bottleneck building block is used in network of 50, 101, 152 layers .
<center>![resnet_block](./resnet_block.jpg)</center>
<center>Figure 1. ResNet Block</center>
We present three ResNet models, which are converted from the models provided by the authors <https://github.com/KaimingHe/deep-residual-networks>. The classfication errors tested in PaddlePaddle on 50,000 ILSVRC validation set with input images channel order of **BGR** by single scale with the shorter side of 256 and single crop as following table.
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">ResNet</th>
<th scope="col" class="left">Top-1</th>
<th scope="col" class="left">Model Size</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">ResNet-50</td>
<td class="left">24.9%</td>
<td class="left">99M</td>
</tr>
<tr>
<td class="left">ResNet-101</td>
<td class="left">23.7%</td>
<td class="left">173M</td>
</tr>
<tr>
<td class="left">ResNet-152</td>
<td class="left">23.2%</td>
<td class="left">234M</td>
</tr>
</tbody>
</table></center>
<br>
## ResNet Model
See ```demo/model_zoo/resnet/resnet.py```. This config contains network of 50, 101 and 152 layers. You can specify layer number by adding argument like ```--config_args=layer_num=50``` in command line arguments.
### Network Visualization
You can get a diagram of ResNet network by running the following commands. The script generates dot file and then converts dot file to PNG file, which needs to install graphviz to convert.
```
cd demo/model_zoo/resnet
./net_diagram.sh
```
### Model Download
```
cd demo/model_zoo/resnet
./get_model.sh
```
You can run above command to download all models and mean file and save them in ```demo/model_zoo/resnet/model``` if downloading successfully.
```
mean_meta_224 resnet_101 resnet_152 resnet_50
```
* resnet_50: model of 50 layers.
* resnet_101: model of 101 layers.
* resnet_152: model of 152 layers.
* mean\_meta\_224: mean file with 3 x 224 x 224 size in **BGR** order. You also can use three mean values: 103.939, 116.779, 123.68.
### Parameter Info
* **Convolution Layer Weight**
As batch normalization layer is connected after each convolution layer, there is no parameter of bias and only one weight in this layer.
shape: `(Co, ky, kx, Ci)`
* Co: channle number of output feature map.
* ky: filter size in vertical direction.
* kx: filter size in horizontal direction.
* Ci: channle number of input feature map.
2-Dim matrix: (Co * ky * kx, Ci), saved in row-major order.
* **Fully connected Layer Weight**
2-Dim matrix: (input layer size, this layer size), saved in row-major order.
* **[Batch Normalization](<http://arxiv.org/abs/1502.03167>) Layer Weight**
There are four parameters in this layer. In fact, only .w0 and .wbias are the learned parameters. The other two are therunning mean and variance respectively. They will be loaded in testing. Following table shows parameters of a batch normzalization layer.
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">Parameter Name</th>
<th scope="col" class="left">Number</th>
<th scope="col" class="left">Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">_res2_1_branch1_bn.w0</td>
<td class="left">256</td>
<td class="left">gamma, scale parameter</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w1</td>
<td class="left">256</td>
<td class="left">mean value of feature map</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w2</td>
<td class="left">256</td>
<td class="left">variance of feature map</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.wbias</td>
<td class="left">256</td>
<td class="left">beta, shift parameter</td>
</tr>
</tbody>
</table></center>
<br>
### Parameter Observation
Users who want to observe the parameters can use Python to read:
```
import sys
import numpy as np
def load(file_name):
with open(file_name, 'rb') as f:
f.read(16) # skip header for float type.
return np.fromfile(f, dtype=np.float32)
if __name__=='__main__':
weight = load(sys.argv[1])
```
or simply use following shell command:
```
od -j 16 -f _res2_1_branch1_bn.w0
```
## Feature Extraction
We provide both C++ and Python interfaces to extract features. The following examples use data in `demo/model_zoo/resnet/example` to show the extracting process in detail.
### C++ Interface
First, specify image data list in `define_py_data_sources2` in the config, see example `demo/model_zoo/resnet/resnet.py`.
```
train_list = 'train.list' if not is_test else None
# mean.meta is mean file of ImageNet dataset.
# mean.meta size : 3 x 224 x 224.
# If you use three mean value, set like:
# "mean_value:103.939,116.779,123.68;"
args={
'mean_meta': "model/mean_meta_224/mean.meta",
'image_size': 224, 'crop_size': 224,
'color': True,'swap_channel:': [2, 1, 0]}
define_py_data_sources2(train_list,
'example/test.list',
module="example.image_list_provider",
obj="processData",
args=args)
```
Second, specify layers to extract features in `Outputs()` of `resnet.py`. For example,
```
Outputs("res5_3_branch2c_conv", "res5_3_branch2c_bn")
```
Third, specify model path and output directory in `extract_fea_c++.sh`, and then run the following commands.
```
cd demo/model_zoo/resnet
./extract_fea_c++.sh
```
If successful, features are saved in `fea_output/rank-00000` as follows. And you can use `load_feature_c` interface in `load_feature.py ` to load such a file.
```
-0.115318 -0.108358 ... -0.087884;-1.27664 ... -1.11516 -2.59123;
-0.126383 -0.116248 ... -0.00534909;-1.42593 ... -1.04501 -1.40769;
```
* Each line stores features of a sample. Here, the first line stores features of `example/dog.jpg` and second line stores features of `example/cat.jpg`.
* Features of different layers are splitted by `;`, and their order is consistent with the layer order in `Outputs()`. Here, the left features are `res5_3_branch2c_conv` layer and right features are `res5_3_branch2c_bn` layer.
### Python Interface
`demo/model_zoo/resnet/classify.py` is an example to show how to use Python to extract features. Following example still uses data of `./example/test.list`. Command is as follows:
```
cd demo/model_zoo/resnet
./extract_fea_py.sh
```
extract_fea_py.sh:
```
python classify.py \
--job=extract \
--conf=resnet.py\
--use_gpu=1 \
--mean=model/mean_meta_224/mean.meta \
--model=model/resnet_50 \
--data=./example/test.list \
--output_layer="res5_3_branch2c_conv,res5_3_branch2c_bn" \
--output_dir=features
```
* \--job=extract: specify job mode to extract feature.
* \--conf=resnet.py: network configure.
* \--use_gpu=1: speficy GPU mode.
* \--model=model/resnet_5: model path.
* \--data=./example/test.list: data list.
* \--output_layer="xxx,xxx": specify layers to extract features.
* \--output_dir=features: output diretcoty.
If run successfully, you will see features saved in `features/batch_0`, this file is produced with cPickle. You can use `load_feature_py` interface in `load_feature.py` to open the file, and it returns a dictionary as follows:
```
{
'cat.jpg': {'res5_3_branch2c_conv': array([[-0.12638293, -0.116248 , -0.11883899, ..., -0.00895038, 0.01994277, -0.00534909]], dtype=float32), 'res5_3_branch2c_bn': array([[-1.42593431, -1.28918779, -1.32414699, ..., -1.45933616, -1.04501402, -1.40769434]], dtype=float32)},
'dog.jpg': {'res5_3_branch2c_conv': array([[-0.11531784, -0.10835785, -0.08809858, ...,0.0055237, 0.01505112, -0.08788397]], dtype=float32), 'res5_3_branch2c_bn': array([[-1.27663755, -1.18272924, -0.90937918, ..., -1.25178063, -1.11515927, -2.59122872]], dtype=float32)}
}
```
Observed carefully, these feature values are consistent with the above results extracted by C++ interface.
## Prediction
`classify.py` also can be used to predict. We provide an example script `predict.sh` to predict data in `example/test.list` using a ResNet model with 50 layers.
```
cd demo/model_zoo/resnet
./predict.sh
```
predict.sh calls the `classify.py`:
```
python classify.py \
--job=predict \
--conf=resnet.py\
--multi_crop \
--model=model/resnet_50 \
--use_gpu=1 \
--data=./example/test.list
```
* \--job=extract: speficy job mode to predict.
* \--conf=resnet.py: network configure.
* \--multi_crop: use 10 crops and average predicting probability.
* \--use_gpu=1: speficy GPU mode.
* \--model=model/resnet_50: model path.
* \--data=./example/test.list: data list.
If run successfully, you will see following results, where 156 and 285 are labels of the images.
```
Label of example/dog.jpg is: 156
Label of example/cat.jpg is: 282
```
# Quick Start
This tutorial will teach the basics of deep learning (DL), including how to implement many different models in PaddlePaddle. You will learn how to:
- Prepare data into the standardized format that PaddlePaddle accepts.
- Write data providers that read data into PaddlePaddle.
- Configure neural networks in PaddlePaddle layer by layer.
- Train models.
- Perform inference with trained models.
## Install
To get started, please install PaddlePaddle on your computer. Throughout this tutorial, you will learn by implementing different DL models for text classification.
To install PaddlePaddle, please follow the instructions here: <a href = "../../getstarted/build_and_install/index_en.html" >Build and Install</a>.
## Overview
For the first step, you will use PaddlePaddle to build a **text classification** system. For example, suppose you run an e-commence website, and you want to analyze the sentiment of user reviews to evaluate product quality.
For example, given the input
```
This monitor is fantastic.
```
Your classifier should output “positive”, since this text snippet shows that the user is satisfied with the product. Given this input:
```
The monitor breaks down two months after purchase.
```
the classifier should output “negative“.
To build your text classification system, your code will need to perform five steps:
<center> ![](./src/Pipeline_en.jpg) </center>
- Preprocess data into a standardized format.
- Provide data to the learning model.
- Specify the neural network structure.
- Train the model.
- Inference (make prediction on test examples).
1. Preprocess data into standardized format
- In the text classification example, you will start with a text file with one training example per line. Each line contains category id (in machine learning, often denoted the target y), followed by the input text (often denoted x); these two elements are separated by a Tab. For example: ```positive [tab] This monitor is fantastic```. You will preprocess this raw data into a format that Paddle can use.
2. Provide data to the learning model.
- You can write data providers in Python. For any required data preprocessing step, you can add the preprocessing code to the PyDataProvider Python file.
- In our text classification example, every word or character will be converted into an integer id, specified in a dictionary file. It perform a dictionary lookup in PyDataProvider to get the id.
3. Specify neural network structure. (From easy to hard, we provide 4 kinds of network configurations)
- A logistic regression model.
- A word embedding model.
- A convolutional neural network model.
- A sequential recurrent neural network model.
- You will also learn different learning algorithms.
4. Training model.
5. Inference.
## Preprocess data into standardized format
In this example, you are going to use [Amazon electronic product review dataset](http://jmcauley.ucsd.edu/data/amazon/) to build a bunch of deep neural network models for text classification. Each text in this dataset is a product review. This dataset has two categories: “positive” and “negative”. Positive means the reviewer likes the product, while negative means the reviewer does not like the product.
`demo/quick_start` in the [source code](https://github.com/PaddlePaddle/Paddle) provides script for downloading the preprocessed data as shown below. (If you want to process the raw data, you can use the script `demo/quick_start/data/proc_from_raw_data/get_data.sh`).
```bash
cd demo/quick_start
./data/get_data.sh
```
## Transfer Data to Model
### Write Data Provider with Python
The following `dataprovider_bow.py` gives a complete example of writing data provider with Python. It includes the following parts:
* initalizer: define the additional meta-data of the data provider and the types of the input data.
* process: Each `yield` returns a data sample. In this case, it return the text representation and category id. The order of features in the returned result needs to be consistent with the definition of the input types in `initalizer`.
```python
from paddle.trainer.PyDataProvider2 import *
# id of the word not in dictionary
UNK_IDX = 0
# initializer is called by the framework during initialization.
# It allows the user to describe the data types and setup the
# necessary data structure for later use.
# `settings` is an object. initializer need to properly fill settings.input_types.
# initializer can also store other data structures needed to be used at process().
# In this example, dictionary is stored in settings.
# `dictionay` and `kwargs` are arguments passed from trainer_config.lr.py
def initializer(settings, dictionary, **kwargs):
# Put the word dictionary into settings
settings.word_dict = dictionary
# setting.input_types specifies what the data types the data provider
# generates.
settings.input_types = [
# The first input is a sparse_binary_vector,
# which means each dimension of the vector is either 0 or 1. It is the
# bag-of-words (BOW) representation of the texts.
sparse_binary_vector(len(dictionary)),
# The second input is an integer. It represents the category id of the
# sample. 2 means there are two labels in the dataset.
# (1 for positive and 0 for negative)
integer_value(2)]
# Delaring a data provider. It has an initializer 'data_initialzer'.
# It will cache the generated data of the first pass in memory, so that
# during later pass, no on-the-fly data generation will be needed.
# `setting` is the same object used by initializer()
# `file_name` is the name of a file listed train_list or test_list file given
# to define_py_data_sources2(). See trainer_config.lr.py.
@provider(init_hook=initializer, cache=CacheType.CACHE_PASS_IN_MEM)
def process(settings, file_name):
# Open the input data file.
with open(file_name, 'r') as f:
# Read each line.
for line in f:
# Each line contains the label and text of the comment, separated by \t.
label, comment = line.strip().split('\t')
# Split the words into a list.
words = comment.split()
# convert the words into a list of ids by looking them up in word_dict.
word_vector = [settings.word_dict.get(w, UNK_IDX) for w in words]
# Return the features for the current comment. The first is a list
# of ids representing a 0-1 binary sparse vector of the text,
# the second is the integer id of the label.
yield word_vector, int(label)
```
### Define Python Data Provider in Configuration files.
You need to add a data provider definition `define_py_data_sources2` in our network configuration. This definition specifies:
- The path of the training and testing data (`data/train.list`, `data/test.list`).
- The location of the data provider file (`dataprovider_bow`).
- The function to call to get data. (`process`).
- Additional arguments or data. Here it passes the path of word dictionary.
```python
from paddle.trainer_config_helpers import *
file = "data/dict.txt"
word_dict = dict()
with open(dict_file, 'r') as f:
for i, line in enumerate(f):
w = line.strip().split()[0]
word_dict[w] = i
# define the data sources for the model.
# We need to use different process for training and prediction.
# For training, the input data includes both word IDs and labels.
# For prediction, the input data only includs word Ids.
define_py_data_sources2(train_list='data/train.list',
test_list='data/test.list',
module="dataprovider_bow",
obj="process",
args={"dictionary": word_dict})
```
You can refer to the following link for more detailed examples and data formats: <a href = "../../api/v1/data_provider/pydataprovider2_en.html">PyDataProvider2</a>.
## Network Architecture
We will describe four kinds of network architectures in this section.
<center> ![](./src/PipelineNetwork_en.jpg) </center>
First, you will build a logistic regression model. Later, you will also get chance to build other more powerful network architectures.
For more detailed documentation, you could refer to: <a href = "../../api/v1/trainer_config_helpers/layers.html">layer documentation</a>. All configuration files are in `demo/quick_start` directory.
### Logistic Regression
The architecture is illustrated in the following picture:
<center> ![](./src/NetLR_en.png) </center>
- You need define the data for text features. The size of the data layer is the number of words in the dictionary.
```python
word = data_layer(name="word", size=voc_dim)
```
- You also need to define the category id for each example. The size of the data layer is the number of labels.
```python
label = data_layer(name="label", size=label_dim)
```
- It uses logistic regression model to classify the vector, and it will output the classification error during training.
- Each layer has an *input* argument that specifies its input layer. Some layers can have multiple input layers. You can use a list of the input layers as input in that case.
- *size* for each layer means the number of neurons of the layer.
- *act_type* means activation function applied to the output of each neuron independently.
- Some layers can have additional special inputs. For example, `classification_cost` needs ground truth label as input to compute classification loss and error.
```python
# Define a fully connected layer with logistic activation (also called softmax activation).
output = fc_layer(input=word,
size=label_dim,
act_type=SoftmaxActivation())
# Define cross-entropy classification loss and error.
classification_cost(input=output, label=label)
```
Performance summary: You can refer to the training and testing scripts later. In order to compare different network architectures, the model complexity and test classification error are listed in the following table:
<html>
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">Logistic regression</td>
<td class="left">252 KB</td>
<td class="left">8.652%</td>
</tr>
</tbody>
</table></center>
</html>
<br>
### Word Embedding Model
In order to use the word embedding model, you need to change the data provider a little bit to make the input words as a sequence of word IDs. The revised data provider `dataprovider_emb.py` is listed below. You only need to change initializer() for the type of the first input. It is changed from sparse_binary_vector to sequence of intergers. process() remains the same. This data provider can also be used for later sequence models.
```python
def initializer(settings, dictionary, **kwargs):
# Put the word dictionary into settings
settings.word_dict = dictionary
settings.input_types = [
# Define the type of the first input as a sequence of integers.
integer_value_sequence(len(dictionary)),
# Define the second input for label id
integer_value(2)]
@provider(init_hook=initializer)
def process(settings, file_name):
...
# omitted, it is same as the data provider for LR model
```
This model is very similar to the framework of logistic regression, but it uses word embedding vectors instead of a sparse vectors to represent words.
<center> ![](./src/NetContinuous_en.png) </center>
- It can look up the dense word embedding vector in the dictionary (its words embedding vector is `word_dim`). The input is a sequence of N words, the output is N word_dim dimensional vectors.
```python
emb = embedding_layer(input=word, dim=word_dim)
```
- It averages all the word embedding in a sentence to get its sentence representation.
```python
avg = pooling_layer(input=emb, pooling_type=AvgPooling())
```
The other parts of the model are the same as logistic regression network.
The performance is summarized in the following table:
<html>
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">Word embedding model</td>
<td class="left">15 MB</td>
<td class="left">8.484%</td>
</tr>
</tbody>
</table>
</html></center>
<br>
### Convolutional Neural Network Model
Convolutional neural network converts a sequence of word embeddings into a sentence representation using temporal convolutions. You will transform the fully connected layer of the word embedding model to 3 new sub-steps.
<center> ![](./src/NetConv_en.png) </center>
Text convolution has 3 steps:
1. Get K nearest neighbor context of each word in a sentence, stack them into a 2D vector representation.
2. Apply temporal convolution to this representation to produce a new hidden_dim dimensional vector.
3. Apply max-pooling to the new vectors at all the time steps in a sentence to get a sentence representation.
```python
# context_len means convolution kernel size.
# context_start means the start of the convolution. It can be negative. In that case, zero padding is applied.
text_conv = sequence_conv_pool(input=emb,
context_start=k,
context_len=2 * k + 1)
```
The performance is summarized in the following table:
<html>
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">Convolutional model</td>
<td class="left">16 MB</td>
<td class="left">5.628%</td>
</tr>
</tbody>
</table></center>
<br>
### Recurrent Model
<center> ![](./src/NetRNN_en.png) </center>
You can use Recurrent neural network as our time sequence model, including simple RNN model, GRU model, and LSTM model。
- GRU model can be specified via:
```python
gru = simple_gru(input=emb, size=gru_size)
```
- LSTM model can be specified via:
```python
lstm = simple_lstm(input=emb, size=lstm_size)
```
You can use single layer LSTM model with Dropout for our text classification problem. The performance is summarized in the following table:
<html>
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">Recurrent model</td>
<td class="left">16 MB</td>
<td class="left">4.812%</td>
</tr>
</tbody>
</table></center>
</html>
<br>
## Optimization Algorithm
<a href = "../../api/v1/trainer_config_helpers/optimizers.html">Optimization algorithms</a> include Momentum, RMSProp, AdaDelta, AdaGrad, Adam, and Adamax. You can use Adam optimization method here, with L2 regularization and gradient clipping, because Adam has been proved to work very well for training recurrent neural network.
```python
settings(batch_size=128,
learning_rate=2e-3,
learning_method=AdamOptimizer(),
regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25)
```
## Training Model
After completing data preparation and network architecture specification, you will run the training script.
<center> ![](./src/PipelineTrain_en.png) </center>
Training script: our training script is in `train.sh` file. The training arguments are listed below:
```bash
paddle train \
--config=trainer_config.py \
--log_period=20 \
--save_dir=./output \
--num_passes=15 \
--use_gpu=false
```
We do not provide examples on how to train on clusters here. If you want to train on clusters, please follow the <a href = "../../howto/usage/cluster/cluster_train_en.html">distributed training</a> documentation or other demos for more details.
## Inference
You can use the trained model to perform prediction on the dataset with no labels. You can also evaluate the model on dataset with labels to obtain its test accuracy.
<center> ![](./src/PipelineTest_en.png) </center>
The test script is listed below. PaddlePaddle can evaluate a model on the data with labels specified in `test.list`.
```bash
paddle train \
--config=trainer_config.lstm.py \
--use_gpu=false \
--job=test \
--init_model_path=./output/pass-0000x
```
We will give an example of performing prediction using Recurrent model on a dataset with no labels. You can refer to <a href = "../../api/v1/predict/swig_py_paddle_en.html">Python Prediction API</a> tutorial,or other <a href = "../../tutorials/index_en.html">demo</a> for the prediction process using Python. You can also use the following script for inference or evaluation.
inference script (predict.sh):
```bash
model="output/pass-00003"
paddle train \
--config=trainer_config.lstm.py \
--use_gpu=false \
--job=test \
--init_model_path=$model \
--config_args=is_predict=1 \
--predict_output_dir=. \
mv rank-00000 result.txt
```
User can choose the best model base on the training log instead of model `output/pass-00003`. There are several differences between training and inference network configurations.
- You do not need labels during inference.
- Outputs need to be specified to the classification probability layer (the output of softmax layer), or the id of maximum probability (`max_id` layer). An example to output the id and probability is given in the code snippet.
- batch_size = 1.
- You need to specify the location of `test_list` in the test data.
The results in `result.txt` is as follows, each line is one sample.
```
predicted_label_id;probability_of_label_0 probability_of_label_1 # the first sample
predicted_label_id;probability_of_label_0 probability_of_label_1 # the second sample
```
```python
is_predict = get_config_arg('is_predict', bool, False)
trn = 'data/train.list' if not is_predict else None
tst = 'data/test.list' if not is_predict else 'data/pred.list'
obj = 'process' if not is_predict else 'process_pre'
batch_size = 128 if not is_predict else 1
if is_predict:
maxid = maxid_layer(output)
outputs([maxid,output])
else:
label = data_layer(name="label", size=2)
cls = classification_cost(input=output, label=label) outputs(cls)
```
## Summary
The scripts of data downloading, network configurations, and training scrips are in `/demo/quick_start`. The following table summarizes the performance of our network architecture on Amazon-Elec dataset(25k):
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Error rate</th>
<th scope="col" class="left">Configuration file name</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">Logistic regression model(BOW)</td>
<td class="left"> 252KB </td>
<td class="left">8.652%</td>
<td class="left">trainer_config.lr.py</td>
</tr>
<tr>
<td class="left">Word embedding</td>
<td class="left"> 15MB </td>
<td class="left"> 8.484%</td>
<td class="left">trainer_config.emb.py</td>
</tr>
<tr>
<td class="left">Convolution model</td>
<td class="left"> 16MB </td>
<td class="left"> 5.628%</td>
<td class="left">trainer_config.cnn.py</td>
</tr>
<tr>
<td class="left">Time sequence model</td>
<td class="left"> 16MB </td>
<td class="left"> 4.812%</td>
<td class="left">trainer_config.lstm.py</td>
</tr>
</tbody>
</table>
</center>
<br>
## Appendix
### Command Line Argument
* \--config:network architecture path.
* \--save_dir:model save directory.
* \--log_period:the logging period per batch.
* \--num_passes:number of training passes. One pass means the training would go over the whole training dataset once.
* \--config_args:Other configuration arguments.
* \--init_model_path:The path of the initial model parameter.
By default, the trainer will save model every pass. You can also specify `saving_period_by_batches` to set the frequency of batch saving. You can use `show_parameter_stats_period` to print the statistics of the parameters, which are very useful for tuning parameters. Other command line arguments can be found in <a href = "../../howto/usage/cmd_parameter/index_en.html">command line argument documentation</a>。
### Log
```
TrainerInternal.cpp:160] Batch=20 samples=2560 AvgCost=0.628761 CurrentCost=0.628761 Eval: classification_error_evaluator=0.304297 CurrentEval: classification_error_evaluator=0.304297
```
During model training, you will see the log like the examples above:
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<thead>
<th scope="col" class="left">Name</th>
<th scope="col" class="left">Explanation</th>
</tr>
</thead>
<tr>
<td class="left">Batch=20</td>
<td class="left"> You have trained 20 batches. </td>
</tr>
<tr>
<td class="left">samples=2560</td>
<td class="left"> You have trained 2560 examples. </td>
</tr>
<tr>
<td class="left">AvgCost</td>
<td class="left"> The average cost from the first batch to the current batch. </td>
</tr>
<tr>
<td class="left">CurrentCost</td>
<td class="left"> the average cost of the last log_period batches </td>
</tr>
<tr>
<td class="left">Eval: classification_error_evaluator</td>
<td class="left"> The average classification error from the first batch to the current batch.</td>
</tr>
<tr>
<td class="left">CurrentEval: classification_error_evaluator</td>
<td class="left"> The average error rate of the last log_period batches </td>
</tr>
</tbody>
</table>
</center>
<br>
因为 它太大了无法显示 source diff 。你可以改为 查看blob
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>&lt;no title&gt; &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../genindex.html"/>
<link rel="search" title="Search" href="../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_ios_en.html">Build PaddlePaddle for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>&lt;no title&gt;</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<p>The tutorials in v1_api_tutorials are using v1_api currently, and will be upgraded to v2_api later.
Thus, v1_api_tutorials is a temporary directory. We decide not to maintain it and will delete it in future.</p>
<p>Please go to <a class="reference external" href="https://github.com/PaddlePaddle/book">PaddlePaddle/book</a> and
<a class="reference external" href="https://github.com/PaddlePaddle/models">PaddlePaddle/models</a> to learn PaddlePaddle.</p>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Chinese Word Embedding Model Tutorial &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../../genindex.html"/>
<link rel="search" title="Search" href="../../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_en.html">Build PaddlePaddle for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Chinese Word Embedding Model Tutorial</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="chinese-word-embedding-model-tutorial">
<span id="chinese-word-embedding-model-tutorial"></span><h1>Chinese Word Embedding Model Tutorial<a class="headerlink" href="#chinese-word-embedding-model-tutorial" title="Permalink to this headline"></a></h1>
<hr class="docutils" />
<p>This tutorial is to guide you through the process of using a Pretrained Chinese Word Embedding Model in the PaddlePaddle standard format.</p>
<p>We thank &#64;lipeng for the pull request that defined the model schemas and pretrained the models.</p>
<div class="section" id="introduction">
<span id="introduction"></span><h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline"></a></h2>
<div class="section" id="chinese-word-dictionary">
<span id="chinese-word-dictionary"></span><h3>Chinese Word Dictionary<a class="headerlink" href="#chinese-word-dictionary" title="Permalink to this headline"></a></h3>
<p>Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of &#8220;《红楼梦》&#8221; is &#8220;&#8221;&#8221;红楼梦&#8221;&#8221;&#8221;,and &#8220;《红楼梦》&#8221;. Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206326, including 4 special token:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">&lt;s&gt;</span></code>: the start of a sequence</li>
<li><code class="docutils literal"><span class="pre">&lt;e&gt;</span></code>: the end of a sequence</li>
<li><code class="docutils literal"><span class="pre">PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING</span></code>: a placeholder, just ignore it and its embedding</li>
<li><code class="docutils literal"><span class="pre">&lt;unk&gt;</span></code>: a word not included in dictionary</li>
</ul>
</div>
<div class="section" id="pretrained-chinese-word-embedding-model">
<span id="pretrained-chinese-word-embedding-model"></span><h3>Pretrained Chinese Word Embedding Model<a class="headerlink" href="#pretrained-chinese-word-embedding-model" title="Permalink to this headline"></a></h3>
<p>Inspired by paper <a class="reference external" href="http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf">A Neural Probabilistic Language Model</a>, our model architecture (<strong>Embedding joint of six words-&gt;FullyConnect-&gt;SoftMax</strong>) is as following graph. And for our dictionary, we pretrain four models with different word vector dimenstions, i.e 32, 64, 128, 256.
<center><img alt="" src="../../_images/neural-n-gram-model.png" /></center>
<center>Figure 1. neural-n-gram-model</center></p>
</div>
<div class="section" id="download-and-extract">
<span id="download-and-extract"></span><h3>Download and Extract<a class="headerlink" href="#download-and-extract" title="Permalink to this headline"></a></h3>
<p>To download and extract our dictionary and pretrained model, run the following commands.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
./pre_DictAndModel.sh
</pre></div>
</div>
</div>
</div>
<div class="section" id="chinese-paraphrasing-example">
<span id="chinese-paraphrasing-example"></span><h2>Chinese Paraphrasing Example<a class="headerlink" href="#chinese-paraphrasing-example" title="Permalink to this headline"></a></h2>
<p>We provide a paraphrasing task to show the usage of pretrained Chinese Word Dictionary and Embedding Model.</p>
<div class="section" id="data-preparation-and-preprocess">
<span id="data-preparation-and-preprocess"></span><h3>Data Preparation and Preprocess<a class="headerlink" href="#data-preparation-and-preprocess" title="Permalink to this headline"></a></h3>
<p>First, run the following commands to download and extract the in-house dataset. The dataset (using UTF-8 format) has 20 training samples, 5 testing samples and 2 generating samples.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/data
./paraphrase_data.sh
</pre></div>
</div>
<p>Second, preprocess data and build dictionary on train data by running the following commands, and the preprocessed dataset is stored in <code class="docutils literal"><span class="pre">$PADDLE_SOURCE_ROOT/demo/seqToseq/data/pre-paraphrase</span></code>:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/
python preprocess.py -i data/paraphrase [--mergeDict]
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">--mergeDict</span></code>: if using this option, the source and target dictionary are merged, i.e, two dictionaries have the same context. Here, as source and target data are all chinese words, this option can be used.</li>
</ul>
</div>
<div class="section" id="user-specified-embedding-model">
<span id="user-specified-embedding-model"></span><h3>User Specified Embedding Model<a class="headerlink" href="#user-specified-embedding-model" title="Permalink to this headline"></a></h3>
<p>The general command of extracting desired parameters from the pretrained embedding model based on user dictionary is:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
python extract_para.py --preModel PREMODEL --preDict PREDICT --usrModel USRMODEL--usrDict USRDICT -d DIM
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">--preModel</span> <span class="pre">PREMODEL</span></code>: the name of pretrained embedding model</li>
<li><code class="docutils literal"><span class="pre">--preDict</span> <span class="pre">PREDICT</span></code>: the name of pretrained dictionary</li>
<li><code class="docutils literal"><span class="pre">--usrModel</span> <span class="pre">USRMODEL</span></code>: the name of extracted embedding model</li>
<li><code class="docutils literal"><span class="pre">--usrDict</span> <span class="pre">USRDICT</span></code>: the name of user specified dictionary</li>
<li><code class="docutils literal"><span class="pre">-d</span> <span class="pre">DIM</span></code>: dimension of parameter</li>
</ul>
<p>Here, you can simply run the command:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/data/
./paraphrase_model.sh
</pre></div>
</div>
<p>And you will see following embedding model structure:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">paraphrase_model</span>
<span class="o">|---</span> <span class="n">_source_language_embedding</span>
<span class="o">|---</span> <span class="n">_target_language_embedding</span>
</pre></div>
</div>
</div>
<div class="section" id="training-model-in-paddlepaddle">
<span id="training-model-in-paddlepaddle"></span><h3>Training Model in PaddlePaddle<a class="headerlink" href="#training-model-in-paddlepaddle" title="Permalink to this headline"></a></h3>
<p>First, create a model config file, see example <code class="docutils literal"><span class="pre">demo/seqToseq/paraphrase/train.conf</span></code>:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">seqToseq_net</span> <span class="k">import</span> <span class="o">*</span>
<span class="n">is_generating</span> <span class="o">=</span> <span class="kc">False</span>
<span class="c1">################## Data Definition #####################</span>
<span class="n">train_conf</span> <span class="o">=</span> <span class="n">seq_to_seq_data</span><span class="p">(</span><span class="n">data_dir</span> <span class="o">=</span> <span class="s2">&quot;./data/pre-paraphrase&quot;</span><span class="p">,</span>
<span class="n">job_mode</span> <span class="o">=</span> <span class="n">job_mode</span><span class="p">)</span>
<span class="c1">############## Algorithm Configuration ##################</span>
<span class="n">settings</span><span class="p">(</span>
<span class="n">learning_method</span> <span class="o">=</span> <span class="n">AdamOptimizer</span><span class="p">(),</span>
<span class="n">batch_size</span> <span class="o">=</span> <span class="mi">50</span><span class="p">,</span>
<span class="n">learning_rate</span> <span class="o">=</span> <span class="mf">5e-4</span><span class="p">)</span>
<span class="c1">################# Network configure #####################</span>
<span class="n">gru_encoder_decoder</span><span class="p">(</span><span class="n">train_conf</span><span class="p">,</span> <span class="n">is_generating</span><span class="p">,</span> <span class="n">word_vector_dim</span> <span class="o">=</span> <span class="mi">32</span><span class="p">)</span>
</pre></div>
</div>
<p>This config is almost the same as <code class="docutils literal"><span class="pre">demo/seqToseq/translation/train.conf</span></code>.</p>
<p>Then, train the model by running the command:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_SOURCE_ROOT/demo/seqToseq/paraphrase
./train.sh
</pre></div>
</div>
<p>where <code class="docutils literal"><span class="pre">train.sh</span></code> is almost the same as <code class="docutils literal"><span class="pre">demo/seqToseq/translation/train.sh</span></code>, the only difference is following two command arguments:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">--init_model_path</span></code>: path of the initialization model, here is <code class="docutils literal"><span class="pre">data/paraphrase_model</span></code></li>
<li><code class="docutils literal"><span class="pre">--load_missing_parameter_strategy</span></code>: operations when model file is missing, here use a normal distibution to initialize the other parameters except for the embedding layer</li>
</ul>
<p>For users who want to understand the dataset format, model architecture and training procedure in detail, please refer to <a class="reference external" href="v1_api_tutorials/text_generation/index_en.md">Text generation Tutorial</a>.</p>
</div>
</div>
<div class="section" id="optional-function">
<span id="optional-function"></span><h2>Optional Function<a class="headerlink" href="#optional-function" title="Permalink to this headline"></a></h2>
<div class="section" id="embedding-parameters-observation">
<span id="embedding-parameters-observation"></span><h3>Embedding Parameters Observation<a class="headerlink" href="#embedding-parameters-observation" title="Permalink to this headline"></a></h3>
<p>For users who want to observe the embedding parameters, this function can convert a PaddlePaddle binary embedding model to a text model by running the command:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --b2t -i INPUT -o OUTPUT -d DIM
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">-i</span> <span class="pre">INPUT</span></code>: the name of input binary embedding model</li>
<li><code class="docutils literal"><span class="pre">-o</span> <span class="pre">OUTPUT</span></code>: the name of output text embedding model</li>
<li><code class="docutils literal"><span class="pre">-d</span> <span class="pre">DIM</span></code>: the dimension of parameter</li>
</ul>
<p>You will see parameters like this in output text model:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="mi">0</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">32156096</span>
<span class="o">-</span><span class="mf">0.7845433</span><span class="p">,</span><span class="mf">1.1937413</span><span class="p">,</span><span class="o">-</span><span class="mf">0.1704215</span><span class="p">,</span><span class="mf">0.4154715</span><span class="p">,</span><span class="mf">0.9566584</span><span class="p">,</span><span class="o">-</span><span class="mf">0.5558153</span><span class="p">,</span><span class="o">-</span><span class="mf">0.2503305</span><span class="p">,</span> <span class="o">......</span>
<span class="mf">0.0000909</span><span class="p">,</span><span class="mf">0.0009465</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008813</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008428</span><span class="p">,</span><span class="mf">0.0007879</span><span class="p">,</span><span class="mf">0.0000183</span><span class="p">,</span><span class="mf">0.0001984</span><span class="p">,</span> <span class="o">......</span>
<span class="o">......</span>
</pre></div>
</div>
<ul class="simple">
<li>1st line is <strong>PaddlePaddle format file head</strong>, it has 3 attributes:<ul>
<li>version of PaddlePaddle, here is 0</li>
<li>sizeof(float), here is 4</li>
<li>total number of parameter, here is 32156096</li>
</ul>
</li>
<li>Other lines print the paramters (assume <code class="docutils literal"><span class="pre">&lt;dim&gt;</span></code> = 32)<ul>
<li>each line print 32 paramters splitted by &#8216;,&#8217;</li>
<li>there is 32156096/32 = 1004877 lines, meaning there is 1004877 embedding words</li>
</ul>
</li>
</ul>
</div>
<div class="section" id="embedding-parameters-revision">
<span id="embedding-parameters-revision"></span><h3>Embedding Parameters Revision<a class="headerlink" href="#embedding-parameters-revision" title="Permalink to this headline"></a></h3>
<p>For users who want to revise the embedding parameters, this function can convert a revised text embedding model to a PaddlePaddle binary model by running the command:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --t2b -i INPUT -o OUTPUT
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">-i</span> <span class="pre">INPUT</span></code>: the name of input text embedding model.</li>
<li><code class="docutils literal"><span class="pre">-o</span> <span class="pre">OUTPUT</span></code>: the name of output binary embedding model</li>
</ul>
<p>Note that the format of input text model is as follows:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">-</span><span class="mf">0.7845433</span><span class="p">,</span><span class="mf">1.1937413</span><span class="p">,</span><span class="o">-</span><span class="mf">0.1704215</span><span class="p">,</span><span class="mf">0.4154715</span><span class="p">,</span><span class="mf">0.9566584</span><span class="p">,</span><span class="o">-</span><span class="mf">0.5558153</span><span class="p">,</span><span class="o">-</span><span class="mf">0.2503305</span><span class="p">,</span> <span class="o">......</span>
<span class="mf">0.0000909</span><span class="p">,</span><span class="mf">0.0009465</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008813</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008428</span><span class="p">,</span><span class="mf">0.0007879</span><span class="p">,</span><span class="mf">0.0000183</span><span class="p">,</span><span class="mf">0.0001984</span><span class="p">,</span> <span class="o">......</span>
<span class="o">......</span>
</pre></div>
</div>
<ul class="simple">
<li>there is no file header in 1st line</li>
<li>each line stores parameters for one word, the separator is commas &#8216;,&#8217;</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Generative Adversarial Networks (GAN) &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../../genindex.html"/>
<link rel="search" title="Search" href="../../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_en.html">Build PaddlePaddle for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Generative Adversarial Networks (GAN)</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="generative-adversarial-networks-gan">
<span id="generative-adversarial-networks-gan"></span><h1>Generative Adversarial Networks (GAN)<a class="headerlink" href="#generative-adversarial-networks-gan" title="Permalink to this headline"></a></h1>
<p>This demo implements GAN training described in the original <a class="reference external" href="https://arxiv.org/abs/1406.2661">GAN paper</a> and deep convolutional generative adversarial networks <a class="reference external" href="https://arxiv.org/abs/1511.06434">DCGAN paper</a>.</p>
<p>The high-level structure of GAN is shown in Figure. 1 below. It is composed of two major parts: a generator and a discriminator, both of which are based on neural networks. The generator takes in some kind of noise with a known distribution and transforms it into an image. The discriminator takes in an image and determines whether it is artificially generated by the generator or a real image. So the generator and the discriminator are in a competitive game in which generator is trying to generate image to look as real as possible to fool the discriminator, while the discriminator is trying to distinguish between real and fake images.</p>
<p><center><img alt="" src="../../_images/gan.png" /></center></p>
<p align="center">
Figure 1. GAN-Model-Structure
<a href="https://ishmaelbelghazi.github.io/ALI/">figure credit</a>
</p><p>The generator and discriminator take turn to be trained using SGD. The objective function of the generator is for its generated images being classified as real by the discriminator, and the objective function of the discriminator is to correctly classify real and fake images. When the GAN model is trained to converge to the equilibrium state, the generator will transform the given noise distribution to the distribution of real images, and the discriminator will not be able to distinguish between real and fake images at all.</p>
<div class="section" id="implementation-of-gan-model-structure">
<span id="implementation-of-gan-model-structure"></span><h2>Implementation of GAN Model Structure<a class="headerlink" href="#implementation-of-gan-model-structure" title="Permalink to this headline"></a></h2>
<p>Since GAN model involves multiple neural networks, it requires to use paddle python API. So the code walk-through below can also partially serve as an introduction to the usage of Paddle Python API.</p>
<p>There are three networks defined in gan_conf.py, namely <strong>generator_training</strong>, <strong>discriminator_training</strong> and <strong>generator</strong>. The relationship to the model structure we defined above is that <strong>discriminator_training</strong> is the discriminator, <strong>generator</strong> is the generator, and the <strong>generator_training</strong> combined the generator and discriminator since training generator would require the discriminator to provide loss function. This relationship is described in the following code:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">if</span> <span class="n">is_generator_training</span><span class="p">:</span>
<span class="n">noise</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;noise&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">noise_dim</span><span class="p">)</span>
<span class="n">sample</span> <span class="o">=</span> <span class="n">generator</span><span class="p">(</span><span class="n">noise</span><span class="p">)</span>
<span class="k">if</span> <span class="n">is_discriminator_training</span><span class="p">:</span>
<span class="n">sample</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;sample&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">sample_dim</span><span class="p">)</span>
<span class="k">if</span> <span class="n">is_generator_training</span> <span class="ow">or</span> <span class="n">is_discriminator_training</span><span class="p">:</span>
<span class="n">label</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;label&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">prob</span> <span class="o">=</span> <span class="n">discriminator</span><span class="p">(</span><span class="n">sample</span><span class="p">)</span>
<span class="n">cost</span> <span class="o">=</span> <span class="n">cross_entropy</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
<span class="n">classification_error_evaluator</span><span class="p">(</span>
<span class="nb">input</span><span class="o">=</span><span class="n">prob</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="n">mode</span> <span class="o">+</span> <span class="s1">&#39;_error&#39;</span><span class="p">)</span>
<span class="n">outputs</span><span class="p">(</span><span class="n">cost</span><span class="p">)</span>
<span class="k">if</span> <span class="n">is_generator</span><span class="p">:</span>
<span class="n">noise</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;noise&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">noise_dim</span><span class="p">)</span>
<span class="n">outputs</span><span class="p">(</span><span class="n">generator</span><span class="p">(</span><span class="n">noise</span><span class="p">))</span>
</pre></div>
</div>
<p>In order to train the networks defined in gan_conf.py, one first needs to initialize a Paddle environment, parse the config, create GradientMachine from the config and create trainer from GradientMachine as done in the code chunk below:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">py_paddle.swig_paddle</span> <span class="kn">as</span> <span class="nn">api</span>
<span class="c1"># init paddle environment</span>
<span class="n">api</span><span class="o">.</span><span class="n">initPaddle</span><span class="p">(</span><span class="s1">&#39;--use_gpu=&#39;</span> <span class="o">+</span> <span class="n">use_gpu</span><span class="p">,</span> <span class="s1">&#39;--dot_period=10&#39;</span><span class="p">,</span>
<span class="s1">&#39;--log_period=100&#39;</span><span class="p">,</span> <span class="s1">&#39;--gpu_id=&#39;</span> <span class="o">+</span> <span class="n">args</span><span class="o">.</span><span class="n">gpu_id</span><span class="p">,</span>
<span class="s1">&#39;--save_dir=&#39;</span> <span class="o">+</span> <span class="s2">&quot;./</span><span class="si">%s</span><span class="s2">_params/&quot;</span> <span class="o">%</span> <span class="n">data_source</span><span class="p">)</span>
<span class="c1"># Parse config</span>
<span class="n">gen_conf</span> <span class="o">=</span> <span class="n">parse_config</span><span class="p">(</span><span class="n">conf</span><span class="p">,</span> <span class="s2">&quot;mode=generator_training,data=&quot;</span> <span class="o">+</span> <span class="n">data_source</span><span class="p">)</span>
<span class="n">dis_conf</span> <span class="o">=</span> <span class="n">parse_config</span><span class="p">(</span><span class="n">conf</span><span class="p">,</span> <span class="s2">&quot;mode=discriminator_training,data=&quot;</span> <span class="o">+</span> <span class="n">data_source</span><span class="p">)</span>
<span class="n">generator_conf</span> <span class="o">=</span> <span class="n">parse_config</span><span class="p">(</span><span class="n">conf</span><span class="p">,</span> <span class="s2">&quot;mode=generator,data=&quot;</span> <span class="o">+</span> <span class="n">data_source</span><span class="p">)</span>
<span class="c1"># Create GradientMachine</span>
<span class="n">dis_training_machine</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">GradientMachine</span><span class="o">.</span><span class="n">createFromConfigProto</span><span class="p">(</span>
<span class="n">dis_conf</span><span class="o">.</span><span class="n">model_config</span><span class="p">)</span>
<span class="n">gen_training_machine</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">GradientMachine</span><span class="o">.</span><span class="n">createFromConfigProto</span><span class="p">(</span>
<span class="n">gen_conf</span><span class="o">.</span><span class="n">model_config</span><span class="p">)</span>
<span class="n">generator_machine</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">GradientMachine</span><span class="o">.</span><span class="n">createFromConfigProto</span><span class="p">(</span>
<span class="n">generator_conf</span><span class="o">.</span><span class="n">model_config</span><span class="p">)</span>
<span class="c1"># Create trainer</span>
<span class="n">dis_trainer</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">Trainer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">dis_conf</span><span class="p">,</span> <span class="n">dis_training_machine</span><span class="p">)</span>
<span class="n">gen_trainer</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">Trainer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">gen_conf</span><span class="p">,</span> <span class="n">gen_training_machine</span><span class="p">)</span>
</pre></div>
</div>
<p>In order to balance the strength between generator and discriminator, we schedule to train whichever one is performing worse by comparing their loss function value. The loss function value can be calculated by a forward pass through the GradientMachine.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_training_loss</span><span class="p">(</span><span class="n">training_machine</span><span class="p">,</span> <span class="n">inputs</span><span class="p">):</span>
<span class="n">outputs</span> <span class="o">=</span> <span class="n">api</span><span class="o">.</span><span class="n">Arguments</span><span class="o">.</span><span class="n">createArguments</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="n">training_machine</span><span class="o">.</span><span class="n">forward</span><span class="p">(</span><span class="n">inputs</span><span class="p">,</span> <span class="n">outputs</span><span class="p">,</span> <span class="n">api</span><span class="o">.</span><span class="n">PASS_TEST</span><span class="p">)</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">outputs</span><span class="o">.</span><span class="n">getSlotValue</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">copyToNumpyMat</span><span class="p">()</span>
<span class="k">return</span> <span class="n">numpy</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">loss</span><span class="p">)</span>
</pre></div>
</div>
<p>After training one network, one needs to sync the new parameters to the other networks. The code below demonstrates one example of such use case:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Train the gen_training</span>
<span class="n">gen_trainer</span><span class="o">.</span><span class="n">trainOneDataBatch</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">data_batch_gen</span><span class="p">)</span>
<span class="c1"># Copy the parameters from gen_training to dis_training and generator</span>
<span class="n">copy_shared_parameters</span><span class="p">(</span><span class="n">gen_training_machine</span><span class="p">,</span>
<span class="n">dis_training_machine</span><span class="p">)</span>
<span class="n">copy_shared_parameters</span><span class="p">(</span><span class="n">gen_training_machine</span><span class="p">,</span> <span class="n">generator_machine</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="a-toy-example">
<span id="a-toy-example"></span><h2>A Toy Example<a class="headerlink" href="#a-toy-example" title="Permalink to this headline"></a></h2>
<p>With the infrastructure explained above, we can now walk you through a toy example of generating two dimensional uniform distribution using 10 dimensional Gaussian noise.</p>
<p>The Gaussian noises are generated using the code below:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_noise</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">noise_dim</span><span class="p">):</span>
<span class="k">return</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">noise_dim</span><span class="p">))</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="s1">&#39;float32&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>The real samples (2-D uniform) are generated using the code below:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># synthesize 2-D uniform data in gan_trainer.py:114</span>
<span class="k">def</span> <span class="nf">load_uniform_data</span><span class="p">():</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">1000000</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="s1">&#39;float32&#39;</span><span class="p">)</span>
<span class="k">return</span> <span class="n">data</span>
</pre></div>
</div>
<p>The generator and discriminator network are built using fully-connected layer and batch_norm layer, and are defined in gan_conf.py.</p>
<p>To train the GAN model, one can use the command below. The flag -d specifies the training data (cifar, mnist or uniform) and flag &#8211;useGpu specifies whether to use gpu for training (0 is cpu, 1 is gpu).</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nv">$python</span> gan_trainer.py -d uniform --useGpu <span class="m">1</span>
</pre></div>
</div>
<p>The generated samples can be found in ./uniform_samples/ and one example is shown below as Figure 2. One can see that it roughly recovers the 2D uniform distribution.</p>
<p><center><img alt="" src="../../_images/uniform_sample.png" /></center></p>
<p align="center">
Figure 2. Uniform Sample
</p></div>
<div class="section" id="mnist-example">
<span id="mnist-example"></span><h2>MNIST Example<a class="headerlink" href="#mnist-example" title="Permalink to this headline"></a></h2>
<div class="section" id="data-preparation">
<span id="data-preparation"></span><h3>Data preparation<a class="headerlink" href="#data-preparation" title="Permalink to this headline"></a></h3>
<p>To download the MNIST data, one can use the following commands:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nv">$cd</span> data/
$./get_mnist_data.sh
</pre></div>
</div>
</div>
<div class="section" id="model-description">
<span id="model-description"></span><h3>Model description<a class="headerlink" href="#model-description" title="Permalink to this headline"></a></h3>
<p>Following the DC-Gan paper (https://arxiv.org/abs/1511.06434), we use convolution/convolution-transpose layer in the discriminator/generator network to better deal with images. The details of the network structures are defined in gan_conf_image.py.</p>
</div>
<div class="section" id="training-the-model">
<span id="training-the-model"></span><h3>Training the model<a class="headerlink" href="#training-the-model" title="Permalink to this headline"></a></h3>
<p>To train the GAN model on mnist data, one can use the following command:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nv">$python</span> gan_trainer.py -d mnist --useGpu <span class="m">1</span>
</pre></div>
</div>
<p>The generated sample images can be found at ./mnist_samples/ and one example is shown below as Figure 3.
<center><img alt="" src="../../_images/mnist_sample.png" /></center></p>
<p align="center">
Figure 3. MNIST Sample
</p></div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Model Zoo - ImageNet &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../../genindex.html"/>
<link rel="search" title="Search" href="../../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_en.html">Build PaddlePaddle for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Model Zoo - ImageNet</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="model-zoo-imagenet">
<span id="model-zoo-imagenet"></span><h1>Model Zoo - ImageNet<a class="headerlink" href="#model-zoo-imagenet" title="Permalink to this headline"></a></h1>
<p><a class="reference external" href="http://www.image-net.org/">ImageNet</a> is a popular dataset for generic object classification. This tutorial provides convolutional neural network(CNN) models for ImageNet.</p>
<div class="section" id="resnet-introduction">
<span id="resnet-introduction"></span><h2>ResNet Introduction<a class="headerlink" href="#resnet-introduction" title="Permalink to this headline"></a></h2>
<p>ResNets from paper <a class="reference external" href="http://arxiv.org/abs/1512.03385">Deep Residual Learning for Image Recognition</a> won the 1st place on the ILSVRC 2015 classification task. They present residual learning framework to ease the training of networks that are substantially deeper than those used previously. The residual connections are shown in following figure. The left building block is used in network of 34 layers and the right bottleneck building block is used in network of 50, 101, 152 layers .</p>
<p><center><img alt="resnet_block" src="../../_images/resnet_block.jpg" /></center>
<center>Figure 1. ResNet Block</center></p>
<p>We present three ResNet models, which are converted from the models provided by the authors <a class="reference external" href="https://github.com/KaimingHe/deep-residual-networks">https://github.com/KaimingHe/deep-residual-networks</a>. The classfication errors tested in PaddlePaddle on 50,000 ILSVRC validation set with input images channel order of <strong>BGR</strong> by single scale with the shorter side of 256 and single crop as following table.
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">ResNet</th>
<th scope="col" class="left">Top-1</th>
<th scope="col" class="left">Model Size</th>
</tr>
</thead><tbody>
<tr>
<td class="left">ResNet-50</td>
<td class="left">24.9%</td>
<td class="left">99M</td>
</tr>
<tr>
<td class="left">ResNet-101</td>
<td class="left">23.7%</td>
<td class="left">173M</td>
</tr>
<tr>
<td class="left">ResNet-152</td>
<td class="left">23.2%</td>
<td class="left">234M</td>
</tr>
</tbody></table></center>
<br></div>
<div class="section" id="resnet-model">
<span id="resnet-model"></span><h2>ResNet Model<a class="headerlink" href="#resnet-model" title="Permalink to this headline"></a></h2>
<p>See <code class="docutils literal"><span class="pre">demo/model_zoo/resnet/resnet.py</span></code>. This config contains network of 50, 101 and 152 layers. You can specify layer number by adding argument like <code class="docutils literal"><span class="pre">--config_args=layer_num=50</span></code> in command line arguments.</p>
<div class="section" id="network-visualization">
<span id="network-visualization"></span><h3>Network Visualization<a class="headerlink" href="#network-visualization" title="Permalink to this headline"></a></h3>
<p>You can get a diagram of ResNet network by running the following commands. The script generates dot file and then converts dot file to PNG file, which needs to install graphviz to convert.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">net_diagram</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
</div>
<div class="section" id="model-download">
<span id="model-download"></span><h3>Model Download<a class="headerlink" href="#model-download" title="Permalink to this headline"></a></h3>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">get_model</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>You can run above command to download all models and mean file and save them in <code class="docutils literal"><span class="pre">demo/model_zoo/resnet/model</span></code> if downloading successfully.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">mean_meta_224</span> <span class="n">resnet_101</span> <span class="n">resnet_152</span> <span class="n">resnet_50</span>
</pre></div>
</div>
<ul class="simple">
<li>resnet_50: model of 50 layers.</li>
<li>resnet_101: model of 101 layers.</li>
<li>resnet_152: model of 152 layers.</li>
<li>mean_meta_224: mean file with 3 x 224 x 224 size in <strong>BGR</strong> order. You also can use three mean values: 103.939, 116.779, 123.68.</li>
</ul>
</div>
<div class="section" id="parameter-info">
<span id="parameter-info"></span><h3>Parameter Info<a class="headerlink" href="#parameter-info" title="Permalink to this headline"></a></h3>
<ul>
<li><p class="first"><strong>Convolution Layer Weight</strong></p>
<p>As batch normalization layer is connected after each convolution layer, there is no parameter of bias and only one weight in this layer.
shape: <code class="docutils literal"><span class="pre">(Co,</span> <span class="pre">ky,</span> <span class="pre">kx,</span> <span class="pre">Ci)</span></code></p>
<ul class="simple">
<li>Co: channle number of output feature map.</li>
<li>ky: filter size in vertical direction.</li>
<li>kx: filter size in horizontal direction.</li>
<li>Ci: channle number of input feature map.</li>
</ul>
<p>2-Dim matrix: (Co * ky * kx, Ci), saved in row-major order.</p>
</li>
<li><p class="first"><strong>Fully connected Layer Weight</strong></p>
<p>2-Dim matrix: (input layer size, this layer size), saved in row-major order.</p>
</li>
<li><p class="first"><strong><a class="reference external" href="http://arxiv.org/abs/1502.03167">Batch Normalization</a> Layer Weight</strong></p>
</li>
</ul>
<p>There are four parameters in this layer. In fact, only .w0 and .wbias are the learned parameters. The other two are therunning mean and variance respectively. They will be loaded in testing. Following table shows parameters of a batch normzalization layer.
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">Parameter Name</th>
<th scope="col" class="left">Number</th>
<th scope="col" class="left">Meaning</th>
</tr>
</thead><tbody>
<tr>
<td class="left">_res2_1_branch1_bn.w0</td>
<td class="left">256</td>
<td class="left">gamma, scale parameter</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w1</td>
<td class="left">256</td>
<td class="left">mean value of feature map</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w2</td>
<td class="left">256</td>
<td class="left">variance of feature map</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.wbias</td>
<td class="left">256</td>
<td class="left">beta, shift parameter</td>
</tr>
</tbody></table></center>
<br></div>
<div class="section" id="parameter-observation">
<span id="parameter-observation"></span><h3>Parameter Observation<a class="headerlink" href="#parameter-observation" title="Permalink to this headline"></a></h3>
<p>Users who want to observe the parameters can use Python to read:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="k">def</span> <span class="nf">load</span><span class="p">(</span><span class="n">file_name</span><span class="p">):</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">&#39;rb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span> <span class="c1"># skip header for float type.</span>
<span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">fromfile</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="k">if</span> <span class="vm">__name__</span><span class="o">==</span><span class="s1">&#39;__main__&#39;</span><span class="p">:</span>
<span class="n">weight</span> <span class="o">=</span> <span class="n">load</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
</pre></div>
</div>
<p>or simply use following shell command:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">od</span> <span class="o">-</span><span class="n">j</span> <span class="mi">16</span> <span class="o">-</span><span class="n">f</span> <span class="n">_res2_1_branch1_bn</span><span class="o">.</span><span class="n">w0</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="feature-extraction">
<span id="feature-extraction"></span><h2>Feature Extraction<a class="headerlink" href="#feature-extraction" title="Permalink to this headline"></a></h2>
<p>We provide both C++ and Python interfaces to extract features. The following examples use data in <code class="docutils literal"><span class="pre">demo/model_zoo/resnet/example</span></code> to show the extracting process in detail.</p>
<div class="section" id="c-interface">
<span id="c-interface"></span><h3>C++ Interface<a class="headerlink" href="#c-interface" title="Permalink to this headline"></a></h3>
<p>First, specify image data list in <code class="docutils literal"><span class="pre">define_py_data_sources2</span></code> in the config, see example <code class="docutils literal"><span class="pre">demo/model_zoo/resnet/resnet.py</span></code>.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="n">train_list</span> <span class="o">=</span> <span class="s1">&#39;train.list&#39;</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_test</span> <span class="k">else</span> <span class="kc">None</span>
<span class="c1"># mean.meta is mean file of ImageNet dataset.</span>
<span class="c1"># mean.meta size : 3 x 224 x 224.</span>
<span class="c1"># If you use three mean value, set like:</span>
<span class="c1"># &quot;mean_value:103.939,116.779,123.68;&quot;</span>
<span class="n">args</span><span class="o">=</span><span class="p">{</span>
<span class="s1">&#39;mean_meta&#39;</span><span class="p">:</span> <span class="s2">&quot;model/mean_meta_224/mean.meta&quot;</span><span class="p">,</span>
<span class="s1">&#39;image_size&#39;</span><span class="p">:</span> <span class="mi">224</span><span class="p">,</span> <span class="s1">&#39;crop_size&#39;</span><span class="p">:</span> <span class="mi">224</span><span class="p">,</span>
<span class="s1">&#39;color&#39;</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span><span class="s1">&#39;swap_channel:&#39;</span><span class="p">:</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">]}</span>
<span class="n">define_py_data_sources2</span><span class="p">(</span><span class="n">train_list</span><span class="p">,</span>
<span class="s1">&#39;example/test.list&#39;</span><span class="p">,</span>
<span class="n">module</span><span class="o">=</span><span class="s2">&quot;example.image_list_provider&quot;</span><span class="p">,</span>
<span class="n">obj</span><span class="o">=</span><span class="s2">&quot;processData&quot;</span><span class="p">,</span>
<span class="n">args</span><span class="o">=</span><span class="n">args</span><span class="p">)</span>
</pre></div>
</div>
<p>Second, specify layers to extract features in <code class="docutils literal"><span class="pre">Outputs()</span></code> of <code class="docutils literal"><span class="pre">resnet.py</span></code>. For example,</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Outputs</span><span class="p">(</span><span class="s2">&quot;res5_3_branch2c_conv&quot;</span><span class="p">,</span> <span class="s2">&quot;res5_3_branch2c_bn&quot;</span><span class="p">)</span>
</pre></div>
</div>
<p>Third, specify model path and output directory in <code class="docutils literal"><span class="pre">extract_fea_c++.sh</span></code>, and then run the following commands.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">extract_fea_c</span><span class="o">++.</span><span class="n">sh</span>
</pre></div>
</div>
<p>If successful, features are saved in <code class="docutils literal"><span class="pre">fea_output/rank-00000</span></code> as follows. And you can use <code class="docutils literal"><span class="pre">load_feature_c</span></code> interface in <code class="docutils literal"><span class="pre">load_feature.py</span></code> to load such a file.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">-</span><span class="mf">0.115318</span> <span class="o">-</span><span class="mf">0.108358</span> <span class="o">...</span> <span class="o">-</span><span class="mf">0.087884</span><span class="p">;</span><span class="o">-</span><span class="mf">1.27664</span> <span class="o">...</span> <span class="o">-</span><span class="mf">1.11516</span> <span class="o">-</span><span class="mf">2.59123</span><span class="p">;</span>
<span class="o">-</span><span class="mf">0.126383</span> <span class="o">-</span><span class="mf">0.116248</span> <span class="o">...</span> <span class="o">-</span><span class="mf">0.00534909</span><span class="p">;</span><span class="o">-</span><span class="mf">1.42593</span> <span class="o">...</span> <span class="o">-</span><span class="mf">1.04501</span> <span class="o">-</span><span class="mf">1.40769</span><span class="p">;</span>
</pre></div>
</div>
<ul class="simple">
<li>Each line stores features of a sample. Here, the first line stores features of <code class="docutils literal"><span class="pre">example/dog.jpg</span></code> and second line stores features of <code class="docutils literal"><span class="pre">example/cat.jpg</span></code>.</li>
<li>Features of different layers are splitted by <code class="docutils literal"><span class="pre">;</span></code>, and their order is consistent with the layer order in <code class="docutils literal"><span class="pre">Outputs()</span></code>. Here, the left features are <code class="docutils literal"><span class="pre">res5_3_branch2c_conv</span></code> layer and right features are <code class="docutils literal"><span class="pre">res5_3_branch2c_bn</span></code> layer.</li>
</ul>
</div>
<div class="section" id="python-interface">
<span id="python-interface"></span><h3>Python Interface<a class="headerlink" href="#python-interface" title="Permalink to this headline"></a></h3>
<p><code class="docutils literal"><span class="pre">demo/model_zoo/resnet/classify.py</span></code> is an example to show how to use Python to extract features. Following example still uses data of <code class="docutils literal"><span class="pre">./example/test.list</span></code>. Command is as follows:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">extract_fea_py</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>extract_fea_py.sh:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">classify</span><span class="o">.</span><span class="n">py</span> \
<span class="o">--</span><span class="n">job</span><span class="o">=</span><span class="n">extract</span> \
<span class="o">--</span><span class="n">conf</span><span class="o">=</span><span class="n">resnet</span><span class="o">.</span><span class="n">py</span>\
<span class="o">--</span><span class="n">use_gpu</span><span class="o">=</span><span class="mi">1</span> \
<span class="o">--</span><span class="n">mean</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">mean_meta_224</span><span class="o">/</span><span class="n">mean</span><span class="o">.</span><span class="n">meta</span> \
<span class="o">--</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">resnet_50</span> \
<span class="o">--</span><span class="n">data</span><span class="o">=./</span><span class="n">example</span><span class="o">/</span><span class="n">test</span><span class="o">.</span><span class="n">list</span> \
<span class="o">--</span><span class="n">output_layer</span><span class="o">=</span><span class="s2">&quot;res5_3_branch2c_conv,res5_3_branch2c_bn&quot;</span> \
<span class="o">--</span><span class="n">output_dir</span><span class="o">=</span><span class="n">features</span>
</pre></div>
</div>
<ul class="simple">
<li>--job=extract: specify job mode to extract feature.</li>
<li>--conf=resnet.py: network configure.</li>
<li>--use_gpu=1: speficy GPU mode.</li>
<li>--model=model/resnet_5: model path.</li>
<li>--data=./example/test.list: data list.</li>
<li>--output_layer=&#8221;xxx,xxx&#8221;: specify layers to extract features.</li>
<li>--output_dir=features: output diretcoty.</li>
</ul>
<p>If run successfully, you will see features saved in <code class="docutils literal"><span class="pre">features/batch_0</span></code>, this file is produced with cPickle. You can use <code class="docutils literal"><span class="pre">load_feature_py</span></code> interface in <code class="docutils literal"><span class="pre">load_feature.py</span></code> to open the file, and it returns a dictionary as follows:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="p">{</span>
<span class="s1">&#39;cat.jpg&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s1">&#39;res5_3_branch2c_conv&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">0.12638293</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.116248</span> <span class="p">,</span> <span class="o">-</span><span class="mf">0.11883899</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.00895038</span><span class="p">,</span> <span class="mf">0.01994277</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.00534909</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">),</span> <span class="s1">&#39;res5_3_branch2c_bn&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">1.42593431</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.28918779</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.32414699</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.45933616</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.04501402</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.40769434</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">)},</span>
<span class="s1">&#39;dog.jpg&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s1">&#39;res5_3_branch2c_conv&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">0.11531784</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.10835785</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08809858</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span><span class="mf">0.0055237</span><span class="p">,</span> <span class="mf">0.01505112</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08788397</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">),</span> <span class="s1">&#39;res5_3_branch2c_bn&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">1.27663755</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.18272924</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.90937918</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.25178063</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.11515927</span><span class="p">,</span> <span class="o">-</span><span class="mf">2.59122872</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">)}</span>
<span class="p">}</span>
</pre></div>
</div>
<p>Observed carefully, these feature values are consistent with the above results extracted by C++ interface.</p>
</div>
</div>
<div class="section" id="prediction">
<span id="prediction"></span><h2>Prediction<a class="headerlink" href="#prediction" title="Permalink to this headline"></a></h2>
<p><code class="docutils literal"><span class="pre">classify.py</span></code> also can be used to predict. We provide an example script <code class="docutils literal"><span class="pre">predict.sh</span></code> to predict data in <code class="docutils literal"><span class="pre">example/test.list</span></code> using a ResNet model with 50 layers.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">predict</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>predict.sh calls the <code class="docutils literal"><span class="pre">classify.py</span></code>:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">classify</span><span class="o">.</span><span class="n">py</span> \
<span class="o">--</span><span class="n">job</span><span class="o">=</span><span class="n">predict</span> \
<span class="o">--</span><span class="n">conf</span><span class="o">=</span><span class="n">resnet</span><span class="o">.</span><span class="n">py</span>\
<span class="o">--</span><span class="n">multi_crop</span> \
<span class="o">--</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">resnet_50</span> \
<span class="o">--</span><span class="n">use_gpu</span><span class="o">=</span><span class="mi">1</span> \
<span class="o">--</span><span class="n">data</span><span class="o">=./</span><span class="n">example</span><span class="o">/</span><span class="n">test</span><span class="o">.</span><span class="n">list</span>
</pre></div>
</div>
<ul class="simple">
<li>--job=extract: speficy job mode to predict.</li>
<li>--conf=resnet.py: network configure.</li>
<li>--multi_crop: use 10 crops and average predicting probability.</li>
<li>--use_gpu=1: speficy GPU mode.</li>
<li>--model=model/resnet_50: model path.</li>
<li>--data=./example/test.list: data list.</li>
</ul>
<p>If run successfully, you will see following results, where 156 and 285 are labels of the images.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Label</span> <span class="n">of</span> <span class="n">example</span><span class="o">/</span><span class="n">dog</span><span class="o">.</span><span class="n">jpg</span> <span class="ow">is</span><span class="p">:</span> <span class="mi">156</span>
<span class="n">Label</span> <span class="n">of</span> <span class="n">example</span><span class="o">/</span><span class="n">cat</span><span class="o">.</span><span class="n">jpg</span> <span class="ow">is</span><span class="p">:</span> <span class="mi">282</span>
</pre></div>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Quick Start &mdash; PaddlePaddle documentation</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="Index"
href="../../genindex.html"/>
<link rel="search" title="Search" href="../../search.html"/>
<link rel="top" title="PaddlePaddle documentation" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_en.html">GET STARTED</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_en.html">Install and Build</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_en.html">Install Using pip</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_en.html">Run in Docker Containers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_en.html">Build using Docker</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_en.html">Build from Sources</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_en.html">HOW TO</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_en.html">Set Command-line Parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_en.html">Use Case</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_en.html">Argument Outline</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_en.html">Detail Description</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_en.html">Distributed Training</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_en.html">fabric</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_en.html">openmpi</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_en.html">kubernetes</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_en.html">kubernetes on AWS</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/new_layer_en.html">Write New Layers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_en.html">Contribute Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_en.html">Contribute Documentation</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_en.html">RNN Models</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_en.html">RNN Configuration</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_en.html">Tune GPU Performance</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_en.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">Model Configuration</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">Data Reader Interface and DataSets</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">Training and Inference</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_en.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_en.html">Build PaddlePaddle for Android</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_en.html">Build PaddlePaddle for iOS</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_en.html">Build PaddlePaddle for Raspberry Pi</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Quick Start</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="quick-start">
<span id="quick-start"></span><h1>Quick Start<a class="headerlink" href="#quick-start" title="Permalink to this headline"></a></h1>
<p>This tutorial will teach the basics of deep learning (DL), including how to implement many different models in PaddlePaddle. You will learn how to:</p>
<ul class="simple">
<li>Prepare data into the standardized format that PaddlePaddle accepts.</li>
<li>Write data providers that read data into PaddlePaddle.</li>
<li>Configure neural networks in PaddlePaddle layer by layer.</li>
<li>Train models.</li>
<li>Perform inference with trained models.</li>
</ul>
<div class="section" id="install">
<span id="install"></span><h2>Install<a class="headerlink" href="#install" title="Permalink to this headline"></a></h2>
<p>To get started, please install PaddlePaddle on your computer. Throughout this tutorial, you will learn by implementing different DL models for text classification.</p>
<p>To install PaddlePaddle, please follow the instructions here: <a href = "../../getstarted/build_and_install/index_en.html" >Build and Install</a>.</p>
</div>
<div class="section" id="overview">
<span id="overview"></span><h2>Overview<a class="headerlink" href="#overview" title="Permalink to this headline"></a></h2>
<p>For the first step, you will use PaddlePaddle to build a <strong>text classification</strong> system. For example, suppose you run an e-commence website, and you want to analyze the sentiment of user reviews to evaluate product quality.</p>
<p>For example, given the input</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">This</span> <span class="n">monitor</span> <span class="ow">is</span> <span class="n">fantastic</span><span class="o">.</span>
</pre></div>
</div>
<p>Your classifier should output “positive”, since this text snippet shows that the user is satisfied with the product. Given this input:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">The</span> <span class="n">monitor</span> <span class="n">breaks</span> <span class="n">down</span> <span class="n">two</span> <span class="n">months</span> <span class="n">after</span> <span class="n">purchase</span><span class="o">.</span>
</pre></div>
</div>
<p>the classifier should output “negative“.</p>
<p>To build your text classification system, your code will need to perform five steps:
<center> <img alt="" src="../../_images/Pipeline_en.jpg" /> </center></p>
<ul class="simple">
<li>Preprocess data into a standardized format.</li>
<li>Provide data to the learning model.</li>
<li>Specify the neural network structure.</li>
<li>Train the model.</li>
<li>Inference (make prediction on test examples).</li>
</ul>
<ol class="simple">
<li>Preprocess data into standardized format<ul>
<li>In the text classification example, you will start with a text file with one training example per line. Each line contains category id (in machine learning, often denoted the target y), followed by the input text (often denoted x); these two elements are separated by a Tab. For example: <code class="docutils literal"><span class="pre">positive</span> <span class="pre">[tab]</span> <span class="pre">This</span> <span class="pre">monitor</span> <span class="pre">is</span> <span class="pre">fantastic</span></code>. You will preprocess this raw data into a format that Paddle can use.</li>
</ul>
</li>
<li>Provide data to the learning model.<ul>
<li>You can write data providers in Python. For any required data preprocessing step, you can add the preprocessing code to the PyDataProvider Python file.</li>
<li>In our text classification example, every word or character will be converted into an integer id, specified in a dictionary file. It perform a dictionary lookup in PyDataProvider to get the id.</li>
</ul>
</li>
<li>Specify neural network structure. (From easy to hard, we provide 4 kinds of network configurations)<ul>
<li>A logistic regression model.</li>
<li>A word embedding model.</li>
<li>A convolutional neural network model.</li>
<li>A sequential recurrent neural network model.</li>
<li>You will also learn different learning algorithms.</li>
</ul>
</li>
<li>Training model.</li>
<li>Inference.</li>
</ol>
</div>
<div class="section" id="preprocess-data-into-standardized-format">
<span id="preprocess-data-into-standardized-format"></span><h2>Preprocess data into standardized format<a class="headerlink" href="#preprocess-data-into-standardized-format" title="Permalink to this headline"></a></h2>
<p>In this example, you are going to use <a class="reference external" href="http://jmcauley.ucsd.edu/data/amazon/">Amazon electronic product review dataset</a> to build a bunch of deep neural network models for text classification. Each text in this dataset is a product review. This dataset has two categories: “positive” and “negative”. Positive means the reviewer likes the product, while negative means the reviewer does not like the product.</p>
<p><code class="docutils literal"><span class="pre">demo/quick_start</span></code> in the <a class="reference external" href="https://github.com/PaddlePaddle/Paddle">source code</a> provides script for downloading the preprocessed data as shown below. (If you want to process the raw data, you can use the script <code class="docutils literal"><span class="pre">demo/quick_start/data/proc_from_raw_data/get_data.sh</span></code>).</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nb">cd</span> demo/quick_start
./data/get_data.sh
</pre></div>
</div>
</div>
<div class="section" id="transfer-data-to-model">
<span id="transfer-data-to-model"></span><h2>Transfer Data to Model<a class="headerlink" href="#transfer-data-to-model" title="Permalink to this headline"></a></h2>
<div class="section" id="write-data-provider-with-python">
<span id="write-data-provider-with-python"></span><h3>Write Data Provider with Python<a class="headerlink" href="#write-data-provider-with-python" title="Permalink to this headline"></a></h3>
<p>The following <code class="docutils literal"><span class="pre">dataprovider_bow.py</span></code> gives a complete example of writing data provider with Python. It includes the following parts:</p>
<ul class="simple">
<li>initalizer: define the additional meta-data of the data provider and the types of the input data.</li>
<li>process: Each <code class="docutils literal"><span class="pre">yield</span></code> returns a data sample. In this case, it return the text representation and category id. The order of features in the returned result needs to be consistent with the definition of the input types in <code class="docutils literal"><span class="pre">initalizer</span></code>.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">paddle.trainer.PyDataProvider2</span> <span class="kn">import</span> <span class="o">*</span>
<span class="c1"># id of the word not in dictionary</span>
<span class="n">UNK_IDX</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1"># initializer is called by the framework during initialization.</span>
<span class="c1"># It allows the user to describe the data types and setup the</span>
<span class="c1"># necessary data structure for later use.</span>
<span class="c1"># `settings` is an object. initializer need to properly fill settings.input_types.</span>
<span class="c1"># initializer can also store other data structures needed to be used at process().</span>
<span class="c1"># In this example, dictionary is stored in settings.</span>
<span class="c1"># `dictionay` and `kwargs` are arguments passed from trainer_config.lr.py</span>
<span class="k">def</span> <span class="nf">initializer</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">dictionary</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="c1"># Put the word dictionary into settings</span>
<span class="n">settings</span><span class="o">.</span><span class="n">word_dict</span> <span class="o">=</span> <span class="n">dictionary</span>
<span class="c1"># setting.input_types specifies what the data types the data provider</span>
<span class="c1"># generates.</span>
<span class="n">settings</span><span class="o">.</span><span class="n">input_types</span> <span class="o">=</span> <span class="p">[</span>
<span class="c1"># The first input is a sparse_binary_vector,</span>
<span class="c1"># which means each dimension of the vector is either 0 or 1. It is the</span>
<span class="c1"># bag-of-words (BOW) representation of the texts.</span>
<span class="n">sparse_binary_vector</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">dictionary</span><span class="p">)),</span>
<span class="c1"># The second input is an integer. It represents the category id of the</span>
<span class="c1"># sample. 2 means there are two labels in the dataset.</span>
<span class="c1"># (1 for positive and 0 for negative)</span>
<span class="n">integer_value</span><span class="p">(</span><span class="mi">2</span><span class="p">)]</span>
<span class="c1"># Delaring a data provider. It has an initializer &#39;data_initialzer&#39;.</span>
<span class="c1"># It will cache the generated data of the first pass in memory, so that</span>
<span class="c1"># during later pass, no on-the-fly data generation will be needed.</span>
<span class="c1"># `setting` is the same object used by initializer()</span>
<span class="c1"># `file_name` is the name of a file listed train_list or test_list file given</span>
<span class="c1"># to define_py_data_sources2(). See trainer_config.lr.py.</span>
<span class="nd">@provider</span><span class="p">(</span><span class="n">init_hook</span><span class="o">=</span><span class="n">initializer</span><span class="p">,</span> <span class="n">cache</span><span class="o">=</span><span class="n">CacheType</span><span class="o">.</span><span class="n">CACHE_PASS_IN_MEM</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">file_name</span><span class="p">):</span>
<span class="c1"># Open the input data file.</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">&#39;r&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="c1"># Read each line.</span>
<span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="p">:</span>
<span class="c1"># Each line contains the label and text of the comment, separated by \t.</span>
<span class="n">label</span><span class="p">,</span> <span class="n">comment</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s1">&#39;</span><span class="se">\t</span><span class="s1">&#39;</span><span class="p">)</span>
<span class="c1"># Split the words into a list.</span>
<span class="n">words</span> <span class="o">=</span> <span class="n">comment</span><span class="o">.</span><span class="n">split</span><span class="p">()</span>
<span class="c1"># convert the words into a list of ids by looking them up in word_dict.</span>
<span class="n">word_vector</span> <span class="o">=</span> <span class="p">[</span><span class="n">settings</span><span class="o">.</span><span class="n">word_dict</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">UNK_IDX</span><span class="p">)</span> <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">words</span><span class="p">]</span>
<span class="c1"># Return the features for the current comment. The first is a list</span>
<span class="c1"># of ids representing a 0-1 binary sparse vector of the text,</span>
<span class="c1"># the second is the integer id of the label.</span>
<span class="k">yield</span> <span class="n">word_vector</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">label</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="define-python-data-provider-in-configuration-files">
<span id="define-python-data-provider-in-configuration-files"></span><h3>Define Python Data Provider in Configuration files.<a class="headerlink" href="#define-python-data-provider-in-configuration-files" title="Permalink to this headline"></a></h3>
<p>You need to add a data provider definition <code class="docutils literal"><span class="pre">define_py_data_sources2</span></code> in our network configuration. This definition specifies:</p>
<ul class="simple">
<li>The path of the training and testing data (<code class="docutils literal"><span class="pre">data/train.list</span></code>, <code class="docutils literal"><span class="pre">data/test.list</span></code>).</li>
<li>The location of the data provider file (<code class="docutils literal"><span class="pre">dataprovider_bow</span></code>).</li>
<li>The function to call to get data. (<code class="docutils literal"><span class="pre">process</span></code>).</li>
<li>Additional arguments or data. Here it passes the path of word dictionary.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">paddle.trainer_config_helpers</span> <span class="kn">import</span> <span class="o">*</span>
<span class="nb">file</span> <span class="o">=</span> <span class="s2">&quot;data/dict.txt&quot;</span>
<span class="n">word_dict</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">()</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">dict_file</span><span class="p">,</span> <span class="s1">&#39;r&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">f</span><span class="p">):</span>
<span class="n">w</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span><span class="o">.</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">word_dict</span><span class="p">[</span><span class="n">w</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span>
<span class="c1"># define the data sources for the model.</span>
<span class="c1"># We need to use different process for training and prediction.</span>
<span class="c1"># For training, the input data includes both word IDs and labels.</span>
<span class="c1"># For prediction, the input data only includs word Ids.</span>
<span class="n">define_py_data_sources2</span><span class="p">(</span><span class="n">train_list</span><span class="o">=</span><span class="s1">&#39;data/train.list&#39;</span><span class="p">,</span>
<span class="n">test_list</span><span class="o">=</span><span class="s1">&#39;data/test.list&#39;</span><span class="p">,</span>
<span class="n">module</span><span class="o">=</span><span class="s2">&quot;dataprovider_bow&quot;</span><span class="p">,</span>
<span class="n">obj</span><span class="o">=</span><span class="s2">&quot;process&quot;</span><span class="p">,</span>
<span class="n">args</span><span class="o">=</span><span class="p">{</span><span class="s2">&quot;dictionary&quot;</span><span class="p">:</span> <span class="n">word_dict</span><span class="p">})</span>
</pre></div>
</div>
<p>You can refer to the following link for more detailed examples and data formats: <a href = "../../api/v1/data_provider/pydataprovider2_en.html">PyDataProvider2</a>.</p>
</div>
</div>
<div class="section" id="network-architecture">
<span id="network-architecture"></span><h2>Network Architecture<a class="headerlink" href="#network-architecture" title="Permalink to this headline"></a></h2>
<p>We will describe four kinds of network architectures in this section.
<center> <img alt="" src="../../_images/PipelineNetwork_en.jpg" /> </center></p>
<p>First, you will build a logistic regression model. Later, you will also get chance to build other more powerful network architectures.
For more detailed documentation, you could refer to: <a href = "../../api/v1/trainer_config_helpers/layers.html">layer documentation</a>. All configuration files are in <code class="docutils literal"><span class="pre">demo/quick_start</span></code> directory.</p>
<div class="section" id="logistic-regression">
<span id="logistic-regression"></span><h3>Logistic Regression<a class="headerlink" href="#logistic-regression" title="Permalink to this headline"></a></h3>
<p>The architecture is illustrated in the following picture:
<center> <img alt="" src="../../_images/NetLR_en.png" /> </center></p>
<ul class="simple">
<li>You need define the data for text features. The size of the data layer is the number of words in the dictionary.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">word</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;word&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">voc_dim</span><span class="p">)</span>
</pre></div>
</div>
<ul class="simple">
<li>You also need to define the category id for each example. The size of the data layer is the number of labels.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">label</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;label&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">label_dim</span><span class="p">)</span>
</pre></div>
</div>
<ul class="simple">
<li>It uses logistic regression model to classify the vector, and it will output the classification error during training.<ul>
<li>Each layer has an <em>input</em> argument that specifies its input layer. Some layers can have multiple input layers. You can use a list of the input layers as input in that case.</li>
<li><em>size</em> for each layer means the number of neurons of the layer.</li>
<li><em>act_type</em> means activation function applied to the output of each neuron independently.</li>
<li>Some layers can have additional special inputs. For example, <code class="docutils literal"><span class="pre">classification_cost</span></code> needs ground truth label as input to compute classification loss and error.</li>
</ul>
</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Define a fully connected layer with logistic activation (also called softmax activation).</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">word</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="n">label_dim</span><span class="p">,</span>
<span class="n">act_type</span><span class="o">=</span><span class="n">SoftmaxActivation</span><span class="p">())</span>
<span class="c1"># Define cross-entropy classification loss and error.</span>
<span class="n">classification_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">output</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
</pre></div>
</div>
<p>Performance summary: You can refer to the training and testing scripts later. In order to compare different network architectures, the model complexity and test classification error are listed in the following table:</p>
<p><html>
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead><tbody>
<tr>
<td class="left">Logistic regression</td>
<td class="left">252 KB</td>
<td class="left">8.652%</td>
</tr></tbody>
</table></center>
</html>
<br></div>
<div class="section" id="word-embedding-model">
<span id="word-embedding-model"></span><h3>Word Embedding Model<a class="headerlink" href="#word-embedding-model" title="Permalink to this headline"></a></h3>
<p>In order to use the word embedding model, you need to change the data provider a little bit to make the input words as a sequence of word IDs. The revised data provider <code class="docutils literal"><span class="pre">dataprovider_emb.py</span></code> is listed below. You only need to change initializer() for the type of the first input. It is changed from sparse_binary_vector to sequence of intergers. process() remains the same. This data provider can also be used for later sequence models.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">initializer</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">dictionary</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="c1"># Put the word dictionary into settings</span>
<span class="n">settings</span><span class="o">.</span><span class="n">word_dict</span> <span class="o">=</span> <span class="n">dictionary</span>
<span class="n">settings</span><span class="o">.</span><span class="n">input_types</span> <span class="o">=</span> <span class="p">[</span>
<span class="c1"># Define the type of the first input as a sequence of integers.</span>
<span class="n">integer_value_sequence</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">dictionary</span><span class="p">)),</span>
<span class="c1"># Define the second input for label id</span>
<span class="n">integer_value</span><span class="p">(</span><span class="mi">2</span><span class="p">)]</span>
<span class="nd">@provider</span><span class="p">(</span><span class="n">init_hook</span><span class="o">=</span><span class="n">initializer</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">file_name</span><span class="p">):</span>
<span class="o">...</span>
<span class="c1"># omitted, it is same as the data provider for LR model</span>
</pre></div>
</div>
<p>This model is very similar to the framework of logistic regression, but it uses word embedding vectors instead of a sparse vectors to represent words.
<center> <img alt="" src="../../_images/NetContinuous_en.png" /> </center></p>
<ul class="simple">
<li>It can look up the dense word embedding vector in the dictionary (its words embedding vector is <code class="docutils literal"><span class="pre">word_dim</span></code>). The input is a sequence of N words, the output is N word_dim dimensional vectors.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">emb</span> <span class="o">=</span> <span class="n">embedding_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">word</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="n">word_dim</span><span class="p">)</span>
</pre></div>
</div>
<ul class="simple">
<li>It averages all the word embedding in a sentence to get its sentence representation.</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">avg</span> <span class="o">=</span> <span class="n">pooling_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">pooling_type</span><span class="o">=</span><span class="n">AvgPooling</span><span class="p">())</span>
</pre></div>
</div>
<p>The other parts of the model are the same as logistic regression network.</p>
<p>The performance is summarized in the following table:</p>
<p><html>
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead><tbody>
<tr>
<td class="left">Word embedding model</td>
<td class="left">15 MB</td>
<td class="left">8.484%</td>
</tr></tbody>
</table>
</html></center>
<br></div>
<div class="section" id="convolutional-neural-network-model">
<span id="convolutional-neural-network-model"></span><h3>Convolutional Neural Network Model<a class="headerlink" href="#convolutional-neural-network-model" title="Permalink to this headline"></a></h3>
<p>Convolutional neural network converts a sequence of word embeddings into a sentence representation using temporal convolutions. You will transform the fully connected layer of the word embedding model to 3 new sub-steps.
<center> <img alt="" src="../../_images/NetConv_en.png" /> </center></p>
<p>Text convolution has 3 steps:</p>
<ol class="simple">
<li>Get K nearest neighbor context of each word in a sentence, stack them into a 2D vector representation.</li>
<li>Apply temporal convolution to this representation to produce a new hidden_dim dimensional vector.</li>
<li>Apply max-pooling to the new vectors at all the time steps in a sentence to get a sentence representation.</li>
</ol>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># context_len means convolution kernel size.</span>
<span class="c1"># context_start means the start of the convolution. It can be negative. In that case, zero padding is applied.</span>
<span class="n">text_conv</span> <span class="o">=</span> <span class="n">sequence_conv_pool</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span>
<span class="n">context_start</span><span class="o">=</span><span class="n">k</span><span class="p">,</span>
<span class="n">context_len</span><span class="o">=</span><span class="mi">2</span> <span class="o">*</span> <span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</pre></div>
</div>
<p>The performance is summarized in the following table:</p>
<p><html>
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead><tbody>
<tr>
<td class="left">Convolutional model</td>
<td class="left">16 MB</td>
<td class="left">5.628%</td>
</tr></tbody>
</table></center>
<br></div>
<div class="section" id="recurrent-model">
<span id="recurrent-model"></span><h3>Recurrent Model<a class="headerlink" href="#recurrent-model" title="Permalink to this headline"></a></h3>
<p><center> <img alt="" src="../../_images/NetRNN_en.png" /> </center></p>
<p>You can use Recurrent neural network as our time sequence model, including simple RNN model, GRU model, and LSTM model。</p>
<ul class="simple">
<li>GRU model can be specified via:</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">gru_size</span><span class="p">)</span>
</pre></div>
</div>
<ul class="simple">
<li>LSTM model can be specified via:</li>
</ul>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm</span> <span class="o">=</span> <span class="n">simple_lstm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">lstm_size</span><span class="p">)</span>
</pre></div>
</div>
<p>You can use single layer LSTM model with Dropout for our text classification problem. The performance is summarized in the following table:</p>
<p><html>
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Test error</th>
</tr>
</thead><tbody>
<tr>
<td class="left">Recurrent model</td>
<td class="left">16 MB</td>
<td class="left">4.812%</td>
</tr></tbody>
</table></center>
</html>
<br></div>
</div>
<div class="section" id="optimization-algorithm">
<span id="optimization-algorithm"></span><h2>Optimization Algorithm<a class="headerlink" href="#optimization-algorithm" title="Permalink to this headline"></a></h2>
<p><a href = "../../api/v1/trainer_config_helpers/optimizers.html">Optimization algorithms</a> include Momentum, RMSProp, AdaDelta, AdaGrad, Adam, and Adamax. You can use Adam optimization method here, with L2 regularization and gradient clipping, because Adam has been proved to work very well for training recurrent neural network.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">settings</span><span class="p">(</span><span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">2e-3</span><span class="p">,</span>
<span class="n">learning_method</span><span class="o">=</span><span class="n">AdamOptimizer</span><span class="p">(),</span>
<span class="n">regularization</span><span class="o">=</span><span class="n">L2Regularization</span><span class="p">(</span><span class="mf">8e-4</span><span class="p">),</span>
<span class="n">gradient_clipping_threshold</span><span class="o">=</span><span class="mi">25</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="training-model">
<span id="training-model"></span><h2>Training Model<a class="headerlink" href="#training-model" title="Permalink to this headline"></a></h2>
<p>After completing data preparation and network architecture specification, you will run the training script.
<center> <img alt="" src="../../_images/PipelineTrain_en.png" /> </center></p>
<p>Training script: our training script is in <code class="docutils literal"><span class="pre">train.sh</span></code> file. The training arguments are listed below:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>paddle train <span class="se">\</span>
--config<span class="o">=</span>trainer_config.py <span class="se">\</span>
--log_period<span class="o">=</span><span class="m">20</span> <span class="se">\</span>
--save_dir<span class="o">=</span>./output <span class="se">\</span>
--num_passes<span class="o">=</span><span class="m">15</span> <span class="se">\</span>
--use_gpu<span class="o">=</span><span class="nb">false</span>
</pre></div>
</div>
<p>We do not provide examples on how to train on clusters here. If you want to train on clusters, please follow the <a href = "../../howto/usage/cluster/cluster_train_en.html">distributed training</a> documentation or other demos for more details.</p>
</div>
<div class="section" id="inference">
<span id="inference"></span><h2>Inference<a class="headerlink" href="#inference" title="Permalink to this headline"></a></h2>
<p>You can use the trained model to perform prediction on the dataset with no labels. You can also evaluate the model on dataset with labels to obtain its test accuracy.
<center> <img alt="" src="../../_images/PipelineTest_en.png" /> </center></p>
<p>The test script is listed below. PaddlePaddle can evaluate a model on the data with labels specified in <code class="docutils literal"><span class="pre">test.list</span></code>.</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>paddle train <span class="se">\</span>
--config<span class="o">=</span>trainer_config.lstm.py <span class="se">\</span>
--use_gpu<span class="o">=</span><span class="nb">false</span> <span class="se">\</span>
--job<span class="o">=</span><span class="nb">test</span> <span class="se">\</span>
--init_model_path<span class="o">=</span>./output/pass-0000x
</pre></div>
</div>
<p>We will give an example of performing prediction using Recurrent model on a dataset with no labels. You can refer to <a href = "../../api/v1/predict/swig_py_paddle_en.html">Python Prediction API</a> tutorial,or other <a href = "../../tutorials/index_en.html">demo</a> for the prediction process using Python. You can also use the following script for inference or evaluation.</p>
<p>inference script (predict.sh):</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nv">model</span><span class="o">=</span><span class="s2">&quot;output/pass-00003&quot;</span>
paddle train <span class="se">\</span>
--config<span class="o">=</span>trainer_config.lstm.py <span class="se">\</span>
--use_gpu<span class="o">=</span><span class="nb">false</span> <span class="se">\</span>
--job<span class="o">=</span><span class="nb">test</span> <span class="se">\</span>
--init_model_path<span class="o">=</span><span class="nv">$model</span> <span class="se">\</span>
--config_args<span class="o">=</span><span class="nv">is_predict</span><span class="o">=</span><span class="m">1</span> <span class="se">\</span>
--predict_output_dir<span class="o">=</span>. <span class="se">\</span>
mv rank-00000 result.txt
</pre></div>
</div>
<p>User can choose the best model base on the training log instead of model <code class="docutils literal"><span class="pre">output/pass-00003</span></code>. There are several differences between training and inference network configurations.</p>
<ul class="simple">
<li>You do not need labels during inference.</li>
<li>Outputs need to be specified to the classification probability layer (the output of softmax layer), or the id of maximum probability (<code class="docutils literal"><span class="pre">max_id</span></code> layer). An example to output the id and probability is given in the code snippet.</li>
<li>batch_size = 1.</li>
<li>You need to specify the location of <code class="docutils literal"><span class="pre">test_list</span></code> in the test data.</li>
</ul>
<p>The results in <code class="docutils literal"><span class="pre">result.txt</span></code> is as follows, each line is one sample.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">predicted_label_id</span><span class="p">;</span><span class="n">probability_of_label_0</span> <span class="n">probability_of_label_1</span> <span class="c1"># the first sample</span>
<span class="n">predicted_label_id</span><span class="p">;</span><span class="n">probability_of_label_0</span> <span class="n">probability_of_label_1</span> <span class="c1"># the second sample</span>
</pre></div>
</div>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">is_predict</span> <span class="o">=</span> <span class="n">get_config_arg</span><span class="p">(</span><span class="s1">&#39;is_predict&#39;</span><span class="p">,</span> <span class="nb">bool</span><span class="p">,</span> <span class="bp">False</span><span class="p">)</span>
<span class="n">trn</span> <span class="o">=</span> <span class="s1">&#39;data/train.list&#39;</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_predict</span> <span class="k">else</span> <span class="bp">None</span>
<span class="n">tst</span> <span class="o">=</span> <span class="s1">&#39;data/test.list&#39;</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_predict</span> <span class="k">else</span> <span class="s1">&#39;data/pred.list&#39;</span>
<span class="n">obj</span> <span class="o">=</span> <span class="s1">&#39;process&#39;</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_predict</span> <span class="k">else</span> <span class="s1">&#39;process_pre&#39;</span>
<span class="n">batch_size</span> <span class="o">=</span> <span class="mi">128</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_predict</span> <span class="k">else</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">is_predict</span><span class="p">:</span>
<span class="n">maxid</span> <span class="o">=</span> <span class="n">maxid_layer</span><span class="p">(</span><span class="n">output</span><span class="p">)</span>
<span class="n">outputs</span><span class="p">([</span><span class="n">maxid</span><span class="p">,</span><span class="n">output</span><span class="p">])</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">label</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;label&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="bp">cls</span> <span class="o">=</span> <span class="n">classification_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">output</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span> <span class="n">outputs</span><span class="p">(</span><span class="bp">cls</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="summary">
<span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="Permalink to this headline"></a></h2>
<p>The scripts of data downloading, network configurations, and training scrips are in <code class="docutils literal"><span class="pre">/demo/quick_start</span></code>. The following table summarizes the performance of our network architecture on Amazon-Elec dataset(25k):</p>
<p><center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Network name</th>
<th scope="col" class="left">Number of parameters</th>
<th scope="col" class="left">Error rate</th>
<th scope="col" class="left">Configuration file name</th>
</tr>
</thead><tbody>
<tr>
<td class="left">Logistic regression model(BOW)</td>
<td class="left"> 252KB </td>
<td class="left">8.652%</td>
<td class="left">trainer_config.lr.py</td>
</tr><tr>
<td class="left">Word embedding</td>
<td class="left"> 15MB </td>
<td class="left"> 8.484%</td>
<td class="left">trainer_config.emb.py</td>
</tr><tr>
<td class="left">Convolution model</td>
<td class="left"> 16MB </td>
<td class="left"> 5.628%</td>
<td class="left">trainer_config.cnn.py</td>
</tr><tr>
<td class="left">Time sequence model</td>
<td class="left"> 16MB </td>
<td class="left"> 4.812%</td>
<td class="left">trainer_config.lstm.py</td>
</tr></tbody>
</table>
</center>
<br></div>
<div class="section" id="appendix">
<span id="appendix"></span><h2>Appendix<a class="headerlink" href="#appendix" title="Permalink to this headline"></a></h2>
<div class="section" id="command-line-argument">
<span id="command-line-argument"></span><h3>Command Line Argument<a class="headerlink" href="#command-line-argument" title="Permalink to this headline"></a></h3>
<ul class="simple">
<li>--config:network architecture path.</li>
<li>--save_dir:model save directory.</li>
<li>--log_period:the logging period per batch.</li>
<li>--num_passes:number of training passes. One pass means the training would go over the whole training dataset once.</li>
<li>--config_args:Other configuration arguments.</li>
<li>--init_model_path:The path of the initial model parameter.</li>
</ul>
<p>By default, the trainer will save model every pass. You can also specify <code class="docutils literal"><span class="pre">saving_period_by_batches</span></code> to set the frequency of batch saving. You can use <code class="docutils literal"><span class="pre">show_parameter_stats_period</span></code> to print the statistics of the parameters, which are very useful for tuning parameters. Other command line arguments can be found in <a href = "../../howto/usage/cmd_parameter/index_en.html">command line argument documentation</a></p>
</div>
<div class="section" id="log">
<span id="log"></span><h3>Log<a class="headerlink" href="#log" title="Permalink to this headline"></a></h3>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">TrainerInternal</span><span class="o">.</span><span class="n">cpp</span><span class="p">:</span><span class="mi">160</span><span class="p">]</span> <span class="n">Batch</span><span class="o">=</span><span class="mi">20</span> <span class="n">samples</span><span class="o">=</span><span class="mi">2560</span> <span class="n">AvgCost</span><span class="o">=</span><span class="mf">0.628761</span> <span class="n">CurrentCost</span><span class="o">=</span><span class="mf">0.628761</span> <span class="n">Eval</span><span class="p">:</span> <span class="n">classification_error_evaluator</span><span class="o">=</span><span class="mf">0.304297</span> <span class="n">CurrentEval</span><span class="p">:</span> <span class="n">classification_error_evaluator</span><span class="o">=</span><span class="mf">0.304297</span>
</pre></div>
</div>
<p>During model training, you will see the log like the examples above:
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border"><thead>
<th scope="col" class="left">Name</th>
<th scope="col" class="left">Explanation</th>
</tr>
</thead><tr>
<td class="left">Batch=20</td>
<td class="left"> You have trained 20 batches. </td>
</tr><tr>
<td class="left">samples=2560</td>
<td class="left"> You have trained 2560 examples. </td>
</tr><tr>
<td class="left">AvgCost</td>
<td class="left"> The average cost from the first batch to the current batch. </td>
</tr><tr>
<td class="left">CurrentCost</td>
<td class="left"> the average cost of the last log_period batches </td>
</tr><tr>
<td class="left">Eval: classification_error_evaluator</td>
<td class="left"> The average classification error from the first batch to the current batch.</td>
</tr><tr>
<td class="left">CurrentEval: classification_error_evaluator</td>
<td class="left"> The average error rate of the last log_period batches </td>
</tr></tbody>
</table>
</center>
<br></div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
The tutorials in v1_api_tutorials are using v1_api currently, and will be upgraded to v2_api later.
Thus, v1_api_tutorials is a temporary directory. We decide not to maintain it and will delete it in future.
Please go to [PaddlePaddle/book](https://github.com/PaddlePaddle/book) and
[PaddlePaddle/models](https://github.com/PaddlePaddle/models) to learn PaddlePaddle.
# 中文词向量模型的使用 #
----------
本文档介绍如何在PaddlePaddle平台上,使用预训练的标准格式词向量模型。
在此感谢 @lipeng 提出的代码需求,并给出的相关模型格式的定义。
## 介绍 ###
### 中文字典 ###
我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下: "《红楼梦》"将被分为 "《","红楼梦","》",和 "《红楼梦》"。字典采用UTF8编码,输出有2列:词本身和词频。字典共包含 3206326个词和4个特殊标记:
- `<s>`: 分词序列的开始
- `<e>`: 分词序列的结束
- `PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING`: 占位符,没有实际意义
- `<unk>`: 未知词
### 中文词向量的预训练模型 ###
遵循文章 [A Neural Probabilistic Language Model](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)中介绍的方法,模型采用 n-gram 语言模型,结构如下图:6元上下文作为输入层->全连接层->softmax层 。对应于字典,我们预训练得到4种不同维度的词向量,分别为:32维、64维、128维和256维。
<center>![](./neural-n-gram-model.png)</center>
<center>Figure 1. neural-n-gram-model</center>
### 下载和数据抽取 ###
运行以下的命令下载和获取我们的字典和预训练模型:
cd $PADDLE_ROOT/demo/model_zoo/embedding
./pre_DictAndModel.sh
## 中文短语改写的例子 ##
以下示范如何使用预训练的中文字典和词向量进行短语改写。
### 数据的准备和预处理 ###
首先,运行以下的命令下载数据集。该数据集(utf8编码)包含20个训练样例,5个测试样例和2个生成式样例。
cd $PADDLE_ROOT/demo/seqToseq/data
./paraphrase_data.sh
第二步,将数据处理成规范格式,在训练数集上训练生成词向量字典(数据将保存在 `$PADDLE_SOURCE_ROOT/demo/seqToseq/data/pre-paraphrase`):
cd $PADDLE_ROOT/demo/seqToseq/
python preprocess.py -i data/paraphrase [--mergeDict]
- 其中,如果使用`--mergeDict`选项,源语言短语和目标语言短语的字典将被合并(源语言和目标语言共享相同的编码字典)。本实例中,源语言和目标语言都是相同的语言,因此可以使用该选项。
### 使用用户指定的词向量字典 ###
使用如下命令,从预训练模型中,根据用户指定的字典,抽取对应的词向量构成新的词表:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python extract_para.py --preModel PREMODEL --preDict PREDICT --usrModel USRMODEL--usrDict USRDICT -d DIM
- `--preModel PREMODEL`: 预训练词向量字典模型的路径
- `--preDict PREDICT`: 预训练模型使用的字典的路径
- `--usrModel USRMODEL`: 抽取出的新词表的保存路径
- `--usrDict USRDICT`: 用户指定新的字典的路径,用于构成新的词表
- `-d DIM`: 参数(词向量)的维度
此处,你也可以简单的运行以下的命令:
cd $PADDLE_ROOT/demo/seqToseq/data/
./paraphrase_model.sh
运行成功以后,你将会看到以下的模型结构:
paraphrase_model
|--- _source_language_embedding
|--- _target_language_embedding
### 在PaddlePaddle平台训练模型 ###
首先,配置模型文件,配置如下(可以参考保存在 `demo/seqToseq/paraphrase/train.conf`的配置):
from seqToseq_net import *
is_generating = False
################## Data Definition #####################
train_conf = seq_to_seq_data(data_dir = "./data/pre-paraphrase",
job_mode = job_mode)
############## Algorithm Configuration ##################
settings(
learning_method = AdamOptimizer(),
batch_size = 50,
learning_rate = 5e-4)
################# Network configure #####################
gru_encoder_decoder(train_conf, is_generating, word_vector_dim = 32)
这个配置与`demo/seqToseq/translation/train.conf` 基本相同
然后,使用以下命令进行模型训练:
cd $PADDLE_SOURCE_ROOT/demo/seqToseq/paraphrase
./train.sh
其中,`train.sh` 与`demo/seqToseq/translation/train.sh` 基本相同,只有2个配置不一样:
- `--init_model_path`: 初始化模型的路径配置为`data/paraphrase_modeldata/paraphrase_model`
- `--load_missing_parameter_strategy`:如果参数模型文件缺失,除词向量模型外的参数将使用正态分布随机初始化
如果用户想要了解详细的数据集的格式、模型的结构和训练过程,请查看 [Text generation Tutorial](../text_generation/index_cn.md).
## 可选功能 ##
### 观测词向量
PaddlePaddle 平台为想观测词向量的用户提供了将二进制词向量模型转换为文本模型的功能:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --b2t -i INPUT -o OUTPUT -d DIM
- `-i INPUT`: 输入的(二进制)词向量模型名称
- `-o OUTPUT`: 输出的文本模型名称
- `-d DIM`: (词向量)参数维度
运行完以上命令,用户可以在输出的文本模型中看到:
0,4,32156096
-0.7845433,1.1937413,-0.1704215,0.4154715,0.9566584,-0.5558153,-0.2503305, ......
0.0000909,0.0009465,-0.0008813,-0.0008428,0.0007879,0.0000183,0.0001984, ......
......
- 其中,第一行是`PaddlePaddle` 输出文件的格式说明,包含3个属性::
- `PaddlePaddle`的版本号,本例中为0
- 浮点数占用的字节数,本例中为4
- 总计的参数个数,本例中为32,156,096
- 其余行是(词向量)参数行(假设词向量维度为32)
- 每行打印32个参数以','分隔
- 共有32,156,096/32 = 1,004,877行,也就是说,模型共包含1,004,877个被向量化的词
### 词向量模型的修正
`PaddlePaddle` 为想修正词向量模型的用户提供了将文本词向量模型转换为二进制模型的命令:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --t2b -i INPUT -o OUTPUT
- `-i INPUT`: 输入的文本词向量模型名称
- `-o OUTPUT`: 输出的二进制词向量模型名称
请注意,输入的文本格式如下:
-0.7845433,1.1937413,-0.1704215,0.4154715,0.9566584,-0.5558153,-0.2503305, ......
0.0000909,0.0009465,-0.0008813,-0.0008428,0.0007879,0.0000183,0.0001984, ......
......
- 输入文本中没有头部(格式说明)行
- (输入文本)每行存储一个词,以逗号','分隔
# Model Zoo - ImageNet #
[ImageNet](http://www.image-net.org/) 是通用物体分类领域一个众所周知的数据库。本教程提供了一个用于ImageNet上的卷积分类网络模型。
## ResNet 介绍
论文 [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385) 中提出的ResNet网络结构在2015年ImageNet大规模视觉识别竞赛(ILSVRC 2015)的分类任务中赢得了第一名。他们提出残差学习的框架来简化网络的训练,所构建网络结构的的深度比之前使用的网络有大幅度的提高。下图展示的是基于残差的连接方式。左图构造网络模块的方式被用于34层的网络中,而右图的瓶颈连接模块用于50层,101层和152层的网络结构中。
<center>![resnet_block](./resnet_block.jpg)</center>
<center>图 1. ResNet 网络模块</center>
本教程中我们给出了三个ResNet模型,这些模型都是由原作者提供的模型<https://github.com/KaimingHe/deep-residual-networks>转换过来的。我们使用PaddlePaddle在ILSVRC的验证集共50,000幅图像上测试了模型的分类错误率,其中输入图像的颜色通道顺序为**BGR**,保持宽高比缩放到短边为256,只截取中心方形的图像区域。分类错误率和模型大小由下表给出。
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">ResNet</th>
<th scope="col" class="left">Top-1</th>
<th scope="col" class="left">Model Size</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">ResNet-50</td>
<td class="left">24.9%</td>
<td class="left">99M</td>
</tr>
<tr>
<td class="left">ResNet-101</td>
<td class="left">23.7%</td>
<td class="left">173M</td>
</tr>
<tr>
<td class="left">ResNet-152</td>
<td class="left">23.2%</td>
<td class="left">234M</td>
</tr>
</tbody>
</table></center>
<br>
## ResNet 模型
50层,101层和152层的网络配置文件可参照```demo/model_zoo/resnet/resnet.py```。你也可以通过在命令行参数中增加一个参数如```--config_args=layer_num=50```来指定网络层的数目。
### 网络可视化
你可以通过执行下面的命令来得到ResNet网络的结构可视化图。该脚本会生成一个dot文件,然后可以转换为图片。需要安装graphviz来转换dot文件为图片。
```
cd demo/model_zoo/resnet
./net_diagram.sh
```
### 模型下载
```
cd demo/model_zoo/resnet
./get_model.sh
```
你可以执行上述命令来下载所有的模型和均值文件,如果下载成功,这些文件将会被保存在```demo/model_zoo/resnet/model```路径下。
```
mean_meta_224 resnet_101 resnet_152 resnet_50
```
* resnet_50: 50层网络模型。
* resnet_101: 101层网络模型。
* resnet_152: 152层网络模型。
* mean\_meta\_224: 均值图像文件,图像大小为3 x 224 x 224,颜色通道顺序为**BGR**。你也可以使用这三个值: 103.939, 116.779, 123.68。
### 参数信息
* **卷积层权重**
由于每个卷积层后面连接的是batch normalization层,因此该层中没有偏置(bias)参数,并且只有一个权重。
形状: `(Co, ky, kx, Ci)`
* Co: 输出特征图的通道数目
* ky: 滤波器核在垂直方向上的尺寸
* kx: 滤波器核在水平方向上的尺寸
* Ci: 输入特征图的通道数目
二维矩阵: (Co * ky * kx, Ci), 行优先次序存储。
* **全连接层权重**
二维矩阵: (输入层尺寸, 本层尺寸), 行优先次序存储。
* **[Batch Normalization](<http://arxiv.org/abs/1502.03167>) 层权重**
本层有四个参数,实际上只有.w0和.wbias是需要学习的参数,另外两个分别是滑动均值和方差。在测试阶段它们将会被加载到模型中。下表展示了batch normalization层的参数。
<center>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">参数名</th>
<th scope="col" class="left">尺寸</th>
<th scope="col" class="left">含义</th>
</tr>
</thead>
<tbody>
<tr>
<td class="left">_res2_1_branch1_bn.w0</td>
<td class="left">256</td>
<td class="left">gamma, 缩放参数</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w1</td>
<td class="left">256</td>
<td class="left">特征图均值</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w2</td>
<td class="left">256</td>
<td class="left">特征图方差</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.wbias</td>
<td class="left">256</td>
<td class="left">beta, 偏置参数</td>
</tr>
</tbody>
</table></center>
<br>
### 参数读取
使用者可以使用下面的Python脚本来读取参数值:
```
import sys
import numpy as np
def load(file_name):
with open(file_name, 'rb') as f:
f.read(16) # skip header for float type.
return np.fromfile(f, dtype=np.float32)
if __name__=='__main__':
weight = load(sys.argv[1])
```
或者直接使用下面的shell命令:
```
od -j 16 -f _res2_1_branch1_bn.w0
```
## 特征提取
我们提供了C++和Python接口来提取特征。下面的例子使用了`demo/model_zoo/resnet/example`中的数据,详细地展示了整个特征提取的过程。
### C++接口
首先,在配置文件中的`define_py_data_sources2`里指定图像数据列表,具体请参照示例`demo/model_zoo/resnet/resnet.py`。
```
train_list = 'train.list' if not is_test else None
# mean.meta is mean file of ImageNet dataset.
# mean.meta size : 3 x 224 x 224.
# If you use three mean value, set like:
# "mean_value:103.939,116.779,123.68;"
args={
'mean_meta': "model/mean_meta_224/mean.meta",
'image_size': 224, 'crop_size': 224,
'color': True,'swap_channel:': [2, 1, 0]}
define_py_data_sources2(train_list,
'example/test.list',
module="example.image_list_provider",
obj="processData",
args=args)
```
第二步,在`resnet.py`文件中指定要提取特征的网络层的名字。例如,
```
Outputs("res5_3_branch2c_conv", "res5_3_branch2c_bn")
```
第三步,在`extract_fea_c++.sh`文件中指定模型路径和输出的目录,然后执行下面的命令。
```
cd demo/model_zoo/resnet
./extract_fea_c++.sh
```
如果执行成功,特征将会存到`fea_output/rank-00000`文件中,如下所示。同时你可以使用`load_feature.py`文件中的`load_feature_c`接口来加载该文件。
```
-0.115318 -0.108358 ... -0.087884;-1.27664 ... -1.11516 -2.59123;
-0.126383 -0.116248 ... -0.00534909;-1.42593 ... -1.04501 -1.40769;
```
* 每行存储的是一个样本的特征。其中,第一行存的是图像`example/dog.jpg`的特征,第二行存的是图像`example/cat.jpg`的特征。
* 不同层的特征由分号`;`隔开,并且它们的顺序与`Outputs()`中指定的层顺序一致。这里,左边是`res5_3_branch2c_conv`层的特征,右边是`res5_3_branch2c_bn`层特征。
### Python接口
示例`demo/model_zoo/resnet/classify.py`中展示了如何使用Python来提取特征。下面的例子同样使用了`./example/test.list`中的数据。执行的命令如下:
```
cd demo/model_zoo/resnet
./extract_fea_py.sh
```
extract_fea_py.sh:
```
python classify.py \
--job=extract \
--conf=resnet.py\
--use_gpu=1 \
--mean=model/mean_meta_224/mean.meta \
--model=model/resnet_50 \
--data=./example/test.list \
--output_layer="res5_3_branch2c_conv,res5_3_branch2c_bn" \
--output_dir=features
```
* \--job=extract: 指定工作模式来提取特征。
* \--conf=resnet.py: 网络配置文件。
* \--use_gpu=1: 指定是否使用GPU。
* \--model=model/resnet_50: 模型路径。
* \--data=./example/test.list: 数据列表。
* \--output_layer="xxx,xxx": 指定提取特征的层。
* \--output_dir=features: 输出目录。
如果运行成功,你将会看到特征存储在`features/batch_0`文件中,该文件是由cPickle产生的。你可以使用`load_feature.py`中的`load_feature_py`接口来打开该文件,它将返回如下的字典:
```
{
'cat.jpg': {'res5_3_branch2c_conv': array([[-0.12638293, -0.116248 , -0.11883899, ..., -0.00895038, 0.01994277, -0.00534909]], dtype=float32), 'res5_3_branch2c_bn': array([[-1.42593431, -1.28918779, -1.32414699, ..., -1.45933616, -1.04501402, -1.40769434]], dtype=float32)},
'dog.jpg': {'res5_3_branch2c_conv': array([[-0.11531784, -0.10835785, -0.08809858, ...,0.0055237, 0.01505112, -0.08788397]], dtype=float32), 'res5_3_branch2c_bn': array([[-1.27663755, -1.18272924, -0.90937918, ..., -1.25178063, -1.11515927, -2.59122872]], dtype=float32)}
}
```
仔细观察,这些特征值与上述使用C++接口提取的结果是一致的。
## 预测
`classify.py`文件也可以用于对样本进行预测。我们提供了一个示例脚本`predict.sh`,它使用50层的ResNet模型来对`example/test.list`中的数据进行预测。
```
cd demo/model_zoo/resnet
./predict.sh
```
predict.sh调用了`classify.py`:
```
python classify.py \
--job=predict \
--conf=resnet.py\
--multi_crop \
--model=model/resnet_50 \
--use_gpu=1 \
--data=./example/test.list
```
* \--job=extract: 指定工作模型进行预测。
* \--conf=resnet.py: 网络配置文件。network configure.
* \--multi_crop: 使用10个裁剪图像块,预测概率取平均。
* \--use_gpu=1: 指定是否使用GPU。
* \--model=model/resnet_50: 模型路径。
* \--data=./example/test.list: 数据列表。
如果运行成功,你将会看到如下结果,其中156和285是这些图像的分类标签。
```
Label of example/dog.jpg is: 156
Label of example/cat.jpg is: 282
```
=============
快速入门教程
=============
我们将以 `文本分类问题 <https://en.wikipedia.org/wiki/Document_classification>`_ 为例,
介绍PaddlePaddle的基本使用方法。
安装
====
请参考 :ref:`install_steps` 安装PaddlePaddle。
使用概述
========
**文本分类问题**:对于给定的一条文本,我们从提前给定的类别集合中选择其所属类别。
比如, 在购物网站上,通过查看买家对某个产品的评价反馈, 评估该产品的质量。
- 这个显示器很棒! (好评)
- 用了两个月之后这个显示器屏幕碎了。(差评)
使用PaddlePaddle, 每一个任务流程都可以被划分为如下五个步骤。
.. image:: src/Pipeline_cn.jpg
:align: center
:scale: 80%
1. 数据格式准备
- 本例每行保存一条样本,类别Id和文本信息用 ``Tab`` 间隔,文本中的单词用空格分隔(如果不切词,则字与字之间用空格分隔),例如:``类别Id '\t' 这 个 显 示 器 很 棒 !``
2. 向系统传送数据
- PaddlePaddle可以执行用户的python脚本程序来读取各种格式的数据文件。
- 本例的所有字符都将转换为连续整数表示的Id传给模型。
3. 描述网络结构和优化算法
- 本例由易到难展示4种不同的文本分类网络配置:逻辑回归模型,词向量模型,卷积模型,时序模型。
- 常用优化算法包括Momentum, RMSProp,AdaDelta,AdaGrad,Adam,Adamax等,本例采用Adam优化方法,加了L2正则和梯度截断。
4. 训练模型
5. 应用模型
数据格式准备
------------
接下来我们将展示如何用PaddlePaddle训练一个文本分类模型,将 `Amazon电子产品评论数据 <http://jmcauley.ucsd.edu/data/amazon/>`_ 分为好评(正样本)和差评(负样本)两种类别。
`源代码 <https://github.com/PaddlePaddle/Paddle>`_ 的 ``demo/quick_start`` 目录里提供了该数据的下载脚本和预处理脚本,你只需要在命令行输入以下命令,就能够很方便的完成数据下载和相应的预处理工作。
.. code-block:: bash
cd demo/quick_start
./data/get_data.sh
./preprocess.sh
数据预处理完成之后,通过配置类似于 ``dataprovider_*.py`` 的数据读取脚本和类似于 ``trainer_config.*.py`` 的训练模型脚本,PaddlePaddle将以设置参数的方式来设置
相应的数据读取脚本和训练模型脚本。接下来,我们将对这两个步骤给出了详细的解释,你也可以先跳过本文的解释环节,直接进入训练模型章节, 使用 ``sh train.sh`` 开始训练模型,
查看`train.sh`内容,通过 **自底向上法** (bottom-up approach)来帮助你理解PaddlePaddle的内部运行机制。
向系统传送数据
==============
Python脚本读取数据
------------------
`DataProvider` 是PaddlePaddle负责提供数据的模块,主要职责在于将训练数据传入内存或者显存,让模型能够得到训练更新,其包括两个函数:
* initializer:PaddlePaddle会在调用读取数据的Python脚本之前,先调用initializer函数。在下面例子里,我们在initialzier函数里初始化词表,并且在随后的读取数据过程中填充词表。
* process:PaddlePaddle调用process函数来读取数据。每次读取一条数据后,process函数会用yield语句输出这条数据,从而能够被PaddlePaddle 捕获 (harvest)。
``dataprovider_bow.py`` 文件给出了完整例子:
.. literalinclude:: ../../../demo/quick_start/dataprovider_bow.py
:language: python
:lines: 21-70
:linenos:
:emphasize-lines: 8,33
详细内容请参见 :ref:`api_dataprovider` 。
配置中的数据加载定义
--------------------
在模型配置中通过 ``define_py_data_sources2`` 接口来加载数据:
.. literalinclude:: ../../../demo/quick_start/trainer_config.emb.py
:language: python
:lines: 19-35
:linenos:
:emphasize-lines: 12
以下是对上述数据加载的解释:
- data/train.list,data/test.list: 指定训练数据和测试数据
- module="dataprovider_bow": 处理数据的Python脚本文件
- obj="process": 指定生成数据的函数
- args={"dictionary": word_dict}: 额外的参数,这里指定词典
更详细数据格式和用例请参考 :ref:`api_pydataprovider2` 。
模型网络结构
============
本小节我们将介绍模型网络结构。
.. image:: src/PipelineNetwork_cn.jpg
:align: center
:scale: 80%
我们将以最基本的逻辑回归网络作为起点,并逐渐展示更加深入的功能。更详细的网络配置连接请参考 :ref:`api_trainer_config_helpers_layers` 。
所有配置都能在 `源代码 <https://github.com/PaddlePaddle/Paddle>`_ 的 ``demo/quick_start`` 目录下找到。
逻辑回归模型
------------
具体流程如下:
.. image:: src/NetLR_cn.jpg
:align: center
:scale: 80%
- 获取利用 `one-hot vector <https://en.wikipedia.org/wiki/One-hot>`_ 表示的每个单词,维度是词典大小
.. code-block:: python
word = data_layer(name="word", size=word_dim)
- 获取该条样本类别Id,维度是类别个数。
.. code-block:: python
label = data_layer(name="label", size=label_dim)
- 利用逻辑回归模型对该向量进行分类,同时会计算分类准确率
.. code-block:: python
# Define a fully connected layer with logistic activation (also called softmax activation).
output = fc_layer(input=word,
size=label_dim,
act_type=SoftmaxActivation())
# Define cross-entropy classification loss and error.
classification_cost(input=output, label=label)
- input: 除去data层,每个层都有一个或多个input,多个input以list方式输入
- size: 该层神经元个数
- act_type: 激活函数类型
**效果总结**:我们将在后面介绍训练和预测流程的脚本。在此为方便对比不同网络结构,我们总结了各个网络的复杂度和效果。
===================== =============================== =================
网络名称 参数数量 错误率
===================== =============================== =================
逻辑回归 252 KB 8.652 %
===================== =============================== =================
词向量模型
----------
embedding模型需要稍微改变提供数据的Python脚本,即 ``dataprovider_emb.py``,词向量模型、
卷积模型、时序模型均使用该脚本。其中文本输入类型定义为整数时序类型integer_value_sequence。
.. code-block:: python
def initializer(settings, dictionary, **kwargs):
settings.word_dict = dictionary
settings.input_types = [
# Define the type of the first input as sequence of integer.
# The value of the integers range from 0 to len(dictrionary)-1
integer_value_sequence(len(dictionary)),
# Define the second input for label id
integer_value(2)]
@provider(init_hook=initializer)
def process(settings, file_name):
...
# omitted, it is same as the data provider for LR model
该模型依然使用逻辑回归分类网络的框架, 只是将句子用连续向量表示替换为用稀疏向量表示, 即对第三步进行替换。句子表示的计算更新为两步:
.. image:: src/NetContinuous_cn.jpg
:align: center
:scale: 80%
- 利用单词Id查找该单词对应的连续向量(维度为word_dim), 输入N个单词,输出为N个word_dim维度向量
.. code-block:: python
emb = embedding_layer(input=word, size=word_dim)
- 将该句话包含的所有单词向量求平均, 得到句子的表示
.. code-block:: python
avg = pooling_layer(input=emb, pooling_type=AvgPooling())
其它部分和逻辑回归网络结构一致。
**效果总结:**
===================== =============================== ==================
网络名称 参数数量 错误率
===================== =============================== ==================
词向量模型 15 MB 8.484 %
===================== =============================== ==================
卷积模型
-----------
卷积网络是一种特殊的从词向量表示到句子表示的方法, 也就是将词向量模型进一步演化为三个新步骤。
.. image:: src/NetConv_cn.jpg
:align: center
:scale: 80%
文本卷积分可为三个步骤:
1. 首先,从每个单词左右两端分别获取k个相邻的单词, 拼接成一个新的向量;
2. 其次,对该向量进行非线性变换(例如Sigmoid变换), 使其转变为维度为hidden_dim的新向量;
3. 最后,对整个新向量集合的每一个维度取最大值来表示最后的句子。
这三个步骤可配置为:
.. code-block:: python
text_conv = sequence_conv_pool(input=emb,
context_start=k,
context_len=2 * k + 1)
**效果总结:**
===================== =============================== ========================
网络名称 参数数量 错误率
===================== =============================== ========================
卷积模型 16 MB 5.628 %
===================== =============================== ========================
时序模型
----------
.. image:: src/NetRNN_cn.jpg
:align: center
:scale: 80%
时序模型,也称为RNN模型, 包括简单的 `RNN模型 <https://en.wikipedia.org/wiki/Recurrent_neural_network>`_, `GRU模型 <https://en.wikipedia.org/wiki/Gated_recurrent_unit>`_ 和 `LSTM模型 <https://en.wikipedia.org/wiki/Long_short-term_memory>`_ 等等。
- GRU模型配置:
.. code-block:: python
gru = simple_gru(input=emb, size=gru_size)
- LSTM模型配置:
.. code-block:: python
lstm = simple_lstm(input=emb, size=lstm_size)
本次试验,我们采用单层LSTM模型,并使用了Dropout,**效果总结:**
===================== =============================== =========================
网络名称 参数数量 错误率
===================== =============================== =========================
时序模型 16 MB 4.812 %
===================== =============================== =========================
优化算法
=========
`优化算法 <http://www.paddlepaddle.org/doc/ui/api/trainer_config_helpers/optimizers_index.html>`_ 包括
Momentum, RMSProp,AdaDelta,AdaGrad,ADAM,Adamax等,这里采用Adam优化方法,同时使用了L2正则(L2 Regularization)和梯度截断(Gradient Clipping)。
.. code-block:: python
settings(batch_size=128,
learning_rate=2e-3,
learning_method=AdamOptimizer(),
regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25)
训练模型
=========
在数据加载和网络配置完成之后, 我们就可以训练模型了。
.. image:: src/PipelineTrain_cn.jpg
:align: center
:scale: 80%
训练模型,我们只需要运行 ``train.sh`` 训练脚本:
.. code-block:: bash
./train.sh
``train.sh`` 中包含了训练模型的基本命令。训练时所需设置的主要参数如下:
.. code-block:: bash
paddle train \
--config=trainer_config.py \
--log_period=20 \
--save_dir=./output \
--num_passes=15 \
--use_gpu=false
这里只简单介绍了单机训练,如何进行分布式训练,请参考 :ref:`cluster_train` 。
预测
=====
当模型训练好了之后,我们就可以进行预测了。
.. image:: src/PipelineTest_cn.jpg
:align: center
:scale: 80%
之前配置文件中 ``test.list`` 指定的数据将会被测试,这里直接通过预测脚本 ``predict.sh`` 进行预测,
更详细的说明,请参考 :ref:`api_swig_py_paddle` 。
.. code-block:: bash
model="output/pass-00003"
paddle train \
--config=trainer_config.lstm.py \
--use_gpu=false \
--job=test \
--init_model_path=$model \
--config_args=is_predict=1 \
--predict_output_dir=. \
mv rank-00000 result.txt
这里以 ``output/pass-00003`` 为例进行预测,用户可以根据训练日志,选择测试结果最好的模型来预测。
预测结果以文本的形式保存在 ``result.txt`` 中,一行为一个样本,格式如下:
.. code-block:: bash
预测ID;ID为0的概率 ID为1的概率
预测ID;ID为0的概率 ID为1的概率
总体效果总结
==============
在 ``/demo/quick_start`` 目录下,能够找到这里使用的所有数据, 网络配置, 训练脚本等等。
对于Amazon-Elec测试集(25k), 如下表格,展示了上述网络模型的训练效果:
===================== =============================== ============= ==================================
网络名称 参数数量 错误率 配置文件
===================== =============================== ============= ==================================
逻辑回归模型 252 KB 8.652% trainer_config.lr.py
词向量模型 15 MB 8.484% trainer_config.emb.py
卷积模型 16 MB 5.628% trainer_config.cnn.py
时序模型 16 MB 4.812% trainer_config.lstm.py
===================== =============================== ============= ==================================
附录
=====
命令行参数
----------
* \--config:网络配置
* \--save_dir:模型存储路径
* \--log_period:每隔多少batch打印一次日志
* \--num_passes:训练轮次,一个pass表示过一遍所有训练样本
* \--config_args:命令指定的参数会传入网络配置中。
* \--init_model_path:指定初始化模型路径,可用在测试或训练时指定初始化模型。
默认一个pass保存一次模型,也可以通过saving_period_by_batches设置每隔多少batch保存一次模型。
可以通过show_parameter_stats_period设置打印参数信息等。
其他参数请参考 命令行参数文档(链接待补充)。
输出日志
---------
.. code-block:: bash
TrainerInternal.cpp:160] Batch=20 samples=2560 AvgCost=0.628761 CurrentCost=0.628761 Eval: classification_error_evaluator=0.304297 CurrentEval: classification_error_evaluator=0.304297
模型训练会看到类似上面这样的日志信息,详细的参数解释,请参考如下表格:
=========================================== ==============================================================
名称 解释
=========================================== ==============================================================
Batch=20 表示过了20个batch
samples=2560 表示过了2560个样本
AvgCost 每个pass的第0个batch到当前batch所有样本的平均cost
CurrentCost 当前log_period个batch所有样本的平均cost
Eval: classification_error_evaluator 每个pass的第0个batch到当前batch所有样本的平均分类错误率
CurrentEval: classification_error_evaluator 当前log_period个batch所有样本的平均分类错误率
=========================================== ==============================================================
因为 它太大了无法显示 source diff 。你可以改为 查看blob
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>&lt;no title&gt; &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../genindex.html"/>
<link rel="search" title="搜索" href="../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_cn.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/fabric_cn.html">fabric集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/usage/capi/index_cn.html">PaddlePaddle C-API</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/usage/capi/workflow_of_capi_cn.html">C-API 使用流程</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/data.html">数据访问</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/run_logic.html">训练与应用</a></li>
<li class="toctree-l2"><a class="reference internal" href="../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../mobile/index_cn.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_android_cn.html">Android平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_ios_cn.html">iOS平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../mobile/cross_compiling_for_raspberry_cn.html">Raspberry Pi平台编译指南</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>&lt;no title&gt;</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<p>The tutorials in v1_api_tutorials are using v1_api currently, and will be upgraded to v2_api later.
Thus, v1_api_tutorials is a temporary directory. We decide not to maintain it and will delete it in future.</p>
<p>Please go to <a class="reference external" href="https://github.com/PaddlePaddle/book">PaddlePaddle/book</a> and
<a class="reference external" href="https://github.com/PaddlePaddle/models">PaddlePaddle/models</a> to learn PaddlePaddle.</p>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>中文词向量模型的使用 &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../../genindex.html"/>
<link rel="search" title="搜索" href="../../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_cn.html">fabric集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/capi/index_cn.html">PaddlePaddle C-API</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/workflow_of_capi_cn.html">C-API 使用流程</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">数据访问</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">训练与应用</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_cn.html">Android平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_cn.html">iOS平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_cn.html">Raspberry Pi平台编译指南</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>中文词向量模型的使用</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="">
<span id="id1"></span><h1>中文词向量模型的使用<a class="headerlink" href="#" title="永久链接至标题"></a></h1>
<hr class="docutils" />
<p>本文档介绍如何在PaddlePaddle平台上,使用预训练的标准格式词向量模型。</p>
<p>在此感谢 &#64;lipeng 提出的代码需求,并给出的相关模型格式的定义。</p>
<div class="section" id="">
<span id="id2"></span><h2>介绍<a class="headerlink" href="#" title="永久链接至标题"></a></h2>
<div class="section" id="">
<span id="id3"></span><h3>中文字典<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下: &#8220;《红楼梦》&#8221;将被分为 &#8220;&#8221;&#8221;红楼梦&#8221;&#8221;&#8221;,和 &#8220;《红楼梦》&#8221;。字典采用UTF8编码,输出有2列:词本身和词频。字典共包含 3206326个词和4个特殊标记:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">&lt;s&gt;</span></code>: 分词序列的开始</li>
<li><code class="docutils literal"><span class="pre">&lt;e&gt;</span></code>: 分词序列的结束</li>
<li><code class="docutils literal"><span class="pre">PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING</span></code>: 占位符,没有实际意义</li>
<li><code class="docutils literal"><span class="pre">&lt;unk&gt;</span></code>: 未知词</li>
</ul>
</div>
<div class="section" id="">
<span id="id4"></span><h3>中文词向量的预训练模型<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>遵循文章 <a class="reference external" href="http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf">A Neural Probabilistic Language Model</a>中介绍的方法,模型采用 n-gram 语言模型,结构如下图:6元上下文作为输入层-&gt;全连接层-&gt;softmax层 。对应于字典,我们预训练得到4种不同维度的词向量,分别为:32维、64维、128维和256维。
<center><img alt="" src="../../_images/neural-n-gram-model.png" /></center>
<center>Figure 1. neural-n-gram-model</center></p>
</div>
<div class="section" id="">
<span id="id5"></span><h3>下载和数据抽取<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>运行以下的命令下载和获取我们的字典和预训练模型:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
./pre_DictAndModel.sh
</pre></div>
</div>
</div>
</div>
<div class="section" id="">
<span id="id6"></span><h2>中文短语改写的例子<a class="headerlink" href="#" title="永久链接至标题"></a></h2>
<p>以下示范如何使用预训练的中文字典和词向量进行短语改写。</p>
<div class="section" id="">
<span id="id7"></span><h3>数据的准备和预处理<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>首先,运行以下的命令下载数据集。该数据集(utf8编码)包含20个训练样例,5个测试样例和2个生成式样例。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/data
./paraphrase_data.sh
</pre></div>
</div>
<p>第二步,将数据处理成规范格式,在训练数集上训练生成词向量字典(数据将保存在 <code class="docutils literal"><span class="pre">$PADDLE_SOURCE_ROOT/demo/seqToseq/data/pre-paraphrase</span></code>):</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/
python preprocess.py -i data/paraphrase [--mergeDict]
</pre></div>
</div>
<ul class="simple">
<li>其中,如果使用<code class="docutils literal"><span class="pre">--mergeDict</span></code>选项,源语言短语和目标语言短语的字典将被合并(源语言和目标语言共享相同的编码字典)。本实例中,源语言和目标语言都是相同的语言,因此可以使用该选项。</li>
</ul>
</div>
<div class="section" id="">
<span id="id8"></span><h3>使用用户指定的词向量字典<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>使用如下命令,从预训练模型中,根据用户指定的字典,抽取对应的词向量构成新的词表:
cd $PADDLE_ROOT/demo/model_zoo/embedding
python extract_para.py &#8211;preModel PREMODEL &#8211;preDict PREDICT &#8211;usrModel USRMODEL&#8211;usrDict USRDICT -d DIM</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">--preModel</span> <span class="pre">PREMODEL</span></code>: 预训练词向量字典模型的路径</li>
<li><code class="docutils literal"><span class="pre">--preDict</span> <span class="pre">PREDICT</span></code>: 预训练模型使用的字典的路径</li>
<li><code class="docutils literal"><span class="pre">--usrModel</span> <span class="pre">USRMODEL</span></code>: 抽取出的新词表的保存路径</li>
<li><code class="docutils literal"><span class="pre">--usrDict</span> <span class="pre">USRDICT</span></code>: 用户指定新的字典的路径,用于构成新的词表</li>
<li><code class="docutils literal"><span class="pre">-d</span> <span class="pre">DIM</span></code>: 参数(词向量)的维度</li>
</ul>
<p>此处,你也可以简单的运行以下的命令:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/seqToseq/data/
./paraphrase_model.sh
</pre></div>
</div>
<p>运行成功以后,你将会看到以下的模型结构:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">paraphrase_model</span>
<span class="o">|---</span> <span class="n">_source_language_embedding</span>
<span class="o">|---</span> <span class="n">_target_language_embedding</span>
</pre></div>
</div>
</div>
<div class="section" id="paddlepaddle">
<span id="paddlepaddle"></span><h3>在PaddlePaddle平台训练模型<a class="headerlink" href="#paddlepaddle" title="永久链接至标题"></a></h3>
<p>首先,配置模型文件,配置如下(可以参考保存在 <code class="docutils literal"><span class="pre">demo/seqToseq/paraphrase/train.conf</span></code>的配置):</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">seqToseq_net</span> <span class="k">import</span> <span class="o">*</span>
<span class="n">is_generating</span> <span class="o">=</span> <span class="kc">False</span>
<span class="c1">################## Data Definition #####################</span>
<span class="n">train_conf</span> <span class="o">=</span> <span class="n">seq_to_seq_data</span><span class="p">(</span><span class="n">data_dir</span> <span class="o">=</span> <span class="s2">&quot;./data/pre-paraphrase&quot;</span><span class="p">,</span>
<span class="n">job_mode</span> <span class="o">=</span> <span class="n">job_mode</span><span class="p">)</span>
<span class="c1">############## Algorithm Configuration ##################</span>
<span class="n">settings</span><span class="p">(</span>
<span class="n">learning_method</span> <span class="o">=</span> <span class="n">AdamOptimizer</span><span class="p">(),</span>
<span class="n">batch_size</span> <span class="o">=</span> <span class="mi">50</span><span class="p">,</span>
<span class="n">learning_rate</span> <span class="o">=</span> <span class="mf">5e-4</span><span class="p">)</span>
<span class="c1">################# Network configure #####################</span>
<span class="n">gru_encoder_decoder</span><span class="p">(</span><span class="n">train_conf</span><span class="p">,</span> <span class="n">is_generating</span><span class="p">,</span> <span class="n">word_vector_dim</span> <span class="o">=</span> <span class="mi">32</span><span class="p">)</span>
</pre></div>
</div>
<p>这个配置与<code class="docutils literal"><span class="pre">demo/seqToseq/translation/train.conf</span></code> 基本相同</p>
<p>然后,使用以下命令进行模型训练:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_SOURCE_ROOT/demo/seqToseq/paraphrase
./train.sh
</pre></div>
</div>
<p>其中,<code class="docutils literal"><span class="pre">train.sh</span></code><code class="docutils literal"><span class="pre">demo/seqToseq/translation/train.sh</span></code> 基本相同,只有2个配置不一样:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">--init_model_path</span></code>: 初始化模型的路径配置为<code class="docutils literal"><span class="pre">data/paraphrase_modeldata/paraphrase_model</span></code></li>
<li><code class="docutils literal"><span class="pre">--load_missing_parameter_strategy</span></code>:如果参数模型文件缺失,除词向量模型外的参数将使用正态分布随机初始化</li>
</ul>
<p>如果用户想要了解详细的数据集的格式、模型的结构和训练过程,请查看 <a class="reference external" href="v1_api_tutorials/text_generation/index_cn.md">Text generation Tutorial</a>.</p>
</div>
</div>
<div class="section" id="">
<span id="id9"></span><h2>可选功能<a class="headerlink" href="#" title="永久链接至标题"></a></h2>
<div class="section" id="">
<span id="id10"></span><h3>观测词向量<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>PaddlePaddle 平台为想观测词向量的用户提供了将二进制词向量模型转换为文本模型的功能:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --b2t -i INPUT -o OUTPUT -d DIM
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">-i</span> <span class="pre">INPUT</span></code>: 输入的(二进制)词向量模型名称</li>
<li><code class="docutils literal"><span class="pre">-o</span> <span class="pre">OUTPUT</span></code>: 输出的文本模型名称</li>
<li><code class="docutils literal"><span class="pre">-d</span> <span class="pre">DIM</span></code>: (词向量)参数维度</li>
</ul>
<p>运行完以上命令,用户可以在输出的文本模型中看到:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="mi">0</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">32156096</span>
<span class="o">-</span><span class="mf">0.7845433</span><span class="p">,</span><span class="mf">1.1937413</span><span class="p">,</span><span class="o">-</span><span class="mf">0.1704215</span><span class="p">,</span><span class="mf">0.4154715</span><span class="p">,</span><span class="mf">0.9566584</span><span class="p">,</span><span class="o">-</span><span class="mf">0.5558153</span><span class="p">,</span><span class="o">-</span><span class="mf">0.2503305</span><span class="p">,</span> <span class="o">......</span>
<span class="mf">0.0000909</span><span class="p">,</span><span class="mf">0.0009465</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008813</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008428</span><span class="p">,</span><span class="mf">0.0007879</span><span class="p">,</span><span class="mf">0.0000183</span><span class="p">,</span><span class="mf">0.0001984</span><span class="p">,</span> <span class="o">......</span>
<span class="o">......</span>
</pre></div>
</div>
<ul class="simple">
<li>其中,第一行是<code class="docutils literal"><span class="pre">PaddlePaddle</span></code> 输出文件的格式说明,包含3个属性::<ul>
<li><code class="docutils literal"><span class="pre">PaddlePaddle</span></code>的版本号,本例中为0</li>
<li>浮点数占用的字节数,本例中为4</li>
<li>总计的参数个数,本例中为32,156,096</li>
</ul>
</li>
<li>其余行是(词向量)参数行(假设词向量维度为32)<ul>
<li>每行打印32个参数以&#8217;,&#8217;分隔</li>
<li>共有32,156,096/32 = 1,004,877行,也就是说,模型共包含1,004,877个被向量化的词</li>
</ul>
</li>
</ul>
</div>
<div class="section" id="">
<span id="id11"></span><h3>词向量模型的修正<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p><code class="docutils literal"><span class="pre">PaddlePaddle</span></code> 为想修正词向量模型的用户提供了将文本词向量模型转换为二进制模型的命令:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>cd $PADDLE_ROOT/demo/model_zoo/embedding
python paraconvert.py --t2b -i INPUT -o OUTPUT
</pre></div>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">-i</span> <span class="pre">INPUT</span></code>: 输入的文本词向量模型名称</li>
<li><code class="docutils literal"><span class="pre">-o</span> <span class="pre">OUTPUT</span></code>: 输出的二进制词向量模型名称</li>
</ul>
<p>请注意,输入的文本格式如下:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">-</span><span class="mf">0.7845433</span><span class="p">,</span><span class="mf">1.1937413</span><span class="p">,</span><span class="o">-</span><span class="mf">0.1704215</span><span class="p">,</span><span class="mf">0.4154715</span><span class="p">,</span><span class="mf">0.9566584</span><span class="p">,</span><span class="o">-</span><span class="mf">0.5558153</span><span class="p">,</span><span class="o">-</span><span class="mf">0.2503305</span><span class="p">,</span> <span class="o">......</span>
<span class="mf">0.0000909</span><span class="p">,</span><span class="mf">0.0009465</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008813</span><span class="p">,</span><span class="o">-</span><span class="mf">0.0008428</span><span class="p">,</span><span class="mf">0.0007879</span><span class="p">,</span><span class="mf">0.0000183</span><span class="p">,</span><span class="mf">0.0001984</span><span class="p">,</span> <span class="o">......</span>
<span class="o">......</span>
</pre></div>
</div>
<ul class="simple">
<li>输入文本中没有头部(格式说明)行</li>
<li>(输入文本)每行存储一个词,以逗号&#8217;,&#8217;分隔</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="../../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Model Zoo - ImageNet &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../../genindex.html"/>
<link rel="search" title="搜索" href="../../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_cn.html">fabric集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/capi/index_cn.html">PaddlePaddle C-API</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/workflow_of_capi_cn.html">C-API 使用流程</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">数据访问</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">训练与应用</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_cn.html">Android平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_cn.html">iOS平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_cn.html">Raspberry Pi平台编译指南</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>Model Zoo - ImageNet</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="model-zoo-imagenet">
<span id="model-zoo-imagenet"></span><h1>Model Zoo - ImageNet<a class="headerlink" href="#model-zoo-imagenet" title="永久链接至标题"></a></h1>
<p><a class="reference external" href="http://www.image-net.org/">ImageNet</a> 是通用物体分类领域一个众所周知的数据库。本教程提供了一个用于ImageNet上的卷积分类网络模型。</p>
<div class="section" id="resnet">
<span id="resnet"></span><h2>ResNet 介绍<a class="headerlink" href="#resnet" title="永久链接至标题"></a></h2>
<p>论文 <a class="reference external" href="http://arxiv.org/abs/1512.03385">Deep Residual Learning for Image Recognition</a> 中提出的ResNet网络结构在2015年ImageNet大规模视觉识别竞赛(ILSVRC 2015)的分类任务中赢得了第一名。他们提出残差学习的框架来简化网络的训练,所构建网络结构的的深度比之前使用的网络有大幅度的提高。下图展示的是基于残差的连接方式。左图构造网络模块的方式被用于34层的网络中,而右图的瓶颈连接模块用于50层,101层和152层的网络结构中。</p>
<p><center><img alt="resnet_block" src="../../_images/resnet_block.jpg" /></center>
<center>图 1. ResNet 网络模块</center></p>
<p>本教程中我们给出了三个ResNet模型,这些模型都是由原作者提供的模型<a class="reference external" href="https://github.com/KaimingHe/deep-residual-networks">https://github.com/KaimingHe/deep-residual-networks</a>转换过来的。我们使用PaddlePaddle在ILSVRC的验证集共50,000幅图像上测试了模型的分类错误率,其中输入图像的颜色通道顺序为<strong>BGR</strong>,保持宽高比缩放到短边为256,只截取中心方形的图像区域。分类错误率和模型大小由下表给出。
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">ResNet</th>
<th scope="col" class="left">Top-1</th>
<th scope="col" class="left">Model Size</th>
</tr>
</thead><tbody>
<tr>
<td class="left">ResNet-50</td>
<td class="left">24.9%</td>
<td class="left">99M</td>
</tr>
<tr>
<td class="left">ResNet-101</td>
<td class="left">23.7%</td>
<td class="left">173M</td>
</tr>
<tr>
<td class="left">ResNet-152</td>
<td class="left">23.2%</td>
<td class="left">234M</td>
</tr>
</tbody></table></center>
<br></div>
<div class="section" id="resnet">
<span id="id1"></span><h2>ResNet 模型<a class="headerlink" href="#resnet" title="永久链接至标题"></a></h2>
<p>50层,101层和152层的网络配置文件可参照<code class="docutils literal"><span class="pre">demo/model_zoo/resnet/resnet.py</span></code>。你也可以通过在命令行参数中增加一个参数如<code class="docutils literal"><span class="pre">--config_args=layer_num=50</span></code>来指定网络层的数目。</p>
<div class="section" id="">
<span id="id2"></span><h3>网络可视化<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>你可以通过执行下面的命令来得到ResNet网络的结构可视化图。该脚本会生成一个dot文件,然后可以转换为图片。需要安装graphviz来转换dot文件为图片。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">net_diagram</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
</div>
<div class="section" id="">
<span id="id3"></span><h3>模型下载<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">get_model</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>你可以执行上述命令来下载所有的模型和均值文件,如果下载成功,这些文件将会被保存在<code class="docutils literal"><span class="pre">demo/model_zoo/resnet/model</span></code>路径下。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">mean_meta_224</span> <span class="n">resnet_101</span> <span class="n">resnet_152</span> <span class="n">resnet_50</span>
</pre></div>
</div>
<ul class="simple">
<li>resnet_50: 50层网络模型。</li>
<li>resnet_101: 101层网络模型。</li>
<li>resnet_152: 152层网络模型。</li>
<li>mean_meta_224: 均值图像文件,图像大小为3 x 224 x 224,颜色通道顺序为<strong>BGR</strong>。你也可以使用这三个值: 103.939, 116.779, 123.68。</li>
</ul>
</div>
<div class="section" id="">
<span id="id4"></span><h3>参数信息<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<ul>
<li><p class="first"><strong>卷积层权重</strong></p>
<p>由于每个卷积层后面连接的是batch normalization层,因此该层中没有偏置(bias)参数,并且只有一个权重。
形状: <code class="docutils literal"><span class="pre">(Co,</span> <span class="pre">ky,</span> <span class="pre">kx,</span> <span class="pre">Ci)</span></code></p>
<ul class="simple">
<li>Co: 输出特征图的通道数目</li>
<li>ky: 滤波器核在垂直方向上的尺寸</li>
<li>kx: 滤波器核在水平方向上的尺寸</li>
<li>Ci: 输入特征图的通道数目</li>
</ul>
<p>二维矩阵: (Co * ky * kx, Ci), 行优先次序存储。</p>
</li>
<li><p class="first"><strong>全连接层权重</strong></p>
<p>二维矩阵: (输入层尺寸, 本层尺寸), 行优先次序存储。</p>
</li>
<li><p class="first"><strong><a class="reference external" href="http://arxiv.org/abs/1502.03167">Batch Normalization</a> 层权重</strong></p>
</li>
</ul>
<p>本层有四个参数,实际上只有.w0和.wbias是需要学习的参数,另外两个分别是滑动均值和方差。在测试阶段它们将会被加载到模型中。下表展示了batch normalization层的参数。
<center></p>
<table border="2" cellspacing="0" cellpadding="6" rules="all" frame="border">
<colgroup>
<col class="left" />
<col class="left" />
<col class="left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="left">参数名</th>
<th scope="col" class="left">尺寸</th>
<th scope="col" class="left">含义</th>
</tr>
</thead><tbody>
<tr>
<td class="left">_res2_1_branch1_bn.w0</td>
<td class="left">256</td>
<td class="left">gamma, 缩放参数</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w1</td>
<td class="left">256</td>
<td class="left">特征图均值</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.w2</td>
<td class="left">256</td>
<td class="left">特征图方差</td>
</tr>
<tr>
<td class="left">_res2_1_branch1_bn.wbias</td>
<td class="left">256</td>
<td class="left">beta, 偏置参数</td>
</tr>
</tbody></table></center>
<br></div>
<div class="section" id="">
<span id="id5"></span><h3>参数读取<a class="headerlink" href="#" title="永久链接至标题"></a></h3>
<p>使用者可以使用下面的Python脚本来读取参数值:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="k">def</span> <span class="nf">load</span><span class="p">(</span><span class="n">file_name</span><span class="p">):</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">&#39;rb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span> <span class="c1"># skip header for float type.</span>
<span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">fromfile</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="k">if</span> <span class="vm">__name__</span><span class="o">==</span><span class="s1">&#39;__main__&#39;</span><span class="p">:</span>
<span class="n">weight</span> <span class="o">=</span> <span class="n">load</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
</pre></div>
</div>
<p>或者直接使用下面的shell命令:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">od</span> <span class="o">-</span><span class="n">j</span> <span class="mi">16</span> <span class="o">-</span><span class="n">f</span> <span class="n">_res2_1_branch1_bn</span><span class="o">.</span><span class="n">w0</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="">
<span id="id6"></span><h2>特征提取<a class="headerlink" href="#" title="永久链接至标题"></a></h2>
<p>我们提供了C++和Python接口来提取特征。下面的例子使用了<code class="docutils literal"><span class="pre">demo/model_zoo/resnet/example</span></code>中的数据,详细地展示了整个特征提取的过程。</p>
<div class="section" id="c">
<span id="c"></span><h3>C++接口<a class="headerlink" href="#c" title="永久链接至标题"></a></h3>
<p>首先,在配置文件中的<code class="docutils literal"><span class="pre">define_py_data_sources2</span></code>里指定图像数据列表,具体请参照示例<code class="docutils literal"><span class="pre">demo/model_zoo/resnet/resnet.py</span></code></p>
<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="n">train_list</span> <span class="o">=</span> <span class="s1">&#39;train.list&#39;</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">is_test</span> <span class="k">else</span> <span class="kc">None</span>
<span class="c1"># mean.meta is mean file of ImageNet dataset.</span>
<span class="c1"># mean.meta size : 3 x 224 x 224.</span>
<span class="c1"># If you use three mean value, set like:</span>
<span class="c1"># &quot;mean_value:103.939,116.779,123.68;&quot;</span>
<span class="n">args</span><span class="o">=</span><span class="p">{</span>
<span class="s1">&#39;mean_meta&#39;</span><span class="p">:</span> <span class="s2">&quot;model/mean_meta_224/mean.meta&quot;</span><span class="p">,</span>
<span class="s1">&#39;image_size&#39;</span><span class="p">:</span> <span class="mi">224</span><span class="p">,</span> <span class="s1">&#39;crop_size&#39;</span><span class="p">:</span> <span class="mi">224</span><span class="p">,</span>
<span class="s1">&#39;color&#39;</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span><span class="s1">&#39;swap_channel:&#39;</span><span class="p">:</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">]}</span>
<span class="n">define_py_data_sources2</span><span class="p">(</span><span class="n">train_list</span><span class="p">,</span>
<span class="s1">&#39;example/test.list&#39;</span><span class="p">,</span>
<span class="n">module</span><span class="o">=</span><span class="s2">&quot;example.image_list_provider&quot;</span><span class="p">,</span>
<span class="n">obj</span><span class="o">=</span><span class="s2">&quot;processData&quot;</span><span class="p">,</span>
<span class="n">args</span><span class="o">=</span><span class="n">args</span><span class="p">)</span>
</pre></div>
</div>
<p>第二步,在<code class="docutils literal"><span class="pre">resnet.py</span></code>文件中指定要提取特征的网络层的名字。例如,</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Outputs</span><span class="p">(</span><span class="s2">&quot;res5_3_branch2c_conv&quot;</span><span class="p">,</span> <span class="s2">&quot;res5_3_branch2c_bn&quot;</span><span class="p">)</span>
</pre></div>
</div>
<p>第三步,在<code class="docutils literal"><span class="pre">extract_fea_c++.sh</span></code>文件中指定模型路径和输出的目录,然后执行下面的命令。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">extract_fea_c</span><span class="o">++.</span><span class="n">sh</span>
</pre></div>
</div>
<p>如果执行成功,特征将会存到<code class="docutils literal"><span class="pre">fea_output/rank-00000</span></code>文件中,如下所示。同时你可以使用<code class="docutils literal"><span class="pre">load_feature.py</span></code>文件中的<code class="docutils literal"><span class="pre">load_feature_c</span></code>接口来加载该文件。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">-</span><span class="mf">0.115318</span> <span class="o">-</span><span class="mf">0.108358</span> <span class="o">...</span> <span class="o">-</span><span class="mf">0.087884</span><span class="p">;</span><span class="o">-</span><span class="mf">1.27664</span> <span class="o">...</span> <span class="o">-</span><span class="mf">1.11516</span> <span class="o">-</span><span class="mf">2.59123</span><span class="p">;</span>
<span class="o">-</span><span class="mf">0.126383</span> <span class="o">-</span><span class="mf">0.116248</span> <span class="o">...</span> <span class="o">-</span><span class="mf">0.00534909</span><span class="p">;</span><span class="o">-</span><span class="mf">1.42593</span> <span class="o">...</span> <span class="o">-</span><span class="mf">1.04501</span> <span class="o">-</span><span class="mf">1.40769</span><span class="p">;</span>
</pre></div>
</div>
<ul class="simple">
<li>每行存储的是一个样本的特征。其中,第一行存的是图像<code class="docutils literal"><span class="pre">example/dog.jpg</span></code>的特征,第二行存的是图像<code class="docutils literal"><span class="pre">example/cat.jpg</span></code>的特征。</li>
<li>不同层的特征由分号<code class="docutils literal"><span class="pre">;</span></code>隔开,并且它们的顺序与<code class="docutils literal"><span class="pre">Outputs()</span></code>中指定的层顺序一致。这里,左边是<code class="docutils literal"><span class="pre">res5_3_branch2c_conv</span></code>层的特征,右边是<code class="docutils literal"><span class="pre">res5_3_branch2c_bn</span></code>层特征。</li>
</ul>
</div>
<div class="section" id="python">
<span id="python"></span><h3>Python接口<a class="headerlink" href="#python" title="永久链接至标题"></a></h3>
<p>示例<code class="docutils literal"><span class="pre">demo/model_zoo/resnet/classify.py</span></code>中展示了如何使用Python来提取特征。下面的例子同样使用了<code class="docutils literal"><span class="pre">./example/test.list</span></code>中的数据。执行的命令如下:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">extract_fea_py</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>extract_fea_py.sh:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">classify</span><span class="o">.</span><span class="n">py</span> \
<span class="o">--</span><span class="n">job</span><span class="o">=</span><span class="n">extract</span> \
<span class="o">--</span><span class="n">conf</span><span class="o">=</span><span class="n">resnet</span><span class="o">.</span><span class="n">py</span>\
<span class="o">--</span><span class="n">use_gpu</span><span class="o">=</span><span class="mi">1</span> \
<span class="o">--</span><span class="n">mean</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">mean_meta_224</span><span class="o">/</span><span class="n">mean</span><span class="o">.</span><span class="n">meta</span> \
<span class="o">--</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">resnet_50</span> \
<span class="o">--</span><span class="n">data</span><span class="o">=./</span><span class="n">example</span><span class="o">/</span><span class="n">test</span><span class="o">.</span><span class="n">list</span> \
<span class="o">--</span><span class="n">output_layer</span><span class="o">=</span><span class="s2">&quot;res5_3_branch2c_conv,res5_3_branch2c_bn&quot;</span> \
<span class="o">--</span><span class="n">output_dir</span><span class="o">=</span><span class="n">features</span>
</pre></div>
</div>
<ul class="simple">
<li>--job=extract: 指定工作模式来提取特征。</li>
<li>--conf=resnet.py: 网络配置文件。</li>
<li>--use_gpu=1: 指定是否使用GPU。</li>
<li>--model=model/resnet_50: 模型路径。</li>
<li>--data=./example/test.list: 数据列表。</li>
<li>--output_layer=&#8221;xxx,xxx&#8221;: 指定提取特征的层。</li>
<li>--output_dir=features: 输出目录。</li>
</ul>
<p>如果运行成功,你将会看到特征存储在<code class="docutils literal"><span class="pre">features/batch_0</span></code>文件中,该文件是由cPickle产生的。你可以使用<code class="docutils literal"><span class="pre">load_feature.py</span></code>中的<code class="docutils literal"><span class="pre">load_feature_py</span></code>接口来打开该文件,它将返回如下的字典:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="p">{</span>
<span class="s1">&#39;cat.jpg&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s1">&#39;res5_3_branch2c_conv&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">0.12638293</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.116248</span> <span class="p">,</span> <span class="o">-</span><span class="mf">0.11883899</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.00895038</span><span class="p">,</span> <span class="mf">0.01994277</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.00534909</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">),</span> <span class="s1">&#39;res5_3_branch2c_bn&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">1.42593431</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.28918779</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.32414699</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.45933616</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.04501402</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.40769434</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">)},</span>
<span class="s1">&#39;dog.jpg&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s1">&#39;res5_3_branch2c_conv&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">0.11531784</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.10835785</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08809858</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span><span class="mf">0.0055237</span><span class="p">,</span> <span class="mf">0.01505112</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08788397</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">),</span> <span class="s1">&#39;res5_3_branch2c_bn&#39;</span><span class="p">:</span> <span class="n">array</span><span class="p">([[</span><span class="o">-</span><span class="mf">1.27663755</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.18272924</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.90937918</span><span class="p">,</span> <span class="o">...</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.25178063</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.11515927</span><span class="p">,</span> <span class="o">-</span><span class="mf">2.59122872</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">float32</span><span class="p">)}</span>
<span class="p">}</span>
</pre></div>
</div>
<p>仔细观察,这些特征值与上述使用C++接口提取的结果是一致的。</p>
</div>
</div>
<div class="section" id="">
<span id="id7"></span><h2>预测<a class="headerlink" href="#" title="永久链接至标题"></a></h2>
<p><code class="docutils literal"><span class="pre">classify.py</span></code>文件也可以用于对样本进行预测。我们提供了一个示例脚本<code class="docutils literal"><span class="pre">predict.sh</span></code>,它使用50层的ResNet模型来对<code class="docutils literal"><span class="pre">example/test.list</span></code>中的数据进行预测。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cd</span> <span class="n">demo</span><span class="o">/</span><span class="n">model_zoo</span><span class="o">/</span><span class="n">resnet</span>
<span class="o">./</span><span class="n">predict</span><span class="o">.</span><span class="n">sh</span>
</pre></div>
</div>
<p>predict.sh调用了<code class="docutils literal"><span class="pre">classify.py</span></code>:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">classify</span><span class="o">.</span><span class="n">py</span> \
<span class="o">--</span><span class="n">job</span><span class="o">=</span><span class="n">predict</span> \
<span class="o">--</span><span class="n">conf</span><span class="o">=</span><span class="n">resnet</span><span class="o">.</span><span class="n">py</span>\
<span class="o">--</span><span class="n">multi_crop</span> \
<span class="o">--</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="o">/</span><span class="n">resnet_50</span> \
<span class="o">--</span><span class="n">use_gpu</span><span class="o">=</span><span class="mi">1</span> \
<span class="o">--</span><span class="n">data</span><span class="o">=./</span><span class="n">example</span><span class="o">/</span><span class="n">test</span><span class="o">.</span><span class="n">list</span>
</pre></div>
</div>
<ul class="simple">
<li>--job=extract: 指定工作模型进行预测。</li>
<li>--conf=resnet.py: 网络配置文件。network configure.</li>
<li>--multi_crop: 使用10个裁剪图像块,预测概率取平均。</li>
<li>--use_gpu=1: 指定是否使用GPU。</li>
<li>--model=model/resnet_50: 模型路径。</li>
<li>--data=./example/test.list: 数据列表。</li>
</ul>
<p>如果运行成功,你将会看到如下结果,其中156和285是这些图像的分类标签。</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Label</span> <span class="n">of</span> <span class="n">example</span><span class="o">/</span><span class="n">dog</span><span class="o">.</span><span class="n">jpg</span> <span class="ow">is</span><span class="p">:</span> <span class="mi">156</span>
<span class="n">Label</span> <span class="n">of</span> <span class="n">example</span><span class="o">/</span><span class="n">cat</span><span class="o">.</span><span class="n">jpg</span> <span class="ow">is</span><span class="p">:</span> <span class="mi">282</span>
</pre></div>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="../../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>快速入门教程 &mdash; PaddlePaddle 文档</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="index" title="索引"
href="../../genindex.html"/>
<link rel="search" title="搜索" href="../../search.html"/>
<link rel="top" title="PaddlePaddle 文档" href="../../index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/override.css" type="text/css" />
<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>
<script src="../../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<header class="site-header">
<div class="site-logo">
<a href="/"><img src="../../_static/images/PP_w.png"></a>
</div>
<div class="site-nav-links">
<div class="site-menu">
<a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a>
<div class="language-switcher dropdown">
<a type="button" data-toggle="dropdown">
<span>English</span>
<i class="fa fa-angle-up"></i>
<i class="fa fa-angle-down"></i>
</a>
<ul class="dropdown-menu">
<li><a href="/doc_cn">中文</a></li>
<li><a href="/doc">English</a></li>
</ul>
</div>
<ul class="site-page-links">
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a></li>
</ul>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
</div>
</header>
<div class="main-content-wrap">
<nav class="doc-menu-vertical" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_cn.html">安装与编译</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_cn.html">fabric集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/usage/capi/index_cn.html">PaddlePaddle C-API</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/workflow_of_capi_cn.html">C-API 使用流程</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">模型配置</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">数据访问</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">训练与应用</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">Layers</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">DataFeeder</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">Executor</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">Initializer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">Evaluator</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">Nets</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">Optimizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">ParamAttr</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">Profiler</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">Regularizer</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">IO</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index_cn.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/model/index_cn.html">模型配置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/parameter/index_cn.html">参数设置</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/local/index_cn.html">本地训练与预测</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../faq/cluster/index_cn.html">集群训练与预测</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../mobile/index_cn.html">MOBILE</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_android_cn.html">Android平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_ios_cn.html">iOS平台编译指南</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../mobile/cross_compiling_for_raspberry_cn.html">Raspberry Pi平台编译指南</a></li>
</ul>
</li>
</ul>
</nav>
<section class="doc-content-wrap">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li>快速入门教程</li>
</ul>
</div>
<div class="wy-nav-content" id="doc-content">
<div class="rst-content">
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="id1">
<h1>快速入门教程<a class="headerlink" href="#id1" title="永久链接至标题"></a></h1>
<p>我们将以 <a class="reference external" href="https://en.wikipedia.org/wiki/Document_classification">文本分类问题</a> 为例,
介绍PaddlePaddle的基本使用方法。</p>
<div class="section" id="id3">
<h2>安装<a class="headerlink" href="#id3" title="永久链接至标题"></a></h2>
<p>请参考 <a class="reference internal" href="../../getstarted/build_and_install/index_cn.html#install-steps"><span class="std std-ref">安装流程</span></a> 安装PaddlePaddle。</p>
</div>
<div class="section" id="id4">
<h2>使用概述<a class="headerlink" href="#id4" title="永久链接至标题"></a></h2>
<p><strong>文本分类问题</strong>:对于给定的一条文本,我们从提前给定的类别集合中选择其所属类别。</p>
<p>比如, 在购物网站上,通过查看买家对某个产品的评价反馈, 评估该产品的质量。</p>
<ul class="simple">
<li>这个显示器很棒! (好评)</li>
<li>用了两个月之后这个显示器屏幕碎了。(差评)</li>
</ul>
<p>使用PaddlePaddle, 每一个任务流程都可以被划分为如下五个步骤。</p>
<blockquote>
<div><a class="reference internal image-reference" href="../../_images/Pipeline_cn.jpg"><img alt="../../_images/Pipeline_cn.jpg" class="align-center" src="../../_images/Pipeline_cn.jpg" style="width: 544.8px; height: 44.8px;" /></a>
</div></blockquote>
<ol class="arabic simple">
<li><dl class="first docutils">
<dt>数据格式准备</dt>
<dd><ul class="first last">
<li>本例每行保存一条样本,类别Id和文本信息用 <code class="docutils literal"><span class="pre">Tab</span></code> 间隔,文本中的单词用空格分隔(如果不切词,则字与字之间用空格分隔),例如:<code class="docutils literal"><span class="pre">类别Id</span> <span class="pre">'\t'</span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span> <span class="pre"></span></code></li>
</ul>
</dd>
</dl>
</li>
<li><dl class="first docutils">
<dt>向系统传送数据</dt>
<dd><ul class="first last">
<li>PaddlePaddle可以执行用户的python脚本程序来读取各种格式的数据文件。</li>
<li>本例的所有字符都将转换为连续整数表示的Id传给模型。</li>
</ul>
</dd>
</dl>
</li>
<li><dl class="first docutils">
<dt>描述网络结构和优化算法</dt>
<dd><ul class="first last">
<li>本例由易到难展示4种不同的文本分类网络配置:逻辑回归模型,词向量模型,卷积模型,时序模型。</li>
<li>常用优化算法包括Momentum, RMSProp,AdaDelta,AdaGrad,Adam,Adamax等,本例采用Adam优化方法,加了L2正则和梯度截断。</li>
</ul>
</dd>
</dl>
</li>
<li>训练模型</li>
<li>应用模型</li>
</ol>
<div class="section" id="id5">
<h3>数据格式准备<a class="headerlink" href="#id5" title="永久链接至标题"></a></h3>
<p>接下来我们将展示如何用PaddlePaddle训练一个文本分类模型,将 <a class="reference external" href="http://jmcauley.ucsd.edu/data/amazon/">Amazon电子产品评论数据</a> 分为好评(正样本)和差评(负样本)两种类别。
<a class="reference external" href="https://github.com/PaddlePaddle/Paddle">源代码</a><code class="docutils literal"><span class="pre">demo/quick_start</span></code> 目录里提供了该数据的下载脚本和预处理脚本,你只需要在命令行输入以下命令,就能够很方便的完成数据下载和相应的预处理工作。</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nb">cd</span> demo/quick_start
./data/get_data.sh
./preprocess.sh
</pre></div>
</div>
<p>数据预处理完成之后,通过配置类似于 <code class="docutils literal"><span class="pre">dataprovider_*.py</span></code> 的数据读取脚本和类似于 <code class="docutils literal"><span class="pre">trainer_config.*.py</span></code> 的训练模型脚本,PaddlePaddle将以设置参数的方式来设置
相应的数据读取脚本和训练模型脚本。接下来,我们将对这两个步骤给出了详细的解释,你也可以先跳过本文的解释环节,直接进入训练模型章节, 使用 <code class="docutils literal"><span class="pre">sh</span> <span class="pre">train.sh</span></code> 开始训练模型,
查看`train.sh`内容,通过 <strong>自底向上法</strong> (bottom-up approach)来帮助你理解PaddlePaddle的内部运行机制。</p>
</div>
</div>
<div class="section" id="id7">
<h2>向系统传送数据<a class="headerlink" href="#id7" title="永久链接至标题"></a></h2>
<div class="section" id="python">
<h3>Python脚本读取数据<a class="headerlink" href="#python" title="永久链接至标题"></a></h3>
<p><cite>DataProvider</cite> 是PaddlePaddle负责提供数据的模块,主要职责在于将训练数据传入内存或者显存,让模型能够得到训练更新,其包括两个函数:</p>
<ul class="simple">
<li>initializer:PaddlePaddle会在调用读取数据的Python脚本之前,先调用initializer函数。在下面例子里,我们在initialzier函数里初始化词表,并且在随后的读取数据过程中填充词表。</li>
<li>process:PaddlePaddle调用process函数来读取数据。每次读取一条数据后,process函数会用yield语句输出这条数据,从而能够被PaddlePaddle 捕获 (harvest)。</li>
</ul>
<p><code class="docutils literal"><span class="pre">dataprovider_bow.py</span></code> 文件给出了完整例子:</p>
<p>详细内容请参见 <a class="reference internal" href="../../api/v1/data_provider/dataprovider_cn.html#api-dataprovider"><span class="std std-ref">DataProvider的介绍</span></a></p>
</div>
<div class="section" id="id8">
<h3>配置中的数据加载定义<a class="headerlink" href="#id8" title="永久链接至标题"></a></h3>
<p>在模型配置中通过 <code class="docutils literal"><span class="pre">define_py_data_sources2</span></code> 接口来加载数据:</p>
<p>以下是对上述数据加载的解释:</p>
<ul class="simple">
<li>data/train.list,data/test.list: 指定训练数据和测试数据</li>
<li>module=&#8221;dataprovider_bow&#8221;: 处理数据的Python脚本文件</li>
<li>obj=&#8221;process&#8221;: 指定生成数据的函数</li>
<li>args={&#8220;dictionary&#8221;: word_dict}: 额外的参数,这里指定词典</li>
</ul>
<p>更详细数据格式和用例请参考 <a class="reference internal" href="../../api/v1/data_provider/pydataprovider2_cn.html#api-pydataprovider2"><span class="std std-ref">PyDataProvider2的使用</span></a></p>
</div>
</div>
<div class="section" id="id9">
<h2>模型网络结构<a class="headerlink" href="#id9" title="永久链接至标题"></a></h2>
<p>本小节我们将介绍模型网络结构。</p>
<blockquote>
<div><a class="reference internal image-reference" href="../../_images/PipelineNetwork_cn.jpg"><img alt="../../_images/PipelineNetwork_cn.jpg" class="align-center" src="../../_images/PipelineNetwork_cn.jpg" style="width: 544.8px; height: 44.8px;" /></a>
</div></blockquote>
<p>我们将以最基本的逻辑回归网络作为起点,并逐渐展示更加深入的功能。更详细的网络配置连接请参考 <span class="xref std std-ref">api_trainer_config_helpers_layers</span>
所有配置都能在 <a class="reference external" href="https://github.com/PaddlePaddle/Paddle">源代码</a><code class="docutils literal"><span class="pre">demo/quick_start</span></code> 目录下找到。</p>
<div class="section" id="id11">
<h3>逻辑回归模型<a class="headerlink" href="#id11" title="永久链接至标题"></a></h3>
<p>具体流程如下:</p>
<blockquote>
<div><a class="reference internal image-reference" href="../../_images/NetLR_cn.jpg"><img alt="../../_images/NetLR_cn.jpg" class="align-center" src="../../_images/NetLR_cn.jpg" style="width: 517.6px; height: 152.8px;" /></a>
</div></blockquote>
<ul>
<li><p class="first">获取利用 <a class="reference external" href="https://en.wikipedia.org/wiki/One-hot">one-hot vector</a> 表示的每个单词,维度是词典大小</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">word</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;word&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">word_dim</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
<li><p class="first">获取该条样本类别Id,维度是类别个数。</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">label</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;label&quot;</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">label_dim</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
<li><p class="first">利用逻辑回归模型对该向量进行分类,同时会计算分类准确率</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Define a fully connected layer with logistic activation (also called softmax activation).</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">word</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="n">label_dim</span><span class="p">,</span>
<span class="n">act_type</span><span class="o">=</span><span class="n">SoftmaxActivation</span><span class="p">())</span>
<span class="c1"># Define cross-entropy classification loss and error.</span>
<span class="n">classification_cost</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">output</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
</ul>
<blockquote>
<div><ul class="simple">
<li>input: 除去data层,每个层都有一个或多个input,多个input以list方式输入</li>
<li>size: 该层神经元个数</li>
<li>act_type: 激活函数类型</li>
</ul>
</div></blockquote>
<p><strong>效果总结</strong>:我们将在后面介绍训练和预测流程的脚本。在此为方便对比不同网络结构,我们总结了各个网络的复杂度和效果。</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="30%" />
<col width="45%" />
<col width="25%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">网络名称</th>
<th class="head">参数数量</th>
<th class="head">错误率</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>逻辑回归</td>
<td>252 KB</td>
<td>8.652 %</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="id12">
<h3>词向量模型<a class="headerlink" href="#id12" title="永久链接至标题"></a></h3>
<p>embedding模型需要稍微改变提供数据的Python脚本,即 <code class="docutils literal"><span class="pre">dataprovider_emb.py</span></code>,词向量模型、
卷积模型、时序模型均使用该脚本。其中文本输入类型定义为整数时序类型integer_value_sequence。</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">initializer</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">dictionary</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="n">settings</span><span class="o">.</span><span class="n">word_dict</span> <span class="o">=</span> <span class="n">dictionary</span>
<span class="n">settings</span><span class="o">.</span><span class="n">input_types</span> <span class="o">=</span> <span class="p">[</span>
<span class="c1"># Define the type of the first input as sequence of integer.</span>
<span class="c1"># The value of the integers range from 0 to len(dictrionary)-1</span>
<span class="n">integer_value_sequence</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">dictionary</span><span class="p">)),</span>
<span class="c1"># Define the second input for label id</span>
<span class="n">integer_value</span><span class="p">(</span><span class="mi">2</span><span class="p">)]</span>
<span class="nd">@provider</span><span class="p">(</span><span class="n">init_hook</span><span class="o">=</span><span class="n">initializer</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">file_name</span><span class="p">):</span>
<span class="o">...</span>
<span class="c1"># omitted, it is same as the data provider for LR model</span>
</pre></div>
</div>
<p>该模型依然使用逻辑回归分类网络的框架, 只是将句子用连续向量表示替换为用稀疏向量表示, 即对第三步进行替换。句子表示的计算更新为两步:</p>
<a class="reference internal image-reference" href="../../_images/NetContinuous_cn.jpg"><img alt="../../_images/NetContinuous_cn.jpg" class="align-center" src="../../_images/NetContinuous_cn.jpg" style="width: 517.6px; height: 195.2px;" /></a>
<ul>
<li><p class="first">利用单词Id查找该单词对应的连续向量(维度为word_dim), 输入N个单词,输出为N个word_dim维度向量</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">emb</span> <span class="o">=</span> <span class="n">embedding_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">word</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">word_dim</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
<li><p class="first">将该句话包含的所有单词向量求平均, 得到句子的表示</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">avg</span> <span class="o">=</span> <span class="n">pooling_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">pooling_type</span><span class="o">=</span><span class="n">AvgPooling</span><span class="p">())</span>
</pre></div>
</div>
</div></blockquote>
</li>
</ul>
<p>其它部分和逻辑回归网络结构一致。</p>
<p><strong>效果总结:</strong></p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="30%" />
<col width="44%" />
<col width="26%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">网络名称</th>
<th class="head">参数数量</th>
<th class="head">错误率</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>词向量模型</td>
<td>15 MB</td>
<td>8.484 %</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="id13">
<h3>卷积模型<a class="headerlink" href="#id13" title="永久链接至标题"></a></h3>
<p>卷积网络是一种特殊的从词向量表示到句子表示的方法, 也就是将词向量模型进一步演化为三个新步骤。</p>
<a class="reference internal image-reference" href="../../_images/NetConv_cn.jpg"><img alt="../../_images/NetConv_cn.jpg" class="align-center" src="../../_images/NetConv_cn.jpg" style="width: 518.4px; height: 256.8px;" /></a>
<p>文本卷积分可为三个步骤:</p>
<ol class="arabic simple">
<li>首先,从每个单词左右两端分别获取k个相邻的单词, 拼接成一个新的向量;</li>
<li>其次,对该向量进行非线性变换(例如Sigmoid变换), 使其转变为维度为hidden_dim的新向量;</li>
<li>最后,对整个新向量集合的每一个维度取最大值来表示最后的句子。</li>
</ol>
<p>这三个步骤可配置为:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">text_conv</span> <span class="o">=</span> <span class="n">sequence_conv_pool</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span>
<span class="n">context_start</span><span class="o">=</span><span class="n">k</span><span class="p">,</span>
<span class="n">context_len</span><span class="o">=</span><span class="mi">2</span> <span class="o">*</span> <span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
</pre></div>
</div>
<p><strong>效果总结:</strong></p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="28%" />
<col width="41%" />
<col width="32%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">网络名称</th>
<th class="head">参数数量</th>
<th class="head">错误率</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>卷积模型</td>
<td>16 MB</td>
<td>5.628 %</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="id14">
<h3>时序模型<a class="headerlink" href="#id14" title="永久链接至标题"></a></h3>
<a class="reference internal image-reference" href="../../_images/NetRNN_cn.jpg"><img alt="../../_images/NetRNN_cn.jpg" class="align-center" src="../../_images/NetRNN_cn.jpg" style="width: 518.4px; height: 304.0px;" /></a>
<p>时序模型,也称为RNN模型, 包括简单的 <a class="reference external" href="https://en.wikipedia.org/wiki/Recurrent_neural_network">RNN模型</a>, <a class="reference external" href="https://en.wikipedia.org/wiki/Gated_recurrent_unit">GRU模型</a><a class="reference external" href="https://en.wikipedia.org/wiki/Long_short-term_memory">LSTM模型</a> 等等。</p>
<ul>
<li><p class="first">GRU模型配置:</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">gru</span> <span class="o">=</span> <span class="n">simple_gru</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">gru_size</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
<li><p class="first">LSTM模型配置:</p>
<blockquote>
<div><div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">lstm</span> <span class="o">=</span> <span class="n">simple_lstm</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">lstm_size</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</li>
</ul>
<p>本次试验,我们采用单层LSTM模型,并使用了Dropout,<strong>效果总结:</strong></p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="27%" />
<col width="40%" />
<col width="32%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">网络名称</th>
<th class="head">参数数量</th>
<th class="head">错误率</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>时序模型</td>
<td>16 MB</td>
<td>4.812 %</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
</div>
<div class="section" id="id15">
<h2>优化算法<a class="headerlink" href="#id15" title="永久链接至标题"></a></h2>
<p><a class="reference external" href="http://www.paddlepaddle.org/doc/ui/api/trainer_config_helpers/optimizers_index.html">优化算法</a> 包括
Momentum, RMSProp,AdaDelta,AdaGrad,ADAM,Adamax等,这里采用Adam优化方法,同时使用了L2正则(L2 Regularization)和梯度截断(Gradient Clipping)。</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">settings</span><span class="p">(</span><span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">2e-3</span><span class="p">,</span>
<span class="n">learning_method</span><span class="o">=</span><span class="n">AdamOptimizer</span><span class="p">(),</span>
<span class="n">regularization</span><span class="o">=</span><span class="n">L2Regularization</span><span class="p">(</span><span class="mf">8e-4</span><span class="p">),</span>
<span class="n">gradient_clipping_threshold</span><span class="o">=</span><span class="mi">25</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="id17">
<h2>训练模型<a class="headerlink" href="#id17" title="永久链接至标题"></a></h2>
<p>在数据加载和网络配置完成之后, 我们就可以训练模型了。</p>
<a class="reference internal image-reference" href="../../_images/PipelineTrain_cn.jpg"><img alt="../../_images/PipelineTrain_cn.jpg" class="align-center" src="../../_images/PipelineTrain_cn.jpg" style="width: 544.0px; height: 44.8px;" /></a>
<p>训练模型,我们只需要运行 <code class="docutils literal"><span class="pre">train.sh</span></code> 训练脚本:</p>
<blockquote>
<div><div class="highlight-bash"><div class="highlight"><pre><span></span>./train.sh
</pre></div>
</div>
</div></blockquote>
<p><code class="docutils literal"><span class="pre">train.sh</span></code> 中包含了训练模型的基本命令。训练时所需设置的主要参数如下:</p>
<blockquote>
<div><div class="highlight-bash"><div class="highlight"><pre><span></span>paddle train <span class="se">\</span>
--config<span class="o">=</span>trainer_config.py <span class="se">\</span>
--log_period<span class="o">=</span><span class="m">20</span> <span class="se">\</span>
--save_dir<span class="o">=</span>./output <span class="se">\</span>
--num_passes<span class="o">=</span><span class="m">15</span> <span class="se">\</span>
--use_gpu<span class="o">=</span><span class="nb">false</span>
</pre></div>
</div>
</div></blockquote>
<p>这里只简单介绍了单机训练,如何进行分布式训练,请参考 <span class="xref std std-ref">cluster_train</span></p>
</div>
<div class="section" id="id18">
<h2>预测<a class="headerlink" href="#id18" title="永久链接至标题"></a></h2>
<p>当模型训练好了之后,我们就可以进行预测了。</p>
<a class="reference internal image-reference" href="../../_images/PipelineTest_cn.jpg"><img alt="../../_images/PipelineTest_cn.jpg" class="align-center" src="../../_images/PipelineTest_cn.jpg" style="width: 544.0px; height: 44.8px;" /></a>
<p>之前配置文件中 <code class="docutils literal"><span class="pre">test.list</span></code> 指定的数据将会被测试,这里直接通过预测脚本 <code class="docutils literal"><span class="pre">predict.sh</span></code> 进行预测,
更详细的说明,请参考 <a class="reference internal" href="../../api/v1/predict/swig_py_paddle_cn.html#api-swig-py-paddle"><span class="std std-ref">基于Python的预测</span></a></p>
<blockquote>
<div><div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nv">model</span><span class="o">=</span><span class="s2">&quot;output/pass-00003&quot;</span>
paddle train <span class="se">\</span>
--config<span class="o">=</span>trainer_config.lstm.py <span class="se">\</span>
--use_gpu<span class="o">=</span><span class="nb">false</span> <span class="se">\</span>
--job<span class="o">=</span><span class="nb">test</span> <span class="se">\</span>
--init_model_path<span class="o">=</span><span class="nv">$model</span> <span class="se">\</span>
--config_args<span class="o">=</span><span class="nv">is_predict</span><span class="o">=</span><span class="m">1</span> <span class="se">\</span>
--predict_output_dir<span class="o">=</span>. <span class="se">\</span>
mv rank-00000 result.txt
</pre></div>
</div>
</div></blockquote>
<p>这里以 <code class="docutils literal"><span class="pre">output/pass-00003</span></code> 为例进行预测,用户可以根据训练日志,选择测试结果最好的模型来预测。</p>
<p>预测结果以文本的形式保存在 <code class="docutils literal"><span class="pre">result.txt</span></code> 中,一行为一个样本,格式如下:</p>
<blockquote>
<div><div class="highlight-bash"><div class="highlight"><pre><span></span>预测ID<span class="p">;</span>ID为0的概率 ID为1的概率
预测ID<span class="p">;</span>ID为0的概率 ID为1的概率
</pre></div>
</div>
</div></blockquote>
</div>
<div class="section" id="id19">
<h2>总体效果总结<a class="headerlink" href="#id19" title="永久链接至标题"></a></h2>
<p><code class="docutils literal"><span class="pre">/demo/quick_start</span></code> 目录下,能够找到这里使用的所有数据, 网络配置, 训练脚本等等。
对于Amazon-Elec测试集(25k), 如下表格,展示了上述网络模型的训练效果:</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="21%" />
<col width="31%" />
<col width="13%" />
<col width="34%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">网络名称</th>
<th class="head">参数数量</th>
<th class="head">错误率</th>
<th class="head">配置文件</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>逻辑回归模型</td>
<td>252 KB</td>
<td>8.652%</td>
<td>trainer_config.lr.py</td>
</tr>
<tr class="row-odd"><td>词向量模型</td>
<td>15 MB</td>
<td>8.484%</td>
<td>trainer_config.emb.py</td>
</tr>
<tr class="row-even"><td>卷积模型</td>
<td>16 MB</td>
<td>5.628%</td>
<td>trainer_config.cnn.py</td>
</tr>
<tr class="row-odd"><td>时序模型</td>
<td>16 MB</td>
<td>4.812%</td>
<td>trainer_config.lstm.py</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="id20">
<h2>附录<a class="headerlink" href="#id20" title="永久链接至标题"></a></h2>
<div class="section" id="id21">
<h3>命令行参数<a class="headerlink" href="#id21" title="永久链接至标题"></a></h3>
<ul class="simple">
<li>&#8211;config:网络配置</li>
<li>&#8211;save_dir:模型存储路径</li>
<li>&#8211;log_period:每隔多少batch打印一次日志</li>
<li>&#8211;num_passes:训练轮次,一个pass表示过一遍所有训练样本</li>
<li>&#8211;config_args:命令指定的参数会传入网络配置中。</li>
<li>&#8211;init_model_path:指定初始化模型路径,可用在测试或训练时指定初始化模型。</li>
</ul>
<p>默认一个pass保存一次模型,也可以通过saving_period_by_batches设置每隔多少batch保存一次模型。
可以通过show_parameter_stats_period设置打印参数信息等。
其他参数请参考 命令行参数文档(链接待补充)。</p>
</div>
<div class="section" id="id22">
<h3>输出日志<a class="headerlink" href="#id22" title="永久链接至标题"></a></h3>
<div class="highlight-bash"><div class="highlight"><pre><span></span>TrainerInternal.cpp:160<span class="o">]</span> <span class="nv">Batch</span><span class="o">=</span><span class="m">20</span> <span class="nv">samples</span><span class="o">=</span><span class="m">2560</span> <span class="nv">AvgCost</span><span class="o">=</span><span class="m">0</span>.628761 <span class="nv">CurrentCost</span><span class="o">=</span><span class="m">0</span>.628761 Eval: <span class="nv">classification_error_evaluator</span><span class="o">=</span><span class="m">0</span>.304297 CurrentEval: <span class="nv">classification_error_evaluator</span><span class="o">=</span><span class="m">0</span>.304297
</pre></div>
</div>
<p>模型训练会看到类似上面这样的日志信息,详细的参数解释,请参考如下表格:</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="41%" />
<col width="59%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">名称</th>
<th class="head">解释</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>Batch=20</td>
<td>表示过了20个batch</td>
</tr>
<tr class="row-odd"><td>samples=2560</td>
<td>表示过了2560个样本</td>
</tr>
<tr class="row-even"><td>AvgCost</td>
<td>每个pass的第0个batch到当前batch所有样本的平均cost</td>
</tr>
<tr class="row-odd"><td>CurrentCost</td>
<td>当前log_period个batch所有样本的平均cost</td>
</tr>
<tr class="row-even"><td>Eval: classification_error_evaluator</td>
<td>每个pass的第0个batch到当前batch所有样本的平均分类错误率</td>
</tr>
<tr class="row-odd"><td>CurrentEval: classification_error_evaluator</td>
<td>当前log_period个batch所有样本的平均分类错误率</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016, PaddlePaddle developers.
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../../',
VERSION:'',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: ".txt",
};
</script>
<script type="text/javascript" src="../../_static/jquery.js"></script>
<script type="text/javascript" src="../../_static/underscore.js"></script>
<script type="text/javascript" src="../../_static/doctools.js"></script>
<script type="text/javascript" src="../../_static/translations.js"></script>
<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script>
<script src="../../_static/js/paddle_doc_init.js"></script>
</body>
</html>
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册