提交 f15aa809 编写于 作者: D dangqingqing 提交者: Yu Yang

Update installation document and quick start document.

* Add wheel into installation package.
* Improve quick start doc.

Change-Id: I6151f0569fef8b73248ad9aa3452ef5f0c1bb900
上级 38d73571
......@@ -37,11 +37,12 @@ PaddlePaddle also support some build options, you have to install related librar
```bash
# necessary
sudo apt-get update
sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
# optional
sudo apt-get install libgoogle-glog-dev
sudo apt-get install libgflags-dev
sudo apt-get install libgtest-dev
pip install wheel
pushd /usr/src/gtest
cmake .
make
......
......@@ -59,7 +59,7 @@ To build your text classification system, your code will need to perform five st
## Preprocess data into standardized format
In this example, you are going to use [Amazon electronic product review dataset](http://jmcauley.ucsd.edu/data/amazon/) to build a bunch of deep neural network models for text classification. Each text in this dataset is a product review. This dataset has two categories: “positive” and “negative”. Positive means the reviewer likes the product, while negative means the reviewer does not like the product.
`demo/quick_start` provides scripts for downloading data and preprocessing data as shown below. The data process takes several minutes (about 3 minutes in our machine).
`demo/quick_start` in the source code provides scripts for downloading data and preprocessing data as shown below. The data process takes several minutes (about 3 minutes in our machine).
```bash
cd demo/quick_start
......@@ -423,7 +423,7 @@ paddle train \
mv rank-00000 result.txt
```
There are several differences between training and inference network configurations.
User can choose the best model base on the training log instead of model `output/pass-00003`. There are several differences between training and inference network configurations.
- You do not need labels during inference.
- Outputs need to be specified to the classification probability layer (the output of softmax layer), or the id of maximum probability (`max_id` layer). An example to output the id and probability is given in the code snippet.
- batch_size = 1.
......
......@@ -32,7 +32,7 @@
## 数据格式准备(Data Preparation)
在本问题中,我们使用[Amazon电子产品评论数据](http://jmcauley.ucsd.edu/data/amazon/)
将评论分为好评(正样本)和差评(负样本)两类。`demo/quick_start`里提供了数据下载脚本
将评论分为好评(正样本)和差评(负样本)两类。源码的`demo/quick_start`里提供了数据下载脚本
和预处理脚本。
```bash
......@@ -144,7 +144,7 @@ PyDataProviderWrapper</a>。
我们将以基本的逻辑回归网络作为起点,并逐渐展示更加深入的功能。更详细的网络配置
连接请参考<a href = "../../../doc/layer.html">Layer文档</a>
所有配置在`demo/quick_start`目录,首先列举逻辑回归网络。
所有配置在源码`demo/quick_start`目录,首先列举逻辑回归网络。
### 逻辑回归模型(Logistic Regression)
......@@ -407,7 +407,7 @@ paddle train \
mv rank-00000 result.txt
```
与训练网络配置不同的是:无需label相关的层,指定outputs输出概率层(softmax输出),
这里以`output/pass-00003`为例进行预测,用户可以根据训练log选择test结果最好的模型来预测。与训练网络配置不同的是:无需label相关的层,指定outputs输出概率层(softmax输出),
指定batch_size=1,数据传输无需label数据,预测数据指定test_list的位置。
预测结果以文本的形式保存在`result.txt`中,一行为一个样本,格式如下:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册