提交 cad67feb 编写于 作者: L Luo Tao

Merge branch 'develop' into doc2

...@@ -2,8 +2,8 @@ cmake_minimum_required(VERSION 2.8) ...@@ -2,8 +2,8 @@ cmake_minimum_required(VERSION 2.8)
project(paddle CXX C) project(paddle CXX C)
set(PADDLE_MAJOR_VERSION 0) set(PADDLE_MAJOR_VERSION 0)
set(PADDLE_MINOR_VERSION 8) set(PADDLE_MINOR_VERSION 9)
set(PADDLE_PATCH_VERSION 0b3) set(PADDLE_PATCH_VERSION 0a0)
set(PADDLE_VERSION ${PADDLE_MAJOR_VERSION}.${PADDLE_MINOR_VERSION}.${PADDLE_PATCH_VERSION}) set(PADDLE_VERSION ${PADDLE_MAJOR_VERSION}.${PADDLE_MINOR_VERSION}.${PADDLE_PATCH_VERSION})
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake") set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake")
......
...@@ -29,6 +29,7 @@ settings( ...@@ -29,6 +29,7 @@ settings(
batch_size=128, batch_size=128,
learning_rate=2e-3, learning_rate=2e-3,
learning_method=AdamOptimizer(), learning_method=AdamOptimizer(),
average_window=0.5,
regularization=L2Regularization(8e-4), regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25) gradient_clipping_threshold=25)
......
...@@ -17,7 +17,7 @@ PaddlePaddle does not need any preprocessing to sequence data, such as padding. ...@@ -17,7 +17,7 @@ PaddlePaddle does not need any preprocessing to sequence data, such as padding.
.. code-block:: python .. code-block:: python
settings.slots = [ settings.input_types = [
integer_value_sequence(len(settings.src_dict)), integer_value_sequence(len(settings.src_dict)),
integer_value_sequence(len(settings.trg_dict)), integer_value_sequence(len(settings.trg_dict)),
integer_value_sequence(len(settings.trg_dict))] integer_value_sequence(len(settings.trg_dict))]
......
...@@ -6,7 +6,7 @@ Sentiment analysis is also used to monitor social media based on large amount of ...@@ -6,7 +6,7 @@ Sentiment analysis is also used to monitor social media based on large amount of
On the other hand, grabbing the user comments of products and analyzing their sentiment are useful to understand user preferences for companies, products, even competing products. On the other hand, grabbing the user comments of products and analyzing their sentiment are useful to understand user preferences for companies, products, even competing products.
This tutorial will guide you through the process of training a Long Short Term Memory (LSTM) Network to classify the sentiment of sentences from [Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/), sometimes known as the [Internet Movie Database (IMDB)](http://ai.stanford.edu/~amaas/papers/wvSent_acl2011.pdf). This dataset contains movie reviews along with their associated binary sentiment polarity labels, namely positive and negative. So randomly guessing yields 50% accuracy. This tutorial will guide you through the process of training a Long Short Term Memory (LSTM) Network to classify the sentiment of sentences from [Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/), sometimes known as the Internet Movie Database (IMDB). This dataset contains movie reviews along with their associated binary sentiment polarity labels, namely positive and negative. So randomly guessing yields 50% accuracy.
## Data Preparation ## Data Preparation
...@@ -39,7 +39,7 @@ imdbEr.txt imdb.vocab README test train ...@@ -39,7 +39,7 @@ imdbEr.txt imdb.vocab README test train
* imdbEr.txt: expected rating for each token in imdb.vocab. * imdbEr.txt: expected rating for each token in imdb.vocab.
* README: data documentation. * README: data documentation.
Both train and test set directory contains: The file in train set directory is as follows. The test set also contains them except `unsup` and `urls_unsup.txt`.
``` ```
labeledBow.feat neg pos unsup unsupBow.feat urls_neg.txt urls_pos.txt urls_unsup.txt labeledBow.feat neg pos unsup unsupBow.feat urls_neg.txt urls_pos.txt urls_unsup.txt
...@@ -151,6 +151,7 @@ settings( ...@@ -151,6 +151,7 @@ settings(
batch_size=128, batch_size=128,
learning_rate=2e-3, learning_rate=2e-3,
learning_method=AdamOptimizer(), learning_method=AdamOptimizer(),
average_window=0.5,
regularization=L2Regularization(8e-4), regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25 gradient_clipping_threshold=25
) )
...@@ -163,17 +164,18 @@ stacked_lstm_net(dict_dim, class_dim=class_dim, ...@@ -163,17 +164,18 @@ stacked_lstm_net(dict_dim, class_dim=class_dim,
* **Data Definition**: * **Data Definition**:
* get\_config\_arg(): get arguments setted by `--config_args=xx` in commandline argument. * get\_config\_arg(): get arguments setted by `--config_args=xx` in commandline argument.
* Define TrainData and TestData provider, here using Python interface (PyDataProviderWrapper) of PaddlePaddle to load data. For details, you can refer to the document of PyDataProvider. * Define data provider, here using Python interface to load data. For details, you can refer to the document of PyDataProvider2.
* **Algorithm Configuration**: * **Algorithm Configuration**:
* use sgd algorithm.
* use adam optimization.
* set batch size of 128. * set batch size of 128.
* set average sgd window.
* set global learning rate. * set global learning rate.
* use adam optimization.
* set average sgd window.
* set L2 regularization.
* set gradient clipping threshold.
* **Network Configuration**: * **Network Configuration**:
* dict_dim: get dictionary dimension. * dict_dim: dictionary dimension.
* class_dim: set category number, IMDB has two label, namely positive and negative label. * class_dim: category number, IMDB has two label, namely positive and negative label.
* `stacked_lstm_net`: predefined network as shown in Figure 3, use this network by default. * `stacked_lstm_net`: predefined network as shown in Figure 3, use this network by default.
* `bidirectional_lstm_net`: predefined network as shown in Figure 2. * `bidirectional_lstm_net`: predefined network as shown in Figure 2.
......
...@@ -60,7 +60,7 @@ Implement C++ Class ...@@ -60,7 +60,7 @@ Implement C++ Class
The C++ class of the layer implements the initialization, forward, and backward part of the layer. The fully connected layer is at :code:`paddle/gserver/layers/FullyConnectedLayer.h` and :code:`paddle/gserver/layers/FullyConnectedLayer.cpp`. We list simplified version of the code below. The C++ class of the layer implements the initialization, forward, and backward part of the layer. The fully connected layer is at :code:`paddle/gserver/layers/FullyConnectedLayer.h` and :code:`paddle/gserver/layers/FullyConnectedLayer.cpp`. We list simplified version of the code below.
It needs to derive the base class :code:`paddle::BaseLayer`, and it needs to override the following functions: It needs to derive the base class :code:`paddle::Layer`, and it needs to override the following functions:
- constructor and destructor. - constructor and destructor.
- :code:`init` function. It is used to initialize the parameters and settings. - :code:`init` function. It is used to initialize the parameters and settings.
......
#!/bin/bash #!/bin/bash
set -e set -e
apt-get update
apt-get install -y dh-make apt-get install -y dh-make
cd ~ cd ~
mkdir -p ~/dist/gpu mkdir -p ~/dist/gpu
mkdir -p ~/dist/cpu mkdir -p ~/dist/cpu
mkdir -p ~/dist/cpu-noavx mkdir -p ~/dist/cpu-noavx
mkdir -p ~/dist/gpu-noavx mkdir -p ~/dist/gpu-noavx
git clone https://github.com/baidu/Paddle.git paddle
cd paddle cd paddle
mkdir build mkdir build
cd build cd build
......
...@@ -3,6 +3,6 @@ set -e ...@@ -3,6 +3,6 @@ set -e
docker build -t build_paddle_deb . docker build -t build_paddle_deb .
rm -rf dist rm -rf dist
mkdir -p dist mkdir -p dist
docker run -v$PWD/dist:/root/dist --name tmp_build_deb_container build_paddle_deb docker run -v$PWD/dist:/root/dist -v $PWD/../../../..:/root/paddle --name tmp_build_deb_container build_paddle_deb
docker rm tmp_build_deb_container docker rm tmp_build_deb_container
docker rmi build_paddle_deb docker rmi build_paddle_deb
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=OFF ENV IS_DEVEL=OFF
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=ON ENV WITH_DEMO=ON
......
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=OFF ENV IS_DEVEL=OFF
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=ON ENV WITH_DEMO=ON
......
FROM ubuntu:14.04 FROM ubuntu:14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=OFF ENV WITH_GPU=OFF
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=OFF ENV IS_DEVEL=OFF
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=ON ENV WITH_DEMO=ON
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=OFF ENV IS_DEVEL=OFF
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=ON ENV WITH_DEMO=ON
......
FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04 FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=ON ENV WITH_GPU=ON
ENV IS_DEVEL=ON ENV IS_DEVEL=ON
ENV WITH_DEMO=OFF ENV WITH_DEMO=OFF
......
FROM PADDLE_BASE_IMAGE FROM PADDLE_BASE_IMAGE
MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Dev Team <paddle-dev@baidu.com>
COPY build.sh /root/ COPY build.sh /root/
ENV GIT_CHECKOUT=develop ENV GIT_CHECKOUT=v0.9.0a0
ENV WITH_GPU=PADDLE_WITH_GPU ENV WITH_GPU=PADDLE_WITH_GPU
ENV IS_DEVEL=PADDLE_IS_DEVEL ENV IS_DEVEL=PADDLE_IS_DEVEL
ENV WITH_DEMO=PADDLE_WITH_DEMO ENV WITH_DEMO=PADDLE_WITH_DEMO
......
...@@ -28,6 +28,34 @@ function version(){ ...@@ -28,6 +28,34 @@ function version(){
echo " with_predict_sdk: @WITH_PREDICT_SDK@" echo " with_predict_sdk: @WITH_PREDICT_SDK@"
} }
function ver2num() {
# convert version to number.
if [ -z "$1" ]; then # empty argument
printf "%03d%03d%03d%03d%03d" 0
else
local VERN=$(echo $1 | sed 's#v##g' | sed 's#\.# #g' \
| sed 's#a# 0 #g' | sed 's#b# 1 #g' | sed 's#rc# 2 #g')
if [ `echo $VERN | wc -w` -eq 3 ] ; then
printf "%03d%03d%03d%03d%03d" $VERN 999 999
else
printf "%03d%03d%03d%03d%03d" $VERN
fi
fi
}
PADDLE_CONF_HOME="$HOME/.config/paddle"
mkdir -p ${PADDLE_CONF_HOME}
if [ -z "${PADDLE_NO_STAT+x}" ]; then
SERVER_VER=`curl -m 5 -X POST --data content="{ \"version\": \"@PADDLE_VERSION@\" }"\
-b ${PADDLE_CONF_HOME}/paddle.cookie \
-c ${PADDLE_CONF_HOME}/paddle.cookie \
http://api.paddlepaddle.org/version 2>/dev/null`
if [ $? -eq 0 ] && [ "$(ver2num @PADDLE_VERSION@)" -lt $(ver2num $SERVER_VER) ]; then
echo "Paddle release a new version ${SERVER_VER}, you can get the install package in http://www.paddlepaddle.org"
fi
fi
MYDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" MYDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册