提交 18e5edf6 编写于 作者: G gangliao 提交者: GitHub

Merge pull request #1976 from luotao1/release/0.10.0

merge commit on Release/0.10.0 branch to develop branch
# Release v0.10.0 # Release v0.10.0
We are glad to release version 0.10.0. In this version, we are happy to release the new
[Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
- Our old Python API is kind of out of date. It's hard to learn and hard to
use. To write a PaddlePaddle program using the old API, we'd have to write
at least two Python files: one `data provider` and another one that defines
the network topology. Users start a PaddlePaddle job by running the
`paddle_trainer` C++ program, which calls Python interpreter to run the
network topology configuration script and then start the training loop,
which iteratively calls the data provider function to load minibatches.
This prevents us from writing a Python program in a modern way, e.g., in the
Jupyter Notebook.
- The new API, which we often refer to as the *v2 API*, allows us to write
much shorter Python programs to define the network and the data in a single
.py file. Also, this program can run in Jupyter Notebook, since the entry
point is in Python program and PaddlePaddle runs as a shared library loaded
and invoked by this Python program.
Basing on the new API, we delivered an online interative
book, [Deep Learning 101](http://book.paddlepaddle.org/index.en.html)
and [its Chinese version](http://book.paddlepaddle.org/).
We also worked on updating our online documentation to describe the new API.
But this is an ongoing work. We will release more documentation improvements
in the next version.
We also worked on bring the new API to distributed model training (via MPI and
Kubernetes). This work is ongoing. We will release more about it in the next
version.
## New Features ## New Features
* We release [new Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
* Deep Learning 101 book in [English](http://book.paddlepaddle.org/index.en.html) and [Chinese](http://book.paddlepaddle.org/).
* Support rectangle input for CNN.
* Support stride pooling for seqlastin and seqfirstin.
* Expose `seq_concat_layer/seq_reshape_layer` in `trainer_config_helpers`.
* Add dataset package: CIFAR, MNIST, IMDB, WMT14, CONLL05, movielens, imikolov.
* Add Priorbox layer for Single Shot Multibox Detection.
* Add smooth L1 cost.
* Add data reader creator and data reader decorator for v2 API.
* Add the CPU implementation of cmrnorm projection.
## Improvements ## Improvements
* Support Python virtualenv for `paddle_trainer`.
* Add pre-commit hooks, used for automatically format our code.
* Upgrade protobuf to version 3.x.
* Add an option to check data type in Python data provider.
* Speedup the backward of average layer on GPU.
* Documentation refinement.
* Check dead links in documents using Travis-CI.
* Add a example for explaining `sparse_vector`.
* Add ReLU in layer_math.py
* Simplify data processing flow for Quick Start.
* Support CUDNN Deconv.
* Add data feeder in v2 API.
* Support predicting the samples from sys.stdin for sentiment demo.
* Provide multi-proccess interface for image preprocessing.
* Add benchmark document for v1 API.
* Add ReLU in `layer_math.py`.
* Add packages for automatically downloading public datasets.
* Rename `Argument::sumCost` to `Argument::sum` since class `Argument` is nothing with cost.
* Expose Argument::sum to Python
* Add a new `TensorExpression` implementation for matrix-related expression evaluations.
* Add lazy assignment for optimizing the calculation of a batch of multiple expressions.
* Add abstract calss `Function` and its implementation:
* `PadFunc` and `PadGradFunc`.
* `ContextProjectionForwardFunc` and `ContextProjectionBackwardFunc`.
* `CosSimBackward` and `CosSimBackwardFunc`.
* `CrossMapNormalFunc` and `CrossMapNormalGradFunc`.
* `MulFunc`.
* Add class `AutoCompare` and `FunctionCompare`, which make it easier to write unit tests for comparing gpu and cpu version of a function.
* Generate `libpaddle_test_main.a` and remove the main function inside the test file.
* Support dense numpy vector in PyDataProvider2.
* Clean code base, remove some copy-n-pasted code snippets:
* Extract `RowBuffer` class for `SparseRowMatrix`.
* Clean the interface of `GradientMachine`.
* Use `override` keyword in layer.
* Simplify `Evaluator::create`, use `ClassRegister` to create `Evaluator`s.
* Check MD5 checksum when downloading demo's dataset.
* Add `paddle::Error` which intentially replace `LOG(FATAL)` in Paddle.
## Bug Fixes ## Bug Fixes
* Check layer input types for `recurrent_group`.
* Don't run `clang-format` with .cu source files.
* Fix bugs with `LogActivation`.
* Fix the bug that runs `test_layerHelpers` multiple times.
* Fix the bug that the seq2seq demo exceeds protobuf message size limit.
* Fix the bug in dataprovider converter in GPU mode.
* Fix a bug in `GatedRecurrentLayer`.
* Fix bug for `BatchNorm` when testing more than one models.
* Fix broken unit test of paramRelu.
* Fix some compile-time warnings about `CpuSparseMatrix`.
* Fix `MultiGradientMachine` error when `trainer_count > batch_size`.
* Fix bugs that prevents from asynchronous data loading in `PyDataProvider2`.
# Release v0.9.0 # Release v0.9.0
......
set(CPACK_PACKAGE_NAME paddle) set(CPACK_PACKAGE_NAME paddle)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "")
set(CPACK_PACKAGE_VERSION_MAJOR ${PADDLE_MAJOR_VERSION}) set(CPACK_PACKAGE_VERSION_MAJOR ${PADDLE_MAJOR_VERSION})
set(CPACK_PACKAGE_VERSION_MINOR ${PADDLE_MINOR_VERSION}) set(CPACK_PACKAGE_VERSION_MINOR ${PADDLE_MINOR_VERSION})
set(CPACK_PACKAGE_VERSION_PATCH ${PADDLE_PATCH_VERSION}) set(CPACK_PACKAGE_VERSION_PATCH ${PADDLE_PATCH_VERSION})
...@@ -10,8 +9,9 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE amd64) ...@@ -10,8 +9,9 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE amd64)
set(CPACK_DEBIAN_PACKAGE_MAINTAINER PaddlePaddle Dev <paddle-dev@baidu.com>) set(CPACK_DEBIAN_PACKAGE_MAINTAINER PaddlePaddle Dev <paddle-dev@baidu.com>)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "Paddle") set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "Paddle")
set(CPACK_PACKAGE_DESCRIPTION "") set(CPACK_PACKAGE_DESCRIPTION "")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "libatlas3-base, libgflags2, libgoogle-glog0, libprotobuf8, libpython2.7, libstdc++6, python-numpy, python-pip, python-pip-whl, python-protobuf") set(CPACK_DEBIAN_PACKAGE_DEPENDS "libpython2.7-dev, libstdc++6, python-pip, curl, libgfortran3, python-pip-whl")
set(CPACK_DEBIAN_PACKAGE_SECTION Devel) set(CPACK_DEBIAN_PACKAGE_SECTION Devel)
set(CPACK_DEBIAN_PACKAGE_VERSION ${PADDLE_VERSION})
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${PROJ_ROOT}/paddle/scripts/deb/postinst") set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${PROJ_ROOT}/paddle/scripts/deb/postinst")
#set(CPACK_GENERATOR "DEB") #set(CPACK_GENERATOR "DEB")
# Start cpack # Start cpack
......
...@@ -29,7 +29,7 @@ settings( ...@@ -29,7 +29,7 @@ settings(
batch_size=128, batch_size=128,
learning_rate=2e-3, learning_rate=2e-3,
learning_method=AdamOptimizer(), learning_method=AdamOptimizer(),
average_window=0.5, model_average=ModelAverage(0.5),
regularization=L2Regularization(8e-4), regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25) gradient_clipping_threshold=25)
......
...@@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf, ...@@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf,
encoder_size=512, encoder_size=512,
decoder_size=512, decoder_size=512,
beam_size=3, beam_size=3,
max_length=250): max_length=250,
error_clipping=50):
""" """
A wrapper for an attention version of GRU Encoder-Decoder network A wrapper for an attention version of GRU Encoder-Decoder network
is_generating: whether this config is used for generating is_generating: whether this config is used for generating
...@@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf, ...@@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf,
input=src_word_id, input=src_word_id,
size=word_vector_dim, size=word_vector_dim,
param_attr=ParamAttr(name='_source_language_embedding')) param_attr=ParamAttr(name='_source_language_embedding'))
src_forward = simple_gru(input=src_embedding, size=encoder_size) src_forward = simple_gru(
input=src_embedding,
size=encoder_size,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
src_backward = simple_gru( src_backward = simple_gru(
input=src_embedding, size=encoder_size, reverse=True) input=src_embedding,
size=encoder_size,
reverse=True,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
encoded_vector = concat_layer(input=[src_forward, src_backward]) encoded_vector = concat_layer(input=[src_forward, src_backward])
with mixed_layer(size=decoder_size) as encoded_proj: with mixed_layer(size=decoder_size) as encoded_proj:
...@@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf, ...@@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf,
decoder_inputs += full_matrix_projection(input=context) decoder_inputs += full_matrix_projection(input=context)
decoder_inputs += full_matrix_projection(input=current_word) decoder_inputs += full_matrix_projection(input=current_word)
gru_step = gru_step_layer( gru_step = gru_step_naive_layer(
name='gru_decoder', name='gru_decoder',
input=decoder_inputs, input=decoder_inputs,
output_mem=decoder_mem, output_mem=decoder_mem,
size=decoder_size) size=decoder_size,
layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
with mixed_layer( with mixed_layer(
size=target_dict_dim, bias_attr=True, size=target_dict_dim, bias_attr=True,
......
...@@ -2,7 +2,8 @@ ...@@ -2,7 +2,8 @@
============ ============
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 1
build_and_install/index_cn.rst build_and_install/index_cn.rst
basic_usage/index_cn.rst
- `深度学习入门课程 <http://book.paddlepaddle.org/>`_
...@@ -2,7 +2,8 @@ GET STARTED ...@@ -2,7 +2,8 @@ GET STARTED
============ ============
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 1
build_and_install/index_en.rst build_and_install/index_en.rst
basic_usage/index_en.rst
- `Deep Learning 101 <http://book.paddlepaddle.org/index.en.html>`_
...@@ -19,18 +19,18 @@ ...@@ -19,18 +19,18 @@
在 PaddlePaddle中,下面这些Layer能够接受双层序列作为输入,完成相应的计算。 在 PaddlePaddle中,下面这些Layer能够接受双层序列作为输入,完成相应的计算。
pooling_layer pooling
============== ========
pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_pooling_layer` 配置API。 pooling 的使用示例如下,详细见 :ref:`api_v2.layer_pooling` 配置API。
.. code-block:: bash .. code-block:: bash
seq_pool = pooling_layer(input=layer, seq_pool = pooling(input=layer,
pooling_type=AvgPooling(), pooling_type=pooling.Max(),
agg_level=AggregateLevel.EACH_SEQUENCE) agg_level=AggregateLevel.EACH_SEQUENCE)
- `pooling_type` 目前支持两种,分别是:MaxPooling()和AvgPooling()。 - `pooling_type` 目前支持两种,分别是:pooling.Max()和pooling.Avg()。
- `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值): - `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值):
...@@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers ...@@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers
last_seq 和 first_seq last_seq 和 first_seq
===================== =====================
last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_seq` 类似),详细见 :ref:`api_trainer_config_helpers_layers_last_seq` 配置API。 last_seq 的使用示例如下( :ref:`api_v2.layer_first_seq` 类似),详细见 :ref:`api_v2.layer_last_seq` 配置API。
.. code-block:: bash .. code-block:: bash
...@@ -65,14 +65,14 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_ ...@@ -65,14 +65,14 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_
- 输入:必须是一个双层序列 - 输入:必须是一个双层序列
- 输出:一个单层序列,其中每个元素是双层序列中每个subseq最后一个(或第一个)元素。 - 输出:一个单层序列,其中每个元素是双层序列中每个subseq最后一个(或第一个)元素。
expand_layer expand
============ ======
expand_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_expand_layer` 配置API。 expand 的使用示例如下,详细见 :ref:`api_v2.layer_expand` 配置API。
.. code-block:: bash .. code-block:: bash
expand = expand_layer(input=layer1, ex = expand(input=layer1,
expand_as=layer2, expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP) expand_level=ExpandLevel.FROM_TIMESTEP)
......
...@@ -4,7 +4,6 @@ RNN相关模型 ...@@ -4,7 +4,6 @@ RNN相关模型
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
rnn_config_cn.rst
recurrent_group_cn.md recurrent_group_cn.md
hierarchical_layer_cn.rst hierarchical_layer_cn.rst
hrnn_rnn_api_compare_cn.rst hrnn_rnn_api_compare_cn.rst
RNN Models RNN Models
========== ==========
.. toctree::
:maxdepth: 1
rnn_config_en.rst
...@@ -5,7 +5,6 @@ PaddlePaddle 文档 ...@@ -5,7 +5,6 @@ PaddlePaddle 文档
:maxdepth: 1 :maxdepth: 1
getstarted/index_cn.rst getstarted/index_cn.rst
tutorials/index_cn.md
howto/index_cn.rst howto/index_cn.rst
api/index_cn.rst api/index_cn.rst
faq/index_cn.rst faq/index_cn.rst
...@@ -5,8 +5,6 @@ PaddlePaddle Documentation ...@@ -5,8 +5,6 @@ PaddlePaddle Documentation
:maxdepth: 1 :maxdepth: 1
getstarted/index_en.rst getstarted/index_en.rst
tutorials/index_en.md
howto/index_en.rst howto/index_en.rst
api/index_en.rst api/index_en.rst
about/index_en.rst about/index_en.rst
\ No newline at end of file
...@@ -114,10 +114,7 @@ ...@@ -114,10 +114,7 @@
</ul> </ul>
</div> </div>
<ul class="site-page-links"> <ul class="site-page-links">
<li><a>Home</a></li> <li><a href="/">Home</a></li>
<li><a>Get Started</a></li>
<li class="active"><a>Documentation</a></li>
<li><a>About Us</a></li>
</ul> </ul>
</div> </div>
<div class="doc-module"> <div class="doc-module">
...@@ -137,7 +134,7 @@ ...@@ -137,7 +134,7 @@
{{ toctree }} {{ toctree }}
{% endblock %} {% endblock %}
</nav> </nav>
{% if toc %} {% if False %}
<nav class="local-toc">{{ toc }}</nav> <nav class="local-toc">{{ toc }}</nav>
{% endif %} {% endif %}
<section class="doc-content-wrap"> <section class="doc-content-wrap">
...@@ -168,7 +165,8 @@ ...@@ -168,7 +165,8 @@
VERSION:'{{ release|e }}', VERSION:'{{ release|e }}',
COLLAPSE_INDEX:false, COLLAPSE_INDEX:false,
FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}', FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}',
HAS_SOURCE: {{ has_source|lower }} HAS_SOURCE: {{ has_source|lower }},
SOURCELINK_SUFFIX: ".txt",
}; };
</script> </script>
{%- for scriptfile in script_files %} {%- for scriptfile in script_files %}
......
...@@ -12,7 +12,7 @@ endif() ...@@ -12,7 +12,7 @@ endif()
add_library(paddle_function STATIC ${cpp_files} ${cu_objs}) add_library(paddle_function STATIC ${cpp_files} ${cu_objs})
add_dependencies(paddle_function ${external_project_dependencies}) add_dependencies(paddle_function ${external_project_dependencies})
add_dependencies(paddle_function gen_proto_cpp)
if(WITH_GPU) if(WITH_GPU)
if(WITH_TESTING) if(WITH_TESTING)
......
...@@ -48,8 +48,7 @@ lstm = lstmemory_group( ...@@ -48,8 +48,7 @@ lstm = lstmemory_group(
size=hidden_dim, size=hidden_dim,
act=TanhActivation(), act=TanhActivation(),
gate_act=SigmoidActivation(), gate_act=SigmoidActivation(),
state_act=TanhActivation(), state_act=TanhActivation())
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
lstm_last = last_seq(input=lstm) lstm_last = last_seq(input=lstm)
......
...@@ -51,8 +51,7 @@ def lstm_group(lstm_group_input): ...@@ -51,8 +51,7 @@ def lstm_group(lstm_group_input):
size=hidden_dim, size=hidden_dim,
act=TanhActivation(), act=TanhActivation(),
gate_act=SigmoidActivation(), gate_act=SigmoidActivation(),
state_act=TanhActivation(), state_act=TanhActivation())
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
return lstm_output return lstm_output
......
#!/bin/bash
set -e
echo "Post install paddle debian package."
echo "Install some python package used for paddle. You can run "
echo " pip install /usr/opt/paddle/share/wheels/*.whl to install them."
find /usr/ -name '*paddle*.whl' | xargs pip install
...@@ -5,13 +5,8 @@ set -e ...@@ -5,13 +5,8 @@ set -e
# Set BASE_IMAGE according to env variables # Set BASE_IMAGE according to env variables
if [ ${WITH_GPU} == "ON" ]; then if [ ${WITH_GPU} == "ON" ]; then
BASE_IMAGE="nvidia/cuda:8.0-cudnn5-runtime-ubuntu14.04" BASE_IMAGE="nvidia/cuda:8.0-cudnn5-runtime-ubuntu14.04"
# additional packages to install when building gpu images
GPU_DOCKER_PKG="python-pip python-dev"
else else
BASE_IMAGE="python:2.7.13-slim" BASE_IMAGE="ubuntu:14.04"
# FIXME: python base image uses different python version than WITH_GPU
# need to change PYTHONHOME to /usr/local when using python base image
CPU_DOCKER_PYTHON_HOME_ENV="ENV PYTHONHOME /usr/local"
fi fi
DOCKERFILE_GPU_ENV="" DOCKERFILE_GPU_ENV=""
...@@ -66,10 +61,7 @@ if [ ${WITH_DOC} == "ON" ]; then ...@@ -66,10 +61,7 @@ if [ ${WITH_DOC} == "ON" ]; then
rm -rf /paddle/build_doc rm -rf /paddle/build_doc
fi fi
# generate deb package for current build # generate deb package for current build
# FIXME(typhoonzero): should we remove paddle/scripts/deb ? cpack -D CPACK_GENERATOR='DEB' ..
# FIXME: CPACK_DEBIAN_PACKAGE_DEPENDS removes all dev dependencies, must
# install them in docker
cpack -D CPACK_GENERATOR='DEB' -D CPACK_DEBIAN_PACKAGE_DEPENDS="" ..
if [[ ${WOBOQ:-OFF} == 'ON' ]]; then if [[ ${WOBOQ:-OFF} == 'ON' ]]; then
apt-get install -y clang-3.8 llvm-3.8 libclang-3.8-dev apt-get install -y clang-3.8 llvm-3.8 libclang-3.8-dev
...@@ -97,32 +89,30 @@ fi ...@@ -97,32 +89,30 @@ fi
paddle version paddle version
if [[ -n ${APT_MIRROR} ]]; then
MIRROR_UPDATE="sed -i '${APT_MIRROR}' /etc/apt/sources.list && \\"
else
MIRROR_UPDATE="\\"
fi
cat > /paddle/build/Dockerfile <<EOF cat > /paddle/build/Dockerfile <<EOF
FROM ${BASE_IMAGE} FROM ${BASE_IMAGE}
MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com>
ENV HOME /root ENV HOME /root
ENV LANG en_US.UTF-8 ENV LANG en_US.UTF-8
# Use Fix locales to en_US.UTF-8 # Use Fix locales to en_US.UTF-8
RUN ${MIRROR_UPDATE} EOF
apt-get update && \
apt-get install -y libgfortran3 libpython2.7 ${GPU_DOCKER_PKG} && \ if [[ -n ${APT_MIRROR} ]]; then
apt-get clean -y && \ cat >> /paddle/build/Dockerfile <<EOF
pip install --upgrade pip && \ RUN sed -i '${APT_MIRROR}' /etc/apt/sources.list
pip install -U 'protobuf==3.1.0' requests numpy EOF
fi
cat >> /paddle/build/Dockerfile <<EOF
# Use different deb file when building different type of images # Use different deb file when building different type of images
ADD *.deb /usr/local/opt/paddle/deb/ ADD *.deb /
# run paddle version to install python packages first # run paddle version to install python packages first
RUN dpkg -i /usr/local/opt/paddle/deb/*.deb && \ RUN apt-get update &&\
rm -f /usr/local/opt/paddle/deb/*.deb && \ apt-get install -y python-pip && pip install -U pip && \
find /usr/ -name '*paddle-*.whl' | xargs pip install && \ dpkg -i /*.deb ; apt-get install -f -y && \
apt-get clean -y && \
rm -f /*.deb && \
paddle version paddle version
${CPU_DOCKER_PYTHON_HOME_ENV}
${DOCKERFILE_CUDNN_DSO} ${DOCKERFILE_CUDNN_DSO}
${DOCKERFILE_GPU_ENV} ${DOCKERFILE_GPU_ENV}
# default command shows the paddle version and exit # default command shows the paddle version and exit
......
...@@ -60,6 +60,7 @@ function deploy_docs() { ...@@ -60,6 +60,7 @@ function deploy_docs() {
deploy_docs "master" "." deploy_docs "master" "."
deploy_docs "develop" "./develop/" deploy_docs "develop" "./develop/"
deploy_docs "release/0.10.0" "./release/0.10.0/"
# Check is there anything changed. # Check is there anything changed.
set +e set +e
......
...@@ -23,7 +23,7 @@ setup(name="py_paddle", ...@@ -23,7 +23,7 @@ setup(name="py_paddle",
install_requires = [ install_requires = [
'nltk>=3.2.2', 'nltk>=3.2.2',
'numpy>=1.8.0', # The numpy is required. 'numpy>=1.8.0', # The numpy is required.
'protobuf>=${PROTOBUF_VERSION}' # The paddle protobuf version 'protobuf==${PROTOBUF_VERSION}' # The paddle protobuf version
], ],
url='http://www.paddlepaddle.org/', url='http://www.paddlepaddle.org/',
license='Apache 2.0', license='Apache 2.0',
......
...@@ -208,12 +208,15 @@ class ExtraLayerAttribute(object): ...@@ -208,12 +208,15 @@ class ExtraLayerAttribute(object):
drop_rate=None, drop_rate=None,
device=None): device=None):
self.attr = dict() self.attr = dict()
if isinstance(error_clipping_threshold, float): if error_clipping_threshold is not None:
assert error_clipping_threshold > 0 error_clipping_threshold = float(error_clipping_threshold)
self.attr["error_clipping_threshold"] = error_clipping_threshold if error_clipping_threshold < 0:
raise ValueError("Error clipping must > 0")
if isinstance(drop_rate, float): self.attr['error_clipping_threshold'] = error_clipping_threshold
assert drop_rate > 0 if drop_rate is not None:
drop_rate = float(drop_rate)
if drop_rate < 0:
raise ValueError("Dropout rate must > 0")
self.attr["drop_rate"] = drop_rate self.attr["drop_rate"] = drop_rate
if isinstance(device, int): if isinstance(device, int):
......
...@@ -84,6 +84,7 @@ __all__ = [ ...@@ -84,6 +84,7 @@ __all__ = [
'GeneratedInput', 'GeneratedInput',
'SubsequenceInput', 'SubsequenceInput',
'gru_step_layer', 'gru_step_layer',
'gru_step_naive_layer',
'recurrent_layer', 'recurrent_layer',
'BaseGeneratedInput', 'BaseGeneratedInput',
'conv_operator', 'conv_operator',
...@@ -3086,6 +3087,78 @@ def gru_step_layer(input, ...@@ -3086,6 +3087,78 @@ def gru_step_layer(input,
activation=act) activation=act)
@wrap_bias_attr_default()
@wrap_param_attr_default()
@wrap_act_default(param_names=['gate_act'], act=SigmoidActivation())
@wrap_act_default(act=TanhActivation())
@wrap_name_default('gru_step_naive')
@layer_support(ERROR_CLIPPING, DROPOUT)
def gru_step_naive_layer(input,
output_mem,
size=None,
name=None,
act=None,
gate_act=None,
bias_attr=None,
param_attr=None,
layer_attr=None):
"""
GRU Step Layer, but using MixedLayer to generate. It support ERROR_CLIPPING
and DROPOUT.
:param input:
:param output_mem:
:param size:
:param name:
:param act:
:param gate_act:
:param bias_attr:
:param param_attr:
:param layer_attr:
:return:
"""
if input.size % 3 != 0:
raise ValueError("GruStep input size must be divided by 3")
if size is None:
size = input.size / 3
def __gate__(gate_name, offset):
with mixed_layer(
name=name + "_" + gate_name,
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=gate_act) as gate:
gate += identity_projection(input=input, offset=offset)
gate += full_matrix_projection(
input=output_mem, param_attr=param_attr)
return gate
update_gate = __gate__("update", 0)
reset_gate = __gate__("reset", size)
with mixed_layer(
name=name + "_reset_output", bias_attr=False) as reset_output:
reset_output += dotmul_operator(a=output_mem, b=reset_gate)
with mixed_layer(
name=name + "_output_candidate",
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=act) as output_candidate:
output_candidate += identity_projection(input=input, offset=2 * size)
output_candidate += full_matrix_projection(
input=reset_output, param_attr=param_attr)
with mixed_layer(name=name) as output:
output += identity_projection(output_mem)
output += dotmul_operator(a=output_mem, b=update_gate, scale=-1.0)
output += dotmul_operator(a=output_candidate, b=update_gate)
return output
@wrap_name_default() @wrap_name_default()
@layer_support() @layer_support()
def get_output_layer(input, arg_name, name=None, layer_attr=None): def get_output_layer(input, arg_name, name=None, layer_attr=None):
......
...@@ -825,7 +825,8 @@ def gru_unit(input, ...@@ -825,7 +825,8 @@ def gru_unit(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
Define calculations that a gated recurrent unit performs in a single time Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be step. This function itself is not a recurrent layer, so that it can not be
...@@ -857,7 +858,12 @@ def gru_unit(input, ...@@ -857,7 +858,12 @@ def gru_unit(input,
out_mem = memory(name=name, size=size) out_mem = memory(name=name, size=size)
gru_out = gru_step_layer( if naive:
__step__ = gru_step_naive_layer
else:
__step__ = gru_step_layer
gru_out = __step__(
name=name, name=name,
input=input, input=input,
output_mem=out_mem, output_mem=out_mem,
...@@ -879,7 +885,8 @@ def gru_group(input, ...@@ -879,7 +885,8 @@ def gru_group(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
gru_group is a recurrent layer group version of Gated Recurrent Unit. It gru_group is a recurrent layer group version of Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
...@@ -928,7 +935,8 @@ def gru_group(input, ...@@ -928,7 +935,8 @@ def gru_group(input,
gru_param_attr=gru_param_attr, gru_param_attr=gru_param_attr,
act=act, act=act,
gate_act=gate_act, gate_act=gate_act,
gru_layer_attr=gru_layer_attr) gru_layer_attr=gru_layer_attr,
naive=naive)
return recurrent_group( return recurrent_group(
name='%s_recurrent_group' % name, name='%s_recurrent_group' % name,
...@@ -949,7 +957,8 @@ def simple_gru(input, ...@@ -949,7 +957,8 @@ def simple_gru(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group, You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
...@@ -1018,7 +1027,8 @@ def simple_gru(input, ...@@ -1018,7 +1027,8 @@ def simple_gru(input,
gru_param_attr=gru_param_attr, gru_param_attr=gru_param_attr,
act=act, act=act,
gate_act=gate_act, gate_act=gate_act,
gru_layer_attr=gru_layer_attr) gru_layer_attr=gru_layer_attr,
naive=naive)
@wrap_name_default('simple_gru2') @wrap_name_default('simple_gru2')
......
...@@ -320,6 +320,7 @@ layers { ...@@ -320,6 +320,7 @@ layers {
} }
} }
drop_rate: 0.5 drop_rate: 0.5
error_clipping_threshold: 40.0
} }
parameters { parameters {
name: "___embedding_0__.w0" name: "___embedding_0__.w0"
......
...@@ -356,6 +356,9 @@ def mixed(size=0, ...@@ -356,6 +356,9 @@ def mixed(size=0,
return MixedLayerV2(size, input, name, act, bias_attr, layer_attr) return MixedLayerV2(size, input, name, act, bias_attr, layer_attr)
mixed.__doc__ = conf_helps.mixed_layer.__doc__
class RecurrentLayerInput(Layer): class RecurrentLayerInput(Layer):
def __init__(self, recurrent_name, index, parent_layers): def __init__(self, recurrent_name, index, parent_layers):
parents_len = len(parent_layers) parents_len = len(parent_layers)
...@@ -404,6 +407,8 @@ data.__name__ = 'data' ...@@ -404,6 +407,8 @@ data.__name__ = 'data'
AggregateLevel = conf_helps.layers.AggregateLevel AggregateLevel = conf_helps.layers.AggregateLevel
ExpandLevel = conf_helps.layers.ExpandLevel ExpandLevel = conf_helps.layers.ExpandLevel
memory = MemoryV2 memory = MemoryV2
memory.__name__ = 'memory'
memory.__doc__ = conf_helps.memory.__doc__
def __layer_name_mapping__(inname): def __layer_name_mapping__(inname):
...@@ -512,6 +517,9 @@ def recurrent_group(step, input, name=None): ...@@ -512,6 +517,9 @@ def recurrent_group(step, input, name=None):
return retv return retv
recurrent_group.__doc__ = conf_helps.recurrent_group.__doc__
@wrap_name_default() @wrap_name_default()
def beam_search(step, def beam_search(step,
input, input,
...@@ -579,6 +587,8 @@ def beam_search(step, ...@@ -579,6 +587,8 @@ def beam_search(step,
return tmp return tmp
beam_search.__doc__ = conf_helps.beam_search.__doc__
__projection_names__ = filter(lambda x: x.endswith('_projection'), __projection_names__ = filter(lambda x: x.endswith('_projection'),
dir(conf_helps)) dir(conf_helps))
......
...@@ -15,6 +15,9 @@ setup(name='paddle', ...@@ -15,6 +15,9 @@ setup(name='paddle',
description='Parallel Distributed Deep Learning', description='Parallel Distributed Deep Learning',
install_requires=[ install_requires=[
"requests", "requests",
"numpy",
"protobuf==${PROTOBUF_VERSION}",
"matplotlib",
], ],
packages=packages, packages=packages,
package_dir={ package_dir={
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册