提交 c27e71e2 编写于 作者: L Liu Yiqun

Merge branch 'develop' into cmake_protobuf

# Release v0.10.0 # Release v0.10.0
We are glad to release version 0.10.0. In this version, we are happy to release the new
[Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
- Our old Python API is kind of out of date. It's hard to learn and hard to
use. To write a PaddlePaddle program using the old API, we'd have to write
at least two Python files: one `data provider` and another one that defines
the network topology. Users start a PaddlePaddle job by running the
`paddle_trainer` C++ program, which calls Python interpreter to run the
network topology configuration script and then start the training loop,
which iteratively calls the data provider function to load minibatches.
This prevents us from writing a Python program in a modern way, e.g., in the
Jupyter Notebook.
- The new API, which we often refer to as the *v2 API*, allows us to write
much shorter Python programs to define the network and the data in a single
.py file. Also, this program can run in Jupyter Notebook, since the entry
point is in Python program and PaddlePaddle runs as a shared library loaded
and invoked by this Python program.
Basing on the new API, we delivered an online interative
book, [Deep Learning 101](http://book.paddlepaddle.org/index.en.html)
and [its Chinese version](http://book.paddlepaddle.org/).
We also worked on updating our online documentation to describe the new API.
But this is an ongoing work. We will release more documentation improvements
in the next version.
We also worked on bring the new API to distributed model training (via MPI and
Kubernetes). This work is ongoing. We will release more about it in the next
version.
## New Features ## New Features
* We release [new Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
* Deep Learning 101 book in [English](http://book.paddlepaddle.org/index.en.html) and [Chinese](http://book.paddlepaddle.org/).
* Support rectangle input for CNN.
* Support stride pooling for seqlastin and seqfirstin.
* Expose `seq_concat_layer/seq_reshape_layer` in `trainer_config_helpers`.
* Add dataset package: CIFAR, MNIST, IMDB, WMT14, CONLL05, movielens, imikolov.
* Add Priorbox layer for Single Shot Multibox Detection.
* Add smooth L1 cost.
* Add data reader creator and data reader decorator for v2 API.
* Add the CPU implementation of cmrnorm projection.
## Improvements ## Improvements
* Support Python virtualenv for `paddle_trainer`.
* Add pre-commit hooks, used for automatically format our code.
* Upgrade protobuf to version 3.x.
* Add an option to check data type in Python data provider.
* Speedup the backward of average layer on GPU.
* Documentation refinement.
* Check dead links in documents using Travis-CI.
* Add a example for explaining `sparse_vector`.
* Add ReLU in layer_math.py
* Simplify data processing flow for Quick Start.
* Support CUDNN Deconv.
* Add data feeder in v2 API.
* Support predicting the samples from sys.stdin for sentiment demo.
* Provide multi-proccess interface for image preprocessing.
* Add benchmark document for v1 API.
* Add ReLU in `layer_math.py`.
* Add packages for automatically downloading public datasets.
* Rename `Argument::sumCost` to `Argument::sum` since class `Argument` is nothing with cost.
* Expose Argument::sum to Python
* Add a new `TensorExpression` implementation for matrix-related expression evaluations.
* Add lazy assignment for optimizing the calculation of a batch of multiple expressions.
* Add abstract calss `Function` and its implementation:
* `PadFunc` and `PadGradFunc`.
* `ContextProjectionForwardFunc` and `ContextProjectionBackwardFunc`.
* `CosSimBackward` and `CosSimBackwardFunc`.
* `CrossMapNormalFunc` and `CrossMapNormalGradFunc`.
* `MulFunc`.
* Add class `AutoCompare` and `FunctionCompare`, which make it easier to write unit tests for comparing gpu and cpu version of a function.
* Generate `libpaddle_test_main.a` and remove the main function inside the test file.
* Support dense numpy vector in PyDataProvider2.
* Clean code base, remove some copy-n-pasted code snippets:
* Extract `RowBuffer` class for `SparseRowMatrix`.
* Clean the interface of `GradientMachine`.
* Use `override` keyword in layer.
* Simplify `Evaluator::create`, use `ClassRegister` to create `Evaluator`s.
* Check MD5 checksum when downloading demo's dataset.
* Add `paddle::Error` which intentially replace `LOG(FATAL)` in Paddle.
## Bug Fixes ## Bug Fixes
* Check layer input types for `recurrent_group`.
* Don't run `clang-format` with .cu source files.
* Fix bugs with `LogActivation`.
* Fix the bug that runs `test_layerHelpers` multiple times.
* Fix the bug that the seq2seq demo exceeds protobuf message size limit.
* Fix the bug in dataprovider converter in GPU mode.
* Fix a bug in `GatedRecurrentLayer`.
* Fix bug for `BatchNorm` when testing more than one models.
* Fix broken unit test of paramRelu.
* Fix some compile-time warnings about `CpuSparseMatrix`.
* Fix `MultiGradientMachine` error when `trainer_count > batch_size`.
* Fix bugs that prevents from asynchronous data loading in `PyDataProvider2`.
# Release v0.9.0 # Release v0.9.0
......
...@@ -44,7 +44,6 @@ if(MKL_INC_DIR AND MKL_CORE_LIB AND MKL_SEQUENTIAL_LIB AND MKL_INTEL_LP64) ...@@ -44,7 +44,6 @@ if(MKL_INC_DIR AND MKL_CORE_LIB AND MKL_SEQUENTIAL_LIB AND MKL_INTEL_LP64)
message(STATUS "Found MKL (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})") message(STATUS "Found MKL (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON) set(CBLAS_FOUND ON)
if(${MKL_LAPACK_INC_DIR}) if(${MKL_LAPACK_INC_DIR})
add_definitions(-DPADDLE_USE_LAPACK)
message(STATUS "Found lapack in MKL (include: ${MKL_LAPACK_INC_DIR})") message(STATUS "Found lapack in MKL (include: ${MKL_LAPACK_INC_DIR})")
endif() endif()
return() # return file. return() # return file.
...@@ -80,7 +79,6 @@ if(ATLAS_INC_DIR AND ATLAS_CBLAS_LIB AND ATLAS_LIB AND NOT CBLAS_FOUND) ...@@ -80,7 +79,6 @@ if(ATLAS_INC_DIR AND ATLAS_CBLAS_LIB AND ATLAS_LIB AND NOT CBLAS_FOUND)
message(STATUS "Found ATLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})") message(STATUS "Found ATLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON) set(CBLAS_FOUND ON)
if(ATLAS_CLAPACK_INC_DIR) if(ATLAS_CLAPACK_INC_DIR)
add_definitions(-DPADDLE_USE_LAPACK)
set(CBLAS_INC_DIR ${CBLAS_INC_DIR} ${ATLAS_CLAPACK_INC_DIR}) set(CBLAS_INC_DIR ${CBLAS_INC_DIR} ${ATLAS_CLAPACK_INC_DIR})
message(STATUS "Found lapack in ATLAS (include: ${ATLAS_CLAPACK_INC_DIR})") message(STATUS "Found lapack in ATLAS (include: ${ATLAS_CLAPACK_INC_DIR})")
endif() endif()
...@@ -115,7 +113,6 @@ if(OPENBLAS_INC_DIR AND OPENBLAS_LIB) ...@@ -115,7 +113,6 @@ if(OPENBLAS_INC_DIR AND OPENBLAS_LIB)
message(STATUS "Found OpenBLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})") message(STATUS "Found OpenBLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON) set(CBLAS_FOUND ON)
if(OPENBLAS_LAPACKE_INC_DIR) if(OPENBLAS_LAPACKE_INC_DIR)
add_definitions(-DPADDLE_USE_LAPACK)
message(STATUS "Found lapack in OpenBLAS (include: ${OPENBLAS_LAPACKE_INC_DIR})") message(STATUS "Found lapack in OpenBLAS (include: ${OPENBLAS_LAPACKE_INC_DIR})")
endif() endif()
return() return()
......
...@@ -24,45 +24,17 @@ IF(NOT ${CBLAS_FOUND}) ...@@ -24,45 +24,17 @@ IF(NOT ${CBLAS_FOUND})
SET(CBLAS_LIBRARIES "${CBLAS_INSTALL_DIR}/lib/${LIBRARY_PREFIX}openblas${STATIC_LIBRARY_SUFFIX}" SET(CBLAS_LIBRARIES "${CBLAS_INSTALL_DIR}/lib/${LIBRARY_PREFIX}openblas${STATIC_LIBRARY_SUFFIX}"
CACHE FILEPATH "openblas library." FORCE) CACHE FILEPATH "openblas library." FORCE)
# check fortran compiler and library SET(COMMON_ARGS CC=${CMAKE_C_COMPILER} NO_LAPACK=1 NO_SHARED=1)
IF(ANDROID) IF(ANDROID)
SET(OPENBLAS_COMMIT "b5c96fcfcdc82945502a2303116a64d89985daf5") SET(OPENBLAS_COMMIT "b5c96fcfcdc82945502a2303116a64d89985daf5")
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 ARM_SOFTFP_ABI=1 NOFORTRAN=1 USE_THREAD=0 libs) SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 ARM_SOFTFP_ABI=1 USE_THREAD=0 libs)
ELSEIF(RPI) ELSEIF(RPI)
SET(OPENBLAS_COMMIT "v0.2.19") SET(OPENBLAS_COMMIT "v0.2.19")
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 NOFORTRAN=1 USE_THREAD=0 libs) SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 USE_THREAD=0 libs)
ELSE() ELSE()
IF(CMAKE_COMPILER_IS_GNUCC)
ENABLE_LANGUAGE(Fortran)
if (NOT CMAKE_Fortran_COMPILER_VERSION)
# cmake < 3.4 cannot get CMAKE_Fortran_COMPILER_VERSION directly.
execute_process(COMMAND ${CMAKE_Fortran_COMPILER} -dumpversion
OUTPUT_VARIABLE CMAKE_Fortran_COMPILER_VERSION)
endif()
string(REGEX MATCHALL "[0-9]+" Fortran_VERSION ${CMAKE_Fortran_COMPILER_VERSION})
list(GET Fortran_VERSION 0 Fortran_MAJOR)
list(GET Fortran_VERSION 1 Fortran_MINOR)
find_library(GFORTRAN_LIBRARY NAMES gfortran PATHS
/lib
/usr/lib
/usr/lib/gcc/x86_64-linux-gnu/${Fortran_MAJOR}.${Fortran_MINOR}/
/usr/lib/gcc/x86_64-linux-gnu/${Fortran_MAJOR}/)
if (NOT GFORTRAN_LIBRARY)
message(FATAL_ERROR "Cannot found gfortran library which it is used by openblas")
endif()
find_package(Threads REQUIRED)
LIST(APPEND CBLAS_LIBRARIES ${GFORTRAN_LIBRARY} ${CMAKE_THREAD_LIBS_INIT})
ENDIF(CMAKE_COMPILER_IS_GNUCC)
IF(NOT CMAKE_Fortran_COMPILER)
MESSAGE(FATAL_ERROR "To build lapack in libopenblas, "
"you need to set gfortran compiler: cmake .. -DCMAKE_Fortran_COMPILER=...")
ENDIF(NOT CMAKE_Fortran_COMPILER)
ADD_DEFINITIONS(-DPADDLE_USE_LAPACK)
SET(OPENBLAS_COMMIT "v0.2.19") SET(OPENBLAS_COMMIT "v0.2.19")
SET(OPENBLAS_ARGS FC=${CMAKE_Fortran_COMPILER} DYNAMIC_ARCH=1 libs netlib) SET(OPENBLAS_ARGS DYNAMIC_ARCH=1 libs)
ENDIF() ENDIF()
ExternalProject_Add( ExternalProject_Add(
...@@ -73,7 +45,7 @@ IF(NOT ${CBLAS_FOUND}) ...@@ -73,7 +45,7 @@ IF(NOT ${CBLAS_FOUND})
PREFIX ${CBLAS_SOURCES_DIR} PREFIX ${CBLAS_SOURCES_DIR}
INSTALL_DIR ${CBLAS_INSTALL_DIR} INSTALL_DIR ${CBLAS_INSTALL_DIR}
BUILD_IN_SOURCE 1 BUILD_IN_SOURCE 1
BUILD_COMMAND ${CMAKE_MAKE_PROGRAM} CC=${CMAKE_C_COMPILER} NO_SHARED=1 ${OPTIONAL_ARGS} BUILD_COMMAND ${CMAKE_MAKE_PROGRAM} ${COMMON_ARGS} ${OPTIONAL_ARGS}
INSTALL_COMMAND ${CMAKE_MAKE_PROGRAM} install NO_SHARED=1 PREFIX=<INSTALL_DIR> INSTALL_COMMAND ${CMAKE_MAKE_PROGRAM} install NO_SHARED=1 PREFIX=<INSTALL_DIR>
UPDATE_COMMAND "" UPDATE_COMMAND ""
CONFIGURE_COMMAND "" CONFIGURE_COMMAND ""
......
set(CPACK_PACKAGE_NAME paddle) set(CPACK_PACKAGE_NAME paddle)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "")
set(CPACK_PACKAGE_VERSION_MAJOR ${PADDLE_MAJOR_VERSION}) set(CPACK_PACKAGE_VERSION_MAJOR ${PADDLE_MAJOR_VERSION})
set(CPACK_PACKAGE_VERSION_MINOR ${PADDLE_MINOR_VERSION}) set(CPACK_PACKAGE_VERSION_MINOR ${PADDLE_MINOR_VERSION})
set(CPACK_PACKAGE_VERSION_PATCH ${PADDLE_PATCH_VERSION}) set(CPACK_PACKAGE_VERSION_PATCH ${PADDLE_PATCH_VERSION})
...@@ -10,8 +9,9 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE amd64) ...@@ -10,8 +9,9 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE amd64)
set(CPACK_DEBIAN_PACKAGE_MAINTAINER PaddlePaddle Dev <paddle-dev@baidu.com>) set(CPACK_DEBIAN_PACKAGE_MAINTAINER PaddlePaddle Dev <paddle-dev@baidu.com>)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "Paddle") set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "Paddle")
set(CPACK_PACKAGE_DESCRIPTION "") set(CPACK_PACKAGE_DESCRIPTION "")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "libatlas3-base, libgflags2, libgoogle-glog0, libprotobuf8, libpython2.7, libstdc++6, python-numpy, python-pip, python-pip-whl, python-protobuf") set(CPACK_DEBIAN_PACKAGE_DEPENDS "libpython2.7-dev, libstdc++6, python-pip, curl, libgfortran3, python-pip-whl")
set(CPACK_DEBIAN_PACKAGE_SECTION Devel) set(CPACK_DEBIAN_PACKAGE_SECTION Devel)
set(CPACK_DEBIAN_PACKAGE_VERSION ${PADDLE_VERSION})
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${PROJ_ROOT}/paddle/scripts/deb/postinst") set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${PROJ_ROOT}/paddle/scripts/deb/postinst")
#set(CPACK_GENERATOR "DEB") #set(CPACK_GENERATOR "DEB")
# Start cpack # Start cpack
......
...@@ -29,7 +29,7 @@ settings( ...@@ -29,7 +29,7 @@ settings(
batch_size=128, batch_size=128,
learning_rate=2e-3, learning_rate=2e-3,
learning_method=AdamOptimizer(), learning_method=AdamOptimizer(),
average_window=0.5, model_average=ModelAverage(0.5),
regularization=L2Regularization(8e-4), regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25) gradient_clipping_threshold=25)
......
...@@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf, ...@@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf,
encoder_size=512, encoder_size=512,
decoder_size=512, decoder_size=512,
beam_size=3, beam_size=3,
max_length=250): max_length=250,
error_clipping=50):
""" """
A wrapper for an attention version of GRU Encoder-Decoder network A wrapper for an attention version of GRU Encoder-Decoder network
is_generating: whether this config is used for generating is_generating: whether this config is used for generating
...@@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf, ...@@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf,
input=src_word_id, input=src_word_id,
size=word_vector_dim, size=word_vector_dim,
param_attr=ParamAttr(name='_source_language_embedding')) param_attr=ParamAttr(name='_source_language_embedding'))
src_forward = simple_gru(input=src_embedding, size=encoder_size) src_forward = simple_gru(
input=src_embedding,
size=encoder_size,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
src_backward = simple_gru( src_backward = simple_gru(
input=src_embedding, size=encoder_size, reverse=True) input=src_embedding,
size=encoder_size,
reverse=True,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
encoded_vector = concat_layer(input=[src_forward, src_backward]) encoded_vector = concat_layer(input=[src_forward, src_backward])
with mixed_layer(size=decoder_size) as encoded_proj: with mixed_layer(size=decoder_size) as encoded_proj:
...@@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf, ...@@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf,
decoder_inputs += full_matrix_projection(input=context) decoder_inputs += full_matrix_projection(input=context)
decoder_inputs += full_matrix_projection(input=current_word) decoder_inputs += full_matrix_projection(input=current_word)
gru_step = gru_step_layer( gru_step = gru_step_naive_layer(
name='gru_decoder', name='gru_decoder',
input=decoder_inputs, input=decoder_inputs,
output_mem=decoder_mem, output_mem=decoder_mem,
size=decoder_size) size=decoder_size,
layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
with mixed_layer( with mixed_layer(
size=target_dict_dim, bias_attr=True, size=target_dict_dim, bias_attr=True,
......
...@@ -2,7 +2,8 @@ ...@@ -2,7 +2,8 @@
============ ============
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 1
build_and_install/index_cn.rst build_and_install/index_cn.rst
basic_usage/index_cn.rst
- `深度学习入门课程 <http://book.paddlepaddle.org/>`_
...@@ -2,7 +2,8 @@ GET STARTED ...@@ -2,7 +2,8 @@ GET STARTED
============ ============
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 1
build_and_install/index_en.rst build_and_install/index_en.rst
basic_usage/index_en.rst
- `Deep Learning 101 <http://book.paddlepaddle.org/index.en.html>`_
...@@ -19,18 +19,18 @@ ...@@ -19,18 +19,18 @@
在 PaddlePaddle中,下面这些Layer能够接受双层序列作为输入,完成相应的计算。 在 PaddlePaddle中,下面这些Layer能够接受双层序列作为输入,完成相应的计算。
pooling_layer pooling
============== ========
pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_pooling_layer` 配置API。 pooling 的使用示例如下,详细见 :ref:`api_v2.layer_pooling` 配置API。
.. code-block:: bash .. code-block:: bash
seq_pool = pooling_layer(input=layer, seq_pool = pooling(input=layer,
pooling_type=AvgPooling(), pooling_type=pooling.Max(),
agg_level=AggregateLevel.EACH_SEQUENCE) agg_level=AggregateLevel.EACH_SEQUENCE)
- `pooling_type` 目前支持两种,分别是:MaxPooling()和AvgPooling()。 - `pooling_type` 目前支持两种,分别是:pooling.Max()和pooling.Avg()。
- `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值): - `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值):
...@@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers ...@@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers
last_seq 和 first_seq last_seq 和 first_seq
===================== =====================
last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_seq` 类似),详细见 :ref:`api_trainer_config_helpers_layers_last_seq` 配置API。 last_seq 的使用示例如下( :ref:`api_v2.layer_first_seq` 类似),详细见 :ref:`api_v2.layer_last_seq` 配置API。
.. code-block:: bash .. code-block:: bash
...@@ -65,14 +65,14 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_ ...@@ -65,14 +65,14 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_
- 输入:必须是一个双层序列 - 输入:必须是一个双层序列
- 输出:一个单层序列,其中每个元素是双层序列中每个subseq最后一个(或第一个)元素。 - 输出:一个单层序列,其中每个元素是双层序列中每个subseq最后一个(或第一个)元素。
expand_layer expand
============ ======
expand_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_expand_layer` 配置API。 expand 的使用示例如下,详细见 :ref:`api_v2.layer_expand` 配置API。
.. code-block:: bash .. code-block:: bash
expand = expand_layer(input=layer1, ex = expand(input=layer1,
expand_as=layer2, expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP) expand_level=ExpandLevel.FROM_TIMESTEP)
......
...@@ -4,7 +4,6 @@ RNN相关模型 ...@@ -4,7 +4,6 @@ RNN相关模型
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
rnn_config_cn.rst
recurrent_group_cn.md recurrent_group_cn.md
hierarchical_layer_cn.rst hierarchical_layer_cn.rst
hrnn_rnn_api_compare_cn.rst hrnn_rnn_api_compare_cn.rst
RNN Models RNN Models
========== ==========
.. toctree::
:maxdepth: 1
rnn_config_en.rst
...@@ -5,7 +5,6 @@ PaddlePaddle 文档 ...@@ -5,7 +5,6 @@ PaddlePaddle 文档
:maxdepth: 1 :maxdepth: 1
getstarted/index_cn.rst getstarted/index_cn.rst
tutorials/index_cn.md
howto/index_cn.rst howto/index_cn.rst
api/index_cn.rst api/index_cn.rst
faq/index_cn.rst faq/index_cn.rst
...@@ -5,8 +5,6 @@ PaddlePaddle Documentation ...@@ -5,8 +5,6 @@ PaddlePaddle Documentation
:maxdepth: 1 :maxdepth: 1
getstarted/index_en.rst getstarted/index_en.rst
tutorials/index_en.md
howto/index_en.rst howto/index_en.rst
api/index_en.rst api/index_en.rst
about/index_en.rst about/index_en.rst
\ No newline at end of file
...@@ -114,10 +114,7 @@ ...@@ -114,10 +114,7 @@
</ul> </ul>
</div> </div>
<ul class="site-page-links"> <ul class="site-page-links">
<li><a>Home</a></li> <li><a href="/">Home</a></li>
<li><a>Get Started</a></li>
<li class="active"><a>Documentation</a></li>
<li><a>About Us</a></li>
</ul> </ul>
</div> </div>
<div class="doc-module"> <div class="doc-module">
...@@ -137,7 +134,7 @@ ...@@ -137,7 +134,7 @@
{{ toctree }} {{ toctree }}
{% endblock %} {% endblock %}
</nav> </nav>
{% if toc %} {% if False %}
<nav class="local-toc">{{ toc }}</nav> <nav class="local-toc">{{ toc }}</nav>
{% endif %} {% endif %}
<section class="doc-content-wrap"> <section class="doc-content-wrap">
...@@ -168,7 +165,8 @@ ...@@ -168,7 +165,8 @@
VERSION:'{{ release|e }}', VERSION:'{{ release|e }}',
COLLAPSE_INDEX:false, COLLAPSE_INDEX:false,
FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}', FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}',
HAS_SOURCE: {{ has_source|lower }} HAS_SOURCE: {{ has_source|lower }},
SOURCELINK_SUFFIX: ".txt",
}; };
</script> </script>
{%- for scriptfile in script_files %} {%- for scriptfile in script_files %}
......
...@@ -21,16 +21,13 @@ set(CUDA_CXX_WITH_GPU_SOURCES ...@@ -21,16 +21,13 @@ set(CUDA_CXX_WITH_GPU_SOURCES
if(WITH_GPU) if(WITH_GPU)
set(CUDA_CXX_SOURCES set(CUDA_CXX_SOURCES
src/hl_dso_loader.cc
src/hl_warpctc_wrap.cc src/hl_warpctc_wrap.cc
${CUDA_CXX_WITH_GPU_SOURCES}) ${CUDA_CXX_WITH_GPU_SOURCES})
set_source_files_properties(${CUDA_CXX_SOURCES} set_source_files_properties(${CUDA_CXX_SOURCES}
PROPERTIES COMPILE_FLAGS "-D__NVCC__") PROPERTIES COMPILE_FLAGS "-D__NVCC__")
else() else()
set(CUDA_CXX_SOURCES set(CUDA_CXX_SOURCES src/hl_warpctc_wrap.cc)
src/hl_dso_loader.cc
src/hl_warpctc_wrap.cc)
endif() endif()
set(CUDA_CU_SOURCES set(CUDA_CU_SOURCES
...@@ -47,7 +44,6 @@ set(CUDA_CU_SOURCES ...@@ -47,7 +44,6 @@ set(CUDA_CU_SOURCES
set(CUDA_HEADERS set(CUDA_HEADERS
include/hl_time.h include/hl_time.h
include/hl_dso_loader.h
include/hl_warpctc_wrap.h include/hl_warpctc_wrap.h
include/hl_sequence.h include/hl_sequence.h
include/hl_cuda_cublas.h include/hl_cuda_cublas.h
......
...@@ -40,18 +40,18 @@ public: ...@@ -40,18 +40,18 @@ public:
namespace gpu { namespace gpu {
static __device__ Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION; static __device__ Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static __device__ Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION; static __device__ Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION;
} } // namespace gpu
#else #else
namespace cpu { namespace cpu {
static Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION; static Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION; static Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION;
} } // namespace cpu
#ifdef __AVX__ #ifdef __AVX__
namespace avx { namespace avx {
static Active<__m256>::forward forward[] = HPPL_ACTIVE_FUNCTION; static Active<__m256>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static Active<__m256>::backward backward[] = HPPL_ACTIVE_FUNCTION; static Active<__m256>::backward backward[] = HPPL_ACTIVE_FUNCTION;
} } // namespace avx
#endif #endif
#endif #endif
......
...@@ -273,23 +273,23 @@ extern void hl_bilinear_forward(const real* inData, ...@@ -273,23 +273,23 @@ extern void hl_bilinear_forward(const real* inData,
const real ratioW); const real ratioW);
/** /**
* @brief Bilinear interpolation backward. * @brief Bilinear interpolation backward.
* *
* @param[out] inGrad input gradient. * @param[out] inGrad input gradient.
* @param[in] inImgH input image height. * @param[in] inImgH input image height.
* @param[in] inImgW input image width. * @param[in] inImgW input image width.
* @param[in] inputH input batchSize. * @param[in] inputH input batchSize.
* @param[in] inputW input image data dim. * @param[in] inputW input image data dim.
* @param[in] outGrad output gradient. * @param[in] outGrad output gradient.
* @param[in] outImgH output image height. * @param[in] outImgH output image height.
* @param[in] outImgW output image width. * @param[in] outImgW output image width.
* @param[in] outputH output batchSize. * @param[in] outputH output batchSize.
* @param[in] outputW output image data dim. * @param[in] outputW output image data dim.
* @param[in] numChannels number of channels. * @param[in] numChannels number of channels.
* @param[in] ratioH inImgH / outImgH. * @param[in] ratioH inImgH / outImgH.
* @param[in] ratioW inImgW / outImgW. * @param[in] ratioW inImgW / outImgW.
* *
*/ */
extern void hl_bilinear_backward(real* inGrad, extern void hl_bilinear_backward(real* inGrad,
const size_t inImgH, const size_t inImgH,
const size_t inImgW, const size_t inImgW,
......
...@@ -14,10 +14,9 @@ limitations under the License. */ ...@@ -14,10 +14,9 @@ limitations under the License. */
#include "hl_cuda_cublas.h" #include "hl_cuda_cublas.h"
#include <sys/time.h> #include <sys/time.h>
#include <mutex>
#include "hl_cuda.h" #include "hl_cuda.h"
#include "hl_dso_loader.h"
#include "hl_thread.ph" #include "hl_thread.ph"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h" #include "paddle/utils/Logging.h"
namespace dynload { namespace dynload {
......
...@@ -15,10 +15,9 @@ limitations under the License. */ ...@@ -15,10 +15,9 @@ limitations under the License. */
#include "hl_cuda_cudnn.h" #include "hl_cuda_cudnn.h"
#include <cudnn.h> #include <cudnn.h>
#include <gflags/gflags.h> #include <gflags/gflags.h>
#include <mutex>
#include "hl_cuda_cudnn.ph" #include "hl_cuda_cudnn.ph"
#include "hl_dso_loader.h"
#include "hl_thread.ph" #include "hl_thread.ph"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h" #include "paddle/utils/Logging.h"
DEFINE_int32(cudnn_conv_workspace_limit_in_mb, DEFINE_int32(cudnn_conv_workspace_limit_in_mb,
......
...@@ -21,11 +21,10 @@ limitations under the License. */ ...@@ -21,11 +21,10 @@ limitations under the License. */
#include <sys/syscall.h> #include <sys/syscall.h>
#include <sys/time.h> #include <sys/time.h>
#include <unistd.h> #include <unistd.h>
#include <mutex>
#include "hl_cuda.ph" #include "hl_cuda.ph"
#include "hl_thread.ph" #include "hl_thread.ph"
#include "hl_dso_loader.h"
#include "paddle/utils/Logging.h" #include "paddle/utils/Logging.h"
#include "paddle/utils/DynamicLoader.h"
// clang-format on // clang-format on
namespace dynload { namespace dynload {
......
...@@ -14,7 +14,7 @@ limitations under the License. */ ...@@ -14,7 +14,7 @@ limitations under the License. */
#include "hl_warpctc_wrap.h" #include "hl_warpctc_wrap.h"
#include <mutex> #include <mutex>
#include "hl_dso_loader.h" #include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h" #include "paddle/utils/Logging.h"
namespace dynload { namespace dynload {
......
...@@ -12,7 +12,7 @@ endif() ...@@ -12,7 +12,7 @@ endif()
add_library(paddle_function STATIC ${cpp_files} ${cu_objs}) add_library(paddle_function STATIC ${cpp_files} ${cu_objs})
add_dependencies(paddle_function ${external_project_dependencies}) add_dependencies(paddle_function ${external_project_dependencies})
add_dependencies(paddle_function gen_proto_cpp)
if(WITH_GPU) if(WITH_GPU)
if(WITH_TESTING) if(WITH_TESTING)
......
...@@ -21,7 +21,6 @@ limitations under the License. */ ...@@ -21,7 +21,6 @@ limitations under the License. */
#include "MultiGradientMachine.h" #include "MultiGradientMachine.h"
#include "MultiNetwork.h" #include "MultiNetwork.h"
#include "NeuralNetwork.h" #include "NeuralNetwork.h"
#include "NeuralNetwork.h"
#include "ParallelNeuralNetwork.h" #include "ParallelNeuralNetwork.h"
#include "hl_gpu.h" #include "hl_gpu.h"
......
...@@ -637,7 +637,7 @@ void RecurrentGradientMachine::removeBeamSearchStatisticsCallbacks() { ...@@ -637,7 +637,7 @@ void RecurrentGradientMachine::removeBeamSearchStatisticsCallbacks() {
/* create scattered id infomation for all realLayer of inFrameLines one time. /* create scattered id infomation for all realLayer of inFrameLines one time.
* If hasSubseq, will also create scattered sequenceStartPositions infomation * If hasSubseq, will also create scattered sequenceStartPositions infomation
* for all realLayer of inFrameLines one time. * for all realLayer of inFrameLines one time.
*/ */
void RecurrentGradientMachine::createInFrameInfo(int inlinkId, void RecurrentGradientMachine::createInFrameInfo(int inlinkId,
const Argument& input, const Argument& input,
......
...@@ -29,7 +29,7 @@ namespace paddle { ...@@ -29,7 +29,7 @@ namespace paddle {
* *
* The config file api is rotate_layer * The config file api is rotate_layer
* *
*/ */
class RotateLayer : public Layer { class RotateLayer : public Layer {
public: public:
......
...@@ -48,8 +48,7 @@ lstm = lstmemory_group( ...@@ -48,8 +48,7 @@ lstm = lstmemory_group(
size=hidden_dim, size=hidden_dim,
act=TanhActivation(), act=TanhActivation(),
gate_act=SigmoidActivation(), gate_act=SigmoidActivation(),
state_act=TanhActivation(), state_act=TanhActivation())
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
lstm_last = last_seq(input=lstm) lstm_last = last_seq(input=lstm)
......
...@@ -51,8 +51,7 @@ def lstm_group(lstm_group_input): ...@@ -51,8 +51,7 @@ def lstm_group(lstm_group_input):
size=hidden_dim, size=hidden_dim,
act=TanhActivation(), act=TanhActivation(),
gate_act=SigmoidActivation(), gate_act=SigmoidActivation(),
state_act=TanhActivation(), state_act=TanhActivation())
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
return lstm_output return lstm_output
......
...@@ -15,6 +15,54 @@ limitations under the License. */ ...@@ -15,6 +15,54 @@ limitations under the License. */
#include "MathFunctions.h" #include "MathFunctions.h"
#include "hl_matrix_apply.cuh" #include "hl_matrix_apply.cuh"
#include "hl_matrix_ops.cuh" #include "hl_matrix_ops.cuh"
#include "paddle/utils/DynamicLoader.h"
namespace dynload {
std::once_flag lapack_dso_flag;
void* lapack_dso_handle = nullptr;
/**
* The following macro definition can generate structs
* (for each function) to dynamic load lapack routine
* via operator overloading.
*
* note: default dynamic linked libs
*/
#define DYNAMIC_LOAD_LAPACK_WRAP(__name) \
struct DynLoad__##__name { \
template <typename... Args> \
auto operator()(Args... args) -> decltype(__name(args...)) { \
using lapack_func = decltype(__name(args...)) (*)(Args...); \
std::call_once(lapack_dso_flag, GetLapackDsoHandle, &lapack_dso_handle); \
void* p_##__name = dlsym(lapack_dso_handle, #__name); \
return reinterpret_cast<lapack_func>(p_##__name)(args...); \
} \
} __name; // struct DynLoad__##__name
// clang-format off
#ifdef PADDLE_USE_ATLAS
#define PADDLE_SGETRF clapack_sgetrf
#define PADDLE_DGETRF clapack_dgetrf
#define PADDLE_SGETRI clapack_sgetri
#define PADDLE_DGETRI clapack_dgetri
#else
#define PADDLE_SGETRF LAPACKE_sgetrf
#define PADDLE_DGETRF LAPACKE_dgetrf
#define PADDLE_SGETRI LAPACKE_sgetri
#define PADDLE_DGETRI LAPACKE_dgetri
#endif
#define LAPACK_ROUTINE_EACH(__macro) \
__macro(PADDLE_SGETRF) \
__macro(PADDLE_DGETRF) \
__macro(PADDLE_SGETRI) \
__macro(PADDLE_DGETRI)
// clang-format on
LAPACK_ROUTINE_EACH(DYNAMIC_LOAD_LAPACK_WRAP)
} // namespace dynload
namespace paddle { namespace paddle {
...@@ -85,16 +133,7 @@ int getrf<float>(const CBLAS_ORDER order, ...@@ -85,16 +133,7 @@ int getrf<float>(const CBLAS_ORDER order,
float* A, float* A,
const int lda, const int lda,
int* ipiv) { int* ipiv) {
#ifdef PADDLE_USE_LAPACK return dynload::PADDLE_SGETRF(order, M, N, A, lda, ipiv);
#ifdef PADDLE_USE_ATLAS
return clapack_sgetrf(order, M, N, A, lda, ipiv);
#else
return LAPACKE_sgetrf(order, M, N, A, lda, ipiv);
#endif
#else
LOG(FATAL) << "Not implemented";
#endif
return 0;
} }
template <> template <>
...@@ -104,16 +143,7 @@ int getrf<double>(const CBLAS_ORDER order, ...@@ -104,16 +143,7 @@ int getrf<double>(const CBLAS_ORDER order,
double* A, double* A,
const int lda, const int lda,
int* ipiv) { int* ipiv) {
#ifdef PADDLE_USE_LAPACK return dynload::PADDLE_DGETRF(order, M, N, A, lda, ipiv);
#ifdef PADDLE_USE_ATLAS
return clapack_dgetrf(order, M, N, A, lda, ipiv);
#else
return LAPACKE_dgetrf(order, M, N, A, lda, ipiv);
#endif
#else
LOG(FATAL) << "Not implemented";
#endif
return 0;
} }
template <> template <>
...@@ -122,16 +152,7 @@ int getri<float>(const CBLAS_ORDER order, ...@@ -122,16 +152,7 @@ int getri<float>(const CBLAS_ORDER order,
float* A, float* A,
const int lda, const int lda,
const int* ipiv) { const int* ipiv) {
#ifdef PADDLE_USE_LAPACK return dynload::PADDLE_SGETRI(order, N, A, lda, ipiv);
#ifdef PADDLE_USE_ATLAS
return clapack_sgetri(order, N, A, lda, ipiv);
#else
return LAPACKE_sgetri(order, N, A, lda, ipiv);
#endif
#else
LOG(FATAL) << "Not implemented";
#endif
return 0;
} }
template <> template <>
...@@ -140,15 +161,7 @@ int getri<double>(const CBLAS_ORDER order, ...@@ -140,15 +161,7 @@ int getri<double>(const CBLAS_ORDER order,
double* A, double* A,
const int lda, const int lda,
const int* ipiv) { const int* ipiv) {
#ifdef PADDLE_USE_LAPACK return dynload::PADDLE_DGETRI(order, N, A, lda, ipiv);
#ifdef PADDLE_USE_ATLAS
return clapack_dgetri(order, N, A, lda, ipiv);
#else
return LAPACKE_dgetri(order, N, A, lda, ipiv);
#endif
#else
LOG(FATAL) << "Not implemented";
#endif
return 0; return 0;
} }
......
...@@ -17,14 +17,11 @@ limitations under the License. */ ...@@ -17,14 +17,11 @@ limitations under the License. */
#ifdef PADDLE_USE_MKL #ifdef PADDLE_USE_MKL
#include <mkl.h> #include <mkl.h>
#ifdef PADDLE_USE_LAPACK
#include <mkl_lapacke.h> #include <mkl_lapacke.h>
#endif
#else #else
extern "C" { extern "C" {
#include <cblas.h> #include <cblas.h>
} }
#ifdef PADDLE_USE_LAPACK
#ifdef PADDLE_USE_ATLAS #ifdef PADDLE_USE_ATLAS
extern "C" { extern "C" {
#include <clapack.h> #include <clapack.h>
...@@ -33,7 +30,6 @@ extern "C" { ...@@ -33,7 +30,6 @@ extern "C" {
#include <lapacke.h> #include <lapacke.h>
#endif #endif
#endif #endif
#endif
#include <cmath> #include <cmath>
......
...@@ -37,7 +37,7 @@ limitations under the License. */ ...@@ -37,7 +37,7 @@ limitations under the License. */
* *
* AutoCompare test; * AutoCompare test;
* test.cmpWithoutArg<I...>(function, height, width) * test.cmpWithoutArg<I...>(function, height, width)
*/ */
#include <gtest/gtest.h> #include <gtest/gtest.h>
#include "TensorCheck.h" #include "TensorCheck.h"
......
...@@ -21,6 +21,7 @@ limitations under the License. */ ...@@ -21,6 +21,7 @@ limitations under the License. */
#include "paddle/math/Matrix.h" #include "paddle/math/Matrix.h"
#include "paddle/math/SparseMatrix.h" #include "paddle/math/SparseMatrix.h"
#include "paddle/testing/TestUtil.h" #include "paddle/testing/TestUtil.h"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Stat.h" #include "paddle/utils/Stat.h"
#include "paddle/utils/Util.h" #include "paddle/utils/Util.h"
...@@ -235,10 +236,15 @@ TEST(Matrix, unary) { ...@@ -235,10 +236,15 @@ TEST(Matrix, unary) {
testMatrixTranspose(height, width); testMatrixTranspose(height, width);
testMatrixRotate(height, width); testMatrixRotate(height, width);
} }
// inverse // inverse matrix
#ifdef PADDLE_USE_LAPACK void** dso_handler = nullptr;
GetLapackDsoHandle(dso_handler);
if (nullptr == *dso_handler) {
LOG(WARNING) << "Failed to find liblapack.so, please specify its path "
"using LD_LIBRARY_PATH.";
} else {
testMatrixInverse(height); testMatrixInverse(height);
#endif }
} }
} }
......
...@@ -126,7 +126,7 @@ protected: ...@@ -126,7 +126,7 @@ protected:
/* /*
* AdaDelta Optimization. * AdaDelta Optimization.
* http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf * http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf
*/ */
class AdaDeltaParameterOptimizer : public ParameterOptimizer { class AdaDeltaParameterOptimizer : public ParameterOptimizer {
public: public:
explicit AdaDeltaParameterOptimizer(const OptimizationConfig& optConfig) explicit AdaDeltaParameterOptimizer(const OptimizationConfig& optConfig)
......
#!/bin/bash
set -e
echo "Post install paddle debian package."
echo "Install some python package used for paddle. You can run "
echo " pip install /usr/opt/paddle/share/wheels/*.whl to install them."
find /usr/ -name '*paddle*.whl' | xargs pip install
...@@ -5,13 +5,8 @@ set -e ...@@ -5,13 +5,8 @@ set -e
# Set BASE_IMAGE according to env variables # Set BASE_IMAGE according to env variables
if [ ${WITH_GPU} == "ON" ]; then if [ ${WITH_GPU} == "ON" ]; then
BASE_IMAGE="nvidia/cuda:8.0-cudnn5-runtime-ubuntu14.04" BASE_IMAGE="nvidia/cuda:8.0-cudnn5-runtime-ubuntu14.04"
# additional packages to install when building gpu images
GPU_DOCKER_PKG="python-pip python-dev"
else else
BASE_IMAGE="python:2.7.13-slim" BASE_IMAGE="ubuntu:14.04"
# FIXME: python base image uses different python version than WITH_GPU
# need to change PYTHONHOME to /usr/local when using python base image
CPU_DOCKER_PYTHON_HOME_ENV="ENV PYTHONHOME /usr/local"
fi fi
DOCKERFILE_GPU_ENV="" DOCKERFILE_GPU_ENV=""
...@@ -66,10 +61,7 @@ if [ ${WITH_DOC} == "ON" ]; then ...@@ -66,10 +61,7 @@ if [ ${WITH_DOC} == "ON" ]; then
rm -rf /paddle/build_doc rm -rf /paddle/build_doc
fi fi
# generate deb package for current build # generate deb package for current build
# FIXME(typhoonzero): should we remove paddle/scripts/deb ? cpack -D CPACK_GENERATOR='DEB' ..
# FIXME: CPACK_DEBIAN_PACKAGE_DEPENDS removes all dev dependencies, must
# install them in docker
cpack -D CPACK_GENERATOR='DEB' -D CPACK_DEBIAN_PACKAGE_DEPENDS="" ..
if [[ ${WOBOQ:-OFF} == 'ON' ]]; then if [[ ${WOBOQ:-OFF} == 'ON' ]]; then
apt-get install -y clang-3.8 llvm-3.8 libclang-3.8-dev apt-get install -y clang-3.8 llvm-3.8 libclang-3.8-dev
...@@ -97,32 +89,30 @@ fi ...@@ -97,32 +89,30 @@ fi
paddle version paddle version
if [[ -n ${APT_MIRROR} ]]; then
MIRROR_UPDATE="sed -i '${APT_MIRROR}' /etc/apt/sources.list && \\"
else
MIRROR_UPDATE="\\"
fi
cat > /paddle/build/Dockerfile <<EOF cat > /paddle/build/Dockerfile <<EOF
FROM ${BASE_IMAGE} FROM ${BASE_IMAGE}
MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com>
ENV HOME /root ENV HOME /root
ENV LANG en_US.UTF-8 ENV LANG en_US.UTF-8
# Use Fix locales to en_US.UTF-8 # Use Fix locales to en_US.UTF-8
RUN ${MIRROR_UPDATE} EOF
apt-get update && \
apt-get install -y libgfortran3 libpython2.7 ${GPU_DOCKER_PKG} && \ if [[ -n ${APT_MIRROR} ]]; then
apt-get clean -y && \ cat >> /paddle/build/Dockerfile <<EOF
pip install --upgrade pip && \ RUN sed -i '${APT_MIRROR}' /etc/apt/sources.list
pip install -U 'protobuf==3.1.0' requests numpy EOF
fi
cat >> /paddle/build/Dockerfile <<EOF
# Use different deb file when building different type of images # Use different deb file when building different type of images
ADD *.deb /usr/local/opt/paddle/deb/ ADD *.deb /
# run paddle version to install python packages first # run paddle version to install python packages first
RUN dpkg -i /usr/local/opt/paddle/deb/*.deb && \ RUN apt-get update &&\
rm -f /usr/local/opt/paddle/deb/*.deb && \ apt-get install -y python-pip && pip install -U pip && \
find /usr/ -name '*paddle-*.whl' | xargs pip install && \ dpkg -i /*.deb ; apt-get install -f -y && \
apt-get clean -y && \
rm -f /*.deb && \
paddle version paddle version
${CPU_DOCKER_PYTHON_HOME_ENV}
${DOCKERFILE_CUDNN_DSO} ${DOCKERFILE_CUDNN_DSO}
${DOCKERFILE_GPU_ENV} ${DOCKERFILE_GPU_ENV}
# default command shows the paddle version and exit # default command shows the paddle version and exit
......
...@@ -60,6 +60,7 @@ function deploy_docs() { ...@@ -60,6 +60,7 @@ function deploy_docs() {
deploy_docs "master" "." deploy_docs "master" "."
deploy_docs "develop" "./develop/" deploy_docs "develop" "./develop/"
deploy_docs "release/0.10.0" "./release/0.10.0/"
# Check is there anything changed. # Check is there anything changed.
set +e set +e
......
...@@ -23,7 +23,7 @@ setup(name="py_paddle", ...@@ -23,7 +23,7 @@ setup(name="py_paddle",
install_requires = [ install_requires = [
'nltk>=3.2.2', 'nltk>=3.2.2',
'numpy>=1.8.0', # The numpy is required. 'numpy>=1.8.0', # The numpy is required.
'protobuf>=${PROTOBUF_VERSION}' # The paddle protobuf version 'protobuf==${PROTOBUF_VERSION}' # The paddle protobuf version
], ],
url='http://www.paddlepaddle.org/', url='http://www.paddlepaddle.org/',
license='Apache 2.0', license='Apache 2.0',
......
...@@ -1059,14 +1059,14 @@ inline bool operator==(const value& x, const value& y) { ...@@ -1059,14 +1059,14 @@ inline bool operator==(const value& x, const value& y) {
} }
inline bool operator!=(const value& x, const value& y) { return !(x == y); } inline bool operator!=(const value& x, const value& y) { return !(x == y); }
} } // namespace picojson
namespace std { namespace std {
template <> template <>
inline void swap(picojson::value& x, picojson::value& y) { inline void swap(picojson::value& x, picojson::value& y) {
x.swap(y); x.swap(y);
} }
} } // namespace std
inline std::istream& operator>>(std::istream& is, picojson::value& x) { inline std::istream& operator>>(std::istream& is, picojson::value& x) {
picojson::set_last_error(std::string()); picojson::set_last_error(std::string());
......
...@@ -12,9 +12,9 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ...@@ -12,9 +12,9 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and See the License for the specific language governing permissions and
limitations under the License. */ limitations under the License. */
#include "hl_dso_loader.h" #include "DynamicLoader.h"
#include <gflags/gflags.h> #include <gflags/gflags.h>
#include "paddle/utils/Logging.h" #include "Logging.h"
DEFINE_string(cudnn_dir, DEFINE_string(cudnn_dir,
"", "",
...@@ -30,6 +30,8 @@ DEFINE_string(cuda_dir, ...@@ -30,6 +30,8 @@ DEFINE_string(cuda_dir,
DEFINE_string(warpctc_dir, "", "Specify path for loading libwarpctc.so."); DEFINE_string(warpctc_dir, "", "Specify path for loading libwarpctc.so.");
DEFINE_string(lapack_dir, "", "Specify path for loading liblapack.so.");
static inline std::string join(const std::string& part1, static inline std::string join(const std::string& part1,
const std::string& part2) { const std::string& part2) {
// directory separator // directory separator
...@@ -160,3 +162,11 @@ void GetWarpCTCDsoHandle(void** dso_handle) { ...@@ -160,3 +162,11 @@ void GetWarpCTCDsoHandle(void** dso_handle) {
GetDsoHandleFromSearchPath(FLAGS_warpctc_dir, "libwarpctc.so", dso_handle); GetDsoHandleFromSearchPath(FLAGS_warpctc_dir, "libwarpctc.so", dso_handle);
#endif #endif
} }
void GetLapackDsoHandle(void** dso_handle) {
#if defined(__APPLE__) || defined(__OSX__)
GetDsoHandleFromSearchPath(FLAGS_lapack_dir, "liblapack.dylib", dso_handle);
#else
GetDsoHandleFromSearchPath(FLAGS_lapack_dir, "liblapack.so", dso_handle);
#endif
}
...@@ -12,13 +12,13 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ...@@ -12,13 +12,13 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and See the License for the specific language governing permissions and
limitations under the License. */ limitations under the License. */
#ifndef HL_DSO_LOADER_H_ #ifndef DYNAMIC_LOAD_H_
#define HL_DSO_LOADER_H_ #define DYNAMIC_LOAD_H_
#include <dlfcn.h> #include <dlfcn.h>
#include <memory> #include <memory>
#include <mutex>
#include <string> #include <string>
#include "hl_base.h"
/** /**
* @brief load the DSO of CUBLAS * @brief load the DSO of CUBLAS
...@@ -52,4 +52,12 @@ void GetCurandDsoHandle(void** dso_handle); ...@@ -52,4 +52,12 @@ void GetCurandDsoHandle(void** dso_handle);
*/ */
void GetWarpCTCDsoHandle(void** dso_handle); void GetWarpCTCDsoHandle(void** dso_handle);
#endif // HL_DSO_LOADER_H_ /**
* @brief load the DSO of lapack
*
* @param **dso_handle dso handler
*
*/
void GetLapackDsoHandle(void** dso_handle);
#endif // DYNAMIC_LOAD_H_
...@@ -208,12 +208,15 @@ class ExtraLayerAttribute(object): ...@@ -208,12 +208,15 @@ class ExtraLayerAttribute(object):
drop_rate=None, drop_rate=None,
device=None): device=None):
self.attr = dict() self.attr = dict()
if isinstance(error_clipping_threshold, float): if error_clipping_threshold is not None:
assert error_clipping_threshold > 0 error_clipping_threshold = float(error_clipping_threshold)
self.attr["error_clipping_threshold"] = error_clipping_threshold if error_clipping_threshold < 0:
raise ValueError("Error clipping must > 0")
if isinstance(drop_rate, float): self.attr['error_clipping_threshold'] = error_clipping_threshold
assert drop_rate > 0 if drop_rate is not None:
drop_rate = float(drop_rate)
if drop_rate < 0:
raise ValueError("Dropout rate must > 0")
self.attr["drop_rate"] = drop_rate self.attr["drop_rate"] = drop_rate
if isinstance(device, int): if isinstance(device, int):
......
...@@ -84,6 +84,7 @@ __all__ = [ ...@@ -84,6 +84,7 @@ __all__ = [
'GeneratedInput', 'GeneratedInput',
'SubsequenceInput', 'SubsequenceInput',
'gru_step_layer', 'gru_step_layer',
'gru_step_naive_layer',
'recurrent_layer', 'recurrent_layer',
'BaseGeneratedInput', 'BaseGeneratedInput',
'conv_operator', 'conv_operator',
...@@ -3086,6 +3087,78 @@ def gru_step_layer(input, ...@@ -3086,6 +3087,78 @@ def gru_step_layer(input,
activation=act) activation=act)
@wrap_bias_attr_default()
@wrap_param_attr_default()
@wrap_act_default(param_names=['gate_act'], act=SigmoidActivation())
@wrap_act_default(act=TanhActivation())
@wrap_name_default('gru_step_naive')
@layer_support(ERROR_CLIPPING, DROPOUT)
def gru_step_naive_layer(input,
output_mem,
size=None,
name=None,
act=None,
gate_act=None,
bias_attr=None,
param_attr=None,
layer_attr=None):
"""
GRU Step Layer, but using MixedLayer to generate. It support ERROR_CLIPPING
and DROPOUT.
:param input:
:param output_mem:
:param size:
:param name:
:param act:
:param gate_act:
:param bias_attr:
:param param_attr:
:param layer_attr:
:return:
"""
if input.size % 3 != 0:
raise ValueError("GruStep input size must be divided by 3")
if size is None:
size = input.size / 3
def __gate__(gate_name, offset):
with mixed_layer(
name=name + "_" + gate_name,
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=gate_act) as gate:
gate += identity_projection(input=input, offset=offset)
gate += full_matrix_projection(
input=output_mem, param_attr=param_attr)
return gate
update_gate = __gate__("update", 0)
reset_gate = __gate__("reset", size)
with mixed_layer(
name=name + "_reset_output", bias_attr=False) as reset_output:
reset_output += dotmul_operator(a=output_mem, b=reset_gate)
with mixed_layer(
name=name + "_output_candidate",
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=act) as output_candidate:
output_candidate += identity_projection(input=input, offset=2 * size)
output_candidate += full_matrix_projection(
input=reset_output, param_attr=param_attr)
with mixed_layer(name=name) as output:
output += identity_projection(output_mem)
output += dotmul_operator(a=output_mem, b=update_gate, scale=-1.0)
output += dotmul_operator(a=output_candidate, b=update_gate)
return output
@wrap_name_default() @wrap_name_default()
@layer_support() @layer_support()
def get_output_layer(input, arg_name, name=None, layer_attr=None): def get_output_layer(input, arg_name, name=None, layer_attr=None):
......
...@@ -825,7 +825,8 @@ def gru_unit(input, ...@@ -825,7 +825,8 @@ def gru_unit(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
Define calculations that a gated recurrent unit performs in a single time Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be step. This function itself is not a recurrent layer, so that it can not be
...@@ -857,7 +858,12 @@ def gru_unit(input, ...@@ -857,7 +858,12 @@ def gru_unit(input,
out_mem = memory(name=name, size=size) out_mem = memory(name=name, size=size)
gru_out = gru_step_layer( if naive:
__step__ = gru_step_naive_layer
else:
__step__ = gru_step_layer
gru_out = __step__(
name=name, name=name,
input=input, input=input,
output_mem=out_mem, output_mem=out_mem,
...@@ -879,7 +885,8 @@ def gru_group(input, ...@@ -879,7 +885,8 @@ def gru_group(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
gru_group is a recurrent layer group version of Gated Recurrent Unit. It gru_group is a recurrent layer group version of Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising does exactly the same calculation as the grumemory layer does. A promising
...@@ -928,7 +935,8 @@ def gru_group(input, ...@@ -928,7 +935,8 @@ def gru_group(input,
gru_param_attr=gru_param_attr, gru_param_attr=gru_param_attr,
act=act, act=act,
gate_act=gate_act, gate_act=gate_act,
gru_layer_attr=gru_layer_attr) gru_layer_attr=gru_layer_attr,
naive=naive)
return recurrent_group( return recurrent_group(
name='%s_recurrent_group' % name, name='%s_recurrent_group' % name,
...@@ -949,7 +957,8 @@ def simple_gru(input, ...@@ -949,7 +957,8 @@ def simple_gru(input,
gru_param_attr=None, gru_param_attr=None,
act=None, act=None,
gate_act=None, gate_act=None,
gru_layer_attr=None): gru_layer_attr=None,
naive=False):
""" """
You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group, You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is simple_gru in network.py. The reason why there are so many interfaces is
...@@ -1018,7 +1027,8 @@ def simple_gru(input, ...@@ -1018,7 +1027,8 @@ def simple_gru(input,
gru_param_attr=gru_param_attr, gru_param_attr=gru_param_attr,
act=act, act=act,
gate_act=gate_act, gate_act=gate_act,
gru_layer_attr=gru_layer_attr) gru_layer_attr=gru_layer_attr,
naive=naive)
@wrap_name_default('simple_gru2') @wrap_name_default('simple_gru2')
......
...@@ -320,6 +320,7 @@ layers { ...@@ -320,6 +320,7 @@ layers {
} }
} }
drop_rate: 0.5 drop_rate: 0.5
error_clipping_threshold: 40.0
} }
parameters { parameters {
name: "___embedding_0__.w0" name: "___embedding_0__.w0"
......
...@@ -356,6 +356,9 @@ def mixed(size=0, ...@@ -356,6 +356,9 @@ def mixed(size=0,
return MixedLayerV2(size, input, name, act, bias_attr, layer_attr) return MixedLayerV2(size, input, name, act, bias_attr, layer_attr)
mixed.__doc__ = conf_helps.mixed_layer.__doc__
class RecurrentLayerInput(Layer): class RecurrentLayerInput(Layer):
def __init__(self, recurrent_name, index, parent_layers): def __init__(self, recurrent_name, index, parent_layers):
parents_len = len(parent_layers) parents_len = len(parent_layers)
...@@ -404,6 +407,8 @@ data.__name__ = 'data' ...@@ -404,6 +407,8 @@ data.__name__ = 'data'
AggregateLevel = conf_helps.layers.AggregateLevel AggregateLevel = conf_helps.layers.AggregateLevel
ExpandLevel = conf_helps.layers.ExpandLevel ExpandLevel = conf_helps.layers.ExpandLevel
memory = MemoryV2 memory = MemoryV2
memory.__name__ = 'memory'
memory.__doc__ = conf_helps.memory.__doc__
def __layer_name_mapping__(inname): def __layer_name_mapping__(inname):
...@@ -512,6 +517,9 @@ def recurrent_group(step, input, name=None): ...@@ -512,6 +517,9 @@ def recurrent_group(step, input, name=None):
return retv return retv
recurrent_group.__doc__ = conf_helps.recurrent_group.__doc__
@wrap_name_default() @wrap_name_default()
def beam_search(step, def beam_search(step,
input, input,
...@@ -579,6 +587,8 @@ def beam_search(step, ...@@ -579,6 +587,8 @@ def beam_search(step,
return tmp return tmp
beam_search.__doc__ = conf_helps.beam_search.__doc__
__projection_names__ = filter(lambda x: x.endswith('_projection'), __projection_names__ = filter(lambda x: x.endswith('_projection'),
dir(conf_helps)) dir(conf_helps))
......
...@@ -15,6 +15,9 @@ setup(name='paddle', ...@@ -15,6 +15,9 @@ setup(name='paddle',
description='Parallel Distributed Deep Learning', description='Parallel Distributed Deep Learning',
install_requires=[ install_requires=[
"requests", "requests",
"numpy",
"protobuf==${PROTOBUF_VERSION}",
"matplotlib",
], ],
packages=packages, packages=packages,
package_dir={ package_dir={
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册