提交 9162629b 编写于 作者: G guosheng

Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-GRUOp-dev

...@@ -28,3 +28,4 @@ cmake_install.cmake ...@@ -28,3 +28,4 @@ cmake_install.cmake
paddle/.timestamp paddle/.timestamp
python/paddlepaddle.egg-info/ python/paddlepaddle.egg-info/
paddle/pybind/pybind.h paddle/pybind/pybind.h
python/paddle/v2/framework/tests/tmp/*
...@@ -127,6 +127,7 @@ include(external/warpctc) # download, build, install warpctc ...@@ -127,6 +127,7 @@ include(external/warpctc) # download, build, install warpctc
include(external/any) # download libn::any include(external/any) # download libn::any
include(external/eigen) # download eigen3 include(external/eigen) # download eigen3
include(external/pybind11) # download pybind11 include(external/pybind11) # download pybind11
include(external/nccl)
include(cudnn) # set cudnn libraries, must before configure include(cudnn) # set cudnn libraries, must before configure
include(configure) # add paddle env configuration include(configure) # add paddle env configuration
...@@ -159,7 +160,7 @@ set(EXTERNAL_LIBS ...@@ -159,7 +160,7 @@ set(EXTERNAL_LIBS
if(WITH_GPU) if(WITH_GPU)
list(APPEND EXTERNAL_LIBS ${CUDA_LIBRARIES} ${CUDA_rt_LIBRARY}) list(APPEND EXTERNAL_LIBS ${CUDA_LIBRARIES} ${CUDA_rt_LIBRARY})
if(NOT WITH_DSO) if(NOT WITH_DSO)
list(APPEND EXTERNAL_LIBS ${CUDNN_LIBRARY} ${CUDA_CUBLAS_LIBRARIES} ${CUDA_curand_LIBRARY}) list(APPEND EXTERNAL_LIBS ${CUDNN_LIBRARY} ${CUDA_CUBLAS_LIBRARIES} ${CUDA_curand_LIBRARY} ${NCCL_LIBRARY})
endif(NOT WITH_DSO) endif(NOT WITH_DSO)
endif(WITH_GPU) endif(WITH_GPU)
......
./doc/howto/dev/contribute_to_paddle_en.md # Contribute Code
We sincerely appreciate your contribution. This document explains our workflow and work style.
## Workflow
PaddlePaddle uses this [Git branching model](http://nvie.com/posts/a-successful-git-branching-model/). The following steps guide usual contributions.
1. Fork
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the ["Fork" button](https://help.github.com/articles/fork-a-repo/).
1. Clone
To make a copy of your fork to your local computers, please run
```bash
git clone https://github.com/your-github-account/paddle
cd paddle
```
1. Create the local feature branch
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
```bash
git checkout -b my-cool-stuff
```
1. Commit
Before issuing your first `git commit` command, please install [`pre-commit`](http://pre-commit.com/) by running the following commands:
```bash
pip install pre-commit
pre-commit install
```
Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
Once installed, `pre-commit` checks the style of code and documentation in every commit. We will see something like the following when you run `git commit`:
```
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
1. Build and test
Users can build PaddlePaddle natively on Linux and Mac OS X. But to unify the building environment and to make it easy for debugging, the recommended way is [using Docker](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/build_en.md).
1. Keep pulling
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
```bash
git remote add upstream https://github.com/PaddlePaddle/Paddle
git pull upstream develop
```
1. Push and file a pull request
You can "push" your local work into your forked repo:
```bash
git push origin my-cool-stuff
```
The push allows you to create a pull request, requesting owners of this [official repo](https://github.com/PaddlePaddle/Paddle) to pull your change into the official one.
To create a pull request, please follow [these steps](https://help.github.com/articles/creating-a-pull-request/).
If your change is for fixing an issue, please write ["Fixes <issue-URL>"](https://help.github.com/articles/closing-issues-using-keywords/) in the description section of your pull request. Github would close the issue when the owners merge your pull request.
Please remember to specify some reviewers for your pull request. If you don't know who are the right ones, please follow Github's recommendation.
1. Delete local and remote branches
To keep your local workspace and your fork clean, you might want to remove merged branches:
```bash
git push origin :my-cool-stuff
git checkout develop
git pull upstream develop
git branch -d my-cool-stuff
```
### Code Review
- Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
- Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; please give a reason otherwise.
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/).
- Reduce the unnecessary commits. Some developers commit often. It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
## Coding Standard
### Code Style
Our C/C++ code follows the [Google style guide](http://google.github.io/styleguide/cppguide.html).
Our Python code follows the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/).
Our build process helps to check the code style. In [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/docker/build.sh#L42), the entry point of our [builder Docker image](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/Dockerfile#L88), the CMake argument `WITH_STYLE_CHECK` is set to `ON` by default. This flag is on
Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`. To check the whole codebase, we can run the command `pre-commit run -a`, as in the [`check_style.sh` file](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/travis/check_style.sh#L30), which is invoked by [our Travis CI configuration](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/.travis.yml#L43).
### Unit Tests
Please remember to add related unit tests.
- For C/C++ code, please follow [`google-test` Primer](https://github.com/google/googletest/blob/master/googletest/docs/Primer.md).
- For Python code, please use [Python's standard `unittest` package](http://pythontesting.net/framework/unittest/unittest-introduction/).
### Writing Logs
We use [glog](https://github.com/google/glog) for logging in our C/C++ code.
For general information, please use `LOG`. For debug information, please use [`VLOG`](http://htmlpreview.github.io/?https://github.com/google/glog/blob/master/doc/glog.html#verbose). The reason is at [here](https://groups.google.com/a/chromium.org/d/msg/chromium-dev/3NDNd1KzXeY/AZKMMx37fdQJ).
`VLOG` requires a *verbose level* parameter. For example:
```c++
VLOG(3) << "Operator FC is taking " << num_inputs << "inputs."
```
When we run a PaddlePaddle application or test, we can specify a verbose threshold. For example:
```bash
GLOG_vmodule=buddy_allocator=2 \
GLOG_v=10 \
python \
../python/paddle/v2/framework/tests/test_recurrent_op.py
```
This will enable VLOG messages generated by `buddy_allocator.{h,cc}` and in the verbose range of 0 to 3, so you will see above example VLOG message, which is in level 3. This suggests that we output overall messages in lower verbose levels, so they display with higher probability. When coding C++, please follow the verbose level convention as follows:
- verbose level 1: [framework](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework)
- verbose level 3: [operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators)
- verbose level 5: [memory](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/memory), [platform](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/platform)
- verbose level 7: [math](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/math)
...@@ -62,11 +62,11 @@ else() ...@@ -62,11 +62,11 @@ else()
FIND_PACKAGE(CUDA REQUIRED) FIND_PACKAGE(CUDA REQUIRED)
if(${CUDA_VERSION_MAJOR} VERSION_LESS 7) if(${CUDA_VERSION_MAJOR} VERSION_LESS 7)
message(FATAL_ERROR "Paddle need CUDA >= 7.0 to compile") message(FATAL_ERROR "Paddle needs CUDA >= 7.0 to compile")
endif() endif()
if(NOT CUDNN_FOUND) if(NOT CUDNN_FOUND)
message(FATAL_ERROR "Paddle need cudnn to compile") message(FATAL_ERROR "Paddle needs cudnn to compile")
endif() endif()
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} "-Xcompiler ${SIMD_FLAG}") set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} "-Xcompiler ${SIMD_FLAG}")
......
...@@ -8,7 +8,7 @@ ExternalProject_Add( ...@@ -8,7 +8,7 @@ ExternalProject_Add(
extern_eigen3 extern_eigen3
${EXTERNAL_PROJECT_LOG_ARGS} ${EXTERNAL_PROJECT_LOG_ARGS}
GIT_REPOSITORY "https://github.com/RLovelett/eigen.git" GIT_REPOSITORY "https://github.com/RLovelett/eigen.git"
GIT_TAG 4e79cb69b9425f5f8c3a84be4350d4ab75b5fd9d GIT_TAG 70661066beef694cadf6c304d0d07e0758825c10
PREFIX ${EIGEN_SOURCE_DIR} PREFIX ${EIGEN_SOURCE_DIR}
UPDATE_COMMAND "" UPDATE_COMMAND ""
CONFIGURE_COMMAND "" CONFIGURE_COMMAND ""
......
include(ExternalProject)
set(NCCL_SOURCE_DIR ${THIRD_PARTY_PATH}/nccl)
include_directories(${NCCL_SOURCE_DIR}/src/extern_nccl/src)
if(WITH_DSO)
# If we use DSO, we do not build nccl, just download the dependencies
set(NCCL_BUILD_COMMAND "")
set(NCCL_INSTALL_COMMAND "")
set(NCCL_INSTALL_DIR "")
else()
# otherwise, we build nccl and link it.
set(NCCL_INSTALL_DIR ${THIRD_PARTY_PATH}/install/nccl)
# Note: cuda 8.0 is needed to make nccl
# When cuda is not installed on the system directory, need to set CUDA_HOME to your cuda root
set(NCCL_BUILD_COMMAND "make -j 8")
set(NCCL_INSTALL_COMMAND "make install PREFIX=${NCCL_INSTALL_DIR}")
endif()
ExternalProject_Add(
extern_nccl
${EXTERNAL_PROJECT_LOG_ARGS}
GIT_REPOSITORY "https://github.com/NVIDIA/nccl.git"
GIT_TAG "v1.3.4-1"
PREFIX "${NCCL_SOURCE_DIR}"
UPDATE_COMMAND ""
CONFIGURE_COMMAND ""
BUILD_COMMAND "${NCCL_BUILD_COMMAND}"
INSTALL_COMMAND "${NCCL_INSTALL_COMMAND}"
INSTALL_DIR "${NCCL_INSTALL_DIR}"
TEST_COMMAND ""
)
if(WITH_DSO)
if(${CMAKE_VERSION} VERSION_LESS "3.3.0")
set(dummyfile ${CMAKE_CURRENT_BINARY_DIR}/lib_nccl_dummy.c)
file(WRITE ${dummyfile} "const char * dummy_nccl = \"${dummyfile}\";")
add_library(nccl STATIC ${dummyfile})
else()
add_library(nccl INTERFACE)
endif()
else()
add_library(nccl STATIC IMPORTED GLOBAL)
set_property(TARGET nccl PROPERTY IMPORTED_LOCATION
${NCCL_INSTALL_DIR}/lib/libnccl_static.a)
endif()
add_dependencies(nccl extern_nccl)
## Survey on Graph
Neural network framework often provides symbolic API for users to write network topology conveniently. This doc manily focus on symbolic API in most popular neural network frameworks, and try to find out how to parse symbolic configuration to a portable file, such as protobuf or json.
### Mxnet
The core concept of symbolic API is `Symbol`. Mxnet implements `Symbol` class in C++, and export to Python using C-API. Please refer to the comments in Mxnet:
`Symbol` is help class used to represent the operator node in Graph.
`Symbol` acts as an interface for building graphs from different components like Variable, Functor and Group. `Symbol` is also exported to python front-end (while Graph is not) to enable quick test and deployment. Conceptually, symbol is the final operation of a graph and thus including all the information required (the graph) to evaluate its output value.
A simple network topology wrote by Symbol is as follows:
```python
def get_symbol(num_classes=10, **kwargs):
data = mx.symbol.Variable('data')
data = mx.symbol.Flatten(data=data)
fc1 = mx.symbol.FullyConnected(data = data, name='fc1', num_hidden=128)
act1 = mx.symbol.Activation(data = fc1, name='relu1', act_type="relu")
fc2 = mx.symbol.FullyConnected(data = act1, name = 'fc2', num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name='relu2', act_type="relu")
fc3 = mx.symbol.FullyConnected(data = act2, name='fc3', num_hidden=num_classes)
mlp = mx.symbol.SoftmaxOutput(data = fc3, name = 'softmax')
return mlp
```
Varible here is actually a Symbol. Every basic Symbol will correspond to one Node, and every Node has its own NodeAttr. There is a op field in NodeAttr class, when a Symbol represents Variable(often input data), the op field is null.
Symbol contains a data member, std::vector<NodeEntry> outputs, and NodeEntry cantains a poniter to Node. We can follow the Node pointer to get all the Graph.
And Symbol can be saved to a Json file.
Here is a detailed example:
```
>>> import mxnet as mx
>>> data = mx.symbol.Variable('data')
>>> print data.debug_str()
Variable:data
>>> data = mx.symbol.Flatten(data=data)
>>> print data.debug_str()
Symbol Outputs:
output[0]=flatten0(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
>>> fc1 = mx.symbol.FullyConnected(data = data, name='fc1', num_hidden=128)
>>> print fc1.debug_str()
Symbol Outputs:
output[0]=fc1(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
Variable:fc1_weight
Variable:fc1_bias
--------------------
Op:FullyConnected, Name=fc1
Inputs:
arg[0]=flatten0(0)
arg[1]=fc1_weight(0) version=0
arg[2]=fc1_bias(0) version=0
Attrs:
num_hidden=128
```
### TensorFlow
The core concept of symbolic API is `Tensor`. Tensorflow defines `Tensor` in Python. Please refer to the comments in TensorFlow:
A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow [Session](https://www.tensorflow.org/api_docs/python/tf/Session).
A simple example is as follows:
```python
# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
e = tf.matmul(c, d)
# Construct a `Session` to execute the graph.
sess = tf.Session()
# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e)
```
The main method of `Tensor` is as follows:
```python
@property
def op(self):
"""The `Operation` that produces this tensor as an output."""
return self._op
@property
def dtype(self):
"""The `DType` of elements in this tensor."""
return self._dtype
@property
def graph(self):
"""The `Graph` that contains this tensor."""
return self._op.graph
@property
def name(self):
"""The string name of this tensor."""
if not self._op.name:
raise ValueError("Operation was not named: %s" % self._op)
return "%s:%d" % (self._op.name, self._value_index)
@property
def device(self):
"""The name of the device on which this tensor will be produced, or None."""
return self._op.device
```
Tensor can be taken as target to run by session. Tensor contains all the information of Graph, and tracks data dependency.
Here is a detailed example:
```
>>> import tensorflow as tf
>>> c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
>>> print c.graph
<tensorflow.python.framework.ops.Graph object at 0x10f256d50>
>>> d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
>>> print d.graph
<tensorflow.python.framework.ops.Graph object at 0x10f256d50>
>>> e = tf.matmul(c, d)
>>> print e.graph
<tensorflow.python.framework.ops.Graph object at 0x10f256d50>
```
### Dynet
The core concept of symbolic API is `Expression`, and Dynet defines `Expression` class in C++.
A simple example is as follows:
```cpp
ComputationGraph cg;
Expression W = parameter(cg, pW);
Expression in = input(cg, xs[i]);
Expression label = input(cg, ys[i]);
Expression pred = W * in;
Expression loss = square(pred - label);
```
The input data and parameter are also represented by Expression. Every basci Expression corresponds to a Node. And input data is also a Node.
Expression has a data member ComputationGraph, and ComputationGraph will be modified in users' configuring process. Expression can be a running target, beacuse Expression contains all dependency.
Here is a detailed example:
write topology in C++
```
ComputationGraph cg;
Expression W = parameter(cg, pW);
cg.print_graphviz();
Expression pred = W * xs[i];
cg.print_graphviz();
Expression loss = square(pred - ys[i]);
cg.print_graphviz();
```
compile and print
```
# first print
digraph G {
rankdir=LR;
nodesep=.05;
N0 [label="v0 = parameters({1}) @ 0x7ffe4de00110"];
}
# second print
digraph G {
rankdir=LR;
nodesep=.05;
N0 [label="v0 = parameters({1}) @ 0x7ffe4de00110"];
N1 [label="v1 = v0 * -0.98"];
N0 -> N1;
}
# third print
digraph G {
rankdir=LR;
nodesep=.05;
N0 [label="v0 = parameters({1}) @ 0x7ffe4de00110"];
N1 [label="v1 = v0 * -0.98"];
N0 -> N1;
N2 [label="v2 = -1.88387 - v1"];
N1 -> N2;
N3 [label="v3 = -v2"];
N2 -> N3;
N4 [label="v4 = square(v3)"];
N3 -> N4;
}
```
### Conclusion
Actually, Symbol/Tensor/Expression in Mxnet/TensorFlow/Dynet are the same level concepts. We use a unified name Expression here, this level concept has following features:
- Users wirte topoloy with symbolic API, and all return value is Expression, including input data and parameter.
- Expression corresponds with a global Graph, and Expression can also be composed.
- Expression tracks all dependency and can be taken as a run target
# Design Doc: Model Format
## Motivation
A model is an output of the training process. One complete model consists of two parts, the **topology** and the **parameters**. In order to support industrial deployment, the model format must be self-complete and must not expose any training source code.
As a result, In PaddlePaddle, the **topology** is represented as a [ProgramDesc](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/doc/design/program.md), which describes the model structure. The **parameters** contain all the trainable weights in the model. We must support large size parameters and efficient serialization/deserialization of parameters.
## Implementation
The topology is saved as a plain text in a detailed self-contain protobuf file.
The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
|field name | type | description |
| --- | --- | --- |
| version | uint32_t | Version of saved file. Always 0 now. |
| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
| tensor desc | void* | TensorDesc protobuf binary message |
| tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
| lod_level | uint64_t | Level of LoD |
| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
| data of lod[0] | uint64_t* | [Optional] lod[0].data() |
| ... | ... | ... |
## Summary
- We introduce a model format.
- The model represented by its forward-pass computation procedure is saved in a **ProgramDesc** protobuf message.
- A bunch of specified format binary tensors describe the **parameters**.
...@@ -65,20 +65,6 @@ class Optimizer(object): ...@@ -65,20 +65,6 @@ class Optimizer(object):
def __init__(self): def __init__(self):
pass pass
def create_backward_pass(self, loss, parameter_list=None):
"""
create and add gradient Operators in BlockDesc to Compute gradients of `loss`
for parameters in parameter_list
Args:
loss: an variable generated by cost function.
parameter_list: parameters that need to compute gradient and update to optimize the lost.
Returns:
list of (parameters, gradients) pair.
"""
return None
def create_optimization_pass(self, parameters_and_grads): def create_optimization_pass(self, parameters_and_grads):
"""Add optimization operators to update gradients to variables. """Add optimization operators to update gradients to variables.
...@@ -93,7 +79,7 @@ class Optimizer(object): ...@@ -93,7 +79,7 @@ class Optimizer(object):
def minimize(self, loss, parameter_list): def minimize(self, loss, parameter_list):
"""Add operations to minimize `loss` by updating `parameter_list`. """Add operations to minimize `loss` by updating `parameter_list`.
This method combines interface `create_backward_pass()` and This method combines interface `append_backward_ops()` and
`create_optimization_pass()` into one. `create_optimization_pass()` into one.
""" """
params_grads = self.create_backward_pass(loss, parameter_list) params_grads = self.create_backward_pass(loss, parameter_list)
......
# Regularization in PaddlePaddle # Regularization in PaddlePaddle
## Introduction to Regularization ## Introduction to Regularization
A central problem in machine learning is how to design an algorithm that will perform well not just on the training data, but also on new data. Many strategies are used by machine learning practitioners to reduce the test error, possibly at the expense of increased training error. These strategies are collectively known as **regularization**. A central problem in machine learning is how to design an algorithm that will perform well not just on the training data, but also on new data. A frequently faced problem is the problem of **overfitting**, where the model does not make reliable predictions on new unseen data. **Regularization** is the process of introducing additional information in order to prevent overfitting. This is usually done by adding extra penalties to the loss function that restricts the parameter spaces that an optimization algorithm can explore.
### Parameter Norm Penalties ### Parameter Norm Penalties
Most common regularization approaches in deep learning are based on limiting the capacity of the models by adding a parameter norm penalty to the objective function `J`. This is given as follows: Most common regularization approaches in deep learning are based on limiting the capacity of the models by adding a parameter norm penalty to the objective function `J`. This is given as follows:
...@@ -18,52 +18,21 @@ The most commonly used norm penalties are the L2 norm penalty and the L1 norm pe ...@@ -18,52 +18,21 @@ The most commonly used norm penalties are the L2 norm penalty and the L1 norm pe
##### L1 Regularization ##### L1 Regularization
<img src="./images/l1_regularization.png" align="center"/><br/> <img src="./images/l1_regularization.png" align="center"/><br/>
A much more detailed mathematical background of reguilarization can be found [here](http://www.deeplearningbook.org/contents/regularization.html). A much more detailed mathematical background of regularization can be found [here](http://www.deeplearningbook.org/contents/regularization.html).
## Regularization Survey
## How to do Regularization in PaddlePaddle A detailed survey of regularization in various deep learning frameworks can be found [here](https://github.com/PaddlePaddle/Paddle/wiki/Regularization-Survey).
On surveying existing frameworks like Tensorflow, PyTorch, Caffe, etc, it can be seen that there are 2 common approaches of doing regularization:
1. Making regularization a part of the optimizer using an attribute like `weight_decay` that is used to control the scale of the L2 Penalty. This approach is used in PyTorch as follows:
```python
opt = torch.optim.SGD(params, lr=0.2, weight_decay=0.2)
```
At every optimization step, this code will add the gradient of the L2 Norm of the params to the gradient of the params with respect to the loss function. This can seen in the following code snippet:
```python
if weight_decay != 0:
d_p.add_(weight_decay, p.data)
```
This is a very restyrictive way of doing regularization and does not give the users enough flexibility.
**Advantages**:
- It is easy to implement for us.
- Faster execution of backward. However, it can be done manually by advanced users too.
**Disadvantages**:
- Not flexible for other regularizations such as L1/L0 regularization.
- Does not allow for different regularization coefficient for different parameters. For example, in most models, ony the weight matrices are regularized and the bias vectors are unregularized.
- Tightly coupled optimizer and regularization implementation.
2. Adding regularization ops to the graph through Python API. This approach is used by Tensorflow and Caffe. Using this approach, we manually add regularization ops to the graph and then add the regularization loss to the final loss function before sending them to the optimizer.
**Advantages**:
- Allows for greater flexibility to the users of Paddle. Using this approach, the users can put different regularization to different parameters and also choose parameters that are not a part of regularization.
- Makes it easy for the users to customize and extend the framework.
**Disadvantages**:
- Implementation requires comprehensive design and time.
## Proposal for Regularization in PaddlePaddle ## Proposal for Regularization in PaddlePaddle
### Low-Level implementation ### Low-Level implementation
In the new design, we propose to create new operations for regularization. For now, we can add 2 ops thgat correspond to the most frequently used regularizations: In the new design, we propose to create new operations for regularization. For now, we can add 2 ops that correspond to the most frequently used regularizations:
- L2_regularization_op - L2_regularization_op
- L1_regularization_op - L1_regularization_op
These ops can be like any other ops with their own CPU/GPU implementations either using Eigen or separate Cpu and GPU kernels. As the initial implementation, we can implement their kernels using Eigen following the abstraction pattern implemented for [Activation Ops](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/accuracy_op.h). This abstraction pattern can make it very easy to implement new regularization schemes. other than L1 and L2 norm penalties. These ops can be like any other ops with their own CPU/GPU implementations either using Eigen or separate CPU and GPU kernels. As the initial implementation, we can implement their kernels using Eigen following the abstraction pattern implemented for [Activation Ops](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/accuracy_op.h). This abstraction pattern can make it very easy to implement new regularization schemes other than L1 and L2 norm penalties.
The idea of building ops for regularization is in sync with the refactored Paddle philosophy of using operators to represent any computation unit. The way these ops will be added to the computation graph, will be decided by the [layer functions](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/python_api.md#layer-function) in Python API. The idea of building ops for regularization is in sync with the refactored Paddle philosophy of using operators to represent any computation unit. The way these ops will be added to the computation graph, will be decided by the [layer functions](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/python_api.md#layer-function) in Python API.
...@@ -94,7 +63,7 @@ Since we want to create the regularization ops in a lazy manner, the regularizat ...@@ -94,7 +63,7 @@ Since we want to create the regularization ops in a lazy manner, the regularizat
#### High-level API #### High-level API
In PaddlePaddle Python API, users will primarily rely on [layer functions](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/python_api.md#layer-function) to create neural network layers. Hence, we lso need to provide regularization functionality in layer functions. The design of these APIs can be postponed for later right now. A good reference for these APIs can be found in [Keras](https://keras.io/regularizers/) and also by looking at Tensorflow in [`tf.contrib.layers`](https://www.tensorflow.org/api_guides/python/contrib.layers). In PaddlePaddle Python API, users will primarily rely on [layer functions](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/python_api.md#layer-function) to create neural network layers. Hence, we also need to provide regularization functionality in layer functions. The design of these APIs can be postponed for later right now. A good reference for these APIs can be found in [Keras](https://keras.io/regularizers/) and also by looking at Tensorflow in [`tf.contrib.layers`](https://www.tensorflow.org/api_guides/python/contrib.layers).
......
# 构建Raspberry Pi平台上的PaddlePaddle库 # 构建Raspberry Pi平台上的PaddlePaddle库
对于Rasspberry Pi系统,用户可通过ssh等方式登录到Raspberry Pi系统上,按照[源码编译PaddlePaddle](http://www.paddlepaddle.org/doc_cn/getstarted/build_and_install/cmake/build_from_source_cn.html)相关文档所述,直接编译Raspberry Pi平台上适用的PaddlePaddle库。 通常有两个方法来构建基于 Rasspberry Pi 的版本:
用户也可以在自己熟悉的开发平台上,通过交叉编译的方式来编译。这篇文档将以Linux x86-64平台为例,介绍交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤 1. 通过ssh等方式登录到Raspberry Pi系统上来构建。所需的开发工具和第三方库可以参考 [`/Dockerfile`](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile)
## 准备交叉编译环境 1. 另一个方法是交叉编译。这篇文档介绍在 Linux/x64 上交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤。
从源码交叉编译PaddlePaddle,用户需要提前准备好交叉编译环境。用户可自行前往[github](https://github.com/raspberrypi/tools)下载Raspberry Pi平台使用的C/C++交叉编译工具链,也可通过以下命令获取: ## 安装交叉编译器
克隆下面 Github repo
```bash ```bash
git clone https://github.com/raspberrypi/tools.git git clone https://github.com/raspberrypi/tools.git
``` ```
该github仓库中包含若干个预编译好的、针对不同平台的编译工具。宿主机是Linux x86-64环境,则需选用`arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64`下的作为编译工具,所使用的编译器为arm-linux-gnueabihf-gcc 4.8.3。 即可在 `./tools/tree/master/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64` 目录里找到交叉编译器 arm-linux-gnueabihf-gcc 4.8.3。运行该编译工具链需要一台 Linux x64 机器上以及 2.14版本以上的 glibc。
注意,该编译工具链需要系统glibc支持2.14以上。
## 配置交叉编译参数 ## 配置交叉编译参数
CMake系统对交叉编译提供了支持[cmake-toolchains](https://cmake.org/cmake/help/v3.0/manual/cmake-toolchains.7.html#cross-compiling)。为了简化cmake配置,PaddlePaddle为交叉编译提供了工具链配置文档[cmake/cross_compiling/raspberry_pi.cmake](https://github.com/PaddlePaddle/Paddle/blob/develop/cmake/cross_compiling/raspberry_pi.cmake),以提供一些默认的编译器和编译参数相关配置 CMake[支持交叉编译](https://cmake.org/cmake/help/v3.0/manual/cmake-toolchains.7.html#cross-compiling)。PaddlePaddle for Raspberry Pi的配置信息在[cmake/cross_compiling/raspberry_pi.cmake](https://github.com/PaddlePaddle/Paddle/blob/develop/cmake/cross_compiling/raspberry_pi.cmake)
交叉编译Raspberry Pi版本PaddlePaddle库时,有一些必须配置的参数: 交叉编译Raspberry Pi版本PaddlePaddle库时,有一些必须配置的参数:
- `CMAKE_SYSTEM_NAME`,CMake编译的目标平台,必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后,PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本,并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。 - `CMAKE_SYSTEM_NAME`:CMake编译的目标平台,必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后,PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本,并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。
Raspberry Pi平台可选配置参数:
- `RPI_TOOLCHAIN`,编译工具链所在的绝对路径,或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器;否则,用户需要在cmake时手动设置这些值。无默认值。 - `RPI_TOOLCHAIN`:编译工具链所在的绝对路径,或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器;否则,用户需要在cmake时手动设置这些值。无默认值。
- `RPI_ARM_NEON`,是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`
其他配置参数: - `RPI_ARM_NEON`:是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`
- `HOST_C/CXX_COMPILER`,宿主机的C/C++编译器。在编译宿主机版protoc可执行文件和目标机版OpenBLAS库时需要用到。默认设置成环境变量`CC`的值;若环境变量`CC`没有设置,则设置成`cc`编译器。 - `HOST_C/CXX_COMPILER`,宿主机的C/C++编译器。在编译宿主机版protoc可执行文件和目标机版OpenBLAS库时需要用到。默认设置成环境变量`CC`的值;若环境变量`CC`没有设置,则设置成`cc`编译器。
cmake参数如下; 一个常用的CMake配置如下:
``` ```
cmake -DCMAKE_SYSTEM_NAME=RPi \ cmake -DCMAKE_SYSTEM_NAME=RPi \
...@@ -47,7 +44,9 @@ cmake -DCMAKE_SYSTEM_NAME=RPi \ ...@@ -47,7 +44,9 @@ cmake -DCMAKE_SYSTEM_NAME=RPi \
.. ..
``` ```
用户还可根据自己的需求设置其他编译参数。比如希望最小化生成的库的大小,可以设置`CMAKE_BUILD_TYPE``MinSizeRel`;若希望最快的执行速度,则可设置`CMAKE_BUILD_TYPE``Release`。亦可以通过手动设置`CMAKE_C/CXX_FLAGS_MINSIZEREL/RELEASE`来影响PaddlePaddle的编译过程。 其中`WITH_C_API=ON`表示需要构建推理库。
用户还可根据自己的需求设置其他编译参数。比如希望最小化生成的库的大小,可以设置`CMAKE_BUILD_TYPE``MinSizeRel`;若希望最快的执行速度,则可设置`CMAKE_BUILD_TYPE``Release`
## 编译和安装 ## 编译和安装
...@@ -60,6 +59,4 @@ make install ...@@ -60,6 +59,4 @@ make install
注意:如果你曾经在源码目录下编译过其他平台的PaddlePaddle库,请先使用`rm -rf`命令删除`third_party`目录和`build`目录,以确保所有的第三方依赖库和PaddlePaddle代码都是针对新的CMake配置重新编译的。 注意:如果你曾经在源码目录下编译过其他平台的PaddlePaddle库,请先使用`rm -rf`命令删除`third_party`目录和`build`目录,以确保所有的第三方依赖库和PaddlePaddle代码都是针对新的CMake配置重新编译的。
执行完安装命令后,由于上一步cmake配置中`WITH_C_API`设置为`ON``your/path/to/install`目录中会包含`include``lib`目录,其中`include`中包含C-API的头文件,`lib`中包含一个Raspberry Pi版本的库。 执行完安装命令后,,`your/path/to/install`目录中会包含`include``lib`目录,其中`include`中包含C-API的头文件,`lib`中包含一个Raspberry Pi版本的库。
更多的编译配置见[源码编译PaddlePaddle](http://www.paddlepaddle.org/doc_cn/getstarted/build_and_install/cmake/build_from_source_cn.html)相关文档。
# Build PaddlePaddle for Raspberry Pi
You may use any of the following two approaches to build the inference library of PaddlePaddle for Raspberry Pi:
1. Build using SSH: Log in to a Raspberry Pi using SSH and build the library. The required development tools and third-party dependencies are listed in here: [`/Dockerfile`](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile).
1. Cross-compile: We talk about how to cross-compile PaddlePaddle for Raspberry Pi on a Linux/x64 machine, in more detail in this article.
## The Cross-Compiling Toolchain
Step 1. Clone the Github repo by running the following command.
```bash
git clone https://github.com/raspberrypi/tools.git
```
Step 2. Use the pre-built cross-compiler found in `./tools/tree/master/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64`. To run it on a Linux computer, glibc version >= 2.14 is needed.
## CMake Arguments
CMake supports [cross-compiling](https://cmake.org/cmake/help/v3.0/manual/cmake-toolchains.7.html#cross-compiling). All CMake configuration arguments required for the cross-compilation for Raspberry Pi can be found in [`cmake/cross_compiling/raspberry_pi.cmake`](https://github.com/PaddlePaddle/Paddle/blob/develop/cmake/cross_compiling/raspberry_pi.cmake).
Some important arguments that need to be set:
- `CMAKE_SYSTEM_NAME`: The target platform. Must be `RPi`.
- `RPI_TOOLCHAIN`: The absolute path of the cross-compiling toolchain.
- `RPI_ARM_NEON`: Use ARM NEON Intrinsics. This is a required argument and set default to `ON`.
- `HOST_C/CXX_COMPILER`: The C/C++ compiler for the host. It is used to build building tools running on the host, for example, protoc.
A commonly-used CMake configuration is as follows:
```
cmake -DCMAKE_SYSTEM_NAME=RPi \
-DRPI_TOOLCHAIN=your/path/to/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64 \
-DRPI_ARM_NEON=ON \
-DCMAKE_INSTALL_PREFIX=your/path/to/install \
-DWITH_GPU=OFF \
-DWITH_C_API=ON \
-DWITH_PYTHON=OFF \
-DWITH_SWIG_PY=OFF \
..
```
To build the inference library, please set the argument WITH_API to ON: `WITH_C_API=ON`.
You can add more arguments. For example, to minimize the size of the generated inference library, you may use `CMAKE_BUILD_TYPE=MinSizeRel`. For performance optimization, you may use `CMAKE_BUILD_TYPE=Release`.
## Build and Install
The following commands build the inference library of PaddlePaddle for Raspberry Pi and third-party dependencies.
```bash
make
make install
```
The intermediate files will be stored in `build`. Third-party libraries will be located in `build/third_party`. If you have already built it for other platforms like Android or iOS, you may want to clear these directories by running the command: `rm -rf build`.
The infernece library will be in `your/path/to/install/lib`, with related header files in `your/path/to/install/include`.
# Contribute Code
We sincerely appreciate your contributions. You can use fork and pull request
workflow to merge your code.
## Code Requirements
- Your code comments must be fully documented by
[Doxygen](http://www.stack.nl/~dimitri/doxygen/) style.
- Make sure the compiler option `WITH_STYLE_CHECK` is on and the compiler
passes the code style check.
- All code must have unit test.
- Pass all unit tests.
The following tutorial guides you into submitting your contibution.
## [Creating a Fork](https://help.github.com/articles/fork-a-repo/)
Just head over to the GitHub page and click the "Fork" button.
It's just that simple.
## Clone
Clone remote repository.
```bash
➜ git clone https://github.com/USERNAME/Paddle
cd Paddle
```
## Create a local branch
Paddle is currently using [Git-flow branching model](http://nvie.com/posts/a-successful-git-branching-model/).
All feature and bug fix development work should be done on a new branch, generally create new branch from `develop` branch .
```bash
➜ git checkout -b my-cool-stuff
```
Before the checkout, you need to keep the current branch directory clean, otherwise the untracked file will be brought to the new branch, which can be inspected by `git status`.
## Using `pre-commit` hook
Paddle developers use [pre-commit](http://pre-commit.com/) tool to manage git
pre-commit hooks. It can help us format source codes (cpp, python), check some
basic thing before commit (only one EOL for each file, do not add a huge file
in git). `pre-commit` tests is a part of unit tests in Travis-CI now, every
PR doesn't fit hook can not be merged into Paddle.
To use [pre-commit](http://pre-commit.com/), you should install it by
`pip install pre-commit`, and currently, Paddle uses `clang-format` to format
c/cpp sources. Please make sure clang-format 3.8+ installed.
Install and run it as follow:
```bash
➜ pip install pre-commit
➜ pre-commit install
```
When you commit your code, the pre-commit hook will check the local code if there is
anything not suitable to commit, and so on.
## Start to develop
In this tutorial, I delete a line in README.md and created a new file.
We can use `git status` to inspect the changes of current directory, `git diff` to see difference.
```bash
➜ git status
On branch test
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
no changes added to commit (use "git add" and/or "git commit -a")
```
## Build and Test
We package PaddlePaddle's compile environment into a Docker image, called the develop image named `paddle:dev`, it contains all compiling tools that PaddlePaddle needs.
If you want to build the develop image, just run:
```bash
➜ docker build -t paddle:dev .
```
Then we can use the develop image to build PaddlePaddle source. For example:
```bash
➜ docker run -v $(pwd):/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "WITH_TEST=ON" paddle:dev
```
The above command will compile PaddlePaddle and create a Dockerfile for building production image. All the generated files are in the build directory. "WITH_GPU" controls if the generated production image supports GPU. "WITH_AVX" controls if the generated production image supports AVX. "WITH_TEST" controls if the unit test will be generated.
Then we can generate the production image by copying the compiled PaddlePaddle program into the image by
```bash
➜ docker build -t paddle:prod -f build/Dockerfile .
```
Run unit test finally:
```bash
➜ docker run -it -v $(pwd):/paddle paddle:dev bash -c "cd /paddle/build && ctest"
```
For more details, you can read [this doc](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/getstarted/build_and_install/docker_install_en.rst).
## Commit
Next we cancel the changes to the README.md file and then commit our changes by following command lines:
```bash
➜ git checkout -- README.md
➜ git status
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
nothing added to commit but untracked files present (use "git add" to track)
➜ git add test
```
We should write a description of each commit by `git commit` to allow others to know
the changes in these files.
```bash
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
## Keeping Fork Up to Date
Before pull your request, you should sync your code from the latest PaddlePaddle.
To do this, you'll need to add a remote at first:
```bash
➜ git remote add upstream https://github.com/PaddlePaddle/Paddle
➜ git remote
origin
upstream
```
Update your fork with the latest upstream changes:
```bash
➜ git fetch upstream
➜ git pull upstream develop
```
Now, your local master branch is up-to-date with everything modified upstream.
## Push to GitHub
```bash
# push to your repository in Github
➜ git push origin my-cool-stuff
```
## Create an issue and a Pull Request
Create an Issue to describe the problem and record its number.
Go to the page for your fork on GitHub, select your development branch,
and click the `New pull request`.
<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
Then select the target branch:
<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
We can add `resolve #Issue number` in PR description to close the issue automatically after the PR is merge. More details in <https://help.github.com/articles/closing-issues-via-commit-messages/>.
Then wait for review, if there need to modify, refer to the above steps to update the corresponding origin branch.
## Delete origin branch
After the PR is merge into the main repository, we can delete the remote branch on the PR page.
<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
Or just run:
```bash
➜ git push origin :my-cool-stuff
```
## Delete local branch
Finally, we delete local branch:
```bash
➜ git checkout develop
# delete my-cool-stuff branch
➜ git branch -D my-cool-stuff
```
../../../CONTRIBUTING.md
\ No newline at end of file
...@@ -21,7 +21,6 @@ ...@@ -21,7 +21,6 @@
dev/build_cn.rst dev/build_cn.rst
dev/write_docs_cn.rst dev/write_docs_cn.rst
dev/contribute_to_paddle_cn.md
模型配置 模型配置
-------- --------
......
vendor/ vendor/
.glide/ .glide/
proto/*.go
...@@ -25,9 +25,8 @@ import ( ...@@ -25,9 +25,8 @@ import (
"strings" "strings"
"time" "time"
log "github.com/inconshreveable/log15"
"github.com/namsral/flag" "github.com/namsral/flag"
log "github.com/sirupsen/logrus"
"github.com/topicai/candy"
"github.com/PaddlePaddle/Paddle/go/master" "github.com/PaddlePaddle/Paddle/go/master"
"github.com/PaddlePaddle/Paddle/go/utils/networkhelper" "github.com/PaddlePaddle/Paddle/go/utils/networkhelper"
...@@ -41,16 +40,20 @@ func main() { ...@@ -41,16 +40,20 @@ func main() {
taskTimeoutMax := flag.Int("task-timeout-max", 3, "max timtout count for each task before it being declared failed task.") taskTimeoutMax := flag.Int("task-timeout-max", 3, "max timtout count for each task before it being declared failed task.")
chunkPerTask := flag.Int("chunk-per-task", 10, "chunk per task.") chunkPerTask := flag.Int("chunk-per-task", 10, "chunk per task.")
logLevel := flag.String("log-level", "info", logLevel := flag.String("log-level", "info",
"log level, possible values: debug, info, warning, error, fatal, panic") "log level, possible values: debug, info, warn, error, crit")
flag.Parse() flag.Parse()
level, e := log.ParseLevel(*logLevel) lvl, err := log.LvlFromString(*logLevel)
candy.Must(e) if err != nil {
panic(err)
}
log.SetLevel(level) log.Root().SetHandler(
log.LvlFilterHandler(lvl, log.CallerStackHandler("%+v", log.StderrHandler)),
)
if *endpoints == "" { if *endpoints == "" {
log.Warningln("-endpoints not set, fault tolerance not be enabled.") log.Warn("-endpoints not set, fault tolerance not be enabled.")
} }
var store master.Store var store master.Store
...@@ -58,23 +61,25 @@ func main() { ...@@ -58,23 +61,25 @@ func main() {
eps := strings.Split(*endpoints, ",") eps := strings.Split(*endpoints, ",")
ip, err := networkhelper.GetExternalIP() ip, err := networkhelper.GetExternalIP()
if err != nil { if err != nil {
log.Fatal(err) log.Crit("get external ip error", log.Ctx{"error": err})
panic(err)
} }
addr := fmt.Sprintf("%s:%d", ip, *port) addr := fmt.Sprintf("%s:%d", ip, *port)
store, err = master.NewEtcdClient(eps, addr, master.DefaultLockPath, master.DefaultAddrPath, master.DefaultStatePath, *ttlSec) store, err = master.NewEtcdClient(eps, addr, master.DefaultLockPath, master.DefaultAddrPath, master.DefaultStatePath, *ttlSec)
if err != nil { if err != nil {
log.Fatal(err) log.Crit("error creating etcd client.", log.Ctx{"error": err})
panic(err)
} }
} else { } else {
store = &master.InMemStore{} store = &master.InMemStore{}
} }
shutdown := func() { shutdown := func() {
log.Infoln("shutting down gracefully") log.Info("shutting down gracefully")
err := store.Shutdown() err := store.Shutdown()
if err != nil { if err != nil {
log.Errorln(err) log.Error("shutdown error", log.Ctx{"error": err})
} }
} }
...@@ -86,24 +91,28 @@ func main() { ...@@ -86,24 +91,28 @@ func main() {
s, err := master.NewService(store, *chunkPerTask, *taskTimeoutDur, *taskTimeoutMax) s, err := master.NewService(store, *chunkPerTask, *taskTimeoutDur, *taskTimeoutMax)
if err != nil { if err != nil {
log.Fatal(err) log.Crit("error creating new service.", log.Ctx{"error": err})
panic(err)
} }
err = rpc.Register(s) err = rpc.Register(s)
if err != nil { if err != nil {
log.Fatal(err) log.Crit("error registering to etcd.", log.Ctx{"error": err})
panic(err)
} }
rpc.HandleHTTP() rpc.HandleHTTP()
l, err := net.Listen("tcp", ":"+strconv.Itoa(*port)) l, err := net.Listen("tcp", ":"+strconv.Itoa(*port))
if err != nil { if err != nil {
log.Fatal(err) log.Crit("error listing to port", log.Ctx{"error": err, "port": *port})
panic(err)
} }
go func() { go func() {
err = http.Serve(l, nil) err = http.Serve(l, nil)
if err != nil { if err != nil {
log.Fatal(err) log.Crit("error serving HTTP", log.Ctx{"error": err})
panic(err)
} }
}() }()
......
...@@ -27,11 +27,11 @@ import ( ...@@ -27,11 +27,11 @@ import (
"github.com/topicai/candy" "github.com/topicai/candy"
"github.com/PaddlePaddle/Paddle/go/pserver" "github.com/PaddlePaddle/Paddle/go/pserver"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
func main() { func main() {
port := flag.Int("port", 0, "port of the pserver") port := flag.Int("port", 8001, "port of the pserver")
index := flag.Int("index", -1, "index of the pserver, set to -1 if use etcd for auto pserver index registry") index := flag.Int("index", -1, "index of the pserver, set to -1 if use etcd for auto pserver index registry")
etcdEndpoint := flag.String("etcd-endpoint", "http://127.0.0.1:2379", etcdEndpoint := flag.String("etcd-endpoint", "http://127.0.0.1:2379",
"comma separated endpoint string for pserver to connect to etcd") "comma separated endpoint string for pserver to connect to etcd")
...@@ -41,13 +41,17 @@ func main() { ...@@ -41,13 +41,17 @@ func main() {
checkpointPath := flag.String("checkpoint-path", "/checkpoints/", "save checkpoint path") checkpointPath := flag.String("checkpoint-path", "/checkpoints/", "save checkpoint path")
checkpointInterval := flag.Duration("checkpoint-interval", 600*time.Second, "save checkpoint per interval seconds") checkpointInterval := flag.Duration("checkpoint-interval", 600*time.Second, "save checkpoint per interval seconds")
logLevel := flag.String("log-level", "info", logLevel := flag.String("log-level", "info",
"log level, possible values: debug, info, warning, error, fatal, panic") "log level, possible values: debug, info, warn, error, crit")
flag.Parse() flag.Parse()
level, err := log.ParseLevel(*logLevel) lvl, err := log.LvlFromString(*logLevel)
candy.Must(err) if err != nil {
panic(err)
}
log.SetLevel(level) log.Root().SetHandler(
log.LvlFilterHandler(lvl, log.CallerStackHandler("%+v", log.StderrHandler)),
)
var idx int var idx int
...@@ -63,7 +67,7 @@ func main() { ...@@ -63,7 +67,7 @@ func main() {
cp, err = pserver.LoadCheckpoint(e, idx) cp, err = pserver.LoadCheckpoint(e, idx)
if err != nil { if err != nil {
if err == pserver.ErrCheckpointNotFound { if err == pserver.ErrCheckpointNotFound {
log.Infof("Could not find the pserver checkpoint.") log.Info("load checkpoint error", "error", err)
} else { } else {
panic(err) panic(err)
} }
...@@ -71,10 +75,10 @@ func main() { ...@@ -71,10 +75,10 @@ func main() {
} }
shutdown := func() { shutdown := func() {
log.Infoln("shutting down gracefully") log.Info("shutting down gracefully")
sErr := e.Shutdown() sErr := e.Shutdown()
if sErr != nil { if sErr != nil {
log.Errorln(sErr) log.Error("error shutting down", log.Ctx{"error": sErr})
} }
} }
...@@ -95,7 +99,7 @@ func main() { ...@@ -95,7 +99,7 @@ func main() {
candy.Must(err) candy.Must(err)
go func() { go func() {
log.Infof("start pserver at port %d", *port) log.Info("serving pserver", log.Ctx{"port": *port})
err = http.Serve(l, nil) err = http.Serve(l, nil)
candy.Must(err) candy.Must(err)
}() }()
......
hash: 328e7b9b7306b45e7b9879139a9f86698115981f6283032e1312093a6a6ddb04 hash: 107c058cf5c9163a75d40eef2273a793c36112683c25d72aa8288827fdde3a19
updated: 2017-10-16T08:00:23.484693528Z updated: 2017-10-30T03:46:19.137696069Z
imports: imports:
- name: github.com/alecthomas/gometalinter - name: github.com/alecthomas/gometalinter
version: bae2f1293d092fd8167939d5108d1b025eaef9de version: bae2f1293d092fd8167939d5108d1b025eaef9de
...@@ -99,6 +99,8 @@ imports: ...@@ -99,6 +99,8 @@ imports:
version: d2709f9f1f31ebcda9651b03077758c1f3a0018c version: d2709f9f1f31ebcda9651b03077758c1f3a0018c
- name: github.com/ghodss/yaml - name: github.com/ghodss/yaml
version: 0ca9ea5df5451ffdf184b4428c902747c2c11cd7 version: 0ca9ea5df5451ffdf184b4428c902747c2c11cd7
- name: github.com/go-stack/stack
version: 817915b46b97fd7bb80e8ab6b69f01a53ac3eebf
- name: github.com/gogo/protobuf - name: github.com/gogo/protobuf
version: 909568be09de550ed094403c2bf8a261b5bb730a version: 909568be09de550ed094403c2bf8a261b5bb730a
subpackages: subpackages:
...@@ -120,8 +122,14 @@ imports: ...@@ -120,8 +122,14 @@ imports:
- runtime - runtime
- runtime/internal - runtime/internal
- utilities - utilities
- name: github.com/inconshreveable/log15
version: 0decfc6c20d9ca0ad143b0e89dcaa20f810b4fb3
- name: github.com/jonboulle/clockwork - name: github.com/jonboulle/clockwork
version: 2eee05ed794112d45db504eb05aa693efd2b8b09 version: 2eee05ed794112d45db504eb05aa693efd2b8b09
- name: github.com/mattn/go-colorable
version: 5411d3eea5978e6cdc258b30de592b60df6aba96
- name: github.com/mattn/go-isatty
version: 57fdcb988a5c543893cc61bce354a6e24ab70022
- name: github.com/matttproud/golang_protobuf_extensions - name: github.com/matttproud/golang_protobuf_extensions
version: c12348ce28de40eed0136aa2b644d0ee0650e56c version: c12348ce28de40eed0136aa2b644d0ee0650e56c
subpackages: subpackages:
...@@ -179,11 +187,12 @@ imports: ...@@ -179,11 +187,12 @@ imports:
- lex/httplex - lex/httplex
- trace - trace
- name: golang.org/x/sys - name: golang.org/x/sys
version: 0f826bdd13b500be0f1d4004938ad978fcc6031e version: e48874b42435b4347fc52bdee0424a52abc974d7
repo: https://github.com/golang/sys.git repo: https://github.com/golang/sys.git
vcs: git vcs: git
subpackages: subpackages:
- unix - unix
- windows
- name: golang.org/x/text - name: golang.org/x/text
version: 836efe42bb4aa16aaa17b9c155d8813d336ed720 version: 836efe42bb4aa16aaa17b9c155d8813d336ed720
repo: https://github.com/golang/text.git repo: https://github.com/golang/text.git
...@@ -222,4 +231,3 @@ testImports: ...@@ -222,4 +231,3 @@ testImports:
version: 05e8a0eda380579888eb53c394909df027f06991 version: 05e8a0eda380579888eb53c394909df027f06991
subpackages: subpackages:
- assert - assert
...@@ -26,3 +26,8 @@ import: ...@@ -26,3 +26,8 @@ import:
version: v1.1.0 version: v1.1.0
- package: github.com/alecthomas/gometalinter - package: github.com/alecthomas/gometalinter
version: v1.2.1 version: v1.2.1
- package: github.com/inconshreveable/log15
version: v2.13
- package: github.com/go-stack/stack
version: v1.6.0
- package: github.com/golang/protobuf
...@@ -35,13 +35,19 @@ import ( ...@@ -35,13 +35,19 @@ import (
"unsafe" "unsafe"
"github.com/PaddlePaddle/Paddle/go/master" "github.com/PaddlePaddle/Paddle/go/master"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
var mu sync.Mutex var mu sync.Mutex
var handleMap = make(map[C.paddle_master_client]*master.Client) var handleMap = make(map[C.paddle_master_client]*master.Client)
var curHandle C.paddle_master_client var curHandle C.paddle_master_client
func init() {
log.Root().SetHandler(
log.LvlFilterHandler(log.LvlWarn, log.CallerStackHandler("%+v", log.StderrHandler)),
)
}
func add(c *master.Client) C.paddle_master_client { func add(c *master.Client) C.paddle_master_client {
mu.Lock() mu.Lock()
defer mu.Unlock() defer mu.Unlock()
...@@ -117,7 +123,8 @@ func paddle_set_dataset(client C.paddle_master_client, path **C.char, size C.int ...@@ -117,7 +123,8 @@ func paddle_set_dataset(client C.paddle_master_client, path **C.char, size C.int
} }
err := c.SetDataset(paths) err := c.SetDataset(paths)
if err != nil { if err != nil {
log.Errorln(err) log.Error("error set dataset",
log.Ctx{"error": err, "paths": paths})
return C.PADDLE_MASTER_ERROR return C.PADDLE_MASTER_ERROR
} }
...@@ -167,7 +174,7 @@ func paddle_request_save_model(client C.paddle_master_client, trainerID string, ...@@ -167,7 +174,7 @@ func paddle_request_save_model(client C.paddle_master_client, trainerID string,
c := get(client) c := get(client)
need, err := c.RequestSaveModel(trainerID, time.Duration(blockMS)*time.Millisecond) need, err := c.RequestSaveModel(trainerID, time.Duration(blockMS)*time.Millisecond)
if err != nil { if err != nil {
log.Errorln(err) log.Error("error request save model", log.Ctx{"error": err})
return C.PADDLE_MASTER_ERROR return C.PADDLE_MASTER_ERROR
} }
......
...@@ -21,7 +21,7 @@ import ( ...@@ -21,7 +21,7 @@ import (
"github.com/PaddlePaddle/Paddle/go/connection" "github.com/PaddlePaddle/Paddle/go/connection"
"github.com/PaddlePaddle/recordio" "github.com/PaddlePaddle/recordio"
"github.com/coreos/etcd/clientv3" "github.com/coreos/etcd/clientv3"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
// Client is the client of the master server. // Client is the client of the master server.
...@@ -75,7 +75,7 @@ func WithEtcd(endpoints []string, timeout time.Duration) func(*Client) error { ...@@ -75,7 +75,7 @@ func WithEtcd(endpoints []string, timeout time.Duration) func(*Client) error {
for { for {
err := f() err := f()
if err != nil { if err != nil {
log.Warningln(err) log.Warn("create etcd client error", log.Ctx{"error": err})
} else { } else {
break break
} }
...@@ -121,6 +121,7 @@ func (c *Client) StartGetRecords(passID int) { ...@@ -121,6 +121,7 @@ func (c *Client) StartGetRecords(passID int) {
} }
func (c *Client) getRecords(passID int) { func (c *Client) getRecords(passID int) {
i := 0
for { for {
t, err := c.getTask(passID) t, err := c.getTask(passID)
if err != nil { if err != nil {
...@@ -130,18 +131,26 @@ func (c *Client) getRecords(passID int) { ...@@ -130,18 +131,26 @@ func (c *Client) getRecords(passID int) {
c.ch <- record{nil, err} c.ch <- record{nil, err}
break break
} }
if err.Error() == ErrPassAfter.Error() {
// wait util last pass finishes if i%60 == 0 {
time.Sleep(time.Second * 3) log.Debug("getTask of passID error.",
continue log.Ctx{"error": err, "passID": passID})
i = 0
} }
log.Errorf("getTask error: %s", err)
// if err.Error() == ErrPassAfter.Error()
// wait util last pass finishes
// if other error such as network error
// wait to reconnect or task time out
time.Sleep(time.Second * 3)
i += 3
continue
} }
for _, chunk := range t.Chunks { for _, chunk := range t.Chunks {
f, e := os.Open(chunk.Path) f, e := os.Open(chunk.Path)
if e != nil { if e != nil {
log.Errorln(e) log.Error("error open chunk", log.Ctx{"error": e})
continue continue
} }
...@@ -152,12 +161,15 @@ func (c *Client) getRecords(passID int) { ...@@ -152,12 +161,15 @@ func (c *Client) getRecords(passID int) {
if s.Err() != nil { if s.Err() != nil {
c.ch <- record{nil, s.Err()} c.ch <- record{nil, s.Err()}
log.Errorln(err, chunk.Path) log.Error(
"error scan chunk",
log.Ctx{"error": err, "path": chunk.Path},
)
} }
err = f.Close() err = f.Close()
if err != nil { if err != nil {
log.Errorln(err) log.Error("error close record file", log.Ctx{"error": err})
} }
} }
...@@ -166,7 +178,7 @@ func (c *Client) getRecords(passID int) { ...@@ -166,7 +178,7 @@ func (c *Client) getRecords(passID int) {
// correct, but a reasonable approximation. // correct, but a reasonable approximation.
err = c.taskFinished(t.Meta.ID) err = c.taskFinished(t.Meta.ID)
if err != nil { if err != nil {
log.Errorln(err) log.Error("task finish callback error.", log.Ctx{"error": err})
} }
} }
} }
...@@ -179,12 +191,12 @@ func (c *Client) monitorMaster(addrCh <-chan string) { ...@@ -179,12 +191,12 @@ func (c *Client) monitorMaster(addrCh <-chan string) {
if curMaster == "" { if curMaster == "" {
err := c.conn.Close() err := c.conn.Close()
if err != nil { if err != nil {
log.Errorln(err) log.Error("close old master addr error", log.Ctx{"error": err})
} }
} else { } else {
err := c.conn.Connect(curMaster) err := c.conn.Connect(curMaster)
if err != nil { if err != nil {
log.Errorln(err) log.Error("connect to new master addr error", log.Ctx{"error": err})
// connect to addr failed, set // connect to addr failed, set
// to last known addr in order // to last known addr in order
......
...@@ -25,8 +25,6 @@ import ( ...@@ -25,8 +25,6 @@ import (
"testing" "testing"
"time" "time"
log "github.com/sirupsen/logrus"
"github.com/PaddlePaddle/Paddle/go/connection" "github.com/PaddlePaddle/Paddle/go/connection"
"github.com/PaddlePaddle/recordio" "github.com/PaddlePaddle/recordio"
) )
...@@ -36,10 +34,6 @@ const ( ...@@ -36,10 +34,6 @@ const (
chunkPerTask = 10 chunkPerTask = 10
) )
func init() {
log.SetLevel(log.ErrorLevel)
}
func TestGetFinishTask(t *testing.T) { func TestGetFinishTask(t *testing.T) {
const path = "/tmp/master_client_test_0" const path = "/tmp/master_client_test_0"
......
...@@ -117,6 +117,7 @@ func TestNextRecord(t *testing.T) { ...@@ -117,6 +117,7 @@ func TestNextRecord(t *testing.T) {
if e != nil { if e != nil {
panic(e) panic(e)
} }
// test for n passes // test for n passes
for pass := 0; pass < 10; pass++ { for pass := 0; pass < 10; pass++ {
c.StartGetRecords(pass) c.StartGetRecords(pass)
......
...@@ -20,7 +20,7 @@ import ( ...@@ -20,7 +20,7 @@ import (
"github.com/coreos/etcd/clientv3" "github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/clientv3/concurrency" "github.com/coreos/etcd/clientv3/concurrency"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
const ( const (
...@@ -44,7 +44,7 @@ type EtcdClient struct { ...@@ -44,7 +44,7 @@ type EtcdClient struct {
// NewEtcdClient creates a new EtcdClient. // NewEtcdClient creates a new EtcdClient.
func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePath string, ttlSec int) (*EtcdClient, error) { func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePath string, ttlSec int) (*EtcdClient, error) {
log.Debugf("Connecting to etcd at %v", endpoints) log.Debug("Connecting to etcd", log.Ctx{"endpoint": endpoints})
cli, err := clientv3.New(clientv3.Config{ cli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints, Endpoints: endpoints,
DialTimeout: dialTimeout, DialTimeout: dialTimeout,
...@@ -64,12 +64,12 @@ func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePat ...@@ -64,12 +64,12 @@ func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePat
// one master running, but split-brain problem may cause // one master running, but split-brain problem may cause
// multiple master servers running), and the cluster management // multiple master servers running), and the cluster management
// software will kill one of them. // software will kill one of them.
log.Infof("Trying to acquire lock at %s.", lockPath) log.Info("Trying to acquire lock.", log.Ctx{"path": lockPath})
err = lock.Lock(context.TODO()) err = lock.Lock(context.TODO())
if err != nil { if err != nil {
return nil, err return nil, err
} }
log.Infof("Successfully acquired lock at %s.", lockPath) log.Info("Successfully acquired lock at %s.", log.Ctx{"path": lockPath})
put := clientv3.OpPut(addrPath, addr) put := clientv3.OpPut(addrPath, addr)
resp, err := cli.Txn(context.Background()).If(lock.IsOwner()).Then(put).Commit() resp, err := cli.Txn(context.Background()).If(lock.IsOwner()).Then(put).Commit()
...@@ -78,7 +78,8 @@ func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePat ...@@ -78,7 +78,8 @@ func NewEtcdClient(endpoints []string, addr string, lockPath, addrPath, statePat
} }
if !resp.Succeeded { if !resp.Succeeded {
log.Fatal("No longer owns the master lock. Exiting.") log.Crit("No longer owns the master lock. Exiting.")
panic("No longer owns the master lock. Exiting.")
} }
e := &EtcdClient{ e := &EtcdClient{
...@@ -102,7 +103,7 @@ func (e *EtcdClient) Save(state []byte) error { ...@@ -102,7 +103,7 @@ func (e *EtcdClient) Save(state []byte) error {
} }
if !resp.Succeeded { if !resp.Succeeded {
log.Errorln("No longer owns the lock, trying to lock again") log.Error("No longer owns the lock, trying to lock again")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
err := e.lock.Lock(ctx) err := e.lock.Lock(ctx)
cancel() cancel()
...@@ -116,9 +117,10 @@ func (e *EtcdClient) Save(state []byte) error { ...@@ -116,9 +117,10 @@ func (e *EtcdClient) Save(state []byte) error {
// to kill current master server. The current // to kill current master server. The current
// state is not saved, but the trainer's RPC // state is not saved, but the trainer's RPC
// call will fail, so the trainer will retry. // call will fail, so the trainer will retry.
log.Fatalf("Could not acquire the lock at %s: %v. Exiting.", e.lockPath, err) log.Crit("Could not acquire the lock at %s: %v. Exiting.", log.Ctx{"path": e.lockPath, "error": err})
panic("Could not acquire the lock at %s: %v. Exiting.")
} }
log.Infof("Successfully acquired lock at %s.", e.lockPath) log.Info("Successfully acquired lock at %s.", e.lockPath)
return e.Save(state) return e.Save(state)
} }
...@@ -136,7 +138,7 @@ func (e *EtcdClient) Load() ([]byte, error) { ...@@ -136,7 +138,7 @@ func (e *EtcdClient) Load() ([]byte, error) {
} }
if !resp.Succeeded { if !resp.Succeeded {
log.Errorln("No longer owns the lock, trying to lock and load again.") log.Error("No longer owns the lock, trying to lock and load again.")
err = e.lock.Lock(context.Background()) err = e.lock.Lock(context.Background())
if err != nil { if err != nil {
return nil, err return nil, err
...@@ -163,7 +165,7 @@ func (e *EtcdClient) Shutdown() error { ...@@ -163,7 +165,7 @@ func (e *EtcdClient) Shutdown() error {
if err == nil { if err == nil {
err = newErr err = newErr
} else { } else {
log.Errorln(newErr) log.Error("shutdown error", log.Ctx{"error": newErr})
} }
} }
...@@ -192,7 +194,7 @@ func watchKey(c *clientv3.Client, key string, valChan chan<- string) { ...@@ -192,7 +194,7 @@ func watchKey(c *clientv3.Client, key string, valChan chan<- string) {
for wresp := range rch { for wresp := range rch {
for _, ev := range wresp.Events { for _, ev := range wresp.Events {
// if received event is DELETE, the value will be an empty string // if received event is DELETE, the value will be an empty string
log.Infof("received event %s, %q : %q\n", ev.Type, ev.Kv.Key, ev.Kv.Value) log.Info("received event.", log.Ctx{"type": ev.Type, "key": ev.Kv.Key, "value": ev.Kv.Value})
valChan <- string(ev.Kv.Value) valChan <- string(ev.Kv.Value)
} }
} }
......
...@@ -25,7 +25,7 @@ import ( ...@@ -25,7 +25,7 @@ import (
"sync" "sync"
"time" "time"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
"github.com/PaddlePaddle/recordio" "github.com/PaddlePaddle/recordio"
) )
...@@ -170,11 +170,11 @@ func (s *Service) recover() (bool, error) { ...@@ -170,11 +170,11 @@ func (s *Service) recover() (bool, error) {
} }
if state == nil { if state == nil {
log.Infoln("No state exists, not recovered.") log.Info("No state exists, not recovered.")
return false, nil return false, nil
} }
log.Infof("Loaded snapshot of size: %d bytes.", len(state)) log.Info("Loaded snapshot.", log.Ctx{"size": len(state)})
gr, err := gzip.NewReader(bytes.NewReader(state)) gr, err := gzip.NewReader(bytes.NewReader(state))
if err != nil { if err != nil {
return false, err return false, err
...@@ -191,11 +191,11 @@ func (s *Service) recover() (bool, error) { ...@@ -191,11 +191,11 @@ func (s *Service) recover() (bool, error) {
if err != nil { if err != nil {
// Only close failed, recover actually succeed, so // Only close failed, recover actually succeed, so
// just log error. // just log error.
log.Errorln(err) log.Error("error close recover file.", log.Ctx{"error": err})
} }
s.state = tqs s.state = tqs
log.WithFields(s.logFields()).Infof("Master recovered from snapshot, scheduling pending task timeout check.") log.Info("Master recovered from snapshot, scheduling pending task timeout check.", s.logCtx())
for _, t := range s.state.Pending { for _, t := range s.state.Pending {
time.AfterFunc(s.timeoutDur, s.checkTimeoutFunc(t.Task.Meta.ID, t.Task.Meta.Epoch)) time.AfterFunc(s.timeoutDur, s.checkTimeoutFunc(t.Task.Meta.ID, t.Task.Meta.Epoch))
} }
...@@ -224,7 +224,7 @@ func (s *Service) snapshot() error { ...@@ -224,7 +224,7 @@ func (s *Service) snapshot() error {
} }
state := buf.Bytes() state := buf.Bytes()
log.Infof("Saving snapshot of size: %d bytes.", len(state)) log.Info("Saving snapshot.", log.Ctx{"size bytes": len(state)})
return s.store.Save(state) return s.store.Save(state)
} }
...@@ -260,7 +260,7 @@ func readChunks(globPaths []string) ([]Chunk, error) { ...@@ -260,7 +260,7 @@ func readChunks(globPaths []string) ([]Chunk, error) {
} }
count := index.NumChunks() count := index.NumChunks()
log.Infof("readChunks: file %s has %d chunks", path, count) log.Info("reading chunks.", log.Ctx{"path": path, "num chunks": count})
for i := 0; i < count; i++ { for i := 0; i < count; i++ {
chunk := Chunk{ chunk := Chunk{
Path: path, Path: path,
...@@ -300,7 +300,7 @@ func (s *Service) SetDataset(globPaths []string, _ *int) error { ...@@ -300,7 +300,7 @@ func (s *Service) SetDataset(globPaths []string, _ *int) error {
err = s.snapshot() err = s.snapshot()
if err != nil { if err != nil {
log.Errorln(err) log.Error("snapshot error", log.Ctx{"error": err})
return err return err
} }
close(s.ready) close(s.ready)
...@@ -320,7 +320,7 @@ func (s *Service) processFailedTask(t taskEntry, epoch int) { ...@@ -320,7 +320,7 @@ func (s *Service) processFailedTask(t taskEntry, epoch int) {
defer func() { defer func() {
err := s.snapshot() err := s.snapshot()
if err != nil { if err != nil {
log.Errorln(err) log.Error("snapshot error", log.Ctx{"error": err})
} }
}() }()
...@@ -328,12 +328,12 @@ func (s *Service) processFailedTask(t taskEntry, epoch int) { ...@@ -328,12 +328,12 @@ func (s *Service) processFailedTask(t taskEntry, epoch int) {
t.NumFailure++ t.NumFailure++
if t.NumFailure > s.failureMax { if t.NumFailure > s.failureMax {
log.Warningf("Task %v failed %d times, discard.", t.Task, t.NumFailure) log.Warn("Task failed to many times, discard.", log.Ctx{"task": t.Task, "num failed": t.NumFailure})
s.state.Failed = append(s.state.Failed, t) s.state.Failed = append(s.state.Failed, t)
return return
} }
log.Warningf("Task %v failed %d times, re-dispatch.", t.Task, t.NumFailure) log.Warn("Task failed, re-dispatch.", log.Ctx{"task": t.Task, "num failed": t.NumFailure})
s.state.Todo = append(s.state.Todo, t) s.state.Todo = append(s.state.Todo, t)
return return
} }
...@@ -353,8 +353,8 @@ func (s *Service) checkTimeoutFunc(taskID int, epoch int) func() { ...@@ -353,8 +353,8 @@ func (s *Service) checkTimeoutFunc(taskID int, epoch int) func() {
} }
// must be called with lock held. // must be called with lock held.
func (s *Service) logFields() log.Fields { func (s *Service) logCtx() log.Ctx {
return log.Fields{ return log.Ctx{
"todoLen": len(s.state.Todo), "todoLen": len(s.state.Todo),
"pendingLen": len(s.state.Pending), "pendingLen": len(s.state.Pending),
"doneLen": len(s.state.Done), "doneLen": len(s.state.Done),
...@@ -383,10 +383,10 @@ func (s *Service) GetTask(passID int, task *Task) error { ...@@ -383,10 +383,10 @@ func (s *Service) GetTask(passID int, task *Task) error {
if len(s.state.Todo) == 0 { if len(s.state.Todo) == 0 {
if len(s.state.Done) == 0 && len(s.state.Pending) == 0 { if len(s.state.Done) == 0 && len(s.state.Pending) == 0 {
log.WithFields(s.logFields()).Warningln("All tasks failed, may start next pass") log.Warn("All tasks failed, may start next pass", s.logCtx())
return ErrAllTaskFailed return ErrAllTaskFailed
} }
log.WithFields(s.logFields()).Warningln("No more available task.") log.Warn("No more available task.", s.logCtx())
return ErrNoMoreAvailable return ErrNoMoreAvailable
} }
...@@ -400,8 +400,9 @@ func (s *Service) GetTask(passID int, task *Task) error { ...@@ -400,8 +400,9 @@ func (s *Service) GetTask(passID int, task *Task) error {
} }
*task = t.Task *task = t.Task
log.WithFields(s.logFields()).Infof("Task #%v dispatched.", t.Task.Meta) ctx := s.logCtx()
ctx["task meta"] = t.Task.Meta
log.Info("Task dispatched.", ctx)
time.AfterFunc(s.timeoutDur, s.checkTimeoutFunc(t.Task.Meta.ID, t.Task.Meta.Epoch)) time.AfterFunc(s.timeoutDur, s.checkTimeoutFunc(t.Task.Meta.ID, t.Task.Meta.Epoch))
return nil return nil
} }
...@@ -417,7 +418,9 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { ...@@ -417,7 +418,9 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error {
t, ok := s.state.Pending[taskID] t, ok := s.state.Pending[taskID]
if !ok { if !ok {
log.WithFields(s.logFields()).Warningln("Pending task #%d not found.", taskID) ctx := s.logCtx()
ctx["task id"] = taskID
log.Warn("Pending task not found.", ctx)
return nil return nil
} }
...@@ -426,7 +429,9 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { ...@@ -426,7 +429,9 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error {
s.state.Done = append(s.state.Done, t) s.state.Done = append(s.state.Done, t)
delete(s.state.Pending, taskID) delete(s.state.Pending, taskID)
log.WithFields(s.logFields()).Infof("Task #%d finished.", taskID) ctx := s.logCtx()
ctx["task id"] = taskID
log.Info("Task finished.", ctx)
if len(s.state.Todo) == 0 && len(s.state.Pending) == 0 { if len(s.state.Todo) == 0 && len(s.state.Pending) == 0 {
// increase master side pass count if all tasks finished // increase master side pass count if all tasks finished
s.state.CurPass++ s.state.CurPass++
...@@ -434,12 +439,14 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { ...@@ -434,12 +439,14 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error {
s.state.Done = []taskEntry{} s.state.Done = []taskEntry{}
// TODO(typhoonzero): deal with failed tasks // TODO(typhoonzero): deal with failed tasks
s.state.Failed = []taskEntry{} s.state.Failed = []taskEntry{}
log.WithFields(s.logFields()).Warningf("all task finished, add new pass data, newpass: %d.", s.state.CurPass) ctx := s.logCtx()
ctx["new pass"] = s.state.CurPass
log.Warn("all task finished, add new pass data.", ctx)
} }
err := s.snapshot() err := s.snapshot()
if err != nil { if err != nil {
log.Errorln(err) log.Error("snapshot error", log.Ctx{"error": err})
} }
return err return err
} }
...@@ -455,7 +462,7 @@ func (s *Service) TaskFailed(meta TaskMeta, dummy *int) error { ...@@ -455,7 +462,7 @@ func (s *Service) TaskFailed(meta TaskMeta, dummy *int) error {
t, ok := s.state.Pending[meta.ID] t, ok := s.state.Pending[meta.ID]
if !ok { if !ok {
log.WithFields(s.logFields()).Warningln("TaskFailed:Pending task #%v not found.", t.Task.Meta) log.Warn("TaskFailed:Pending task not found.", log.Ctx{"task": t.Task.Meta})
return nil return nil
} }
......
# Ignore everything in this directory
*
# Except this file
!.gitignore
...@@ -13,5 +13,5 @@ ...@@ -13,5 +13,5 @@
# limitations under the License. # limitations under the License.
# #
if(WITH_TESTING) if(WITH_TESTING)
go_test(pserver_test DEPS paddle_go_optimizer) go_test(pserver_test DEPS paddle_go_optimizer gen_proto_go)
endif() endif()
...@@ -45,9 +45,15 @@ import ( ...@@ -45,9 +45,15 @@ import (
"github.com/PaddlePaddle/Paddle/go/pserver" "github.com/PaddlePaddle/Paddle/go/pserver"
"github.com/PaddlePaddle/Paddle/go/pserver/client" "github.com/PaddlePaddle/Paddle/go/pserver/client"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
func init() {
log.Root().SetHandler(
log.LvlFilterHandler(log.LvlWarn, log.CallerStackHandler("%+v", log.StderrHandler)),
)
}
var mu sync.Mutex var mu sync.Mutex
var handleMap = make(map[C.paddle_pserver_client]*client.Client) var handleMap = make(map[C.paddle_pserver_client]*client.Client)
var curHandle C.paddle_pserver_client var curHandle C.paddle_pserver_client
...@@ -164,10 +170,13 @@ func paddle_init_param(client C.paddle_pserver_client, param C.paddle_parameter, ...@@ -164,10 +170,13 @@ func paddle_init_param(client C.paddle_pserver_client, param C.paddle_parameter,
if err != nil { if err != nil {
if err.Error() == pserver.AlreadyInitialized { if err.Error() == pserver.AlreadyInitialized {
log.Warningf("parameter %s already initialized, treat paddle_init_param as successful.", name) log.Warn(
"parameter already initialized, treat paddle_init_param as successful.",
log.Ctx{"parameter": name},
)
return C.PSERVER_OK return C.PSERVER_OK
} }
log.Errorln(err) log.Error("error init param", log.Ctx{"error": err})
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
...@@ -180,11 +189,11 @@ func paddle_finish_init_params(client C.paddle_pserver_client) C.int { ...@@ -180,11 +189,11 @@ func paddle_finish_init_params(client C.paddle_pserver_client) C.int {
err := c.FinishInitParams() err := c.FinishInitParams()
if err != nil { if err != nil {
if err.Error() == pserver.AlreadyInitialized { if err.Error() == pserver.AlreadyInitialized {
log.Warningln("parameters already initialized, treat paddle_finish_init_params as successful.") log.Warn("parameters already initialized, treat paddle_finish_init_params as successful.")
return C.PSERVER_OK return C.PSERVER_OK
} }
log.Errorln(err) log.Error("error finish init params", log.Ctx{"error": err})
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
...@@ -205,7 +214,7 @@ func paddle_send_grads(client C.paddle_pserver_client, grads **C.paddle_gradient ...@@ -205,7 +214,7 @@ func paddle_send_grads(client C.paddle_pserver_client, grads **C.paddle_gradient
c := get(client) c := get(client)
err := c.SendGrads(gs) err := c.SendGrads(gs)
if err != nil { if err != nil {
log.Errorln(err) log.Error("error send grads", log.Ctx{"error": err})
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
...@@ -222,7 +231,7 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, ...@@ -222,7 +231,7 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter,
c := get(client) c := get(client)
ps, err := c.GetParams(ns) ps, err := c.GetParams(ns)
if err != nil { if err != nil {
log.Errorln(err) log.Error("error get params", log.Ctx{"error": err})
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
...@@ -231,7 +240,13 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, ...@@ -231,7 +240,13 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter,
for i, p := range ps { for i, p := range ps {
pn[i] = p.Name pn[i] = p.Name
} }
log.Errorf("pserver returned wrong number of parameters. Requested: %s, returned: %s.", strings.Join(pn, ", "), strings.Join(ns, ", ")) log.Error(
"pserver returned wrong number of parameters.",
log.Ctx{
"Requested": strings.Join(pn, ", "),
"Returned": strings.Join(ns, ", "),
},
)
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
...@@ -241,7 +256,13 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, ...@@ -241,7 +256,13 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter,
for i, p := range ps { for i, p := range ps {
pn[i] = p.Name pn[i] = p.Name
} }
log.Errorf("pserver returned wrong parameters, or not in requested order. Requested: %s, returned: %s.", strings.Join(pn, ", "), strings.Join(ns, ", ")) log.Error(
"pserver returned wrong parameters, or not in requested order.",
log.Ctx{
"Requested": strings.Join(pn, ", "),
"Returned": strings.Join(ns, ", "),
},
)
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
} }
...@@ -251,13 +272,19 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, ...@@ -251,13 +272,19 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter,
param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst))))
if unsafe.Pointer(param) == nil { if unsafe.Pointer(param) == nil {
log.Errorln("must pre-allocate parameter.") log.Error("must pre-allocate parameter.")
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
if unsafe.Pointer(param.content) != nil { if unsafe.Pointer(param.content) != nil {
if int(param.content_len) != len(p.Content) { if int(param.content_len) != len(p.Content) {
log.Errorf("the pre-allocated content len does not match parameter content len. Pre-allocated len: %d, returned len: %d", param.content_len, len(p.Content)) log.Error(
"the pre-allocated content len does not match parameter content len.",
log.Ctx{
"Pre-allocated len": param.content_len,
"Returned len": len(p.Content),
},
)
return C.PSERVER_ERROR return C.PSERVER_ERROR
} }
} }
......
...@@ -22,7 +22,7 @@ import ( ...@@ -22,7 +22,7 @@ import (
"github.com/PaddlePaddle/Paddle/go/connection" "github.com/PaddlePaddle/Paddle/go/connection"
"github.com/PaddlePaddle/Paddle/go/pserver" "github.com/PaddlePaddle/Paddle/go/pserver"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
// TODO(helin): add RPC call retry logic // TODO(helin): add RPC call retry logic
...@@ -84,7 +84,7 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { ...@@ -84,7 +84,7 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) {
if curServers[i].Addr == "" { if curServers[i].Addr == "" {
err := c.pservers[i].Close() err := c.pservers[i].Close()
if err != nil { if err != nil {
log.Errorln(err) log.Error("error closing connection to pserver", log.Ctx{"error": err})
} }
continue continue
...@@ -92,7 +92,7 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { ...@@ -92,7 +92,7 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) {
err := c.pservers[i].Connect(curServers[i].Addr) err := c.pservers[i].Connect(curServers[i].Addr)
if err != nil { if err != nil {
log.Errorln(err) log.Error("error connecting to pserver", log.Ctx{"error": err})
// connect to addr failed, set // connect to addr failed, set
// to last known addr in order // to last known addr in order
......
...@@ -30,7 +30,7 @@ import ( ...@@ -30,7 +30,7 @@ import (
"github.com/PaddlePaddle/Paddle/go/pserver" "github.com/PaddlePaddle/Paddle/go/pserver"
"github.com/PaddlePaddle/Paddle/go/pserver/client" "github.com/PaddlePaddle/Paddle/go/pserver/client"
"github.com/coreos/etcd/clientv3" "github.com/coreos/etcd/clientv3"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
const ( const (
...@@ -90,7 +90,7 @@ func initEtcdClient() { ...@@ -90,7 +90,7 @@ func initEtcdClient() {
DialTimeout: time.Second * time.Duration(1), DialTimeout: time.Second * time.Duration(1),
}) })
if err != nil { if err != nil {
log.Errorf("err %v", err) log.Error("error init etcd client", log.Ctx{"error": err})
} }
ctx, cancel := context.WithTimeout(context.Background(), timeout) ctx, cancel := context.WithTimeout(context.Background(), timeout)
_, err = client.Delete(ctx, pserver.PsDesired) _, err = client.Delete(ctx, pserver.PsDesired)
......
...@@ -25,7 +25,7 @@ import ( ...@@ -25,7 +25,7 @@ import (
"github.com/PaddlePaddle/Paddle/go/pserver" "github.com/PaddlePaddle/Paddle/go/pserver"
"github.com/coreos/etcd/clientv3" "github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/clientv3/concurrency" "github.com/coreos/etcd/clientv3/concurrency"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
const ( const (
...@@ -54,26 +54,29 @@ func (e *Etcd) Desired() int { ...@@ -54,26 +54,29 @@ func (e *Etcd) Desired() int {
resp, err := e.client.Get(ctx, pserver.PsDesired) resp, err := e.client.Get(ctx, pserver.PsDesired)
cancel() cancel()
if err != nil { if err != nil {
log.Errorf("Get ps dresire number failed! recnnectiong..., %v", err) log.Error(
"Get ps dresire number failed! reconnecting...",
log.Ctx{"error": err},
)
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
kvs := resp.Kvs kvs := resp.Kvs
if len(kvs) == 0 { if len(kvs) == 0 {
log.Infoln("Waiting for ps desired registered ...") log.Info("Waiting for ps desired registered ...")
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
psDesired, err = strconv.Atoi(string(resp.Kvs[0].Value)) psDesired, err = strconv.Atoi(string(resp.Kvs[0].Value))
if err != nil { if err != nil {
log.Errorf("psDesired %d invalid %v", psDesired, err) log.Error("atoi failed", log.Ctx{"error": err})
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
log.Debugf("Get psDesired number: %d", psDesired) log.Debug("Got psDesired", log.Ctx{"psDesired": psDesired})
break break
} }
return psDesired return psDesired
...@@ -88,17 +91,20 @@ func (e *Etcd) List() []Server { ...@@ -88,17 +91,20 @@ func (e *Etcd) List() []Server {
for i := 0; i < psDesired; i++ { for i := 0; i < psDesired; i++ {
ctx, cancel := context.WithTimeout(context.Background(), e.timeout) ctx, cancel := context.WithTimeout(context.Background(), e.timeout)
psKey := pserver.PsPath + strconv.Itoa(i) psKey := pserver.PsPath + strconv.Itoa(i)
log.Debugf("checking %s", psKey) log.Debug("looking for pserver", log.Ctx{"ps key": psKey})
resp, err := e.client.Get(ctx, psKey) resp, err := e.client.Get(ctx, psKey)
cancel() cancel()
if err != nil { if err != nil {
log.Infof("Get psKey= %s error, %v", psKey, err) log.Info(
"Get psKey error",
log.Ctx{"ps key": psKey, "error": err},
)
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
kvs := resp.Kvs kvs := resp.Kvs
if len(kvs) == 0 { if len(kvs) == 0 {
log.Infof("Waiting for ps addr registered ...") log.Info("Waiting for ps addr registered ...")
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
...@@ -106,11 +112,17 @@ func (e *Etcd) List() []Server { ...@@ -106,11 +112,17 @@ func (e *Etcd) List() []Server {
psAddr := string(resp.Kvs[0].Value) psAddr := string(resp.Kvs[0].Value)
// TODO(Longfei) check the ps address // TODO(Longfei) check the ps address
if psAddr == "" { if psAddr == "" {
log.Infof("Get psKey = %s, psAddr is empty", psKey) log.Info(
"Value under psKey is empty",
log.Ctx{"psKey": psKey},
)
time.Sleep(e.timeout) time.Sleep(e.timeout)
continue continue
} }
log.Debugf("got value (%s) for key: %s", psAddr, psKey) log.Debug(
"got psAddr given psKey",
log.Ctx{"psAddr": psAddr, "psKey": psKey},
)
servers[i].Index = i servers[i].Index = i
servers[i].Addr = psAddr servers[i].Addr = psAddr
} }
...@@ -130,13 +142,13 @@ func NewEtcd(endpoints string) *Etcd { ...@@ -130,13 +142,13 @@ func NewEtcd(endpoints string) *Etcd {
DialTimeout: defaultEtcdTimeout, DialTimeout: defaultEtcdTimeout,
}) })
if err != nil { if err != nil {
log.Errorf("Init etcd connection failed: %v", err) log.Error("Init etcd connection failed", log.Ctx{"error": err})
time.Sleep(defaultEtcdTimeout) time.Sleep(defaultEtcdTimeout)
continue continue
} }
break break
} }
log.Infof("Connected to etcd: %s\n", endpoints) log.Info("Connected to etcd endpoint", log.Ctx{"endpoint": endpoints})
client := &Etcd{ client := &Etcd{
client: cli, client: cli,
timeout: defaultEtcdTimeout, timeout: defaultEtcdTimeout,
...@@ -154,7 +166,7 @@ func (e *Etcd) Select() (bool, error) { ...@@ -154,7 +166,7 @@ func (e *Etcd) Select() (bool, error) {
} }
lock := concurrency.NewMutex(sess, initLockPath) lock := concurrency.NewMutex(sess, initLockPath)
log.Infof("Trying to acquire lock at %s.", initLockPath) log.Info("Trying to acquire lock", log.Ctx{"lock path": initLockPath})
// Do not use timeout context here, since we don't know how // Do not use timeout context here, since we don't know how
// long does it take for other trainers to initialize the // long does it take for other trainers to initialize the
// parameters. // parameters.
...@@ -162,7 +174,7 @@ func (e *Etcd) Select() (bool, error) { ...@@ -162,7 +174,7 @@ func (e *Etcd) Select() (bool, error) {
if err != nil { if err != nil {
return false, err return false, err
} }
log.Infof("Successfully acquired lock at %s.", initLockPath) log.Info("Successfully acquired lock", log.Ctx{"lock path": initLockPath})
get := clientv3.OpGet(initDonePath) get := clientv3.OpGet(initDonePath)
ctx, cancel := context.WithTimeout(context.Background(), e.timeout) ctx, cancel := context.WithTimeout(context.Background(), e.timeout)
...@@ -181,17 +193,17 @@ func (e *Etcd) Select() (bool, error) { ...@@ -181,17 +193,17 @@ func (e *Etcd) Select() (bool, error) {
if len(resp.Kvs) == 0 { if len(resp.Kvs) == 0 {
// Key value not set, select current trainer. // Key value not set, select current trainer.
e.lock = lock e.lock = lock
log.Infoln("Trainer selected.") log.Info("Trainer selected.")
return true, nil return true, nil
} }
if string(resp.Kvs[0].Value) == initDoneVal { if string(resp.Kvs[0].Value) == initDoneVal {
log.Infoln("Initialization is already done.") log.Info("Initialization is already done.")
ctx, cancel = context.WithTimeout(context.Background(), e.timeout) ctx, cancel = context.WithTimeout(context.Background(), e.timeout)
err = lock.Unlock(ctx) err = lock.Unlock(ctx)
cancel() cancel()
if err != nil { if err != nil {
log.Errorln(err) log.Error("error unlocking", log.Ctx{"error": err})
} }
return false, nil return false, nil
} }
...@@ -221,7 +233,7 @@ func (e *Etcd) Done() error { ...@@ -221,7 +233,7 @@ func (e *Etcd) Done() error {
err = e.lock.Unlock(ctx) err = e.lock.Unlock(ctx)
cancel() cancel()
if err != nil { if err != nil {
log.Errorln(err) log.Error("error unlocking", log.Ctx{"error": err})
} else { } else {
e.lock = nil e.lock = nil
} }
...@@ -244,7 +256,7 @@ func (e *Etcd) Close() error { ...@@ -244,7 +256,7 @@ func (e *Etcd) Close() error {
cErr := e.client.Close() cErr := e.client.Close()
if cErr != nil { if cErr != nil {
if err != nil { if err != nil {
log.Errorln(cErr) log.Error("error closing etcd client", log.Ctx{"error": cErr})
return err return err
} }
return cErr return cErr
......
...@@ -24,7 +24,7 @@ import ( ...@@ -24,7 +24,7 @@ import (
"github.com/PaddlePaddle/Paddle/go/utils/networkhelper" "github.com/PaddlePaddle/Paddle/go/utils/networkhelper"
"github.com/coreos/etcd/clientv3" "github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/clientv3/concurrency" "github.com/coreos/etcd/clientv3/concurrency"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
const ( const (
...@@ -82,19 +82,19 @@ func (e *EtcdClient) Register(port int) (int, error) { ...@@ -82,19 +82,19 @@ func (e *EtcdClient) Register(port int) (int, error) {
DialTimeout: e.dialTimeout, DialTimeout: e.dialTimeout,
}) })
if err != nil { if err != nil {
log.Errorf("connect to etcd error: %v", err) log.Error("connect to etcd error", log.Ctx{"error": err})
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
continue continue
} }
e.client = cli e.client = cli
sess, err := concurrency.NewSession(cli, concurrency.WithTTL(e.ttlSec)) sess, err := concurrency.NewSession(cli, concurrency.WithTTL(e.ttlSec))
if err != nil { if err != nil {
log.Errorf("create etcd session error: %v", err) log.Error("create etcd session error", log.Ctx{"error": err})
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
continue continue
} }
e.sess = sess e.sess = sess
log.Debugf("inited client to %s", e.endpoints) log.Debug("connected to etcd", log.Ctx{"endpoint": e.endpoints})
break break
} }
// init /ps_desired using transaction, for multiple pservers may want to write // init /ps_desired using transaction, for multiple pservers may want to write
...@@ -104,7 +104,7 @@ func (e *EtcdClient) Register(port int) (int, error) { ...@@ -104,7 +104,7 @@ func (e *EtcdClient) Register(port int) (int, error) {
_, err := e.initDesiredPservers(ctx, e.numPservers) _, err := e.initDesiredPservers(ctx, e.numPservers)
cancel() cancel()
if err != nil { if err != nil {
log.Warn(err) log.Warn("pserver init error", log.Ctx{"error": err, "num pservers": e.numPservers})
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
continue continue
} }
...@@ -119,14 +119,17 @@ func (e *EtcdClient) Register(port int) (int, error) { ...@@ -119,14 +119,17 @@ func (e *EtcdClient) Register(port int) (int, error) {
resp, err := e.client.Get(ctx, PsDesired) resp, err := e.client.Get(ctx, PsDesired)
cancel() cancel()
if err != nil { if err != nil {
log.Errorf("getting %s error: %v", PsDesired, err) log.Error("get etcd key error", log.Ctx{"key": PsDesired, "error": err})
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
continue continue
} }
if len(resp.Kvs) != 0 { if len(resp.Kvs) != 0 {
e.desired, err = strconv.Atoi(string(resp.Kvs[0].Value)) e.desired, err = strconv.Atoi(string(resp.Kvs[0].Value))
if err != nil { if err != nil {
log.Errorf("value of %s invalid %v\n", PsDesired, err) log.Error(
"psDesired atoi error",
log.Ctx{"error": err, "value": string(resp.Kvs[0].Value)},
)
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
// NOTE: wait util ps_desired value change // NOTE: wait util ps_desired value change
continue continue
...@@ -143,7 +146,7 @@ func (e *EtcdClient) Register(port int) (int, error) { ...@@ -143,7 +146,7 @@ func (e *EtcdClient) Register(port int) (int, error) {
pserverIdx, err = e.registerPserverEtcd(ctx, port) pserverIdx, err = e.registerPserverEtcd(ctx, port)
cancel() cancel()
if err != nil { if err != nil {
log.Warn(err) log.Warn("register pserver on etcd error", log.Ctx{"error": err})
time.Sleep(retryTimeout) time.Sleep(retryTimeout)
continue continue
} }
...@@ -170,16 +173,17 @@ func (e *EtcdClient) registerPserverEtcd(ctx context.Context, port int) (int, er ...@@ -170,16 +173,17 @@ func (e *EtcdClient) registerPserverEtcd(ctx context.Context, port int) (int, er
registered := false registered := false
for i := 0; i < e.desired; i++ { for i := 0; i < e.desired; i++ {
psKey := PsPath + strconv.Itoa(i) psKey := PsPath + strconv.Itoa(i)
log.Debugf("checking %s", psKey)
ps := c.Get(psKey) ps := c.Get(psKey)
log.Debugf("got value (%s) for key: %s", ps, psKey) log.Debug(
"register pserver got value",
log.Ctx{"value": ps, "key": psKey},
)
if ps == "" { if ps == "" {
// find the first id and write info // find the first id and write info
pserverAddr := e.externalIP + ":" + strconv.Itoa(port) pserverAddr := e.externalIP + ":" + strconv.Itoa(port)
c.Put(psKey, pserverAddr, clientv3.WithLease(e.sess.Lease())) c.Put(psKey, pserverAddr, clientv3.WithLease(e.sess.Lease()))
log.Debugf("set pserver node %s with value %s", psKey, pserverAddr) log.Debug("register finished", log.Ctx{"key": psKey, "value": pserverAddr})
log.Debug("register finished")
idx = i idx = i
registered = true registered = true
break break
...@@ -239,7 +243,7 @@ func (e *EtcdClient) Shutdown() error { ...@@ -239,7 +243,7 @@ func (e *EtcdClient) Shutdown() error {
newErr := e.client.Close() newErr := e.client.Close()
if newErr != nil { if newErr != nil {
if err != nil { if err != nil {
log.Errorln(newErr) log.Error("shutdown error", log.Ctx{"error": newErr})
} else { } else {
err = newErr err = newErr
} }
......
...@@ -25,7 +25,7 @@ import ( ...@@ -25,7 +25,7 @@ import (
"fmt" "fmt"
"unsafe" "unsafe"
log "github.com/sirupsen/logrus" log "github.com/inconshreveable/log15"
) )
type optimizer struct { type optimizer struct {
...@@ -56,12 +56,12 @@ func newOptimizer(paramWithConfigs ParameterWithConfig, State []byte) *optimizer ...@@ -56,12 +56,12 @@ func newOptimizer(paramWithConfigs ParameterWithConfig, State []byte) *optimizer
c := paramWithConfigs.Config c := paramWithConfigs.Config
s := State s := State
paramBufferSize := C.size_t(len(p.Content)) paramBufferSize := C.size_t(len(p.Content))
log.WithFields(log.Fields{ log.Info("New Optimizer Created with config", log.Ctx{
"ElementType": p.ElementType, "ElementType": p.ElementType,
"ParamSize": paramBufferSize, "ParamSize": paramBufferSize,
"ConfigSize": len(c), "ConfigSize": len(c),
"StateSize": len(s), "StateSize": len(s),
}).Info("New Optimizer Created with config:") })
var cbuffer unsafe.Pointer var cbuffer unsafe.Pointer
cbuffer = C.malloc(paramBufferSize) cbuffer = C.malloc(paramBufferSize)
...@@ -71,22 +71,41 @@ func newOptimizer(paramWithConfigs ParameterWithConfig, State []byte) *optimizer ...@@ -71,22 +71,41 @@ func newOptimizer(paramWithConfigs ParameterWithConfig, State []byte) *optimizer
cstate = unsafe.Pointer(&s[0]) cstate = unsafe.Pointer(&s[0])
} }
var cptr (*C.uchar)
if len(c) > 0 {
cptr = (*C.uchar)(&c[0])
} else {
log.Error("empty config", "param name", paramWithConfigs.Param.Name)
}
o.config = c o.config = c
o.opt = C.paddle_create_optimizer((*C.uchar)(&c[0]), C.int(len(c)), o.opt = C.paddle_create_optimizer(
C.paddle_element_type(p.ElementType), cbuffer, C.int(paramBufferSize), (*C.char)(cstate), C.int(len(s))) cptr,
C.int(len(c)),
C.paddle_element_type(p.ElementType),
cbuffer,
C.int(paramBufferSize),
(*C.char)(cstate),
C.int(len(s)),
)
return o return o
} }
func (o *optimizer) GetWeights() []byte { func (o *optimizer) GetWeights() []byte {
var buffer unsafe.Pointer var buffer unsafe.Pointer
// we do not own the buffer, no need to free later.
bufferLen := C.paddle_optimizer_get_weights(o.opt, &buffer) bufferLen := C.paddle_optimizer_get_weights(o.opt, &buffer)
return cArrayToSlice(buffer, int(bufferLen)*C.sizeof_float) return cArrayToSlice(buffer, int(bufferLen)*C.sizeof_float)
} }
func (o *optimizer) GetStates() []byte { func (o *optimizer) GetStates() []byte {
var cbuffer *C.char var cbuffer *C.char
// we owns the state buffer, need to free later.
cbufferLen := C.paddle_optimizer_get_state(o.opt, &cbuffer) cbufferLen := C.paddle_optimizer_get_state(o.opt, &cbuffer)
return cArrayToSlice(unsafe.Pointer(cbuffer), int(cbufferLen)) buf := cArrayToSlice(unsafe.Pointer(cbuffer), int(cbufferLen))
cpy := make([]byte, len(buf))
copy(cpy, buf)
C.free(unsafe.Pointer(cbuffer))
return cpy
} }
func (o *optimizer) UpdateParameter(g Gradient) error { func (o *optimizer) UpdateParameter(g Gradient) error {
......
...@@ -15,8 +15,12 @@ ...@@ -15,8 +15,12 @@
package pserver package pserver
import ( import (
"encoding/binary"
"io/ioutil" "io/ioutil"
"math"
"testing" "testing"
"github.com/stretchr/testify/assert"
) )
func TestOptimizerCreateRelease(t *testing.T) { func TestOptimizerCreateRelease(t *testing.T) {
...@@ -36,3 +40,39 @@ func TestOptimizerCreateRelease(t *testing.T) { ...@@ -36,3 +40,39 @@ func TestOptimizerCreateRelease(t *testing.T) {
o := newOptimizer(param, nil) o := newOptimizer(param, nil)
o.Cleanup() o.Cleanup()
} }
func float32Bytes(float float32) []byte {
bits := math.Float32bits(float)
bytes := make([]byte, 4)
binary.LittleEndian.PutUint32(bytes, bits)
return bytes
}
func TestOptimizerState(t *testing.T) {
p := Parameter{
Name: "a",
ElementType: Int32,
}
weights := float32Bytes(100)
p.Content = weights
config, err := ioutil.ReadFile("./client/c/test/testdata/optimizer.pb")
if err != nil {
t.Fatalf("read optimizer proto failed")
}
param := ParameterWithConfig{
Param: p,
Config: config,
}
o := newOptimizer(param, nil)
s := o.GetStates()
// clear param content and check if the state is restored.
param.Param.Content = float32Bytes(300)
o1 := newOptimizer(param, s)
s1 := o1.GetStates()
assert.Equal(t, s, s1)
assert.Equal(t, weights, o.GetWeights())
assert.Equal(t, weights, o1.GetWeights())
o.Cleanup()
o1.Cleanup()
}
...@@ -17,22 +17,26 @@ package pserver ...@@ -17,22 +17,26 @@ package pserver
import ( import (
"bufio" "bufio"
"bytes" "bytes"
"crypto/md5" "encoding/binary"
"encoding/gob" "encoding/gob"
"encoding/hex"
"encoding/json" "encoding/json"
"errors" "errors"
"fmt" "fmt"
"hash/crc32"
"io/ioutil" "io/ioutil"
"os" "os"
"path" "path"
"strconv" "strconv"
"strings"
"sync" "sync"
"time" "time"
"github.com/golang/protobuf/proto"
uuid "github.com/satori/go.uuid" uuid "github.com/satori/go.uuid"
log "github.com/sirupsen/logrus" pb "github.com/PaddlePaddle/Paddle/go/proto"
log "github.com/inconshreveable/log15"
) )
// ElementType is the type of elements of a Parameter. // ElementType is the type of elements of a Parameter.
...@@ -40,7 +44,7 @@ type ElementType int ...@@ -40,7 +44,7 @@ type ElementType int
// ErrCheckpointNotFound indicates that the pserver checkpoint could // ErrCheckpointNotFound indicates that the pserver checkpoint could
// not be found. // not be found.
var ErrCheckpointNotFound = errors.New("checkpoint not found") var ErrCheckpointNotFound = errors.New("checkpoint not found in etcd")
// RPC error message. // RPC error message.
const ( const (
...@@ -66,6 +70,46 @@ type Parameter struct { ...@@ -66,6 +70,46 @@ type Parameter struct {
Content []byte Content []byte
} }
func float32ToString(b []byte) string {
f := make([]float32, len(b)/4)
buf := bytes.NewReader(b)
err := binary.Read(buf, binary.LittleEndian, &f)
if err != nil {
return ""
}
return fmt.Sprintf("%v", f)
}
func float32ByteToString(c []byte) string {
var a []byte
var b []byte
if len(c) <= 80 {
a = c
} else {
a = c[0:40]
b = c[len(c)-40:]
}
var s string
s = float32ToString(a)
if b == nil {
return s
}
s = strings.Replace(s, "]", "", -1) + "..." + strings.Replace(float32ToString(b), "[", "", -1)
return s
}
func (p Parameter) String() string {
if p.ElementType != Float32 {
return fmt.Sprintf("name:%v ElementType:%v",
p.Name, p.ElementType)
}
return float32ByteToString(p.Content)
}
// ParameterWithConfig contains the parameter and the configuration. // ParameterWithConfig contains the parameter and the configuration.
type ParameterWithConfig struct { type ParameterWithConfig struct {
Param Parameter Param Parameter
...@@ -76,7 +120,7 @@ type ParameterWithConfig struct { ...@@ -76,7 +120,7 @@ type ParameterWithConfig struct {
type checkpointMeta struct { type checkpointMeta struct {
UUID string `json:"uuid"` UUID string `json:"uuid"`
Path string `json:"path"` Path string `json:"path"`
MD5 string `json:"md5"` CRC32 uint32 `json:"crc32"`
Timestamp int64 `json:"timestamp"` Timestamp int64 `json:"timestamp"`
} }
...@@ -92,7 +136,7 @@ type Service struct { ...@@ -92,7 +136,7 @@ type Service struct {
idx int idx int
checkpointInterval time.Duration checkpointInterval time.Duration
checkpointPath string checkpointPath string
client *EtcdClient client KVStore
mu sync.Mutex mu sync.Mutex
optMap map[string]*optimizer optMap map[string]*optimizer
...@@ -104,7 +148,12 @@ type parameterCheckpoint struct { ...@@ -104,7 +148,12 @@ type parameterCheckpoint struct {
State []byte State []byte
} }
func loadMeta(e *EtcdClient, idx int) (meta checkpointMeta, err error) { type KVStore interface {
GetKey(key string, timeout time.Duration) ([]byte, error)
PutKey(key string, value []byte, timeout time.Duration, withLease bool) error
}
func loadMeta(e KVStore, idx int) (meta checkpointMeta, err error) {
v, err := e.GetKey(PsCheckpoint+strconv.Itoa(idx), 3*time.Second) v, err := e.GetKey(PsCheckpoint+strconv.Itoa(idx), 3*time.Second)
if err != nil { if err != nil {
return return
...@@ -123,7 +172,10 @@ func loadMeta(e *EtcdClient, idx int) (meta checkpointMeta, err error) { ...@@ -123,7 +172,10 @@ func loadMeta(e *EtcdClient, idx int) (meta checkpointMeta, err error) {
} }
// LoadCheckpoint loads checkpoint from file. // LoadCheckpoint loads checkpoint from file.
func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) { func LoadCheckpoint(e KVStore, idx int) (Checkpoint, error) {
log.Info("Loading checkpoint", "pserver index", idx)
defer traceTime(time.Now(), "load checkpoint")
cpMeta, err := loadMeta(e, idx) cpMeta, err := loadMeta(e, idx)
if err != nil { if err != nil {
return nil, err return nil, err
...@@ -134,11 +186,8 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) { ...@@ -134,11 +186,8 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) {
return nil, err return nil, err
} }
// TODO(helin): change MD5 to CRC since CRC is better for file crc32 := crc32.ChecksumIEEE(content)
// checksum in our use case (emphasize speed over security). if crc32 != cpMeta.CRC32 {
h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
return nil, errors.New(WrongChecksum) return nil, errors.New(WrongChecksum)
} }
...@@ -147,12 +196,13 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) { ...@@ -147,12 +196,13 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) {
if err = dec.Decode(&cp); err != nil { if err = dec.Decode(&cp); err != nil {
return nil, err return nil, err
} }
return cp, nil return cp, nil
} }
// NewService creates a new service, will bypass etcd registration if no // NewService creates a new service, will bypass etcd registration if no
// endpoints specified. It will recovery from checkpoint file if a exists a specified checkpoint. // endpoints specified. It will recovery from checkpoint file if a exists a specified checkpoint.
func NewService(idx int, interval time.Duration, path string, client *EtcdClient, cp Checkpoint) (*Service, error) { func NewService(idx int, interval time.Duration, path string, client KVStore, cp Checkpoint) (*Service, error) {
s := &Service{ s := &Service{
idx: idx, idx: idx,
checkpointInterval: interval, checkpointInterval: interval,
...@@ -170,6 +220,7 @@ func NewService(idx int, interval time.Duration, path string, client *EtcdClient ...@@ -170,6 +220,7 @@ func NewService(idx int, interval time.Duration, path string, client *EtcdClient
} }
s.optMap[p.Param.Name] = newOptimizer(p, item.State) s.optMap[p.Param.Name] = newOptimizer(p, item.State)
} }
close(s.initialized)
} }
return s, nil return s, nil
} }
...@@ -178,11 +229,14 @@ func NewService(idx int, interval time.Duration, path string, client *EtcdClient ...@@ -178,11 +229,14 @@ func NewService(idx int, interval time.Duration, path string, client *EtcdClient
func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error { func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error {
select { select {
case <-s.initialized: case <-s.initialized:
log.Warn("init param called but parameters already initialized.")
return errors.New(AlreadyInitialized) return errors.New(AlreadyInitialized)
default: default:
} }
// TODO(helin): parse parameter config c := &pb.OptimizerConfig{}
proto.Unmarshal(paramWithConfigs.Config, c)
log.Debug(fmt.Sprintf("OptimizerConfig:%v", c))
s.mu.Lock() s.mu.Lock()
defer s.mu.Unlock() defer s.mu.Unlock()
...@@ -191,6 +245,13 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error ...@@ -191,6 +245,13 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error
// properly memory aligned, if not, make copy to a memory // properly memory aligned, if not, make copy to a memory
// aligned region. // aligned region.
s.optMap[paramWithConfigs.Param.Name] = newOptimizer(paramWithConfigs, nil) s.optMap[paramWithConfigs.Param.Name] = newOptimizer(paramWithConfigs, nil)
log.Info(
"init parameter",
"name", paramWithConfigs.Param.Name,
"config len", len(paramWithConfigs.Config),
"param len", len(paramWithConfigs.Param.Content),
"type", paramWithConfigs.Param.ElementType,
)
return nil return nil
} }
...@@ -199,6 +260,7 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error ...@@ -199,6 +260,7 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error
func (s *Service) FinishInitParams(_ int, _ *int) error { func (s *Service) FinishInitParams(_ int, _ *int) error {
select { select {
case <-s.initialized: case <-s.initialized:
log.Warn("finished init param called but parameters already initialized.")
return errors.New(AlreadyInitialized) return errors.New(AlreadyInitialized)
default: default:
} }
...@@ -209,10 +271,12 @@ func (s *Service) FinishInitParams(_ int, _ *int) error { ...@@ -209,10 +271,12 @@ func (s *Service) FinishInitParams(_ int, _ *int) error {
for range t { for range t {
err := s.checkpoint() err := s.checkpoint()
if err != nil { if err != nil {
log.Errorln(err) log.Error("checkpoint error", log.Ctx{"error": err})
} }
} }
}() }()
log.Info("init parameter finished.")
return nil return nil
} }
...@@ -222,6 +286,8 @@ func (s *Service) SendGrad(g Gradient, _ *int) error { ...@@ -222,6 +286,8 @@ func (s *Service) SendGrad(g Gradient, _ *int) error {
select { select {
case <-s.initialized: case <-s.initialized:
default: default:
log.Warn("received gradient before initialization.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return errors.New(Uninitialized) return errors.New(Uninitialized)
} }
...@@ -230,9 +296,14 @@ func (s *Service) SendGrad(g Gradient, _ *int) error { ...@@ -230,9 +296,14 @@ func (s *Service) SendGrad(g Gradient, _ *int) error {
o, ok := s.optMap[g.Name] o, ok := s.optMap[g.Name]
if !ok { if !ok {
log.Warn("received gradient but can't find name.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return fmt.Errorf("parameter: %s does not exist", g.Name) return fmt.Errorf("parameter: %s does not exist", g.Name)
} }
log.Debug(Parameter(g).String())
log.Info("received gradient from trainer, updating gradient.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return o.UpdateParameter(g) return o.UpdateParameter(g)
} }
...@@ -244,6 +315,7 @@ func (s *Service) GetParam(name string, parameter *Parameter) error { ...@@ -244,6 +315,7 @@ func (s *Service) GetParam(name string, parameter *Parameter) error {
opt, ok := s.optMap[name] opt, ok := s.optMap[name]
if !ok { if !ok {
log.Warn("trainer wants to get a parameter that does not exist.", "name", name)
return fmt.Errorf("parameter: %s does not exist", name) return fmt.Errorf("parameter: %s does not exist", name)
} }
...@@ -257,12 +329,14 @@ func (s *Service) GetParam(name string, parameter *Parameter) error { ...@@ -257,12 +329,14 @@ func (s *Service) GetParam(name string, parameter *Parameter) error {
parameter.Name = name parameter.Name = name
parameter.ElementType = opt.elementType parameter.ElementType = opt.elementType
parameter.Content = opt.GetWeights() parameter.Content = opt.GetWeights()
log.Debug(parameter.String())
log.Info("sending parameter to the trainer", "name", parameter.Name, "size", len(parameter.Content), "type", parameter.ElementType)
return nil return nil
} }
func traceTime(start time.Time, name string) { func traceTime(start time.Time, name string) {
elapsed := time.Since(start) elapsed := time.Since(start)
log.Infof("%s took %v", name, elapsed) log.Info("time elapsed", log.Ctx{"name": name, "elapsed": elapsed})
} }
// checkpoint saves checkpoint to disk. // checkpoint saves checkpoint to disk.
...@@ -270,7 +344,7 @@ func traceTime(start time.Time, name string) { ...@@ -270,7 +344,7 @@ func traceTime(start time.Time, name string) {
// checkpoint should be only called after the parameters are // checkpoint should be only called after the parameters are
// initialized. // initialized.
func (s *Service) checkpoint() (err error) { func (s *Service) checkpoint() (err error) {
log.Infoln("Begin save checkpoint.") log.Info("Begin save checkpoint.")
defer traceTime(time.Now(), "save checkpoint") defer traceTime(time.Now(), "save checkpoint")
s.mu.Lock() s.mu.Lock()
...@@ -297,6 +371,13 @@ func (s *Service) checkpoint() (err error) { ...@@ -297,6 +371,13 @@ func (s *Service) checkpoint() (err error) {
return return
} }
if _, err = os.Stat(s.checkpointPath); os.IsNotExist(err) {
err = os.MkdirAll(s.checkpointPath, os.ModePerm)
if err != nil {
return
}
}
id := uuid.NewV4().String() id := uuid.NewV4().String()
p := path.Join(s.checkpointPath, id) p := path.Join(s.checkpointPath, id)
f, err := os.Create(p) f, err := os.Create(p)
...@@ -308,7 +389,7 @@ func (s *Service) checkpoint() (err error) { ...@@ -308,7 +389,7 @@ func (s *Service) checkpoint() (err error) {
closeErr := f.Close() closeErr := f.Close()
if closeErr != nil { if closeErr != nil {
if err != nil { if err != nil {
log.Errorln(closeErr) log.Error("error close checkpoint file", log.Ctx{"error": closeErr})
} else { } else {
// Set closeErr as return value. // Set closeErr as return value.
err = closeErr err = closeErr
...@@ -329,20 +410,29 @@ func (s *Service) checkpoint() (err error) { ...@@ -329,20 +410,29 @@ func (s *Service) checkpoint() (err error) {
oldMeta, err := loadMeta(s.client, s.idx) oldMeta, err := loadMeta(s.client, s.idx)
if err == ErrCheckpointNotFound { if err == ErrCheckpointNotFound {
log.Infoln("Do not have existing checkpoint.") log.Info("old meta not found, skip removing old meta")
err = nil err = nil
} else if err == nil {
log.Info("removing old meta")
if oldMeta.Path != "" {
rmErr := os.Remove(oldMeta.Path)
if rmErr != nil {
// log error, but still treat checkpoint as
// successful.
log.Error("remove old meta file error", log.Ctx{"error": rmErr})
}
}
} }
if err != nil { if err != nil {
return return
} }
h := md5.New() crc32 := crc32.ChecksumIEEE(buf.Bytes())
md5 := hex.EncodeToString(h.Sum(buf.Bytes()))
cpMeta := checkpointMeta{ cpMeta := checkpointMeta{
UUID: id, UUID: id,
Timestamp: time.Now().UnixNano(), Timestamp: time.Now().UnixNano(),
MD5: md5, CRC32: crc32,
Path: p, Path: p,
} }
...@@ -356,14 +446,5 @@ func (s *Service) checkpoint() (err error) { ...@@ -356,14 +446,5 @@ func (s *Service) checkpoint() (err error) {
return return
} }
if oldMeta.Path != "" {
rmErr := os.Remove(oldMeta.Path)
if rmErr != nil {
// log error, but still treat checkpoint as
// successful.
log.Errorln(rmErr)
}
}
return return
} }
package pserver
import (
"bytes"
"encoding/binary"
"fmt"
"testing"
"time"
"github.com/stretchr/testify/assert"
)
const testDir = "./test_data"
type myKV struct {
m map[string][]byte
}
func (m *myKV) GetKey(key string, timeout time.Duration) ([]byte, error) {
if m.m == nil {
m.m = make(map[string][]byte)
}
return m.m[key], nil
}
func (m *myKV) PutKey(key string, value []byte, timeout time.Duration, withLease bool) error {
if m.m == nil {
m.m = make(map[string][]byte)
}
m.m[key] = value
return nil
}
func TestCheckpoint(t *testing.T) {
kv := &myKV{}
s, err := NewService(0, time.Hour, testDir, kv, nil)
assert.Nil(t, err)
err = s.checkpoint()
assert.Nil(t, err)
_, err = LoadCheckpoint(kv, 0)
assert.Nil(t, err)
}
func float32ToByte(f float32) []byte {
var buf bytes.Buffer
err := binary.Write(&buf, binary.LittleEndian, f)
if err != nil {
fmt.Println("binary.Write failed:", err)
}
return buf.Bytes()
}
func TestCheckpointWithData(t *testing.T) {
kv := &myKV{}
s, err := NewService(0, time.Hour, testDir, kv, nil)
assert.Nil(t, err)
var content []byte
for i := 0; i < 50000; i++ {
content = append(content, float32ToByte(float32(i))...)
}
p1 := Parameter{Name: "p1", ElementType: 1, Content: content}
err = s.InitParam(ParameterWithConfig{Param: p1}, nil)
assert.Nil(t, err)
err = s.FinishInitParams(0, nil)
assert.Nil(t, err)
var p2 Parameter
err = s.GetParam(p1.Name, &p2)
assert.Nil(t, err)
assert.Equal(t, p1, p2)
err = s.checkpoint()
assert.Nil(t, err)
cp, err := LoadCheckpoint(kv, 0)
assert.Nil(t, err)
s1, err := NewService(0, time.Hour, testDir, kv, cp)
assert.Nil(t, err)
var p3 Parameter
err = s1.GetParam(p1.Name, &p3)
assert.Nil(t, err)
assert.Equal(t, p1, p3)
}
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
package pserver_test package pserver_test
import ( import (
"fmt"
"io/ioutil" "io/ioutil"
"reflect" "reflect"
"sync" "sync"
...@@ -179,6 +180,32 @@ func TestBlockUntilInitialized(t *testing.T) { ...@@ -179,6 +180,32 @@ func TestBlockUntilInitialized(t *testing.T) {
wg.Wait() wg.Wait()
} }
func TestCheckpointSpeed(t *testing.T) { func TestGradientString(t *testing.T) {
//TODO(zhihong): test speed g := pserver.Parameter{}
g.ElementType = pserver.Float32
g.Content = []byte{0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40, 0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40}
if g.String() != "[3.3702806e+12 2.142699 3.3702806e+12 2.142699]" {
t.Fatal("get float data error!")
}
g.Content = []byte{0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40}
if g.String() != "[3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699...3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699]" {
t.Fatal("get float data error!", g.String())
}
fmt.Println(g)
} }
...@@ -64,12 +64,18 @@ paddle_error paddle_gradient_machine_create_for_inference_with_parameters( ...@@ -64,12 +64,18 @@ paddle_error paddle_gradient_machine_create_for_inference_with_parameters(
modelConfigProtobuf.resize(modelConfigSize); modelConfigProtobuf.resize(modelConfigSize);
is.read(&modelConfigProtobuf[0], modelConfigSize); is.read(&modelConfigProtobuf[0], modelConfigSize);
paddle::TrainerConfig config; paddle::TrainerConfig config;
paddle::ModelConfig modelConfig;
if (!config.ParseFromString(modelConfigProtobuf) || !config.IsInitialized()) { if (!config.ParseFromString(modelConfigProtobuf) || !config.IsInitialized()) {
return kPD_PROTOBUF_ERROR; if (!modelConfig.ParseFromString(modelConfigProtobuf) ||
!modelConfig.IsInitialized()) {
return kPD_PROTOBUF_ERROR;
}
} else {
modelConfig = config.model_config();
} }
auto ptr = new paddle::capi::CGradientMachine(); auto ptr = new paddle::capi::CGradientMachine();
ptr->machine.reset(paddle::GradientMachine::create( ptr->machine.reset(paddle::GradientMachine::create(
config.model_config(), CREATE_MODE_TESTING, {paddle::PARAMETER_VALUE})); modelConfig, CREATE_MODE_TESTING, {paddle::PARAMETER_VALUE}));
std::vector<paddle::ParameterPtr>& parameters = ptr->machine->getParameters(); std::vector<paddle::ParameterPtr>& parameters = ptr->machine->getParameters();
for (auto& para : parameters) { for (auto& para : parameters) {
para->load(is); para->load(is);
......
# ddim lib # ddim lib
proto_library(framework_proto SRCS framework.proto)
cc_library(ddim SRCS ddim.cc DEPS eigen3) cc_library(ddim SRCS ddim.cc DEPS eigen3)
cc_test(ddim_test SRCS ddim_test.cc DEPS ddim) cc_test(ddim_test SRCS ddim_test.cc DEPS ddim)
nv_test(dim_test SRCS dim_test.cu DEPS ddim) nv_test(dim_test SRCS dim_test.cu DEPS ddim)
...@@ -7,25 +9,25 @@ cc_library(tensor SRCS tensor.cc DEPS ddim place paddle_memory device_context) ...@@ -7,25 +9,25 @@ cc_library(tensor SRCS tensor.cc DEPS ddim place paddle_memory device_context)
cc_test(tensor_test SRCS tensor_test.cc DEPS tensor) cc_test(tensor_test SRCS tensor_test.cc DEPS tensor)
cc_test(eigen_test SRCS eigen_test.cc DEPS tensor) cc_test(eigen_test SRCS eigen_test.cc DEPS tensor)
cc_library(lod_tensor SRCS lod_tensor.cc DEPS ddim place tensor) cc_library(lod_tensor SRCS lod_tensor.cc DEPS ddim place tensor framework_proto)
cc_test(lod_tensor_test SRCS lod_tensor_test.cc DEPS lod_tensor) cc_test(lod_tensor_test SRCS lod_tensor_test.cc DEPS lod_tensor paddle_memory)
nv_test(lod_tensor_gpu_test SRCS lod_tensor_test.cu DEPS lod_tensor) nv_test(lod_tensor_gpu_test SRCS lod_tensor_test.cu DEPS lod_tensor)
cc_test(variable_test SRCS variable_test.cc) cc_test(variable_test SRCS variable_test.cc)
cc_library(scope SRCS scope.cc) cc_library(scope SRCS scope.cc DEPS glog)
cc_test(scope_test SRCS scope_test.cc DEPS scope) cc_test(scope_test SRCS scope_test.cc DEPS scope)
proto_library(framework_proto SRCS framework.proto)
cc_library(attribute SRCS attribute.cc DEPS framework_proto) cc_library(attribute SRCS attribute.cc DEPS framework_proto)
cc_test(program_desc_test SRCS program_desc_test.cc DEPS proto_desc) cc_test(program_desc_test SRCS program_desc_test.cc DEPS proto_desc)
cc_library(op_proto_maker SRCS op_proto_maker.cc DEPS framework_proto attribute) cc_library(op_proto_maker SRCS op_proto_maker.cc DEPS framework_proto attribute)
cc_test(op_proto_maker_test SRCS op_proto_maker_test.cc DEPS op_proto_maker) cc_test(op_proto_maker_test SRCS op_proto_maker_test.cc DEPS op_proto_maker)
cc_library(op_info SRCS op_info.cc DEPS attribute framework_proto) cc_library(op_info SRCS op_info.cc DEPS attribute framework_proto)
cc_library(operator SRCS operator.cc DEPS op_info device_context tensor scope glog) cc_library(shape_inference SRCS shape_inference.cc DEPS ddim attribute)
cc_library(operator SRCS operator.cc DEPS op_info device_context tensor scope glog shape_inference)
cc_test(operator_test SRCS operator_test.cc DEPS operator op_registry) cc_test(operator_test SRCS operator_test.cc DEPS operator op_registry)
cc_library(proto_desc SRCS var_desc.cc op_desc.cc block_desc.cc program_desc.cc DEPS attribute ddim op_info operator) cc_library(proto_desc SRCS var_desc.cc op_desc.cc block_desc.cc program_desc.cc DEPS shape_inference op_info operator glog)
cc_library(op_registry SRCS op_registry.cc DEPS op_proto_maker op_info operator glog proto_desc) cc_library(op_registry SRCS op_registry.cc DEPS op_proto_maker op_info operator glog proto_desc)
cc_test(op_registry_test SRCS op_registry_test.cc DEPS op_registry) cc_test(op_registry_test SRCS op_registry_test.cc DEPS op_registry)
...@@ -41,7 +43,7 @@ add_custom_command(TARGET framework_py_proto POST_BUILD ...@@ -41,7 +43,7 @@ add_custom_command(TARGET framework_py_proto POST_BUILD
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}) WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
cc_library(backward SRCS backward.cc DEPS net_op) cc_library(backward SRCS backward.cc DEPS net_op)
cc_test(backward_test SRCS backward_test.cc DEPS backward recurrent_op device_context) cc_test(backward_test SRCS backward_test.cc DEPS backward recurrent_op device_context fill_constant_op)
cc_library(executor SRCS executor.cc DEPS op_registry device_context scope framework_proto backward glog) cc_library(executor SRCS executor.cc DEPS op_registry device_context scope framework_proto backward glog)
......
...@@ -315,6 +315,7 @@ static void CreateGradVarInBlock( ...@@ -315,6 +315,7 @@ static void CreateGradVarInBlock(
return false; /* not break */ return false; /* not break */
}); });
if (need_infer_shape) { if (need_infer_shape) {
ops[op_index]->InferVarType(block_desc);
ops[op_index]->InferShape(*block_desc); ops[op_index]->InferShape(*block_desc);
} }
} }
...@@ -452,11 +453,16 @@ ParamGradInfoMap AppendBackward( ...@@ -452,11 +453,16 @@ ParamGradInfoMap AppendBackward(
std::transform(target_shape_desc.begin(), target_shape_desc.end(), std::transform(target_shape_desc.begin(), target_shape_desc.end(),
std::back_inserter(target_shape), std::back_inserter(target_shape),
[](int64_t dim) { return static_cast<int>(dim); }); [](int64_t dim) { return static_cast<int>(dim); });
VLOG(3) << "backward from loss=" << target.Name()
<< " data_type=" << target.GetDataType();
std::unique_ptr<OpDescBind> fill_one_op( std::unique_ptr<OpDescBind> fill_one_op(
new OpDescBind("fill_constant", {}, {{"Out", {fill_one_op_out}}}, new OpDescBind("fill_constant", {}, {{"Out", {fill_one_op_out}}},
{{"shape", target_shape}, {{"shape", target_shape},
{"value", static_cast<float>(1.0)}, {"value", static_cast<float>(1.0)},
{"data_type", framework::DataType::FP32}})); {"data_type", target.GetDataType()}}));
// infer var type of fill_one_op
fill_one_op->InferVarType(root_block);
root_block->AppendAllocatedOp(std::move(fill_one_op)); root_block->AppendAllocatedOp(std::move(fill_one_op));
size_t forward_op_num = root_block->OpSize(); size_t forward_op_num = root_block->OpSize();
size_t forward_block_num = program_desc.Size(); size_t forward_block_num = program_desc.Size();
...@@ -475,8 +481,7 @@ ParamGradInfoMap AppendBackward( ...@@ -475,8 +481,7 @@ ParamGradInfoMap AppendBackward(
std::unordered_map<std::string, GradVarInfo> retv; std::unordered_map<std::string, GradVarInfo> retv;
auto var = root_block->Var(fill_one_op_out); auto var = root_block->Var(fill_one_op_out);
// FIXME(qiao) infer the data type var->SetDataType(target.GetDataType());
var->SetDataType(framework::DataType::FP32);
var->SetShape(target.Shape()); var->SetShape(target.Shape());
auto& target_grad = retv[target.Name()]; auto& target_grad = retv[target.Name()];
target_grad.name_ = fill_one_op_out; target_grad.name_ = fill_one_op_out;
......
...@@ -21,6 +21,8 @@ ...@@ -21,6 +21,8 @@
#include "paddle/framework/var_desc.h" #include "paddle/framework/var_desc.h"
#include "paddle/operators/net_op.h" #include "paddle/operators/net_op.h"
USE_OP(fill_constant);
namespace paddle { namespace paddle {
namespace framework { namespace framework {
......
...@@ -120,6 +120,17 @@ BlockDesc *BlockDescBind::Proto() { ...@@ -120,6 +120,17 @@ BlockDesc *BlockDescBind::Proto() {
Flush(); Flush();
return desc_; return desc_;
} }
BlockDescBind::BlockDescBind(ProgramDescBind *prog, BlockDesc *desc)
: prog_(prog), desc_(desc), need_update_(false) {
for (const VarDesc &var_desc : desc_->vars()) {
vars_[var_desc.name()].reset(new VarDescBind(var_desc));
}
for (const OpDesc &op_desc : desc_->ops()) {
ops_.emplace_back(new OpDescBind(op_desc, prog));
}
}
BlockDescBind::BlockDescBind(const BlockDescBind &other, BlockDesc *desc, BlockDescBind::BlockDescBind(const BlockDescBind &other, BlockDesc *desc,
ProgramDescBind *prog) ProgramDescBind *prog)
: prog_(prog), desc_(desc) { : prog_(prog), desc_(desc) {
......
...@@ -36,8 +36,7 @@ class ProgramDescBind; ...@@ -36,8 +36,7 @@ class ProgramDescBind;
class BlockDescBind { class BlockDescBind {
public: public:
BlockDescBind(ProgramDescBind *prog, BlockDesc *desc) BlockDescBind(ProgramDescBind *prog, BlockDesc *desc);
: prog_(prog), desc_(desc), need_update_(false) {}
BlockDescBind(const BlockDescBind &other, BlockDesc *desc, BlockDescBind(const BlockDescBind &other, BlockDesc *desc,
ProgramDescBind *prog); ProgramDescBind *prog);
......
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
#pragma once #pragma once
#include <typeindex> #include <typeindex>
#include "paddle/framework/framework.pb.h" #include "paddle/framework/framework.pb.h"
#include "paddle/platform/enforce.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
...@@ -33,5 +34,25 @@ inline DataType ToDataType(std::type_index type) { ...@@ -33,5 +34,25 @@ inline DataType ToDataType(std::type_index type) {
} }
} }
template <typename Visitor>
inline void VisitDataType(DataType type, Visitor visitor) {
switch (type) {
case DataType::FP32:
visitor.template operator()<float>();
break;
case DataType::FP64:
visitor.template operator()<double>();
break;
case DataType::INT32:
visitor.template operator()<int>();
break;
case DataType::INT64:
visitor.template operator()<int64_t>();
break;
default:
PADDLE_THROW("Not supported");
}
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -195,6 +195,14 @@ std::vector<int64_t> vectorize(const DDim& ddim) { ...@@ -195,6 +195,14 @@ std::vector<int64_t> vectorize(const DDim& ddim) {
return result; return result;
} }
// NOTE: framework::vectorize converts to type int64_t
// which does not fit cudnn inputs.
std::vector<int> vectorize2int(const DDim& ddim) {
std::vector<int64_t> temp = vectorize(ddim);
std::vector<int> result(temp.begin(), temp.end());
return result;
}
struct ProductVisitor : public boost::static_visitor<int64_t> { struct ProductVisitor : public boost::static_visitor<int64_t> {
template <int D> template <int D>
int64_t operator()(const Dim<D>& dim) { int64_t operator()(const Dim<D>& dim) {
......
...@@ -93,6 +93,7 @@ int64_t get(const DDim& dim, int idx); ...@@ -93,6 +93,7 @@ int64_t get(const DDim& dim, int idx);
void set(DDim& dim, int idx, int val); void set(DDim& dim, int idx, int val);
std::vector<int64_t> vectorize(const DDim& ddim); std::vector<int64_t> vectorize(const DDim& ddim);
std::vector<int> vectorize2int(const DDim& ddim);
int64_t product(const DDim& ddim); int64_t product(const DDim& ddim);
......
...@@ -28,7 +28,8 @@ enum OpInfoFillType { ...@@ -28,7 +28,8 @@ enum OpInfoFillType {
kOperator = 0, kOperator = 0,
kOpProtoAndCheckerMaker = 1, kOpProtoAndCheckerMaker = 1,
kGradOpDescMaker = 2, kGradOpDescMaker = 2,
kVarTypeInference = 3 kVarTypeInference = 3,
kShapeInference = 4
}; };
template <typename T> template <typename T>
...@@ -42,7 +43,10 @@ struct OpInfoFillTypeID { ...@@ -42,7 +43,10 @@ struct OpInfoFillTypeID {
? kGradOpDescMaker ? kGradOpDescMaker
: (std::is_base_of<VarTypeInference, T>::value : (std::is_base_of<VarTypeInference, T>::value
? kVarTypeInference ? kVarTypeInference
: static_cast<OpInfoFillType>(-1)))); : (std::is_base_of<InferShapeBase, T>::value
? kShapeInference
: static_cast<OpInfoFillType>(
-1)))));
} }
}; };
...@@ -121,6 +125,16 @@ struct OpInfoFiller<T, kVarTypeInference> { ...@@ -121,6 +125,16 @@ struct OpInfoFiller<T, kVarTypeInference> {
} }
}; };
template <typename T>
struct OpInfoFiller<T, kShapeInference> {
void operator()(const char* op_type, OpInfo* info) const {
info->infer_shape_ = [](InferShapeContext* ctx) {
T inference;
inference(ctx);
};
}
};
} // namespace details } // namespace details
} // namespace framework } // namespace framework
......
...@@ -20,6 +20,7 @@ limitations under the License. */ ...@@ -20,6 +20,7 @@ limitations under the License. */
#include <set> #include <set>
#include <vector> #include <vector>
#include "paddle/framework/feed_fetch_type.h"
#include "paddle/framework/lod_tensor.h" #include "paddle/framework/lod_tensor.h"
#include "paddle/framework/op_registry.h" #include "paddle/framework/op_registry.h"
#include "paddle/framework/scope.h" #include "paddle/framework/scope.h"
...@@ -56,6 +57,22 @@ Executor::~Executor() { ...@@ -56,6 +57,22 @@ Executor::~Executor() {
} }
} }
static void CreateTensor(Variable* var, VarDesc::VarType var_type) {
if (var_type == VarDesc::LOD_TENSOR) {
var->GetMutable<LoDTensor>();
} else if (var_type == VarDesc::SELECTED_ROWS) {
var->GetMutable<SelectedRows>();
} else if (var_type == VarDesc::FEED_MINIBATCH) {
var->GetMutable<FeedFetchList>();
} else if (var_type == VarDesc::FETCH_LIST) {
var->GetMutable<FeedFetchList>();
} else {
PADDLE_THROW(
"Variable type must be "
"LoDTensor/SelectedRows/FEED_MINIBATCH/FETCH_LIST.");
}
}
void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id) { void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id) {
// TODO(tonyyang-svail): // TODO(tonyyang-svail):
// - only runs on the first device (i.e. no interdevice communication) // - only runs on the first device (i.e. no interdevice communication)
...@@ -69,10 +86,12 @@ void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id) { ...@@ -69,10 +86,12 @@ void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id) {
for (auto& var : block.vars()) { for (auto& var : block.vars()) {
if (var.persistable()) { if (var.persistable()) {
auto* ptr = scope->Var(var.name()); auto* ptr = scope->Var(var.name());
CreateTensor(ptr, var.type());
VLOG(3) << "Create Variable " << var.name() VLOG(3) << "Create Variable " << var.name()
<< " global, which pointer is " << ptr; << " global, which pointer is " << ptr;
} else { } else {
auto* ptr = local_scope.Var(var.name()); auto* ptr = local_scope.Var(var.name());
CreateTensor(ptr, var.type());
VLOG(3) << "Create Variable " << var.name() VLOG(3) << "Create Variable " << var.name()
<< " locally, which pointer is " << ptr; << " locally, which pointer is " << ptr;
} }
......
...@@ -115,6 +115,7 @@ message VarDesc { ...@@ -115,6 +115,7 @@ message VarDesc {
SELECTED_ROWS = 2; SELECTED_ROWS = 2;
FEED_MINIBATCH = 3; FEED_MINIBATCH = 3;
FETCH_LIST = 4; FETCH_LIST = 4;
STEP_SCOPES = 5;
} }
required string name = 1; required string name = 1;
required VarType type = 2; required VarType type = 2;
......
...@@ -14,6 +14,14 @@ ...@@ -14,6 +14,14 @@
#include "paddle/framework/lod_tensor.h" #include "paddle/framework/lod_tensor.h"
#include "paddle/memory/memcpy.h"
#include "paddle/memory/memory.h"
#include <stdint.h>
#include <string.h>
#include <algorithm>
#include <iterator>
#include <glog/logging.h> #include <glog/logging.h>
namespace paddle { namespace paddle {
...@@ -97,6 +105,15 @@ size_t LoDTensor::NumElements(size_t level, size_t idx) const { ...@@ -97,6 +105,15 @@ size_t LoDTensor::NumElements(size_t level, size_t idx) const {
return lod_[level][idx + 1] - lod_[level][idx]; return lod_[level][idx + 1] - lod_[level][idx];
} }
size_t LoDTensor::NumInstancesInElement(size_t level, size_t idx) const {
PADDLE_ENFORCE_LT(level, NumLevels());
PADDLE_ENFORCE_LT(idx, NumElements(level));
auto abs_lod = ToAbsOffset(lod());
size_t begin = abs_lod[level][idx];
size_t end = abs_lod[level][idx + 1];
return end - begin;
}
void LoDTensor::ShrinkLevels(size_t level_begin, size_t level_end) { void LoDTensor::ShrinkLevels(size_t level_begin, size_t level_end) {
auto new_lod = framework::SliceLevels(lod_, level_begin, level_end); auto new_lod = framework::SliceLevels(lod_, level_begin, level_end);
lod_ = new_lod; lod_ = new_lod;
...@@ -108,9 +125,15 @@ void LoDTensor::ShrinkInLevel(size_t level, size_t elem_begin, ...@@ -108,9 +125,15 @@ void LoDTensor::ShrinkInLevel(size_t level, size_t elem_begin,
PADDLE_ENFORCE_LT(elem_begin, NumElements(level)); PADDLE_ENFORCE_LT(elem_begin, NumElements(level));
PADDLE_ENFORCE_LT(elem_end, NumElements(level) + 1); PADDLE_ENFORCE_LT(elem_end, NumElements(level) + 1);
auto abs_lod = framework::ToAbsOffset(lod());
auto new_lod = framework::SliceInLevel(lod_, level, elem_begin, elem_end); auto new_lod = framework::SliceInLevel(lod_, level, elem_begin, elem_end);
lod_ = new_lod; lod_ = new_lod;
}
// slice the underlying tensor
size_t begin = abs_lod[level][elem_begin];
size_t end = abs_lod[level][elem_end];
PADDLE_ENFORCE_LT(begin, end, "Cannot shrink, the result tensor is empty.");
ShareDataWith(Slice(begin, end));
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -25,6 +25,7 @@ ...@@ -25,6 +25,7 @@
#include "paddle/framework/ddim.h" #include "paddle/framework/ddim.h"
#include "paddle/framework/tensor.h" #include "paddle/framework/tensor.h"
#include "paddle/platform/enforce.h" #include "paddle/platform/enforce.h"
#include "paddle/platform/place.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
...@@ -84,7 +85,9 @@ class LoDTensor : public Tensor { ...@@ -84,7 +85,9 @@ class LoDTensor : public Tensor {
void set_lod(const LoD& lod) { lod_ = lod; } void set_lod(const LoD& lod) { lod_ = lod; }
LoD lod() const { return lod_; } const LoD& lod() const { return lod_; }
LoD* mutable_lod() { return &lod_; }
/* /*
* Get the start offset and end offset of an element from LoD. * Get the start offset and end offset of an element from LoD.
...@@ -121,6 +124,12 @@ class LoDTensor : public Tensor { ...@@ -121,6 +124,12 @@ class LoDTensor : public Tensor {
*/ */
size_t NumElements(size_t level, size_t idx) const; size_t NumElements(size_t level, size_t idx) const;
/*
* Get the number of instances in the underlying tensor in the `idx`-th
* element.
*/
size_t NumInstancesInElement(size_t level, size_t idx) const;
/* /*
* Shrink levels[level_begin:level_end] * Shrink levels[level_begin:level_end]
*/ */
...@@ -135,5 +144,42 @@ class LoDTensor : public Tensor { ...@@ -135,5 +144,42 @@ class LoDTensor : public Tensor {
private: private:
LoD lod_; LoD lod_;
}; };
/*
* Expand the `source` to fit the LoD of `lod`. For example, a `source`
* LoDTensor is
* - LoD: [0, 2]
* - tensor: [a0, a1]
* a `lod` is
* - LoD: [0 3 5]
* returns a new LoDTensor
* - [a0 a0 a0 a1 a1]
*/
template <typename T>
LoDTensor LodExpand(const LoDTensor& source, const LoD& lod, size_t level,
const platform::Place& place) {
LoD abs_lod = ToAbsOffset(lod);
const auto& lod_level = lod[level];
size_t num_instances = source.dims()[0];
// new tensor
LoDTensor tensor;
tensor.set_lod(lod);
auto dims = source.dims();
dims[0] = lod_level.back();
tensor.Resize(dims);
tensor.mutable_data<T>(place);
PADDLE_ENFORCE_EQ(num_instances, lod_level.size() - 1);
for (size_t ins = 0; ins < num_instances; ins++) {
for (size_t elem = lod_level[ins]; elem < lod_level[ins + 1]; elem++) {
tensor.Slice(elem, elem + 1)
.CopyFrom(source.Slice(ins, ins + 1), platform::CPUPlace(),
platform::CPUDeviceContext());
}
}
return tensor;
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -17,10 +17,13 @@ ...@@ -17,10 +17,13 @@
#include <gtest/gtest.h> #include <gtest/gtest.h>
#include <algorithm> #include <algorithm>
#include <memory> #include <memory>
#include <vector>
namespace paddle { namespace paddle {
namespace framework { namespace framework {
const int kLodTensorSize = 20 * 128;
class LoDTensorTester : public ::testing::Test { class LoDTensorTester : public ::testing::Test {
public: public:
virtual void SetUp() override { virtual void SetUp() override {
...@@ -38,7 +41,10 @@ class LoDTensorTester : public ::testing::Test { ...@@ -38,7 +41,10 @@ class LoDTensorTester : public ::testing::Test {
lod_tensor_.Resize({20 /*batch size*/, 128 /*dim*/}); lod_tensor_.Resize({20 /*batch size*/, 128 /*dim*/});
// malloc memory // malloc memory
lod_tensor_.mutable_data<float>(place); float* dst_ptr = lod_tensor_.mutable_data<float>(place);
for (int i = 0; i < kLodTensorSize; ++i) {
dst_ptr[i] = i;
}
lod_tensor_.set_lod(lod); lod_tensor_.set_lod(lod);
} }
...@@ -86,11 +92,14 @@ TEST_F(LoDTensorTester, ShrinkInLevel) { ...@@ -86,11 +92,14 @@ TEST_F(LoDTensorTester, ShrinkInLevel) {
size_t level = 0; size_t level = 0;
LoDTensor new_lod_tensor = lod_tensor_; LoDTensor new_lod_tensor = lod_tensor_;
new_lod_tensor.ShrinkInLevel(level, 0, 1); new_lod_tensor.ShrinkInLevel(level, 0, 1);
EXPECT_EQ(new_lod_tensor.NumLevels(), 3UL); ASSERT_EQ(new_lod_tensor.NumLevels(), 3UL);
EXPECT_EQ(new_lod_tensor.NumElements(0), 1UL); ASSERT_EQ(new_lod_tensor.NumElements(0), 1UL);
EXPECT_EQ(new_lod_tensor.NumElements(1), 2UL); ASSERT_EQ(new_lod_tensor.NumElements(1), 2UL);
EXPECT_EQ(new_lod_tensor.NumElements(2), 5UL); ASSERT_EQ(new_lod_tensor.NumElements(2), 5UL);
ASSERT_EQ(new_lod_tensor.data<float>(), lod_tensor_.data<float>()); ASSERT_EQ(new_lod_tensor.dims()[0], 12);
for (int i = 0; i < 12 * 128; i++) {
ASSERT_EQ(new_lod_tensor.data<float>()[i], i);
}
level = 1; level = 1;
new_lod_tensor = lod_tensor_; new_lod_tensor = lod_tensor_;
...@@ -98,7 +107,41 @@ TEST_F(LoDTensorTester, ShrinkInLevel) { ...@@ -98,7 +107,41 @@ TEST_F(LoDTensorTester, ShrinkInLevel) {
ASSERT_EQ(new_lod_tensor.NumLevels(), 2UL); ASSERT_EQ(new_lod_tensor.NumLevels(), 2UL);
ASSERT_EQ(new_lod_tensor.NumElements(0), 1UL); ASSERT_EQ(new_lod_tensor.NumElements(0), 1UL);
ASSERT_EQ(new_lod_tensor.NumElements(1), 3UL); ASSERT_EQ(new_lod_tensor.NumElements(1), 3UL);
ASSERT_EQ(new_lod_tensor.data<float>(), lod_tensor_.data<float>()); ASSERT_EQ(new_lod_tensor.dims()[0], 7);
for (int i = 5 * 128; i < 12 * 128; i++) {
ASSERT_EQ(new_lod_tensor.data<float>()[i - 5 * 128], i);
}
LoDTensor t1;
t1.set_lod(lod_tensor_.lod());
t1.ShareDataWith(lod_tensor_);
LoDTensor t2;
t2.set_lod(lod_tensor_.lod());
t2.ShareDataWith(lod_tensor_);
t1.ShrinkInLevel(0, 1, 2);
t2.ShrinkInLevel(0, 0, 1);
EXPECT_NE(t1.data<float>(), t2.data<float>());
EXPECT_NE(t1.data<float>(), lod_tensor_.data<float>());
}
TEST(LodExpand, test) {
LoD lod{{0, 2}};
LoDTensor tensor;
tensor.set_lod(lod);
tensor.Resize({2, 1});
tensor.mutable_data<float>(platform::CPUPlace());
tensor.data<float>()[0] = 0;
tensor.data<float>()[1] = 1;
LoD target;
target.emplace_back(std::vector<size_t>{0, 3, 5});
auto new_tensor = LodExpand<float>(tensor, target, 0UL, platform::CPUPlace());
std::vector<int> result{{0, 0, 0, 1, 1}};
for (size_t i = 0; i < 5; i++) {
ASSERT_EQ(new_tensor.data<float>()[i], result[i]);
}
} }
} // namespace framework } // namespace framework
......
...@@ -47,4 +47,4 @@ TEST(LoDTensor, LoDInGPU) { ...@@ -47,4 +47,4 @@ TEST(LoDTensor, LoDInGPU) {
for (size_t i = 0; i < src_lod[0].size(); ++i) { for (size_t i = 0; i < src_lod[0].size(); ++i) {
CHECK_EQ(lod[0].data()[i], src_lod[0].data()[i] * 2); CHECK_EQ(lod[0].data()[i], src_lod[0].data()[i] * 2);
} }
} }
\ No newline at end of file
...@@ -14,26 +14,97 @@ limitations under the License. */ ...@@ -14,26 +14,97 @@ limitations under the License. */
#include "paddle/framework/op_desc.h" #include "paddle/framework/op_desc.h"
#include <functional> #include <functional>
#include <mutex>
#include <unordered_map> #include <unordered_map>
#include "glog/logging.h"
#include "paddle/framework/block_desc.h" #include "paddle/framework/block_desc.h"
#include "paddle/framework/operator.h" #include "paddle/framework/operator.h"
#include "paddle/framework/program_desc.h"
#include "paddle/framework/shape_inference.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
class OpDescBind;
class BlockDescBind;
class CompileTimeInferShapeContext : public InferShapeContext {
public:
CompileTimeInferShapeContext(const OpDescBind &op,
const BlockDescBind &block);
bool HasInput(const std::string &name) const override;
bool HasOutput(const std::string &name) const override;
bool HasInputs(const std::string &name) const override;
bool HasOutputs(const std::string &name) const override;
DDim GetInputDim(const std::string &name) const override;
void SetOutputDim(const std::string &name, const DDim &dim) override;
AttrReader Attrs() const override;
const std::vector<std::string> &Inputs(
const std::string &name) const override;
const std::vector<std::string> &Outputs(
const std::string &name) const override;
private:
DDim GetDim(const std::string &name) const override;
void SetDim(const std::string &name, const DDim &dim) override;
const OpDescBind &op_;
const BlockDescBind &block_;
};
OpDescBind::OpDescBind(const std::string &type, const VariableNameMap &inputs, OpDescBind::OpDescBind(const std::string &type, const VariableNameMap &inputs,
const VariableNameMap &outputs, const VariableNameMap &outputs,
const AttributeMap &attrs) { const AttributeMap &attrs) {
op_desc_.set_type(type); desc_.set_type(type);
inputs_ = inputs; inputs_ = inputs;
outputs_ = outputs; outputs_ = outputs;
attrs_ = attrs; attrs_ = attrs;
need_update_ = true; need_update_ = true;
} }
OpDescBind::OpDescBind(const OpDesc &desc, ProgramDescBind *prog)
: desc_(desc), need_update_(false) {
// restore inputs_
int input_size = desc_.inputs_size();
for (int i = 0; i < input_size; ++i) {
const OpDesc::Var &var = desc_.inputs(i);
std::vector<std::string> &args = inputs_[var.parameter()];
int argu_size = var.arguments_size();
args.reserve(argu_size);
for (int j = 0; j < argu_size; ++j) {
args.push_back(var.arguments(j));
}
}
// restore outputs_
int output_size = desc_.outputs_size();
for (int i = 0; i < output_size; ++i) {
const OpDesc::Var &var = desc_.outputs(i);
std::vector<std::string> &args = outputs_[var.parameter()];
int argu_size = var.arguments_size();
args.reserve(argu_size);
for (int j = 0; j < argu_size; ++j) {
args.push_back(var.arguments(j));
}
}
// restore attrs_
for (const OpDesc::Attr &attr : desc_.attrs()) {
std::string attr_name = attr.name();
attrs_[attr_name] = GetAttrValue(attr, prog->Proto());
}
}
OpDesc *OpDescBind::Proto() { OpDesc *OpDescBind::Proto() {
Flush(); Flush();
return &op_desc_; return &desc_;
} }
const std::vector<std::string> &OpDescBind::Input( const std::vector<std::string> &OpDescBind::Input(
...@@ -167,23 +238,23 @@ struct SetAttrDescVisitor : public boost::static_visitor<void> { ...@@ -167,23 +238,23 @@ struct SetAttrDescVisitor : public boost::static_visitor<void> {
void OpDescBind::Flush() { void OpDescBind::Flush() {
if (need_update_) { if (need_update_) {
this->op_desc_.mutable_inputs()->Clear(); this->desc_.mutable_inputs()->Clear();
for (auto &ipt : inputs_) { for (auto &ipt : inputs_) {
auto *input = op_desc_.add_inputs(); auto *input = desc_.add_inputs();
input->set_parameter(ipt.first); input->set_parameter(ipt.first);
VectorToRepeated(ipt.second, input->mutable_arguments()); VectorToRepeated(ipt.second, input->mutable_arguments());
} }
this->op_desc_.mutable_outputs()->Clear(); this->desc_.mutable_outputs()->Clear();
for (auto &opt : outputs_) { for (auto &opt : outputs_) {
auto *output = op_desc_.add_outputs(); auto *output = desc_.add_outputs();
output->set_parameter(opt.first); output->set_parameter(opt.first);
VectorToRepeated(opt.second, output->mutable_arguments()); VectorToRepeated(opt.second, output->mutable_arguments());
} }
this->op_desc_.mutable_attrs()->Clear(); this->desc_.mutable_attrs()->Clear();
for (auto &attr : attrs_) { for (auto &attr : attrs_) {
auto *attr_desc = op_desc_.add_attrs(); auto *attr_desc = desc_.add_attrs();
attr_desc->set_name(attr.first); attr_desc->set_name(attr.first);
attr_desc->set_type( attr_desc->set_type(
static_cast<framework::AttrType>(attr.second.which() - 1)); static_cast<framework::AttrType>(attr.second.which() - 1));
...@@ -195,26 +266,26 @@ void OpDescBind::Flush() { ...@@ -195,26 +266,26 @@ void OpDescBind::Flush() {
} }
} }
using InferShapeFuncMap = static std::once_flag init_infer_shape_funcs;
std::unordered_map<std::string /*op_type*/,
std::function<void(InferShapeContext *)>>;
static InferShapeFuncMap &InferShapeFuncs() { static void InitInferShapeFuncs() {
static InferShapeFuncMap *g_map = nullptr; std::call_once(init_infer_shape_funcs, [] {
if (g_map == nullptr) { auto &map = OpInfoMap::Instance();
g_map = new InferShapeFuncMap(); auto &info_map = *map.mutable_map();
auto &info_map = OpInfoMap::Instance();
// all registered kernels for (auto &kern_pair : OperatorWithKernel::AllOpKernels()) {
for (auto &pair : OperatorWithKernel::AllOpKernels()) { auto op_type = kern_pair.first;
auto &info = info_map.Get(pair.first); auto &op_info = info_map.at(op_type);
// use empty type here to avoid runtime checks.
auto op = auto op =
static_cast<OperatorWithKernel *>(info.Creator()("", {}, {}, {})); static_cast<OperatorWithKernel *>(op_info.Creator()("", {}, {}, {}));
g_map->insert( if (op_info.infer_shape_) { // infer_shape has been registered.
{pair.first, [op](InferShapeContext *ctx) { op->InferShape(ctx); }}); continue;
}
op_info.infer_shape_ = [op](InferShapeContext *ctx) {
op->InferShape(ctx);
};
} }
} });
return *g_map;
} }
void OpDescBind::CheckAttrs() { void OpDescBind::CheckAttrs() {
...@@ -230,13 +301,13 @@ void OpDescBind::CheckAttrs() { ...@@ -230,13 +301,13 @@ void OpDescBind::CheckAttrs() {
} }
void OpDescBind::InferShape(const BlockDescBind &block) const { void OpDescBind::InferShape(const BlockDescBind &block) const {
auto &funcs = InferShapeFuncs(); VLOG(3) << "CompileTime infer shape on " << Type();
auto it = funcs.find(this->Type()); InitInferShapeFuncs();
if (it == funcs.end()) { auto &infer_shape = OpInfoMap::Instance().Get(this->Type()).infer_shape_;
PADDLE_THROW("Operator %s has not been registered", this->Type()); PADDLE_ENFORCE(static_cast<bool>(infer_shape),
} "%s's infer_shape has not been registered", this->Type());
CompileTimeInferShapeContext ctx(*this, block); CompileTimeInferShapeContext ctx(*this, block);
it->second(&ctx); infer_shape(&ctx);
} }
void OpDescBind::InferVarType(BlockDescBind *block) const { void OpDescBind::InferVarType(BlockDescBind *block) const {
...@@ -253,5 +324,97 @@ void OpDescBind::InferVarType(BlockDescBind *block) const { ...@@ -253,5 +324,97 @@ void OpDescBind::InferVarType(BlockDescBind *block) const {
} }
} }
CompileTimeInferShapeContext::CompileTimeInferShapeContext(
const OpDescBind &op, const BlockDescBind &block)
: op_(op), block_(block) {}
bool CompileTimeInferShapeContext::HasInput(const std::string &name) const {
const std::vector<std::string> &input_names = op_.Input(name);
auto length = input_names.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL,
"Input(%s) should have only one value, "
"but it have %d now",
name, length);
return block_.HasVarRecursive(input_names[0]);
}
bool CompileTimeInferShapeContext::HasOutput(const std::string &name) const {
const std::vector<std::string> &output_names = op_.Output(name);
auto length = output_names.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL,
"Output(%s) should have only one value, "
"but it have %d now",
name, length);
return block_.HasVarRecursive(output_names[0]);
}
bool CompileTimeInferShapeContext::HasInputs(const std::string &name) const {
const std::vector<std::string> &input_names = op_.Input(name);
if (input_names.empty()) {
return false;
}
for (auto &input : input_names) {
if (!block_.HasVarRecursive(input)) return false;
}
return true;
}
bool CompileTimeInferShapeContext::HasOutputs(const std::string &name) const {
const std::vector<std::string> &output_names = op_.Output(name);
if (output_names.empty()) {
return false;
}
for (auto &output : output_names) {
if (!block_.HasVarRecursive(output)) return false;
}
return true;
}
DDim CompileTimeInferShapeContext::GetInputDim(const std::string &name) const {
std::vector<DDim> ddims = GetInputsDim(name);
auto length = ddims.size();
PADDLE_ENFORCE_EQ(length, 1UL,
"Input(%s) should have 1 value, "
"but it has %d now",
name, length);
return ddims[0];
}
void CompileTimeInferShapeContext::SetOutputDim(const std::string &name,
const DDim &dim) {
SetOutputsDim(name, {dim});
}
AttrReader CompileTimeInferShapeContext::Attrs() const {
return AttrReader(op_.GetAttrMap());
}
const std::vector<std::string> &CompileTimeInferShapeContext::Inputs(
const std::string &name) const {
return op_.Input(name);
}
const std::vector<std::string> &CompileTimeInferShapeContext::Outputs(
const std::string &name) const {
return op_.Output(name);
}
DDim CompileTimeInferShapeContext::GetDim(const std::string &name) const {
auto var = block_.FindVarRecursive(name);
PADDLE_ENFORCE(var != nullptr, "Cannot find variable %s", name);
return framework::make_ddim(var->Shape());
}
void CompileTimeInferShapeContext::SetDim(const std::string &name,
const DDim &dim) {
block_.FindVarRecursive(name)->SetShape(framework::vectorize(dim));
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -24,6 +24,7 @@ namespace paddle { ...@@ -24,6 +24,7 @@ namespace paddle {
namespace framework { namespace framework {
class BlockDescBind; class BlockDescBind;
class ProgramDescBind;
class OpDescBind { class OpDescBind {
public: public:
...@@ -32,11 +33,13 @@ class OpDescBind { ...@@ -32,11 +33,13 @@ class OpDescBind {
OpDescBind(const std::string &type, const VariableNameMap &inputs, OpDescBind(const std::string &type, const VariableNameMap &inputs,
const VariableNameMap &outputs, const AttributeMap &attrs); const VariableNameMap &outputs, const AttributeMap &attrs);
OpDescBind(const OpDesc &desc, ProgramDescBind *prog);
OpDesc *Proto(); OpDesc *Proto();
std::string Type() const { return op_desc_.type(); } std::string Type() const { return desc_.type(); }
void SetType(const std::string &type) { op_desc_.set_type(type); } void SetType(const std::string &type) { desc_.set_type(type); }
const std::vector<std::string> &Input(const std::string &name) const; const std::vector<std::string> &Input(const std::string &name) const;
...@@ -104,6 +107,8 @@ class OpDescBind { ...@@ -104,6 +107,8 @@ class OpDescBind {
void InferVarType(BlockDescBind *block) const; void InferVarType(BlockDescBind *block) const;
void MarkAsTarget() { desc_.set_is_target(true); }
void Flush(); void Flush();
private: private:
...@@ -117,7 +122,7 @@ class OpDescBind { ...@@ -117,7 +122,7 @@ class OpDescBind {
return ret_val; return ret_val;
} }
OpDesc op_desc_; OpDesc desc_;
VariableNameMap inputs_; VariableNameMap inputs_;
VariableNameMap outputs_; VariableNameMap outputs_;
AttributeMap attrs_; AttributeMap attrs_;
......
...@@ -25,12 +25,19 @@ ...@@ -25,12 +25,19 @@
namespace paddle { namespace paddle {
namespace framework { namespace framework {
class InferShapeBase {
public:
virtual ~InferShapeBase() = default;
virtual void operator()(InferShapeContext*) const = 0;
};
struct OpInfo { struct OpInfo {
OpCreator creator_; OpCreator creator_;
GradOpMakerFN grad_op_maker_; GradOpMakerFN grad_op_maker_;
OpProto* proto_{nullptr}; OpProto* proto_{nullptr};
OpAttrChecker* checker_{nullptr}; OpAttrChecker* checker_{nullptr};
InferVarTypeFN infer_var_type_; InferVarTypeFN infer_var_type_;
InferShapeFN infer_shape_;
bool HasOpProtoAndChecker() const { bool HasOpProtoAndChecker() const {
return proto_ != nullptr && checker_ != nullptr; return proto_ != nullptr && checker_ != nullptr;
...@@ -87,16 +94,13 @@ class OpInfoMap { ...@@ -87,16 +94,13 @@ class OpInfoMap {
} }
} }
template <typename Callback> const std::unordered_map<std::string, OpInfo>& map() const { return map_; }
void IterAllInfo(Callback callback) {
for (auto& it : map_) { std::unordered_map<std::string, OpInfo>* mutable_map() { return &map_; }
callback(it.first, it.second);
}
}
private: private:
OpInfoMap() = default; OpInfoMap() = default;
std::unordered_map<std::string, const OpInfo> map_; std::unordered_map<std::string, OpInfo> map_;
DISABLE_COPY_AND_ASSIGN(OpInfoMap); DISABLE_COPY_AND_ASSIGN(OpInfoMap);
}; };
......
...@@ -29,6 +29,7 @@ limitations under the License. */ ...@@ -29,6 +29,7 @@ limitations under the License. */
#include "paddle/framework/op_desc.h" #include "paddle/framework/op_desc.h"
#include "paddle/framework/operator.h" #include "paddle/framework/operator.h"
#include "paddle/framework/scope.h" #include "paddle/framework/scope.h"
#include "paddle/framework/shape_inference.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
...@@ -161,6 +162,10 @@ class OpKernelRegistrar : public Registrar { ...@@ -161,6 +162,10 @@ class OpKernelRegistrar : public Registrar {
REGISTER_OPERATOR(op_type, op_class, _GradOpDescMaker_##grad_op_type##_, \ REGISTER_OPERATOR(op_type, op_class, _GradOpDescMaker_##grad_op_type##_, \
op_maker_class); op_maker_class);
#define REGISTER_OP_WITH_KERNEL(op_type, ...) \
REGISTER_OPERATOR(op_type, ::paddle::framework::OperatorWithKernel, \
##__VA_ARGS__)
#define REGISTER_OP_WITHOUT_GRADIENT(op_type, op_class, op_maker_class) \ #define REGISTER_OP_WITHOUT_GRADIENT(op_type, op_class, op_maker_class) \
REGISTER_OPERATOR(op_type, op_class, op_maker_class) REGISTER_OPERATOR(op_type, op_class, op_maker_class)
...@@ -223,6 +228,10 @@ class OpKernelRegistrar : public Registrar { ...@@ -223,6 +228,10 @@ class OpKernelRegistrar : public Registrar {
USE_OP_ITSELF(op_type); \ USE_OP_ITSELF(op_type); \
USE_OP_DEVICE_KERNEL(op_type, CPU); USE_OP_DEVICE_KERNEL(op_type, CPU);
#define USE_GPU_ONLY_OP(op_type) \
USE_OP_ITSELF(op_type); \
USE_OP_DEVICE_KERNEL(op_type, GPU)
#define USE_OP(op_type) \ #define USE_OP(op_type) \
USE_OP_ITSELF(op_type); \ USE_OP_ITSELF(op_type); \
USE_OP_KERNEL(op_type) USE_OP_KERNEL(op_type)
......
...@@ -15,6 +15,7 @@ limitations under the License. */ ...@@ -15,6 +15,7 @@ limitations under the License. */
#include "paddle/framework/operator.h" #include "paddle/framework/operator.h"
#include <algorithm> #include <algorithm>
#include <atomic> #include <atomic>
#include "paddle/framework/shape_inference.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
...@@ -33,24 +34,6 @@ ExecutionContext::GetEigenDevice<platform::GPUPlace, Eigen::GpuDevice>() const { ...@@ -33,24 +34,6 @@ ExecutionContext::GetEigenDevice<platform::GPUPlace, Eigen::GpuDevice>() const {
} }
#endif #endif
const Tensor* GetTensorFromVar(const Variable* var) {
if (var->IsType<LoDTensor>()) {
return &var->Get<LoDTensor>();
}
PADDLE_ENFORCE(var->IsType<Tensor>(),
"The Input must be LoDTensor or Tensor.");
return &var->Get<Tensor>();
}
Tensor* GetTensorFromVar(Variable* var) {
if (var->IsType<LoDTensor>()) {
return var->GetMutable<LoDTensor>();
}
PADDLE_ENFORCE(var->IsType<Tensor>(),
"The Input must be LoDTensor or Tensor.");
return var->GetMutable<Tensor>();
}
std::string OperatorBase::Input(const std::string& name) const { std::string OperatorBase::Input(const std::string& name) const {
auto& ins = Inputs(name); auto& ins = Inputs(name);
PADDLE_ENFORCE_LE(ins.size(), 1UL, PADDLE_ENFORCE_LE(ins.size(), 1UL,
...@@ -204,6 +187,30 @@ void OperatorBase::GenerateTemporaryNames() { ...@@ -204,6 +187,30 @@ void OperatorBase::GenerateTemporaryNames() {
} }
} }
static const Tensor* GetTensorFromVar(const Variable* var) {
const Tensor* t = nullptr;
if (var->IsType<LoDTensor>()) {
t = &(var->Get<LoDTensor>());
} else if (var->IsType<SelectedRows>()) {
t = &(var->Get<SelectedRows>().value());
} else {
PADDLE_THROW("Variable type must be LoDTensor/SelectedRows.");
}
return t;
}
static Tensor* GetMutableTensorFromVar(Variable* var) {
Tensor* t = nullptr;
if (var->IsType<LoDTensor>()) {
t = var->GetMutable<LoDTensor>();
} else if (var->IsType<SelectedRows>()) {
t = var->GetMutable<SelectedRows>()->mutable_value();
} else {
PADDLE_THROW("Variable type must be LoDTensor/SelectedRows.");
}
return t;
}
template <> template <>
const Tensor* ExecutionContext::Input<Tensor>(const std::string& name) const { const Tensor* ExecutionContext::Input<Tensor>(const std::string& name) const {
auto* var = InputVar(name); auto* var = InputVar(name);
...@@ -227,7 +234,7 @@ const std::vector<const Tensor*> ExecutionContext::MultiInput<Tensor>( ...@@ -227,7 +234,7 @@ const std::vector<const Tensor*> ExecutionContext::MultiInput<Tensor>(
template <> template <>
Tensor* ExecutionContext::Output<Tensor>(const std::string& name) const { Tensor* ExecutionContext::Output<Tensor>(const std::string& name) const {
auto var = OutputVar(name); auto var = OutputVar(name);
return var == nullptr ? nullptr : var->GetMutable<LoDTensor>(); return var == nullptr ? nullptr : GetMutableTensorFromVar(var);
} }
template <> template <>
...@@ -240,7 +247,7 @@ std::vector<Tensor*> ExecutionContext::MultiOutput<Tensor>( ...@@ -240,7 +247,7 @@ std::vector<Tensor*> ExecutionContext::MultiOutput<Tensor>(
[&](const std::string& sub_name) { [&](const std::string& sub_name) {
auto var = scope_.FindVar(sub_name); auto var = scope_.FindVar(sub_name);
return var == nullptr ? nullptr return var == nullptr ? nullptr
: var->GetMutable<LoDTensor>(); : GetMutableTensorFromVar(var);
}); });
return res; return res;
} }
...@@ -267,5 +274,137 @@ bool OpSupportGPU(const std::string& op_type) { ...@@ -267,5 +274,137 @@ bool OpSupportGPU(const std::string& op_type) {
return false; return false;
} }
class RuntimeInferShapeContext : public InferShapeContext {
public:
RuntimeInferShapeContext(const OperatorBase& op, const Scope& scope)
: op_(op), scope_(scope) {}
bool HasInput(const std::string& name) const override {
auto& ins = Inputs(name);
size_t length = ins.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL, "Input %s should have more than one inputs",
name);
auto ipt = ins[0];
auto* var = ipt == kEmptyVarName ? nullptr : scope_.FindVar(ipt);
return var != nullptr;
}
bool HasOutput(const std::string& name) const override {
auto& outs = Outputs(name);
size_t length = outs.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL, "Output %s should have more than one inputs",
name);
auto ipt = outs[0];
auto* var = ipt == kEmptyVarName ? nullptr : scope_.FindVar(ipt);
return var != nullptr;
}
bool HasInputs(const std::string& name) const override {
auto inputs = op_.Inputs(name);
if (inputs.empty()) {
return false;
}
for (auto& input : inputs) {
if (scope_.FindVar(input) == nullptr) {
return false;
}
}
return true;
}
bool HasOutputs(const std::string& name) const override {
auto outputs = op_.Outputs(name);
if (outputs.empty()) {
return false;
}
for (auto& output : outputs) {
if (scope_.FindVar(output) == nullptr) {
return false;
}
}
return true;
}
DDim GetInputDim(const std::string& name) const override {
return GetDim(op_.Input(name));
}
void SetOutputDim(const std::string& name, const DDim& dim) override {
SetDim(op_.Output(name), dim);
}
AttrReader Attrs() const override { return AttrReader(op_.Attrs()); }
const std::vector<std::string>& Inputs(
const std::string& name) const override {
return op_.Inputs(name);
}
const std::vector<std::string>& Outputs(
const std::string& name) const override {
return op_.Outputs(name);
}
private:
DDim GetDim(const std::string& name) const override {
Variable* var = scope_.FindVar(name);
if (var->IsType<LoDTensor>()) {
return var->Get<LoDTensor>().dims();
} else if (var->IsType<SelectedRows>()) {
return var->Get<SelectedRows>().GetCompleteDims();
} else {
PADDLE_THROW("Variable type must be LoDTensor/SelectedRows.");
}
}
void SetDim(const std::string& name, const DDim& dim) override {
Variable* var = scope_.FindVar(name);
if (var->IsType<LoDTensor>()) {
var->GetMutable<LoDTensor>()->Resize(dim);
} else if (var->IsType<SelectedRows>()) {
var->GetMutable<SelectedRows>()->set_height(dim[0]);
} else {
PADDLE_THROW("Variable type must be LoDTensor/SelectedRows.");
}
}
const OperatorBase& op_;
const Scope& scope_;
};
void OperatorWithKernel::Run(const Scope& scope,
const platform::DeviceContext& dev_ctx) const {
VLOG(3) << "Running operator " << this->Type();
RuntimeInferShapeContext infer_shape_ctx(*this, scope);
this->InferShape(&infer_shape_ctx);
ExecutionContext ctx(*this, scope, dev_ctx);
// check if op[type] has kernel registered.
auto& all_op_kernels = AllOpKernels();
auto kernels_iter = all_op_kernels.find(type_);
if (kernels_iter == all_op_kernels.end()) {
PADDLE_THROW(
"There are no kernels which are registered in the %s operator.", type_);
}
// check if op[type] have kernel for kernel_key
OpKernelMap& kernels = kernels_iter->second;
auto kernel_key = OpKernelKey(IndicateDataType(ctx), dev_ctx);
auto kernel_iter = kernels.find(kernel_key);
if (kernel_iter == kernels.end()) {
PADDLE_THROW("The operator %s does not support %s", type_, kernel_key);
}
kernel_iter->second->Compute(ctx);
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -28,7 +28,7 @@ limitations under the License. */ ...@@ -28,7 +28,7 @@ limitations under the License. */
#include "paddle/framework/lod_tensor.h" #include "paddle/framework/lod_tensor.h"
#include "paddle/framework/op_info.h" #include "paddle/framework/op_info.h"
#include "paddle/framework/scope.h" #include "paddle/framework/scope.h"
#include "paddle/framework/shape_inference.h" #include "paddle/framework/selected_rows.h"
#include "paddle/framework/tensor.h" #include "paddle/framework/tensor.h"
#include "paddle/platform/device_context.h" #include "paddle/platform/device_context.h"
#include "paddle/platform/place.h" #include "paddle/platform/place.h"
...@@ -60,9 +60,6 @@ inline std::string GradVarName(const std::string& var_name) { ...@@ -60,9 +60,6 @@ inline std::string GradVarName(const std::string& var_name) {
class OperatorBase; class OperatorBase;
class ExecutionContext; class ExecutionContext;
extern const Tensor* GetTensorFromVar(const Variable* var);
extern Tensor* GetTensorFromVar(Variable* var);
/** /**
* OperatorBase has the basic element that Net will call to do computation. * OperatorBase has the basic element that Net will call to do computation.
* Only CreateOperator from OpRegistry will new Operator directly. User * Only CreateOperator from OpRegistry will new Operator directly. User
...@@ -125,7 +122,7 @@ class OperatorBase { ...@@ -125,7 +122,7 @@ class OperatorBase {
protected: protected:
std::string type_; std::string type_;
// NOTE: in case of OpGrad, inputs_ contains: // NOTE: in case of OpGrad, inputs_ contains:
// I (Inputs)opear // I (Inputs)
// O (Outputs) // O (Outputs)
// OG (Output Gradients) // OG (Output Gradients)
VariableNameMap inputs_; VariableNameMap inputs_;
...@@ -290,6 +287,16 @@ class ExecutionContext { ...@@ -290,6 +287,16 @@ class ExecutionContext {
return device_context_; return device_context_;
} }
//! Get actual name vector for this input.
const std::vector<std::string>& Inputs(const std::string& name) const {
return op_.Inputs(name);
}
//! Get actual name vector for this output.
const std::vector<std::string>& Outputs(const std::string& name) const {
return op_.Outputs(name);
}
#ifdef PADDLE_WITH_CUDA #ifdef PADDLE_WITH_CUDA
const platform::CUDADeviceContext& cuda_device_context() const { const platform::CUDADeviceContext& cuda_device_context() const {
PADDLE_ENFORCE(platform::is_gpu_place(device_context_.GetPlace())); PADDLE_ENFORCE(platform::is_gpu_place(device_context_.GetPlace()));
...@@ -319,226 +326,6 @@ template <> ...@@ -319,226 +326,6 @@ template <>
std::vector<Tensor*> ExecutionContext::MultiOutput<Tensor>( std::vector<Tensor*> ExecutionContext::MultiOutput<Tensor>(
const std::string& name) const; const std::string& name) const;
class CompileTimeInferShapeContext : public InferShapeContext {
public:
CompileTimeInferShapeContext(const OpDescBind& op, const BlockDescBind& block)
: op_(op), block_(block) {}
bool HasInput(const std::string& name) const override {
const std::vector<std::string>& input_names = op_.Input(name);
auto length = input_names.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL,
"Input(%s) should have only one value, "
"but it have %d now",
name, length);
return block_.HasVarRecursive(input_names[0]);
}
bool HasOutput(const std::string& name) const override {
const std::vector<std::string>& output_names = op_.Output(name);
auto length = output_names.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL,
"Output(%s) should have only one value, "
"but it have %d now",
name, length);
return block_.HasVarRecursive(output_names[0]);
}
bool HasInputs(const std::string& name) const override {
const std::vector<std::string>& input_names = op_.Input(name);
if (input_names.empty()) {
return false;
}
for (auto& input : input_names) {
if (!block_.HasVarRecursive(input)) return false;
}
return true;
}
bool HasOutputs(const std::string& name) const override {
const std::vector<std::string>& output_names = op_.Output(name);
if (output_names.empty()) {
return false;
}
for (auto& output : output_names) {
if (!block_.HasVarRecursive(output)) return false;
}
return true;
}
DDim GetInputDim(const std::string& name) const override {
std::vector<DDim> ddims = GetInputsDim(name);
auto length = ddims.size();
PADDLE_ENFORCE_EQ(length, 1UL,
"Input(%s) should have 1 value, "
"but it has %d now",
name, length);
return ddims[0];
}
void SetInputDim(const std::string& name, const DDim& dim) override {
SetInputsDim(name, {dim});
}
DDim GetOutputDim(const std::string& name) const override {
std::vector<DDim> ddims = GetOutputsDim(name);
auto length = ddims.size();
PADDLE_ENFORCE_EQ(length, 1UL,
"Output(%s) should have 1 value, "
"but it has %d now",
name, length);
return ddims[0];
}
void SetOutputDim(const std::string& name, const DDim& dim) override {
SetOutputsDim(name, {dim});
}
AttrReader Attrs() const override { return AttrReader(op_.GetAttrMap()); }
const std::vector<std::string>& Inputs(
const std::string& name) const override {
return op_.Input(name);
}
const std::vector<std::string>& Outputs(
const std::string& name) const override {
return op_.Output(name);
}
private:
DDim GetDim(const std::string& name) const override {
return framework::make_ddim(block_.FindVarRecursive(name)->Shape());
}
void SetDim(const std::string& name, const DDim& dim) override {
block_.FindVarRecursive(name)->SetShape(framework::vectorize(dim));
}
const OpDescBind& op_;
const BlockDescBind& block_;
};
class RuntimeInferShapeContext : public InferShapeContext {
public:
RuntimeInferShapeContext(const OperatorBase& op, const Scope& scope)
: op_(op), scope_(scope) {}
bool HasInput(const std::string& name) const override {
auto& ins = Inputs(name);
size_t length = ins.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL, "Input %s should have more than one inputs",
name);
auto ipt = ins[0];
auto* var = ipt == kEmptyVarName ? nullptr : scope_.FindVar(ipt);
return var != nullptr;
}
bool HasOutput(const std::string& name) const override {
auto& outs = Outputs(name);
size_t length = outs.size();
if (length == 0) {
return false;
}
PADDLE_ENFORCE_EQ(length, 1UL, "Output %s should have more than one inputs",
name);
auto ipt = outs[0];
auto* var = ipt == kEmptyVarName ? nullptr : scope_.FindVar(ipt);
return var != nullptr;
}
bool HasInputs(const std::string& name) const override {
auto inputs = op_.Inputs(name);
if (inputs.empty()) {
return false;
}
for (auto& input : inputs) {
if (scope_.FindVar(input) == nullptr) {
return false;
}
}
return true;
}
bool HasOutputs(const std::string& name) const override {
auto outputs = op_.Outputs(name);
if (outputs.empty()) {
return false;
}
for (auto& output : outputs) {
if (scope_.FindVar(output) == nullptr) {
return false;
}
}
return true;
}
DDim GetInputDim(const std::string& name) const override {
return GetDim(op_.Input(name));
}
void SetInputDim(const std::string& name, const DDim& dim) override {
SetDim(op_.Input(name), dim);
}
DDim GetOutputDim(const std::string& name) const override {
return GetDim(op_.Output(name));
}
void SetOutputDim(const std::string& name, const DDim& dim) override {
SetDim(op_.Output(name), dim);
}
AttrReader Attrs() const override { return AttrReader(op_.Attrs()); }
const std::vector<std::string>& Inputs(
const std::string& name) const override {
return op_.Inputs(name);
}
const std::vector<std::string>& Outputs(
const std::string& name) const override {
return op_.Outputs(name);
}
private:
template <bool Allocate>
Tensor* GetTensor(const std::string& name) const {
Tensor* t = nullptr;
auto* var = scope_.FindVar(name);
if (!var->IsType<LoDTensor>() && !var->IsType<Tensor>()) {
if (Allocate) {
t = var->GetMutable<LoDTensor>();
} else {
PADDLE_THROW("Variable(%s) should be tensor", name);
}
} else {
t = GetTensorFromVar(scope_.FindVar(name));
}
return t;
}
DDim GetDim(const std::string& name) const override {
return GetTensor<false>(name)->dims();
}
void SetDim(const std::string& name, const DDim& dim) override {
GetTensor<true>(name)->Resize(dim);
}
const OperatorBase& op_;
const Scope& scope_;
};
class OpKernelBase { class OpKernelBase {
public: public:
/** /**
...@@ -597,32 +384,7 @@ class OperatorWithKernel : public OperatorBase { ...@@ -597,32 +384,7 @@ class OperatorWithKernel : public OperatorBase {
: OperatorBase(type, inputs, outputs, attrs) {} : OperatorBase(type, inputs, outputs, attrs) {}
void Run(const Scope& scope, void Run(const Scope& scope,
const platform::DeviceContext& dev_ctx) const final { const platform::DeviceContext& dev_ctx) const final;
VLOG(3) << "Running operator " << this->Type();
RuntimeInferShapeContext infer_shape_ctx(*this, scope);
this->InferShape(&infer_shape_ctx);
ExecutionContext ctx(*this, scope, dev_ctx);
// check if op[type] has kernel registered.
auto& all_op_kernels = AllOpKernels();
auto kernels_iter = all_op_kernels.find(type_);
if (kernels_iter == all_op_kernels.end()) {
PADDLE_THROW("op[%s] has no kernel", type_);
}
// check if op[type] have kernel for kernel_key
OpKernelMap& kernels = kernels_iter->second;
auto kernel_key = OpKernelKey(IndicateDataType(ctx), dev_ctx);
auto kernel_iter = kernels.find(kernel_key);
if (kernel_iter == kernels.end()) {
PADDLE_THROW("op[%s] has no kernel with kernel_key[%s]", type_,
kernel_key);
}
kernel_iter->second->Compute(ctx);
}
static std::unordered_map<std::string /* op_type */, OpKernelMap>& static std::unordered_map<std::string /* op_type */, OpKernelMap>&
AllOpKernels() { AllOpKernels() {
...@@ -638,12 +400,15 @@ class OperatorWithKernel : public OperatorBase { ...@@ -638,12 +400,15 @@ class OperatorWithKernel : public OperatorBase {
}); });
} }
virtual void InferShape(InferShapeContext* ctx) const = 0; virtual void InferShape(InferShapeContext* ctx) const {
OpInfoMap::Instance().Get(Type()).infer_shape_(ctx);
}
protected: protected:
// indicate kernel DataType by input data. Defaultly all input data must be // indicate kernel DataType by input data. Defaultly all input data must be
// same. // same.
virtual DataType IndicateDataType(const ExecutionContext& ctx) const { virtual DataType IndicateDataType(const ExecutionContext& ctx) const {
VLOG(3) << "Default IndicateDataType " << this->Type();
auto& scope = ctx.scope(); auto& scope = ctx.scope();
int data_type = -1; int data_type = -1;
for (auto& input : this->inputs_) { for (auto& input : this->inputs_) {
...@@ -655,11 +420,14 @@ class OperatorWithKernel : public OperatorBase { ...@@ -655,11 +420,14 @@ class OperatorWithKernel : public OperatorBase {
t = &var->Get<Tensor>(); t = &var->Get<Tensor>();
} else if (var->IsType<LoDTensor>()) { } else if (var->IsType<LoDTensor>()) {
t = &var->Get<LoDTensor>(); t = &var->Get<LoDTensor>();
} else if (var->IsType<SelectedRows>()) {
t = &(var->Get<SelectedRows>().value());
} }
if (t != nullptr) { if (t != nullptr) {
int tmp = static_cast<int>(ToDataType(t->type())); int tmp = static_cast<int>(ToDataType(t->type()));
VLOG(3) << "Input " << ipt_name << " with data_type " << tmp;
PADDLE_ENFORCE(tmp == data_type || data_type == -1, PADDLE_ENFORCE(tmp == data_type || data_type == -1,
"DataType of Paddle Op must be same."); "DataType of Paddle Op %s must be same.", Type());
data_type = tmp; data_type = tmp;
} }
} }
......
...@@ -237,12 +237,12 @@ TEST(OpKernel, multi_inputs) { ...@@ -237,12 +237,12 @@ TEST(OpKernel, multi_inputs) {
paddle::platform::CPUDeviceContext cpu_device_context; paddle::platform::CPUDeviceContext cpu_device_context;
paddle::framework::Scope scope; paddle::framework::Scope scope;
scope.Var("x0")->GetMutable<Tensor>(); scope.Var("x0")->GetMutable<LoDTensor>();
scope.Var("x1")->GetMutable<Tensor>(); scope.Var("x1")->GetMutable<LoDTensor>();
scope.Var("x2")->GetMutable<Tensor>(); scope.Var("x2")->GetMutable<LoDTensor>();
scope.Var("k0")->GetMutable<Tensor>(); scope.Var("k0")->GetMutable<LoDTensor>();
scope.Var("y0")->GetMutable<Tensor>(); scope.Var("y0")->GetMutable<LoDTensor>();
scope.Var("y1")->GetMutable<Tensor>(); scope.Var("y1")->GetMutable<LoDTensor>();
auto op = paddle::framework::OpRegistry::CreateOp(op_desc, nullptr); auto op = paddle::framework::OpRegistry::CreateOp(op_desc, nullptr);
op->Run(scope, cpu_device_context); op->Run(scope, cpu_device_context);
......
...@@ -19,9 +19,9 @@ namespace paddle { ...@@ -19,9 +19,9 @@ namespace paddle {
namespace framework { namespace framework {
BlockDescBind *ProgramDescBind::AppendBlock(const BlockDescBind &parent) { BlockDescBind *ProgramDescBind::AppendBlock(const BlockDescBind &parent) {
auto *b = prog_.add_blocks(); auto *b = desc_.add_blocks();
b->set_parent_idx(parent.ID()); b->set_parent_idx(parent.ID());
b->set_idx(prog_.blocks_size() - 1); b->set_idx(desc_.blocks_size() - 1);
blocks_.emplace_back(new BlockDescBind(this, b)); blocks_.emplace_back(new BlockDescBind(this, b));
return blocks_.back().get(); return blocks_.back().get();
} }
...@@ -30,23 +30,39 @@ ProgramDesc *ProgramDescBind::Proto() { ...@@ -30,23 +30,39 @@ ProgramDesc *ProgramDescBind::Proto() {
for (auto &block : blocks_) { for (auto &block : blocks_) {
block->Flush(); block->Flush();
} }
return &prog_; return &desc_;
} }
ProgramDescBind::ProgramDescBind() { ProgramDescBind::ProgramDescBind() {
auto *block = prog_.mutable_blocks()->Add(); auto *block = desc_.mutable_blocks()->Add();
block->set_idx(kRootBlockIndex); block->set_idx(kRootBlockIndex);
block->set_parent_idx(kNoneBlockIndex); block->set_parent_idx(kNoneBlockIndex);
blocks_.emplace_back(new BlockDescBind(this, block)); blocks_.emplace_back(new BlockDescBind(this, block));
} }
ProgramDescBind::ProgramDescBind(const ProgramDescBind &o) { ProgramDescBind::ProgramDescBind(const ProgramDescBind &o) {
prog_ = o.prog_; desc_ = o.desc_;
for (int i = 0; i < prog_.blocks_size(); ++i) { for (int i = 0; i < desc_.blocks_size(); ++i) {
auto *block = prog_.mutable_blocks(i); auto *block = desc_.mutable_blocks(i);
blocks_.emplace_back(new BlockDescBind(*o.blocks_[i], block, this)); blocks_.emplace_back(new BlockDescBind(*o.blocks_[i], block, this));
} }
} }
ProgramDescBind::ProgramDescBind(const ProgramDesc &desc) {
desc_ = desc;
for (auto &block_desc : *desc_.mutable_blocks()) {
blocks_.emplace_back(new BlockDescBind(this, &block_desc));
}
}
ProgramDescBind::ProgramDescBind(const std::string &binary_str) {
PADDLE_ENFORCE(desc_.ParseFromString(binary_str),
"Fail to parse program_desc from binary string.");
for (auto &block_desc : *desc_.mutable_blocks()) {
blocks_.emplace_back(new BlockDescBind(this, &block_desc));
}
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -29,8 +29,12 @@ class ProgramDescBind { ...@@ -29,8 +29,12 @@ class ProgramDescBind {
public: public:
ProgramDescBind(); ProgramDescBind();
explicit ProgramDescBind(const ProgramDesc &desc);
ProgramDescBind(const ProgramDescBind &o); ProgramDescBind(const ProgramDescBind &o);
explicit ProgramDescBind(const std::string &binary_str);
BlockDescBind *AppendBlock(const BlockDescBind &parent); BlockDescBind *AppendBlock(const BlockDescBind &parent);
BlockDescBind *Block(size_t idx) { return blocks_[idx].get(); } BlockDescBind *Block(size_t idx) { return blocks_[idx].get(); }
...@@ -40,7 +44,7 @@ class ProgramDescBind { ...@@ -40,7 +44,7 @@ class ProgramDescBind {
ProgramDesc *Proto(); ProgramDesc *Proto();
private: private:
ProgramDesc prog_; ProgramDesc desc_;
std::vector<std::unique_ptr<BlockDescBind>> blocks_; std::vector<std::unique_ptr<BlockDescBind>> blocks_;
}; };
......
...@@ -59,7 +59,7 @@ TEST(ProgramDesc, copy_ctor) { ...@@ -59,7 +59,7 @@ TEST(ProgramDesc, copy_ctor) {
}; };
ASSERT_EQ(global_block->LocalVarNames(), global_block_copy->LocalVarNames()); ASSERT_EQ(global_block->LocalVarNames(), global_block_copy->LocalVarNames());
ASSERT_EQ(3, global_block_copy->LocalVarNames().size()); ASSERT_EQ(3UL, global_block_copy->LocalVarNames().size());
assert_same_var("X", x); assert_same_var("X", x);
assert_same_var("Y", y); assert_same_var("Y", y);
assert_same_var("Out", out); assert_same_var("Out", out);
...@@ -79,5 +79,67 @@ TEST(ProgramDesc, copy_ctor) { ...@@ -79,5 +79,67 @@ TEST(ProgramDesc, copy_ctor) {
// Not check block's protostr are same it because the order of vars could be // Not check block's protostr are same it because the order of vars could be
// different and it is correct. // different and it is correct.
} }
TEST(ProgramDescBind, serialize_and_deserialize) {
ProgramDescBind program_origin;
auto* global_block = program_origin.Block(0);
auto* x = global_block->Var("X");
x->SetType(VarDesc_VarType_LOD_TENSOR);
x->SetLoDLevel(0);
x->SetDataType(FP32);
x->SetShape({1000, 784});
auto* y = global_block->Var("Y");
y->SetType(VarDesc_VarType_LOD_TENSOR);
y->SetLoDLevel(0);
y->SetDataType(FP32);
y->SetShape({784, 100});
auto* op = global_block->AppendOp();
op->SetType("mul");
op->SetInput("X", {x->Name()});
op->SetInput("Y", {y->Name()});
auto* out = global_block->Var("Out");
out->SetType(VarDesc_VarType_LOD_TENSOR);
op->SetOutput("Y", {out->Name()});
std::string binary_str;
program_origin.Proto()->SerializeToString(&binary_str);
ProgramDescBind program_restored(binary_str);
auto* global_block_restored = program_restored.Block(0);
ASSERT_NE(global_block, global_block_restored);
auto assert_same_var = [&](const std::string& name, VarDescBind* var_before) {
ASSERT_TRUE(global_block_restored->HasVar(name));
auto* restored = global_block_restored->Var(name);
ASSERT_NE(restored, var_before);
ASSERT_EQ(restored->Name(), var_before->Name());
ASSERT_EQ(restored->GetType(), var_before->GetType());
ASSERT_EQ(restored->Shape(), var_before->Shape());
ASSERT_EQ(restored->Proto()->SerializeAsString(),
var_before->Proto()->SerializeAsString());
};
ASSERT_EQ(global_block->LocalVarNames(),
global_block_restored->LocalVarNames());
ASSERT_EQ(3UL, global_block_restored->LocalVarNames().size());
assert_same_var("X", x);
assert_same_var("Y", y);
assert_same_var("Out", out);
for (size_t i = 0; i < global_block->OpSize(); ++i) {
auto op_origin = global_block->Op(i);
auto op_restored = global_block->Op(i);
ASSERT_EQ(op_origin->Type(), op_restored->Type());
ASSERT_EQ(op_origin->Inputs(), op_restored->Inputs());
ASSERT_EQ(op_origin->Outputs(), op_restored->Outputs());
ASSERT_EQ(op_restored->Proto()->SerializeAsString(),
op_origin->Proto()->SerializeAsString());
}
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -46,7 +46,7 @@ bool IsTarget(const OpDesc& op_desc) { ...@@ -46,7 +46,7 @@ bool IsTarget(const OpDesc& op_desc) {
return false; return false;
} }
void prune_impl(const ProgramDesc& input, ProgramDesc& output, int block_id) { void prune_impl(const ProgramDesc& input, ProgramDesc* output, int block_id) {
// TODO(tonyyang-svail): // TODO(tonyyang-svail):
// - will change to use multiple blocks for RNN op and Cond Op // - will change to use multiple blocks for RNN op and Cond Op
...@@ -91,8 +91,8 @@ void prune_impl(const ProgramDesc& input, ProgramDesc& output, int block_id) { ...@@ -91,8 +91,8 @@ void prune_impl(const ProgramDesc& input, ProgramDesc& output, int block_id) {
// we reverse the should_run vector // we reverse the should_run vector
std::reverse(should_run.begin(), should_run.end()); std::reverse(should_run.begin(), should_run.end());
output = input; *output = input;
auto* op_field = output.mutable_blocks(block_id)->mutable_ops(); auto* op_field = output->mutable_blocks(block_id)->mutable_ops();
op_field->Clear(); op_field->Clear();
for (size_t i = 0; i < should_run.size(); ++i) { for (size_t i = 0; i < should_run.size(); ++i) {
if (should_run[i]) { if (should_run[i]) {
...@@ -101,7 +101,8 @@ void prune_impl(const ProgramDesc& input, ProgramDesc& output, int block_id) { ...@@ -101,7 +101,8 @@ void prune_impl(const ProgramDesc& input, ProgramDesc& output, int block_id) {
} }
} }
void Prune(const ProgramDesc& input, ProgramDesc& output) { // TODO(fengjiayi): Prune() could be inplaced to avoid unnecessary copies
void Prune(const ProgramDesc& input, ProgramDesc* output) {
prune_impl(input, output, 0); prune_impl(input, output, 0);
} }
......
...@@ -20,7 +20,7 @@ limitations under the License. */ ...@@ -20,7 +20,7 @@ limitations under the License. */
namespace paddle { namespace paddle {
namespace framework { namespace framework {
void Prune(const ProgramDesc& input, ProgramDesc& output); void Prune(const ProgramDesc& input, ProgramDesc* output);
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -59,11 +59,11 @@ TEST(Prune, one_operator) { ...@@ -59,11 +59,11 @@ TEST(Prune, one_operator) {
f::ProgramDesc *pdesc = program.Proto(); f::ProgramDesc *pdesc = program.Proto();
f::ProgramDesc pruned; f::ProgramDesc pruned;
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 0); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 0);
pdesc->mutable_blocks(0)->mutable_ops(0)->set_is_target(true); pdesc->mutable_blocks(0)->mutable_ops(0)->set_is_target(true);
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 1); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 1);
} }
...@@ -81,7 +81,7 @@ TEST(Prune, forward) { ...@@ -81,7 +81,7 @@ TEST(Prune, forward) {
for (int i = 0; i < pdesc->blocks(0).ops_size(); ++i) { for (int i = 0; i < pdesc->blocks(0).ops_size(); ++i) {
f::ProgramDesc pruned; f::ProgramDesc pruned;
pdesc->mutable_blocks(0)->mutable_ops(i)->set_is_target(true); pdesc->mutable_blocks(0)->mutable_ops(i)->set_is_target(true);
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), i + 1); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), i + 1);
} }
} }
...@@ -100,7 +100,7 @@ TEST(Prune, multi_input_op) { ...@@ -100,7 +100,7 @@ TEST(Prune, multi_input_op) {
pdesc->mutable_blocks(0)->mutable_ops(3)->set_is_target(true); pdesc->mutable_blocks(0)->mutable_ops(3)->set_is_target(true);
f::ProgramDesc pruned; f::ProgramDesc pruned;
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 4); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 4);
} }
...@@ -116,7 +116,7 @@ TEST(Prune, multi_output_op) { ...@@ -116,7 +116,7 @@ TEST(Prune, multi_output_op) {
pdesc->mutable_blocks(0)->mutable_ops(2)->set_is_target(true); pdesc->mutable_blocks(0)->mutable_ops(2)->set_is_target(true);
f::ProgramDesc pruned; f::ProgramDesc pruned;
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 2); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 2);
} }
...@@ -133,6 +133,6 @@ TEST(Prune, multi_target) { ...@@ -133,6 +133,6 @@ TEST(Prune, multi_target) {
pdesc->mutable_blocks(0)->mutable_ops(2)->set_is_target(true); pdesc->mutable_blocks(0)->mutable_ops(2)->set_is_target(true);
f::ProgramDesc pruned; f::ProgramDesc pruned;
Prune(*pdesc, pruned); Prune(*pdesc, &pruned);
PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 3); PADDLE_ENFORCE_EQ(pruned.blocks(0).ops_size(), 3);
} }
...@@ -16,6 +16,7 @@ limitations under the License. */ ...@@ -16,6 +16,7 @@ limitations under the License. */
#include <memory> // for unique_ptr #include <memory> // for unique_ptr
#include <mutex> // for call_once #include <mutex> // for call_once
#include "glog/logging.h"
#include "paddle/string/printf.h" #include "paddle/string/printf.h"
namespace paddle { namespace paddle {
...@@ -23,7 +24,10 @@ namespace framework { ...@@ -23,7 +24,10 @@ namespace framework {
Scope::~Scope() { Scope::~Scope() {
DropKids(); DropKids();
for (auto& kv : vars_) delete kv.second; for (auto& kv : vars_) {
VLOG(3) << "Destroy variable " << kv.first;
delete kv.second;
}
} }
Scope& Scope::NewScope() const { Scope& Scope::NewScope() const {
...@@ -38,6 +42,7 @@ Variable* Scope::Var(const std::string& name) { ...@@ -38,6 +42,7 @@ Variable* Scope::Var(const std::string& name) {
} }
Variable* v = new Variable(); Variable* v = new Variable();
vars_[name] = v; vars_[name] = v;
VLOG(3) << "Create variable " << name << " on scope";
v->name_ = &(vars_.find(name)->first); v->name_ = &(vars_.find(name)->first);
return v; return v;
} }
...@@ -65,6 +70,23 @@ void Scope::DropKids() { ...@@ -65,6 +70,23 @@ void Scope::DropKids() {
kids_.clear(); kids_.clear();
} }
std::vector<std::string> Scope::GetAllNames(bool recursive) const {
std::vector<std::string> known_vars(vars_.size());
if (recursive) {
for (auto& kid : kids_) {
auto kid_vars = kid->GetAllNames();
for (auto& p : kid_vars) {
known_vars.emplace_back(p);
}
}
}
for (auto& p : vars_) {
known_vars.emplace_back(p.first);
}
return known_vars;
}
void Scope::DeleteScope(Scope* scope) { void Scope::DeleteScope(Scope* scope) {
auto it = std::find(this->kids_.begin(), this->kids_.end(), scope); auto it = std::find(this->kids_.begin(), this->kids_.end(), scope);
PADDLE_ENFORCE(it != this->kids_.end(), "Cannot find %p as kid scope", scope); PADDLE_ENFORCE(it != this->kids_.end(), "Cannot find %p as kid scope", scope);
......
...@@ -17,6 +17,7 @@ limitations under the License. */ ...@@ -17,6 +17,7 @@ limitations under the License. */
#include <list> #include <list>
#include <string> #include <string>
#include <unordered_map> #include <unordered_map>
#include <vector>
#include "paddle/framework/variable.h" #include "paddle/framework/variable.h"
#include "paddle/platform/macros.h" #include "paddle/platform/macros.h"
...@@ -64,6 +65,9 @@ class Scope { ...@@ -64,6 +65,9 @@ class Scope {
/// Drop all kids scopes belonged to this scope. /// Drop all kids scopes belonged to this scope.
void DropKids(); void DropKids();
// enumerate all the variables current contains.
std::vector<std::string> GetAllNames(bool recursive = false) const;
private: private:
// Call Scope::NewScope for a sub-scope. // Call Scope::NewScope for a sub-scope.
explicit Scope(Scope const* parent) : parent_(parent) {} explicit Scope(Scope const* parent) : parent_(parent) {}
......
...@@ -13,6 +13,7 @@ See the License for the specific language governing permissions and ...@@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License. */ limitations under the License. */
#include "paddle/framework/scope.h" #include "paddle/framework/scope.h"
#include "glog/logging.h"
#include "gtest/gtest.h" #include "gtest/gtest.h"
using paddle::framework::Scope; using paddle::framework::Scope;
...@@ -54,3 +55,17 @@ TEST(Scope, FindScope) { ...@@ -54,3 +55,17 @@ TEST(Scope, FindScope) {
EXPECT_EQ(&s, s.FindScope(v)); EXPECT_EQ(&s, s.FindScope(v));
EXPECT_EQ(&s, ss.FindScope(v)); EXPECT_EQ(&s, ss.FindScope(v));
} }
TEST(Scope, GetAllNames) {
Scope s;
Variable* v = s.Var("a");
EXPECT_EQ(&s, s.FindScope(v));
std::vector<std::string> ans = s.GetAllNames();
std::string str;
for (auto& var : ans) {
str += var;
}
EXPECT_STREQ("a", str.c_str());
}
...@@ -23,7 +23,10 @@ class SelectedRows { ...@@ -23,7 +23,10 @@ class SelectedRows {
value_.reset(new Tensor()); value_.reset(new Tensor());
} }
SelectedRows() { value_.reset(new Tensor()); } SelectedRows() {
height_ = 0;
value_.reset(new Tensor());
}
platform::Place place() const { return value_->place(); } platform::Place place() const { return value_->place(); }
...@@ -37,6 +40,8 @@ class SelectedRows { ...@@ -37,6 +40,8 @@ class SelectedRows {
const Vector<int64_t>& rows() const { return rows_; } const Vector<int64_t>& rows() const { return rows_; }
Vector<int64_t>* mutable_rows() { return &rows_; }
void set_rows(const Vector<int64_t>& rows) { rows_ = rows; } void set_rows(const Vector<int64_t>& rows) { rows_ = rows; }
DDim GetCompleteDims() const { DDim GetCompleteDims() const {
......
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/framework/shape_inference.h"
namespace paddle {
namespace framework {
std::vector<framework::DDim> InferShapeContext::GetInputsDim(
const std::string &name) const {
const std::vector<std::string> &names = Inputs(name);
return GetDims(names);
}
void InferShapeContext::SetOutputsDim(
const std::string &name, const std::vector<framework::DDim> &dims) {
auto &names = Outputs(name);
SetDims(names, dims);
}
void InferShapeContext::ShareLoD(const std::string &in, const std::string &out,
size_t i, size_t j) const {}
std::vector<framework::DDim> InferShapeContext::GetDims(
const std::vector<std::string> &names) const {
std::vector<framework::DDim> ret;
ret.reserve(names.size());
std::transform(
names.begin(), names.end(), std::back_inserter(ret),
[this](const std::string &name) { return this->GetDim(name); });
return ret;
}
void InferShapeContext::SetDims(const std::vector<std::string> &names,
const std::vector<framework::DDim> &dims) {
size_t length = names.size();
PADDLE_ENFORCE_EQ(length, dims.size());
for (size_t i = 0; i < length; ++i) {
SetDim(names[i], dims[i]);
}
}
} // namespace framework
} // namespace paddle
...@@ -14,6 +14,7 @@ limitations under the License. */ ...@@ -14,6 +14,7 @@ limitations under the License. */
#pragma once #pragma once
#include "paddle/framework/attribute.h"
#include "paddle/framework/ddim.h" #include "paddle/framework/ddim.h"
namespace paddle { namespace paddle {
...@@ -21,7 +22,7 @@ namespace framework { ...@@ -21,7 +22,7 @@ namespace framework {
class InferShapeContext { class InferShapeContext {
public: public:
virtual ~InferShapeContext() {} virtual ~InferShapeContext() = default;
virtual bool HasInput(const std::string &name) const = 0; virtual bool HasInput(const std::string &name) const = 0;
virtual bool HasOutput(const std::string &name) const = 0; virtual bool HasOutput(const std::string &name) const = 0;
...@@ -29,57 +30,32 @@ class InferShapeContext { ...@@ -29,57 +30,32 @@ class InferShapeContext {
virtual bool HasOutputs(const std::string &name) const = 0; virtual bool HasOutputs(const std::string &name) const = 0;
virtual framework::DDim GetInputDim(const std::string &name) const = 0; virtual framework::DDim GetInputDim(const std::string &name) const = 0;
std::vector<framework::DDim> GetInputsDim(const std::string &name) const {
const std::vector<std::string> &names = Inputs(name); std::vector<framework::DDim> GetInputsDim(const std::string &name) const;
return GetDims(names);
}
virtual void SetInputDim(const std::string &name,
const framework::DDim &dim) = 0;
void SetInputsDim(const std::string &name,
const std::vector<framework::DDim> &dims) {
auto &names = Inputs(name);
SetDims(names, dims);
}
virtual framework::DDim GetOutputDim(const std::string &name) const = 0;
std::vector<framework::DDim> GetOutputsDim(const std::string &name) const {
const std::vector<std::string> &names = Outputs(name);
return GetDims(names);
}
virtual void SetOutputDim(const std::string &name, const DDim &dim) = 0; virtual void SetOutputDim(const std::string &name, const DDim &dim) = 0;
void SetOutputsDim(const std::string &name, void SetOutputsDim(const std::string &name,
const std::vector<framework::DDim> &dims) { const std::vector<framework::DDim> &dims);
auto &names = Outputs(name);
SetDims(names, dims);
}
virtual AttrReader Attrs() const = 0; virtual AttrReader Attrs() const = 0;
virtual const std::vector<std::string> &Inputs( virtual const std::vector<std::string> &Inputs(
const std::string &name) const = 0; const std::string &name) const = 0;
virtual const std::vector<std::string> &Outputs( virtual const std::vector<std::string> &Outputs(
const std::string &name) const = 0; const std::string &name) const = 0;
// TODO(qiao) implement this function // TODO(qiao) implement this function
void ShareLoD(const std::string &in, const std::string &out, size_t i = 0, void ShareLoD(const std::string &in, const std::string &out, size_t i = 0,
size_t j = 0) const {} size_t j = 0) const;
protected: protected:
virtual framework::DDim GetDim(const std::string &name) const = 0; virtual framework::DDim GetDim(const std::string &name) const = 0;
virtual void SetDim(const std::string &name, const framework::DDim &dim) = 0; virtual void SetDim(const std::string &name, const framework::DDim &dim) = 0;
std::vector<framework::DDim> GetDims( std::vector<framework::DDim> GetDims(
const std::vector<std::string> &names) const { const std::vector<std::string> &names) const;
std::vector<framework::DDim> ret;
ret.reserve(names.size());
std::transform(
names.begin(), names.end(), std::back_inserter(ret),
[this](const std::string &name) { return this->GetDim(name); });
return ret;
}
void SetDims(const std::vector<std::string> &names, void SetDims(const std::vector<std::string> &names,
const std::vector<framework::DDim> &dims) { const std::vector<framework::DDim> &dims);
size_t length = names.size();
PADDLE_ENFORCE_EQ(length, dims.size());
for (size_t i = 0; i < length; ++i) {
SetDim(names[i], dims[i]);
}
}
}; };
} // namespace framework } // namespace framework
......
...@@ -31,6 +31,8 @@ namespace paddle { ...@@ -31,6 +31,8 @@ namespace paddle {
namespace framework { namespace framework {
class LoDTensor;
class Tensor { class Tensor {
public: public:
template <typename T, size_t D, int MajorType, typename IndexType> template <typename T, size_t D, int MajorType, typename IndexType>
...@@ -124,16 +126,25 @@ class Tensor { ...@@ -124,16 +126,25 @@ class Tensor {
inline Tensor Slice(const int& begin_idx, const int& end_idx) const; inline Tensor Slice(const int& begin_idx, const int& end_idx) const;
platform::Place place() const { platform::Place place() const {
PADDLE_ENFORCE_NOT_NULL(holder_, "Tensor get place() must contains holder"); PADDLE_ENFORCE_NOT_NULL(
holder_, "Tensor not initialized yet when Tensor::place() is called.");
return holder_->place(); return holder_->place();
} }
std::type_index type() const { return holder_->type(); } std::type_index type() const {
PADDLE_ENFORCE_NOT_NULL(
holder_, "Tensor not initialized yet when Tensor::type() is called.");
return holder_->type();
}
size_t memory_size() const;
private: private:
inline void check_memory_size() const; inline void check_memory_size() const;
private: private:
friend class LoDTensor;
/** /**
* @note Placeholder hides type T, so it doesn't appear as a template * @note Placeholder hides type T, so it doesn't appear as a template
* parameter of Variable. * parameter of Variable.
...@@ -181,7 +192,12 @@ class Tensor { ...@@ -181,7 +192,12 @@ class Tensor {
/*! holds the memory block if allocated. */ /*! holds the memory block if allocated. */
std::shared_ptr<Placeholder> holder_; std::shared_ptr<Placeholder> holder_;
/*! points to dimensions of memory block. */ /**
* @brief points to elements dimensions.
*
* @note dims_ do not indicate the memory block size.
*/
DDim dims_; DDim dims_;
/** /**
......
...@@ -20,6 +20,8 @@ ...@@ -20,6 +20,8 @@
#include <algorithm> #include <algorithm>
#include <limits> #include <limits>
#include "paddle/framework/eigen.h"
namespace paddle { namespace paddle {
namespace framework { namespace framework {
...@@ -104,10 +106,10 @@ void TensorArray::Write(size_t index, const LoDTensor& value) { ...@@ -104,10 +106,10 @@ void TensorArray::Write(size_t index, const LoDTensor& value) {
values_.resize(index + 1); values_.resize(index + 1);
} }
values_[index].set_lod(value.lod());
values_[index].Resize(value.dims()); values_[index].Resize(value.dims());
values_[index].mutable_data<value_type>(platform::CPUPlace()); values_[index].mutable_data<value_type>(value.place());
values_[index].CopyFrom(value, platform::CPUPlace(), values_[index].CopyFrom(value, value.place(), platform::CPUDeviceContext());
platform::CPUDeviceContext());
} }
void TensorArray::WriteShared(size_t index, const LoDTensor& value) { void TensorArray::WriteShared(size_t index, const LoDTensor& value) {
...@@ -116,6 +118,7 @@ void TensorArray::WriteShared(size_t index, const LoDTensor& value) { ...@@ -116,6 +118,7 @@ void TensorArray::WriteShared(size_t index, const LoDTensor& value) {
values_.resize(index + 1); values_.resize(index + 1);
} }
values_[index].set_lod(value.lod());
values_[index].ShareDataWith(value); values_[index].ShareDataWith(value);
} }
...@@ -144,6 +147,155 @@ DySeqMetaBatch TensorArray::Unpack(const LoDTensor& source, int level, ...@@ -144,6 +147,155 @@ DySeqMetaBatch TensorArray::Unpack(const LoDTensor& source, int level,
return unpacker.meta; return unpacker.meta;
} }
LoDTensor TensorArray::LodPack(size_t level) const {
PADDLE_ENFORCE_GT(size(), 0UL, "no time step exists");
// the levels should be no less than 2
LoDTensor merged;
const LoDTensor *pre, *cur;
pre = &Read(0);
for (size_t step = 1; step < size(); step++) {
cur = &Read(step);
PADDLE_ENFORCE_GT(cur->NumLevels(), 0);
PADDLE_ENFORCE_GT(pre->NumLevels(), 0);
PADDLE_ENFORCE_EQ(pre->NumLevels(), cur->NumLevels());
PADDLE_ENFORCE_EQ(pre->NumElements(level), cur->NumElements(level));
merged = LodPackTwo(*pre, *cur, level);
pre = &merged;
}
return merged;
}
/*
* NOTE currently, only the lowest level supports packing.
* The lowest LoD will be changed, while the relative offsets in levels above
* stay unchanged.
*
* previous step : [0] [1] [3]
* current step: [0 1 2] [2 3] []
* packed to
* [0 0] [0 1] [0 2] [1 2] [1 3] [3]
*/
LoDTensor TensorArray::LodPackTwo(const LoDTensor& pre, const LoDTensor& cur,
size_t level) const {
PADDLE_ENFORCE_EQ(pre.NumLevels(), cur.NumLevels());
PADDLE_ENFORCE_EQ(pre.NumLevels(), level + 1,
"Only the lowest LoD level supports pack temporarily.");
// calculate the result tensor's shape first
size_t num_instances = 0;
for (size_t elem = 0; elem < pre.NumElements(level); elem++) {
size_t prefix_size = pre.NumElements(level, elem);
size_t num_candidates = cur.NumElements(level, elem);
if (num_candidates > 0) {
num_instances += num_candidates * (prefix_size + 1);
} else {
num_instances += prefix_size;
}
}
auto res_dims = pre.dims();
res_dims[0] = num_instances;
LoDTensor result;
result.Resize(res_dims);
result.mutable_data<value_type>(cur.place());
Vector<size_t> last_lod_level;
// copy data
size_t index = 0;
last_lod_level.push_back(index);
for (size_t elem = 0; elem < pre.NumElements(level); elem++) {
size_t prefix_size = pre.NumElements(level, elem);
size_t num_candidates = cur.NumElements(level, elem);
// slice the prefix Tensor
LoDTensor prefix = pre;
prefix.ShrinkInLevel(level, elem, elem + 1);
LoDTensor candidate = cur;
if (num_candidates > 0) {
candidate.ShrinkInLevel(level, elem, elem + 1);
} else { // just push prefix
result.Slice(index, index + prefix_size)
.CopyFrom(prefix, result.place(), platform::CPUDeviceContext());
index += prefix_size;
last_lod_level.push_back(index);
}
for (size_t candi = 0; candi < num_candidates; candi++) {
// TODO(superjom) support GPU
result.Slice(index, index + prefix_size)
.CopyFrom(prefix, result.place(), platform::CPUDeviceContext());
index += prefix_size;
// copy candidate record
result.Slice(index, index + 1)
.CopyFrom(candidate.Slice(candi, candi + 1), result.place(),
platform::CPUDeviceContext());
index++;
last_lod_level.push_back(index);
}
}
// update lod
auto lod = cur.lod();
lod.back() = last_lod_level;
result.set_lod(lod);
return result;
}
/*
* source [0 1 2] [3 4] [5 6 7] will be transformd to a list of LoDTensors such
* as
* [0 3 5] [1 4 6] [2 7] with 1-level LoDs:
* - [0 1 2 3]
* - [0 1 2 3]
* - [0 1 1 2], the [1,1) here means the second sequence is empty
*
* NOTE Unpack a LoDTensor in this approach may result in a big LoD.
*/
void TensorArray::LodUnpack(const LoDTensor& source, size_t level) {
PADDLE_ENFORCE_EQ(level, source.NumLevels() - 1,
"only the lowest LoD level supports unpack.");
const size_t non_empty_instances = source.dims()[0];
size_t index = 0;
Vector<size_t> lowest_lod_level;
lowest_lod_level.push_back(index);
for (size_t step = 0; step < non_empty_instances; step++) {
size_t num_instances = 0;
for (size_t id = 0; id < source.NumElements(level); id++) {
auto instance = source;
instance.ShrinkInLevel(level, id, id + 1);
if (static_cast<size_t>(instance.dims()[0]) > step) {
num_instances++;
index++;
}
lowest_lod_level.push_back(index);
}
// create tensor for this time step
LoDTensor tensor;
auto dims = source.dims();
dims[0] = num_instances;
// set lod
auto lod = source.lod();
lod.back() = lowest_lod_level;
tensor.set_lod(lod);
index = 0;
for (size_t id = 0; id < source.NumElements(level); id++) {
auto instance = source;
instance.ShrinkInLevel(level, id, id + 1);
if (static_cast<size_t>(instance.dims()[0]) > step) {
// copy this instance
tensor.Slice(index, index + 1)
.CopyFrom(instance.Slice(step, step + 1), tensor.place(),
platform::CPUDeviceContext());
index++;
}
}
Write(step, tensor);
}
}
LoDTensor TensorArray::Stack() const { LoDTensor TensorArray::Stack() const {
LoDTensor result; LoDTensor result;
if (size() == 0) return result; if (size() == 0) return result;
......
...@@ -86,6 +86,16 @@ class TensorArray { ...@@ -86,6 +86,16 @@ class TensorArray {
*/ */
DySeqMetaBatch Unpack(const LoDTensor &source, int level, bool length_desend); DySeqMetaBatch Unpack(const LoDTensor &source, int level, bool length_desend);
/*
* Pack an array of LoDTensors to a LoDTensor.
*/
LoDTensor LodPack(size_t level) const;
/*
* Unpack a LoDTensor to an array of LoDTensors.
*/
void LodUnpack(const LoDTensor &source, size_t level);
/* /*
* Pack the values into a tensor with rank one higher than each tensor in * Pack the values into a tensor with rank one higher than each tensor in
* values. * values.
...@@ -111,6 +121,9 @@ class TensorArray { ...@@ -111,6 +121,9 @@ class TensorArray {
protected: protected:
void Unstack(const LoDTensor &source, bool data_shared) const; void Unstack(const LoDTensor &source, bool data_shared) const;
LoDTensor LodPackTwo(const LoDTensor &pre, const LoDTensor &cur,
size_t level) const;
private: private:
mutable std::vector<LoDTensor> values_; mutable std::vector<LoDTensor> values_;
}; // class TensorArray }; // class TensorArray
......
...@@ -126,5 +126,57 @@ TEST_F(TensorArrayTester, size) { ...@@ -126,5 +126,57 @@ TEST_F(TensorArrayTester, size) {
ASSERT_EQ(ta.size(), static_cast<size_t>(batch_size)); ASSERT_EQ(ta.size(), static_cast<size_t>(batch_size));
} }
TEST(TensorArray, LodPack) {
// three time steps, each step stores a LoDTensors
// - [0] [1]
// - [2 3], [4 5]
// - [6 7] [] [8], [9, 10]
// try to get a LoDTensor with content:
// - [0 2 6]
// - [0 2 7]
// - [0 3]
// - [1 4 8]
// - [1 5 9]
// - [1 5 10]
std::array<LoDTensor, 3> tensors;
tensors[0].Resize(make_ddim({2, 1}));
tensors[1].Resize(make_ddim({4, 1}));
tensors[2].Resize(make_ddim({5, 1}));
int index = 0;
for (auto& t : tensors) {
t.mutable_data<int>(platform::CPUPlace());
for (int i = 0; i < t.dims()[0]; i++) {
t.data<int>()[i] = index;
index++;
}
}
std::array<LoD, 3> lods;
std::vector<std::vector<size_t>> levels{
{0, 1, 2}, {0, 2, 4}, {0, 2, 2, 3, 5}};
for (int i = 0; i < 3; i++) {
lods[i].emplace_back(levels[i].begin(), levels[i].end());
}
TensorArray ta;
for (int i = 0; i < 3; i++) {
tensors[i].set_lod(lods[i]);
ta.Write(i, tensors[i]);
}
auto merged = ta.LodPack(0);
std::vector<int> target_tensor_data{{0, 2, 6, // 0
0, 2, 7, // 1
0, 3, // 2
1, 4, 8, // 3
1, 5, 9, // 5
1, 5, 10}};
EXPECT_EQ(merged.dims()[0], (int)target_tensor_data.size());
for (size_t i = 0; i < target_tensor_data.size(); i++) {
EXPECT_EQ(target_tensor_data[i], merged.data<int>()[i]);
}
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -62,12 +62,16 @@ inline void Tensor::check_memory_size() const { ...@@ -62,12 +62,16 @@ inline void Tensor::check_memory_size() const {
PADDLE_ENFORCE_NOT_NULL( PADDLE_ENFORCE_NOT_NULL(
holder_, "Tensor holds no memory. Call Tensor::mutable_data first."); holder_, "Tensor holds no memory. Call Tensor::mutable_data first.");
PADDLE_ENFORCE_GE( PADDLE_ENFORCE_GE(
holder_->size(), numel() * SizeOfType(type()) + offset_, holder_->size(), memory_size() + offset_,
"Tensor's dims_ is out of bound. Call Tensor::mutable_data " "Tensor's dims_ is out of bound. Call Tensor::mutable_data "
"first to re-allocate memory.\n" "first to re-allocate memory.\n"
"or maybe the required data-type mismatches the data already stored."); "or maybe the required data-type mismatches the data already stored.");
} }
inline size_t Tensor::memory_size() const {
return holder_ == nullptr ? 0UL : numel() * SizeOfType(type());
}
template <typename T> template <typename T>
inline const T* Tensor::data() const { inline const T* Tensor::data() const {
check_memory_size(); check_memory_size();
......
...@@ -28,6 +28,8 @@ class OperatorBase; ...@@ -28,6 +28,8 @@ class OperatorBase;
class OpDescBind; class OpDescBind;
class BlockDescBind; class BlockDescBind;
class BlockDesc; class BlockDesc;
class InferShapeContext;
using VariableNameMap = std::map<std::string, std::vector<std::string>>; using VariableNameMap = std::map<std::string, std::vector<std::string>>;
// The order should be as same as framework.proto // The order should be as same as framework.proto
...@@ -49,5 +51,7 @@ using GradOpMakerFN = std::function<std::vector<std::unique_ptr<OpDescBind>>( ...@@ -49,5 +51,7 @@ using GradOpMakerFN = std::function<std::vector<std::unique_ptr<OpDescBind>>(
using InferVarTypeFN = std::function<void(const OpDescBind& /*op_desc*/, using InferVarTypeFN = std::function<void(const OpDescBind& /*op_desc*/,
BlockDescBind* /*block*/)>; BlockDescBind* /*block*/)>;
using InferShapeFN = std::function<void(InferShapeContext*)>;
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -18,6 +18,10 @@ limitations under the License. */ ...@@ -18,6 +18,10 @@ limitations under the License. */
namespace paddle { namespace paddle {
namespace framework { namespace framework {
VarDesc::VarType VarDescBind::GetType() const { return desc_.type(); }
void VarDescBind::SetType(VarDesc::VarType type) { desc_.set_type(type); }
void VarDescBind::SetShape(const std::vector<int64_t> &dims) { void VarDescBind::SetShape(const std::vector<int64_t> &dims) {
VectorToRepeated(dims, mutable_tensor_desc()->mutable_dims()); VectorToRepeated(dims, mutable_tensor_desc()->mutable_dims());
} }
......
...@@ -59,6 +59,8 @@ class VarDescBind { ...@@ -59,6 +59,8 @@ class VarDescBind {
desc_.set_type(VarDesc::LOD_TENSOR); desc_.set_type(VarDesc::LOD_TENSOR);
} }
explicit VarDescBind(const VarDesc &desc) : desc_(desc) {}
VarDesc *Proto() { return &desc_; } VarDesc *Proto() { return &desc_; }
std::string Name() const { return desc_.name(); } std::string Name() const { return desc_.name(); }
...@@ -75,9 +77,9 @@ class VarDescBind { ...@@ -75,9 +77,9 @@ class VarDescBind {
int32_t GetLodLevel() const; int32_t GetLodLevel() const;
VarDesc::VarType GetType() const { return desc_.type(); } VarDesc::VarType GetType() const;
void SetType(VarDesc::VarType type) { desc_.set_type(type); } void SetType(VarDesc::VarType type);
bool Persistable() const { return desc_.persistable(); } bool Persistable() const { return desc_.persistable(); }
......
...@@ -46,6 +46,8 @@ class Variable { ...@@ -46,6 +46,8 @@ class Variable {
std::type_index(typeid(T)) == std::type_index(holder_->Type()); std::type_index(typeid(T)) == std::type_index(holder_->Type());
} }
void Clear() { holder_.reset(); }
private: private:
struct Placeholder { struct Placeholder {
virtual ~Placeholder() {} virtual ~Placeholder() {}
......
此差异已折叠。
/* Copyright (c) 2017 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "MKLDNNLayer.h"
#include "mkldnn.hpp"
namespace paddle {
typedef mkldnn::batch_normalization_forward bn_fwd;
typedef mkldnn::batch_normalization_backward bn_bwd;
/**
* @brief A subclass of MKLDNNLayer BatchNorm layer.
*
* The config file api is mkldnn_batch_norm
*/
class MKLDNNBatchNormLayer : public MKLDNNLayer {
protected:
// save forward primitive_desc, which can be used backward
std::shared_ptr<bn_fwd::primitive_desc> fwdPD_;
// Epsilon value used in the batch normalization formula.
static const real EPS;
// weight and bias in paddle
std::unique_ptr<Weight> weight_;
std::unique_ptr<Weight> biases_;
// mkldnn use a large buffer store both scale and shift
// which are weight and bias in paddle corresponding.
MatrixPtr valueScaleShift_;
MatrixPtr gradScaleShift_;
// Moving average of mean.
std::unique_ptr<Weight> movingMean_;
// Moving average of variance.
std::unique_ptr<Weight> movingVar_;
// if useGlobalStats_ is true, will use the loaded mean and variance.
// otherwise, calculate mean and variance in every mini-batch.
bool useGlobalStats_;
// used in MKLDNN primitive desc
unsigned flags_;
// use to compute moving mean and variance.
real movingAvgFraction_;
// whether the weight has been init
bool hasInitedWgt_;
// local mean and variance
// when useGlobalStats_ they are loaded from moving mean and variance
// when do not useGlobalStats_ they are calculated from this mini-batch
MKLDNNMatrixPtr mean_;
MKLDNNMatrixPtr var_;
public:
explicit MKLDNNBatchNormLayer(const LayerConfig& config)
: MKLDNNLayer(config), useGlobalStats_(true), hasInitedWgt_(false) {}
~MKLDNNBatchNormLayer() {}
bool init(const LayerMap& layerMap,
const ParameterMap& parameterMap) override;
void forward(PassType passType) override;
void reshape(
int& bs, int& ic, int& ih, int& iw, int oc, int& oh, int& ow) override;
void resetFwd(std::vector<mkldnn::primitive>& pipeline,
MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& bias,
MKLDNNMatrixPtr& out) override;
void resetBwd(std::vector<mkldnn::primitive>& pipeline,
MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& bias,
MKLDNNMatrixPtr& out) override;
void updateWeights(const UpdateCallback& callback) override;
void convertWeightsFromPaddle() override;
protected:
void initWeight();
/**
* cal moving mean and variance.
* moving = moving * AvgFraction + local * (1 - AvgFraction)
*/
void calMovingMeanAndVar();
/**
* Forward functions: reset buffers(input, weight, output),
* reset primitive descriptor,
* reset pipeline.
*/
void resetFwdBuffers(MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& out);
void resetFwdPD(std::shared_ptr<bn_fwd::primitive_desc>& pd,
MKLDNNMatrixPtr in,
MKLDNNMatrixPtr wgt,
MKLDNNMatrixPtr out);
void resetFwdPipeline(std::vector<mkldnn::primitive>& pipeline,
std::shared_ptr<bn_fwd::primitive_desc>& pd,
MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& out);
/**
* Backward functions: reset buffers(input, weight, output),
* reset primitive descriptor,
* reset pipeline.
*/
void resetBwdBuffers(MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& out);
void resetBwdPD(std::shared_ptr<bn_bwd::primitive_desc>& pd,
MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& out);
void resetBwdPipeline(std::vector<mkldnn::primitive>& pipeline,
std::shared_ptr<bn_bwd::primitive_desc>& pd,
MKLDNNMatrixPtr& in,
MKLDNNMatrixPtr& wgt,
MKLDNNMatrixPtr& out);
};
} // namespace paddle
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -60,7 +60,7 @@ public: ...@@ -60,7 +60,7 @@ public:
*/ */
inline real* get(int row) const { inline real* get(int row) const {
if (preallocatedBuf_) { if (preallocatedBuf_) {
CHECK_LE((row + 1) * width_ * sizeof(real), preallocatedBuf_->getSize()); CHECK_LE((row)*width_ * sizeof(real), preallocatedBuf_->getSize());
return reinterpret_cast<real*>(preallocatedBuf_->getBuf()) + row * width_; return reinterpret_cast<real*>(preallocatedBuf_->getBuf()) + row * width_;
} else { } else {
CHECK_LE((row + 1) * width_, rowStore_.size()); CHECK_LE((row + 1) * width_, rowStore_.size());
......
add_subdirectory(detail) add_subdirectory(detail)
cc_library(memory SRCS memory.cc) cc_library(memory SRCS memory.cc DEPS place)
cc_library(memcpy SRCS memcpy.cc) cc_library(memcpy SRCS memcpy.cc)
cc_library(paddle_memory cc_library(paddle_memory
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册