diff --git a/doc/about/index_cn.md b/doc/about/index_cn.md deleted file mode 100644 index 3bf030004d4de8c6f3cb773c6e78c09f40878c5f..0000000000000000000000000000000000000000 --- a/doc/about/index_cn.md +++ /dev/null @@ -1,11 +0,0 @@ -关于PaddlePaddle -================ - -PaddlePaddle是一个最早由百度科学家和工程师共同研发的并行分布式深度学习平台,兼备易用性、高效性、灵活性和可扩展性,目前已被百度内部多个产品线广泛使用。 -PaddlePaddle目前已经开放源码, 但是远未完善,我们希望能在这个基础上不断的改进、扩展和延伸。 -同时我们希望广大开发者积极提供反馈和贡献源代码,建立一个活跃的开源社区。 - -致谢 --------- - -在此,特别感谢PaddlePaddle的[所有贡献者](https://github.com/PaddlePaddle/Paddle/graphs/contributors)。 diff --git a/doc/about/index_en.rst b/doc/about/index_en.rst deleted file mode 100644 index 065c430cdea802ed3c9f487cd00255b85a5598a5..0000000000000000000000000000000000000000 --- a/doc/about/index_en.rst +++ /dev/null @@ -1,14 +0,0 @@ -ABOUT -======= - -PaddlPaddle is an easy-to-use, efficient, flexible and scalable deep learning platform, -which is originally developed by Baidu scientists and engineers for the purpose of applying deep learning to many products at Baidu. - -PaddlePaddle is now open source but far from complete, which is intended to be built upon, improved, scaled, and extended. -We hope to build an active open source community both by providing feedback and by actively contributing to the source code. - - -Credits --------- - -We owe many thanks to `all contributors and developers `_ of PaddlePaddle! diff --git a/doc/index_en.rst b/doc/index_en.rst index 168c7667c61da677905585d6c4b5037ce80b3765..64684b8b9b27e245c6b32ea28809d3bbce22fab9 100644 --- a/doc/index_en.rst +++ b/doc/index_en.rst @@ -7,4 +7,3 @@ PaddlePaddle Documentation getstarted/index_en.rst howto/index_en.rst api/index_en.rst - about/index_en.rst diff --git a/paddle/framework/backward.cc b/paddle/framework/backward.cc index bfda18724cc8ed23a40e0626ff07a290d26aa9d2..6b4c612cd8d9263258e3987914c44002e7bca92c 100644 --- a/paddle/framework/backward.cc +++ b/paddle/framework/backward.cc @@ -124,6 +124,9 @@ static std::unique_ptr BackwardRecursive( std::list insert_position; for (auto& dup_output_op : dup_output_ops) { const std::string& name = dup_output_op.first; + // duplicate @Empty@ don't need to be added + if (name == kEmptyVarName) continue; + auto& dup_op = dup_output_op.second; // no duplicate output if (dup_op.size() == 1) continue; @@ -209,7 +212,7 @@ std::unique_ptr Backward( const OperatorBase& forwardOp, const std::unordered_set& no_grad_vars) { std::unordered_set no_grad_names; - no_grad_names.reserve(no_grad_vars.size()); + no_grad_names.reserve(no_grad_vars.size() + 1); no_grad_names.insert(std::string(kEmptyVarName) + kGradVarSuffix); diff --git a/paddle/framework/backward.md b/paddle/framework/backward.md index c8fa3fefe5632a36d9044b4bccfd3dbb7c64dbf6..8aa6728a95bc464ab8884986f0cec6c817d3303b 100644 --- a/paddle/framework/backward.md +++ b/paddle/framework/backward.md @@ -1,23 +1,53 @@ -## Operator/expression 's Backward +# Operator/expression 's Backward -### Motivation +## Motivation -In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the fundmental gradient operators/expressions together with chain rule . Every forward network need a backward network to construct the full computation lineage, the operator/ expression's Backward feature will generate the backward pass respect to forward pass. +In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the fundmental gradient operators/expressions together with chain rule . Every forward network need a backward network to construct the full computation graph, the operator/expression's backward pass will be generated respect to forward pass. + +## Backward Operator Registry -### Implement : gradient operator registry +A backward network is built up with several backward operators. Backward operators take forward operators' inputs, outputs and output gradients and then calculate its input gradients. -| | forward operator | backward operator | -| ---------------------- | ---------------- | -------------------------------- | -| **Operator::inputs_** | Inputs | Inputs, Outputs, OutputGradients | -| **Operator::outputs_** | Outputs | InputGradients | +| | forward operator | backward operator +| ---------------------- | ---------------- |------------------------- | +| **Operator::inputs_** | Inputs | Inputs, Outputs, OutputGradients | +| **Operator::outputs_** | Outputs | InputGradients | -Inputs/Outputs means the input/output of the operator, InputGradients/OutputGradients is the gradient respect to forward opeartor. Forward operator and Backward operator are isomorphic, save their corresponding needs into member attribute. + In most cases, there is a one-to-one correspondence between forward and backward operators. These correspondences are recorded by a global hash map(`OpInfoMap`). To follow the philosophy of minimum core and make operators pluggable, the registry mechanism is introduced. -We use a global hash map record the gradient operators available, follow the philosophy of minimum core, make operator pluggable unit. Each gradient is an operator and it needs to regist itself. +For example, we have got a `mul_op`, and we can register it's information and corresponding backward operator by the following macro: -grad_op_builder(fengjiayi) +```cpp +REGISTER_OP(mul, MulOp, MulOpMaker, mul_grad, MulOpGrad); +``` -### Implement : Backward network +`mul` is the operator's type. `MulOp` and `MulOpMaker` are the operator class and the operator maker class respectively. + +`mul_grad` is the type of backward operator, and `MulOpGrad` is its class name. + +## Backward Opeartor Creating + +Given a certain forward operator, we can get its corresponding backward opeartor by calling: + +```cpp +OperatorBase* bwd_op = BuildGradOp(const OperatorBase* fwd_op); +``` + +The function `BuildGradOp` will sequentially execute following processes: + +1. Get the `type_` of given forward operator, and then get the corresponding backward operator's type by looking up the `OpInfoMap`. + +2. Build two maps named `inputs` and `outputs` to temporary storage backward operator's inputs and outputs. Copy forward operator's `inputs_` and `outputs_` to map `inputs`, except these are not necessary for gradient computing. + +3. Add forward inputs' gradient variables into map `output`, adding forward outputs' gradient variables into map `input`. + +4. Building backward operator with `inputs`, `outputs` and forward operator's attributes. + +## Backward Network Building + +A backward network is a series of backward operators. The main idea of building a backward network is creating backward operators in the inverted sequence and put them together. + +In our design, the network itself is also a kind of operator. So the operators contained by a big network may be some small network. given a forward network, it generates the backward network. We only care about the Gradients—`OutputGradients`,`InputGradients`. diff --git a/paddle/operators/net_op.cc b/paddle/operators/net_op.cc index 44d925f0b0cc5ff20d52e548816f118c2027343a..78b5e2767842312722fac3509e843a05fe194559 100644 --- a/paddle/operators/net_op.cc +++ b/paddle/operators/net_op.cc @@ -31,10 +31,13 @@ void NetOp::CompleteAddOp(bool calc) { for (auto& op : ops_) { for (auto& ipt : op->Inputs()) { for (auto& var_name : ipt.second) { - if (!Contains(output_set, var_name)) { // Not other op's output - input_set.insert(var_name); - } else { + // If input variable has been in output set, then it will be + // added into intermediate_outputs_. Otherwise, it will be + // added into input set. + if (Contains(output_set, var_name)) { intermediate_outputs_.insert(var_name); + } else { + input_set.insert(var_name); } } } diff --git a/python/paddle/v2/parameters.py b/python/paddle/v2/parameters.py index b8af5abaeada49a3e8951c21c9065aaf4d1ab851..4cfd91882e2d5f0098d27b8897359152ddd94dda 100644 --- a/python/paddle/v2/parameters.py +++ b/python/paddle/v2/parameters.py @@ -14,6 +14,7 @@ import numpy as np from paddle.proto.ParameterConfig_pb2 import ParameterConfig +from collections import OrderedDict import paddle.trainer.config_parser as cp import struct import tarfile @@ -42,9 +43,25 @@ def create(layers): class Parameters(object): """ - Parameters is a dictionary contains Paddle's parameter. The key of - Parameters is the name of parameter. The value of Parameters is a plain - :code:`numpy.ndarry` . + `Parameters` manages all the learnable parameters in a neural network. + It stores parameters' information in an OrderedDict. The key is + the name of a parameter, and value is a parameter's configuration(in + protobuf format), such as initialization mean and std, its size, whether it + is a static parameter, and so on. + + :param __param_conf__: store the configurations of learnable parameters in + the network in an OrderedDict. Parameter is added one by one into the + dict by following their created order in the network: parameters of + the previous layers in a network are careted first. You can visit the + parameters from bottom to top by iterating over this dict. + :type __param_conf__: OrderedDict + :param __gradient_machines__: all of the parameters in a neural network are + appended to a PaddlePaddle gradient machine, which is used internally to + copy parameter values between C++ and Python end. + :type __gradient_machines__: list + :param __tmp_params__: a dict to store dummy parameters if no + __gradient_machines__ is appended to `Parameters`. + :type __tmp_params__: dict Basically usage is @@ -62,7 +79,7 @@ class Parameters(object): """ def __init__(self): - self.__param_conf__ = dict() + self.__param_conf__ = OrderedDict() self.__gradient_machines__ = [] self.__tmp_params__ = dict() @@ -231,6 +248,9 @@ class Parameters(object): :rtype: np.ndarray """ import py_paddle.swig_paddle as api + if self.__param_conf__[key].is_static: + return np.zeros(self.__param_conf__[key].size, dtype=np.float32) + return self.__getter_inner(key, api.PARAMETER_GRADIENT) def set(self, parameter_name, value): @@ -250,7 +270,7 @@ class Parameters(object): append gradient machine to parameters. This method is used internally in Trainer.train. - :param gradient_machine: Paddle C++ GradientMachine object. + :param gradient_machine: PaddlePaddle C++ GradientMachine object. :type gradient_machine: api.GradientMachine :return: """