提交 fb2aa717 编写于 作者: K kexinzhao 提交者: Yi Wang

Polish Operators Docs (r) (#5377)

* polish r operator docs

* fix on naming convention
上级 30a85204
...@@ -44,17 +44,21 @@ public: ...@@ -44,17 +44,21 @@ public:
AddOutput("Out", "(Tensor) Accumulated output tensor"); AddOutput("Out", "(Tensor) Accumulated output tensor");
AddAttr<float>("gamma", "(float, default 1.0) Accumulation multiplier").SetDefault(1.0f); AddAttr<float>("gamma", "(float, default 1.0) Accumulation multiplier").SetDefault(1.0f);
AddComment(R"DOC( AddComment(R"DOC(
Accumulate operator accumulates the input tensor to the output tensor. If the Accumulate Operator.
This operator accumulates the input tensor to the output tensor. If the
output tensor already has the right size, we add to it; otherwise, we first output tensor already has the right size, we add to it; otherwise, we first
initialize the output tensor to all zeros, and then do accumulation. Any initialize the output tensor to all zeros, and then do accumulation. Any
further calls to the operator, given that no one else fiddles with the output further calls to the operator, given that no one else fiddles with the output
in the interim, will do simple accumulations. in the interim, will do simple accumulations.
Accumulation is done as shown:
Accumulation is done as follows:
Out = 1*X + gamma*Out Out = 1*X + gamma*Out
where X is the input tensor, Out is the output tensor and gamma is the multiplier where X is the input tensor, Out is the output tensor and gamma is the multiplier
argument. argument.
)DOC"); )DOC");
} }
}; };
......
...@@ -26,9 +26,9 @@ class RankLossOp : public framework::OperatorWithKernel { ...@@ -26,9 +26,9 @@ class RankLossOp : public framework::OperatorWithKernel {
void InferShape(framework::InferShapeContext *ctx) const override { void InferShape(framework::InferShapeContext *ctx) const override {
// input check // input check
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null"); PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null.");
PADDLE_ENFORCE(ctx->HasInput("Left"), "Input(Left) shouldn't be null"); PADDLE_ENFORCE(ctx->HasInput("Left"), "Input(Left) shouldn't be null.");
PADDLE_ENFORCE(ctx->HasInput("Right"), "Input(Right) shouldn't be null"); PADDLE_ENFORCE(ctx->HasInput("Right"), "Input(Right) shouldn't be null.");
auto label_dims = ctx->GetInputDim("Label"); auto label_dims = ctx->GetInputDim("Label");
auto left_dims = ctx->GetInputDim("Left"); auto left_dims = ctx->GetInputDim("Left");
...@@ -50,32 +50,32 @@ class RankLossOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -50,32 +50,32 @@ class RankLossOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("Label", AddInput("Label",
"The label indicating A ranked higher than B or not, row vector."); "The label indicating A ranked higher than B or not, row vector.");
AddInput("Left", "The output of RankNet for doc A, vector."); AddInput("Left", "The output of RankNet for doc A, vector.");
AddInput("Right", "The output of RankNet for doc B, vetor"); AddInput("Right", "The output of RankNet for doc B, vetor.");
AddOutput("Out", "The output loss of RankLoss operator, vector."); AddOutput("Out", "The output loss of RankLoss operator, vector.");
AddComment(R"DOC(RankLoss operator AddComment(R"DOC(
RankLoss Operator.
Rank loss operator for RankNet[1]. RankNet is a pairwise ranking model with RankLoss operator for RankNet
(http://icml.cc/2015/wp-content/uploads/2015/06/icml_ranking.pdf).
RankNet is a pairwise ranking model with
one training sample consisting of a pair of doc A and B, and the label P one training sample consisting of a pair of doc A and B, and the label P
indicating that A is ranked higher than B or not: indicating that A is ranked higher than B or not:
P = {0, 1} or {0, 0.5, 1}, where 0.5 means no information about the rank of P = {0, 1} or {0, 0.5, 1}, where 0.5 means no information about the rank of
the input pair. the input pair.
The RankLoss operator contains three inputs: Left (o_i), Right (o_j) and Label The RankLoss operator takes three inputs: Left (o_i), Right (o_j) and Label
(P_{i,j}), which represent the output of RankNet for two docs and the label (P_{i,j}), which represent the output of RankNet for the two docs and the label,
respectively, and yields the rank loss C_{i,j} by following the expression respectively, and yields the rank loss C_{i,j} using the following equation:
\f[ \f$$
C_{i,j} = -\tilde{P_{ij}} * o_{i,j} + log(1 + e^{o_{i,j}}) \\ C_{i,j} = -\tilde{P_{ij}} * o_{i,j} + log(1 + e^{o_{i,j}}) \\
o_{i,j} = o_i - o_j \\ o_{i,j} = o_i - o_j \\
\tilde{P_{i,j}} = \left \{0, 0.5, 1 \right \} \ or \ \left \{0, 1 \right \} \tilde{P_{i,j}} = \left \{0, 0.5, 1 \right \} \ or \ \left \{0, 1 \right \}
\f] \f$$
The operator can take inputs of one sample or in batch. The operator can take inputs of one sample or in batch.
[1]. Chris Burges, Tal Shaked, Erin Renshaw, et al. Learning to
Rank using Gradient Descent.
http://icml.cc/2015/wp-content/uploads/2015/06/icml_ranking.pdf
)DOC"); )DOC");
} }
}; };
......
...@@ -509,14 +509,14 @@ class RecurrentOpProtoMaker : public framework::OpProtoAndCheckerMaker { ...@@ -509,14 +509,14 @@ class RecurrentOpProtoMaker : public framework::OpProtoAndCheckerMaker {
AddInput(kInitialStates, "rnn initial states").AsDuplicable(); AddInput(kInitialStates, "rnn initial states").AsDuplicable();
AddInput(kParameters, AddInput(kParameters,
"Parameters are used by step block as its input. However, the " "Parameters are used by step block as its input. However, the "
"inputs is not a sequence tensor. Every time step, each operator " "input is not a sequence tensor. Every time step, each operator "
"in step block just use the parameter directly") "in step block just use the parameter directly.")
.AsDuplicable(); .AsDuplicable();
AddOutput(kOutputs, AddOutput(kOutputs,
"The output sequence of RNN. The sequence length must be same") "The output sequence of RNN. The sequence length must be same.")
.AsDuplicable(); .AsDuplicable();
AddOutput(kStepScopes, AddOutput(kStepScopes,
"StepScopes contains all local variables in each time step."); "StepScopes contain all local variables in each time step.");
AddAttr<std::vector<std::string>>(kExStates, AddAttr<std::vector<std::string>>(kExStates,
string::Sprintf( string::Sprintf(
R"DOC(The ex-state variable names. R"DOC(The ex-state variable names.
...@@ -556,10 +556,12 @@ if reverse is True ...@@ -556,10 +556,12 @@ if reverse is True
o o o o o o o o
)DOC").SetDefault(false); )DOC").SetDefault(false);
AddAttr<bool>(kIsTrain, "").SetDefault(true); AddAttr<bool>(kIsTrain, "").SetDefault(true);
AddComment(R"DOC(Static Length Recurrent Operator AddComment(R"DOC(
Static Length Recurrent Operator.
The static length recurrent operator can only operate on fixed size sequence
data, i.e. in each mini-batch, the sequence length of all inputs are the same.
The static length recurrent operator can only operate on fix sized sequence
data, i.e. in each mini-batch, the sequence length of all inputs are same.
)DOC"); )DOC");
} }
}; };
......
...@@ -80,24 +80,27 @@ class ReduceOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -80,24 +80,27 @@ class ReduceOpMaker : public framework::OpProtoAndCheckerMaker {
public: public:
ReduceOpMaker(framework::OpProto *proto, framework::OpAttrChecker *op_checker) ReduceOpMaker(framework::OpProto *proto, framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput( AddInput("X",
"X", "(Tensor) The input tensor. Tensors with rank at most 6 are "
"(Tensor) The input tensor. Tensors with rank at most 6 are supported"); "supported.");
AddOutput("Out", "(Tensor) The result tensor."); AddOutput("Out", "(Tensor) The result tensor.");
AddAttr<int>( AddAttr<int>(
"dim", "dim",
"(int, default 1) The dimension to reduce. " "(int, default 0) The dimension to reduce. "
"Must be in the range [-rank(input), rank(input)). " "Must be in the range [-rank(input), rank(input)). "
"If `dim < 0`, the dim to reduce is `rank + dim`. " "If `dim < 0`, the dim to reduce is `rank + dim`. "
"Noting that reducing on the first dim will make the LoD info lost.") "Note that reducing on the first dim will make the LoD info lost.")
.SetDefault(0); .SetDefault(0);
AddAttr<bool>("keep_dim", AddAttr<bool>("keep_dim",
"(bool, default false) " "(bool, default false) "
"If true, retain the reduced dimension with length 1.") "If true, retain the reduced dimension with length 1.")
.SetDefault(false); .SetDefault(false);
comment_ = R"DOC( comment_ = R"DOC(
{ReduceOP} operator computes the {reduce} of input tensor along the given dimension. {ReduceOp} Operator.
The result tensor has 1 fewer dimension than the input unless `keep_dim` is true.
This operator computes the {reduce} of input tensor along the given dimension.
The result tensor has 1 fewer dimension than the input unless keep_dim is true.
)DOC"; )DOC";
AddComment(comment_); AddComment(comment_);
} }
......
...@@ -71,8 +71,11 @@ class ReshapeOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -71,8 +71,11 @@ class ReshapeOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "The input tensor of reshape operator."); AddInput("X", "The input tensor of reshape operator.");
AddOutput("Out", "The output tensor of reshape operator."); AddOutput("Out", "The output tensor of reshape operator.");
AddAttr<std::vector<int>>("shape", "Target shape of reshape operator."); AddAttr<std::vector<int>>("shape",
AddComment(R"DOC(Reshape operator "(vector<int>) "
"Target shape of reshape operator.");
AddComment(R"DOC(
Reshape Operator.
Reshape Input(X) into the shape specified by Attr(shape). Reshape Input(X) into the shape specified by Attr(shape).
...@@ -81,7 +84,7 @@ Given a 2-D tensor X with 2 rows and 2 columns ...@@ -81,7 +84,7 @@ Given a 2-D tensor X with 2 rows and 2 columns
[[1, 2], [3, 4]] [[1, 2], [3, 4]]
with target shape = [1, 4], the reshape operator will transform and target shape = [1, 4], the reshape operator will transform
the tensor X into a 1-D tensor: the tensor X into a 1-D tensor:
[1, 2, 3, 4] [1, 2, 3, 4]
......
...@@ -68,22 +68,22 @@ class RmspropOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -68,22 +68,22 @@ class RmspropOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Param", AddInput("Param",
"(Tensor, default Tensor<float>) " "(Tensor, default Tensor<float>) "
"Input parameter value that has to be updated"); "Input parameter value that has to be updated.");
AddInput("MeanSquare", AddInput("MeanSquare",
"(Tensor, default Tensor<float>)" "(Tensor, default Tensor<float>)"
" The mean square value that gets updated"); " The mean square value that gets updated.");
AddInput("LearningRate", AddInput("LearningRate",
"(Tensor, default Tensor<float>) " "(Tensor, default Tensor<float>) "
"The learning rate should be a tensor of size 1"); "The learning rate should be a tensor of size 1.");
AddInput("Grad", AddInput("Grad",
"(Tensor, default Tensor<float>) " "(Tensor, default Tensor<float>) "
"Input gradient of the parameter"); "Input gradient of the parameter.");
AddInput("Moment", AddInput("Moment",
"(Tensor, default Tensor<float>) The moment that gets updated"); "(Tensor, default Tensor<float>) The moment that gets updated.");
AddOutput("ParamOut", "(Tensor) Output updated parameter value"); AddOutput("ParamOut", "(Tensor) Output updated parameter value.");
AddOutput("MomentOut", "(Tensor) Output updated moment"); AddOutput("MomentOut", "(Tensor) Output updated moment.");
AddOutput("MeanSquareOut", "(Tensor) Output Mean squared updated value"); AddOutput("MeanSquareOut", "(Tensor) Output Mean squared updated value.");
AddAttr<float>("epsilon", AddAttr<float>("epsilon",
"(float, default 1e-10) Constant " "(float, default 1e-10) Constant "
...@@ -93,18 +93,19 @@ class RmspropOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -93,18 +93,19 @@ class RmspropOpMaker : public framework::OpProtoAndCheckerMaker {
"(float, default 0.9) " "(float, default 0.9) "
"Discounting factor for coming gradient.") "Discounting factor for coming gradient.")
.SetDefault(0.9f); .SetDefault(0.9f);
AddAttr<float>("momentum", "(float, default 0.0) Constant value") AddAttr<float>("momentum", "(float, default 0.0) Constant value.")
.SetDefault(0.0f); .SetDefault(0.0f);
AddComment(R"DOC( AddComment(R"DOC(
Rmsprop Optimizer.
RMSprop $$
MeanSquareOut = decay * MeanSquare + (1 - decay) * Grad * Grad \\
MeanSquareOut = decay * MeanSquare + (1 - decay) * Grad * Grad
MomentOut = momentum * Moment + MomentOut = momentum * Moment +
LearningRate * Grad / sqrt(MeanSquareOut + epsilon) \frac{LearningRate * Grad}{\sqrt{MeanSquareOut + epsilon}} \\
ParamOut = Param - MomentOut ParamOut = Param - MomentOut
$$
The original slides that proposed RMSprop: Slide 29 of The original slides that proposed Rmsprop: Slide 29 of
http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf)
)DOC"); )DOC");
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册