未验证 提交 1d04b19c 编写于 作者: A Abhinav Arora 提交者: GitHub

Fix the rendering of latex equation for adamax op (#6294)

* Using latex fraction syntax in sigmoid and logsigmoid op

* Fixing the rendering of the latex equations in adamax operator
上级 161128ba
......@@ -44,9 +44,9 @@ class SigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("X", "Input of Sigmoid operator");
AddOutput("Y", "Output of Sigmoid operator");
AddComment(R"DOC(
Sigmoid Activation Operator.
Sigmoid Activation Operator
$y = 1 / (1 + e^{-x})$
$$y = \frac{1}{1 + e^{-x}}$$
)DOC");
}
......@@ -60,9 +60,9 @@ class LogSigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("X", "Input of LogSigmoid operator");
AddOutput("Y", "Output of LogSigmoid operator");
AddComment(R"DOC(
Logsigmoid Activation Operator.
Logsigmoid Activation Operator
$y = \log(1 / (1 + e^{-x}))$
$$y = \log \frac{1}{1 + e^{-x}}$$
)DOC");
}
......
......@@ -107,10 +107,12 @@ Adam algorithm based on the infinity norm.
Adamax updates:
$$momentOut = \beta_1 * moment + (1 - \beta_1) * grad \break
infNormOut = max(\beta_2 * infNorm + \epsilon, |grad|) \break
learningRate = learningRate /(1 - \beta_1_{pow}) \break
paramOut = param - learningRate * momentPut / infNormOut$$
$$
momentOut = \beta_{1} * moment + (1 - \beta_{1}) * grad \\
infNormOut = max(\beta_{2} * infNorm + \epsilon, |grad|) \\
learningRate = \frac{learningRate}{1 - \beta_{1}^{Beta1Pow}} \\
paramOut = param - learningRate * \frac{momentOut}{infNormOut}
$$
The original paper does not have an epsilon attribute.
However, it is added here for numerical stability to prevent the
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册