未验证 提交 1d04b19c 编写于 作者: A Abhinav Arora 提交者: GitHub

Fix the rendering of latex equation for adamax op (#6294)

* Using latex fraction syntax in sigmoid and logsigmoid op

* Fixing the rendering of the latex equations in adamax operator
上级 161128ba
...@@ -44,9 +44,9 @@ class SigmoidOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -44,9 +44,9 @@ class SigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("X", "Input of Sigmoid operator"); AddInput("X", "Input of Sigmoid operator");
AddOutput("Y", "Output of Sigmoid operator"); AddOutput("Y", "Output of Sigmoid operator");
AddComment(R"DOC( AddComment(R"DOC(
Sigmoid Activation Operator. Sigmoid Activation Operator
$y = 1 / (1 + e^{-x})$ $$y = \frac{1}{1 + e^{-x}}$$
)DOC"); )DOC");
} }
...@@ -60,9 +60,9 @@ class LogSigmoidOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -60,9 +60,9 @@ class LogSigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("X", "Input of LogSigmoid operator"); AddInput("X", "Input of LogSigmoid operator");
AddOutput("Y", "Output of LogSigmoid operator"); AddOutput("Y", "Output of LogSigmoid operator");
AddComment(R"DOC( AddComment(R"DOC(
Logsigmoid Activation Operator. Logsigmoid Activation Operator
$y = \log(1 / (1 + e^{-x}))$ $$y = \log \frac{1}{1 + e^{-x}}$$
)DOC"); )DOC");
} }
......
...@@ -107,10 +107,12 @@ Adam algorithm based on the infinity norm. ...@@ -107,10 +107,12 @@ Adam algorithm based on the infinity norm.
Adamax updates: Adamax updates:
$$momentOut = \beta_1 * moment + (1 - \beta_1) * grad \break $$
infNormOut = max(\beta_2 * infNorm + \epsilon, |grad|) \break momentOut = \beta_{1} * moment + (1 - \beta_{1}) * grad \\
learningRate = learningRate /(1 - \beta_1_{pow}) \break infNormOut = max(\beta_{2} * infNorm + \epsilon, |grad|) \\
paramOut = param - learningRate * momentPut / infNormOut$$ learningRate = \frac{learningRate}{1 - \beta_{1}^{Beta1Pow}} \\
paramOut = param - learningRate * \frac{momentOut}{infNormOut}
$$
The original paper does not have an epsilon attribute. The original paper does not have an epsilon attribute.
However, it is added here for numerical stability to prevent the However, it is added here for numerical stability to prevent the
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册