提交 81ba077e 编写于 作者: K Kexin Zhao

small fix

上级 81c7dbc5
# Design Doc: float16
## Why float16
Half precision (float16) is a binary floating-point format that occupies 16 bits / 2 bytes in memory. float16 is half the size of traditional 32-bit single precision format (float) and has lower precision and smaller range.
When high precision computation is not required, using float16 data type could potentially
- reduce storage space, memory bandwidth, and power usages;
- increase the chance of data fitting into a smaller cache of lower latency;
- provide arithmetic speed up if supported by hardware.
A brief survey of float16 support on different hardwares can be found [here](https://github.com/PaddlePaddle/Paddle/issues/4853). A brief survey of existing float16 implementations can be found [here](https://github.com/Xreki/Xreki.github.io/blob/master/multi_data_types_in_dl_framework/ppt/float16_and_quantized_type.md).
There are various natively supported float16 implementations on different hardwares/linear algebra libraries including half on cuda, __fp16/float16_t on ARM processor, and Eigen::half on Eigen.
The goal of float16 is to serve as a key for the executor to find and run the correct version of operator kernel compute method specialized for float16. It should be compatible with half on cuda, __fp16 on ARM, and Eigen::half on Eigen to make writing customized float16 kernels easier.
## Implementation
The float16 class holds a 2-byte uint16_t data internally.
```
struct float16 {
uint16_t x;
};
```
float16 supports the following features:
- constructors / assignment operators that take input from primitive data types including bool, integers of various length, float, and double.
- constructors / assignment operators that take input from half on cuda, __fp16 on ARM, and Eigen::half on Eigen.
- conversion operators to primitive data types and half precision data types on cuda, ARM and Eigen.
- overloaded arithmetic operators (e.g., +, -, *, /) for cuda, arm, and non-arm cpu, respectively. These operators will take advantage of the cuda and ARM intrinsics on the corresponding hardware.
To support the above features, two fundamental conversion functions are provided:
```
float16 float_to_half_rn(float f); // convert to half precision in round-to-nearest-even mode
float half_to_float(float16 h);
```
which provides one-to-one conversion between float32 and float16. These twos functions will do different conversion routines based on the current hardware. CUDA/ARM instrinsics will be used when the corresonding hardware is available. When the hardware falls back to non-ARM cpu, software emulation will be performed to do the conversion.
## To do
After float16 class is available, some of the future items are below:
- Update pybind/tensor_py.h to bind c++ float16 with numpy float16.
- Modify `IndicateDataType()` method in `framework/operator.h` to make it compatible with float16.
- Create a type-casting operator that can convert the data type in tensor between float16 and other types.
...@@ -43,7 +43,12 @@ class SigmoidOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -43,7 +43,12 @@ class SigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Sigmoid operator"); AddInput("X", "Input of Sigmoid operator");
AddOutput("Y", "Output of Sigmoid operator"); AddOutput("Y", "Output of Sigmoid operator");
AddComment("Sigmoid activation operator, sigmoid = 1 / (1 + exp(-x))"); AddComment(R"DOC(
Sigmoid activation operator.
$y = 1 / (1 + e^{-x})$
)DOC");
} }
}; };
...@@ -54,8 +59,12 @@ class LogSigmoidOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -54,8 +59,12 @@ class LogSigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of LogSigmoid operator"); AddInput("X", "Input of LogSigmoid operator");
AddOutput("Y", "Output of LogSigmoid operator"); AddOutput("Y", "Output of LogSigmoid operator");
AddComment( AddComment(R"DOC(
"Logsigmoid activation operator, logsigmoid = log (1 / (1 + exp(-x)))"); Logsigmoid activation operator.
$y = \log(1 / (1 + e^{-x}))$
)DOC");
} }
}; };
...@@ -65,7 +74,12 @@ class ExpOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -65,7 +74,12 @@ class ExpOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Exp operator"); AddInput("X", "Input of Exp operator");
AddOutput("Y", "Output of Exp operator"); AddOutput("Y", "Output of Exp operator");
AddComment("Exp activation operator, exp(x) = e^x"); AddComment(R"DOC(
Exp activation operator.
$y = e^x$
)DOC");
} }
}; };
...@@ -75,7 +89,12 @@ class ReluOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -75,7 +89,12 @@ class ReluOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Relu operator"); AddInput("X", "Input of Relu operator");
AddOutput("Y", "Output of Relu operator"); AddOutput("Y", "Output of Relu operator");
AddComment("Relu activation operator, relu(x) = max(x, 0)"); AddComment(R"DOC(
Relu activation operator.
$y = \max(x, 0)$
)DOC");
} }
}; };
...@@ -87,11 +106,14 @@ class LeakyReluOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -87,11 +106,14 @@ class LeakyReluOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of LeakyRelu operator"); AddInput("X", "Input of LeakyRelu operator");
AddOutput("Y", "Output of LeakyRelu operator"); AddOutput("Y", "Output of LeakyRelu operator");
AddComment(
"LeakyRelu activation operator, "
"leaky_relu = max(x, alpha * x)");
AddAttr<AttrType>("alpha", "The small negative slope") AddAttr<AttrType>("alpha", "The small negative slope")
.SetDefault(static_cast<AttrType>(0.02f)); .SetDefault(static_cast<AttrType>(0.02f));
AddComment(R"DOC(
LeakyRelu activation operator.
$y = \max(x, \alpha * x)$
)DOC");
} }
}; };
...@@ -103,12 +125,20 @@ class SoftShrinkOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -103,12 +125,20 @@ class SoftShrinkOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Softshrink operator"); AddInput("X", "Input of Softshrink operator");
AddOutput("Y", "Output of Softshrink operator"); AddOutput("Y", "Output of Softshrink operator");
AddComment(
"Softshrink activation operator, "
"softshrink = x - lambda, if x > lambda;"
" x + lambda, if x < lambda; 0 otherwise");
AddAttr<AttrType>("lambda", "non-negative offset") AddAttr<AttrType>("lambda", "non-negative offset")
.SetDefault(static_cast<AttrType>(0.5f)); .SetDefault(static_cast<AttrType>(0.5f));
AddComment(R"DOC(
Softshrink activation operator.
$$
y = \begin{cases}
x - \lambda, \text{if } x > \lambda \\
x + \lambda, \text{if } x < -\lambda \\
0, \text{otherwise}
\end{cases}
$$
)DOC");
} }
}; };
...@@ -118,9 +148,12 @@ class TanhOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -118,9 +148,12 @@ class TanhOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Tanh operator"); AddInput("X", "Input of Tanh operator");
AddOutput("Y", "Output of Tanh operator"); AddOutput("Y", "Output of Tanh operator");
AddComment( AddComment(R"DOC(
"Tanh activation operator, tanh = (exp(x) - exp(-x)) / (exp(x) + " Tanh activation operator.
"exp(-x))");
$$y = \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}$$
)DOC");
} }
}; };
...@@ -131,7 +164,12 @@ class TanhShrinkOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -131,7 +164,12 @@ class TanhShrinkOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of TanhShrink operator"); AddInput("X", "Input of TanhShrink operator");
AddOutput("Y", "Output of TanhShrink operator"); AddOutput("Y", "Output of TanhShrink operator");
AddComment("TanhShrink activation operator, tanhshrink(x) = x - tanh(x)"); AddComment(R"DOC(
TanhShrink activation operator.
$$y = x - \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}$$
)DOC");
} }
}; };
...@@ -143,13 +181,20 @@ class HardShrinkOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -143,13 +181,20 @@ class HardShrinkOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of HardShrink operator"); AddInput("X", "Input of HardShrink operator");
AddOutput("Y", "Output of HardShrink operator"); AddOutput("Y", "Output of HardShrink operator");
AddComment(
"HardShrink activation operator, "
"hard_shrink(x) = x if x > lambda"
"hard_shrink(x) = x if x < -lambda"
"hard_shrink(x) = 0 otherwise");
AddAttr<AttrType>("threshold", "The value of threshold for HardShrink") AddAttr<AttrType>("threshold", "The value of threshold for HardShrink")
.SetDefault(static_cast<AttrType>(0.5)); .SetDefault(static_cast<AttrType>(0.5));
AddComment(R"DOC(
HardShrink activation operator.
$$
y = \begin{cases}
x, \text{if } x > \lambda \\
x, \text{if } x < -\lambda \\
0, \text{otherwise}
\end{cases}
$$
)DOC");
} }
}; };
...@@ -159,7 +204,12 @@ class SqrtOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -159,7 +204,12 @@ class SqrtOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Sqrt operator"); AddInput("X", "Input of Sqrt operator");
AddOutput("Y", "Output of Sqrt operator"); AddOutput("Y", "Output of Sqrt operator");
AddComment("Sqrt activation operator, sqrt(x) = x^(1/2)"); AddComment(R"DOC(
Sqrt activation operator.
$y = \sqrt{x}$
)DOC");
} }
}; };
...@@ -169,7 +219,12 @@ class AbsOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -169,7 +219,12 @@ class AbsOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Abs operator"); AddInput("X", "Input of Abs operator");
AddOutput("Y", "Output of Abs operator"); AddOutput("Y", "Output of Abs operator");
AddComment("Abs activation operator, abs(x) = |x|"); AddComment(R"DOC(
Abs activation operator.
$y = |x|$
)DOC");
} }
}; };
...@@ -180,7 +235,12 @@ class ReciprocalOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -180,7 +235,12 @@ class ReciprocalOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Reciprocal operator"); AddInput("X", "Input of Reciprocal operator");
AddOutput("Y", "Output of Reciprocal operator"); AddOutput("Y", "Output of Reciprocal operator");
AddComment("Reciprocal activation operator, reciprocal(x) = 1 / x"); AddComment(R"DOC(
Reciprocal activation operator.
$$y = \frac{1}{x}$$
)DOC");
} }
}; };
...@@ -190,7 +250,14 @@ class LogOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -190,7 +250,14 @@ class LogOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Log operator"); AddInput("X", "Input of Log operator");
AddOutput("Y", "Output of Log operator"); AddOutput("Y", "Output of Log operator");
AddComment("Log activation operator, log(x) = natural logarithm of x"); AddComment(R"DOC(
Log activation operator.
$y = \ln(x)$
Natural logarithm of x.
)DOC");
} }
}; };
...@@ -200,7 +267,12 @@ class SquareOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -200,7 +267,12 @@ class SquareOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Square operator"); AddInput("X", "Input of Square operator");
AddOutput("Y", "Output of Square operator"); AddOutput("Y", "Output of Square operator");
AddComment("Square activation operator, square(x) = x^2"); AddComment(R"DOC(
Square activation operator.
$y = x^2$
)DOC");
} }
}; };
...@@ -211,7 +283,12 @@ class SoftplusOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -211,7 +283,12 @@ class SoftplusOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Softplus operator"); AddInput("X", "Input of Softplus operator");
AddOutput("Y", "Output of Softplus operator"); AddOutput("Y", "Output of Softplus operator");
AddComment("Softplus activation operator, softplus(x) = log(1 + exp(x))"); AddComment(R"DOC(
Softplus activation operator.
$y = \ln(1 + e^{x})$
)DOC");
} }
}; };
...@@ -222,7 +299,12 @@ class SoftsignOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -222,7 +299,12 @@ class SoftsignOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Softsign operator"); AddInput("X", "Input of Softsign operator");
AddOutput("Y", "Output of Softsign operator"); AddOutput("Y", "Output of Softsign operator");
AddComment("Softsign activation operator, softsign(x) = x / (1 + |x|)"); AddComment(R"DOC(
Softsign activation operator.
$$y = \frac{x}{1 + |x|}$$
)DOC");
} }
}; };
...@@ -233,11 +315,16 @@ class BReluOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -233,11 +315,16 @@ class BReluOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of BRelu operator"); AddInput("X", "Input of BRelu operator");
AddOutput("Y", "Output of BRelu operator"); AddOutput("Y", "Output of BRelu operator");
AddComment("BRelu activation operator, brelu = max(min(x, t_min), t_max)");
AddAttr<AttrType>("t_min", "The min marginal value of BRelu") AddAttr<AttrType>("t_min", "The min marginal value of BRelu")
.SetDefault(static_cast<AttrType>(0)); .SetDefault(static_cast<AttrType>(0));
AddAttr<AttrType>("t_max", "The max marginal value of BRelu") AddAttr<AttrType>("t_max", "The max marginal value of BRelu")
.SetDefault(static_cast<AttrType>(24)); .SetDefault(static_cast<AttrType>(24));
AddComment(R"DOC(
BRelu activation operator.
$y = \max(\min(x, t_{min}), t_{max})$
)DOC");
} }
}; };
...@@ -249,11 +336,14 @@ class SoftReluOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -249,11 +336,14 @@ class SoftReluOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of SoftRelu operator"); AddInput("X", "Input of SoftRelu operator");
AddOutput("Y", "Output of SoftRelu operator"); AddOutput("Y", "Output of SoftRelu operator");
AddComment(
"SoftRelu activation operator, soft_relu = log(1 + exp(max(min(x, "
"threshold), threshold)))");
AddAttr<AttrType>("threshold", "The threshold value of SoftRelu") AddAttr<AttrType>("threshold", "The threshold value of SoftRelu")
.SetDefault(static_cast<AttrType>(40)); .SetDefault(static_cast<AttrType>(40));
AddComment(R"DOC(
SoftRelu activation operator.
$y = \ln(1 + \exp(\max(\min(x, threshold), threshold))$
)DOC");
} }
}; };
...@@ -262,19 +352,19 @@ class ELUOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -262,19 +352,19 @@ class ELUOpMaker : public framework::OpProtoAndCheckerMaker {
public: public:
ELUOpMaker(framework::OpProto *proto, framework::OpAttrChecker *op_checker) ELUOpMaker(framework::OpProto *proto, framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", AddInput("X", "Input of ELU operator");
"(Tensor) The input of ELU operator, it shouldn't be empty. Input " AddOutput("Y", "Output of ELU operator");
"is flattened and treated as a 1D array."); AddAttr<AttrType>("alpha", "The alpha value of ELU")
AddOutput("Y", .SetDefault(static_cast<AttrType>(1.0f));
"(Tensor) The output of ELU operator. It has the same shape as "
"the input.");
AddAttr<AttrType>(
"alpha", "(float, default 1.0) Alpha value in the elu formulation.")
.SetDefault(static_cast<AttrType>(1.));
AddComment(R"DOC( AddComment(R"DOC(
ELU activation operator. It applies this element-wise computation on ELU activation operator.
the input: f(x) = max(0, x) + min(0, alpha * (exp(x) - 1)).
Check .. _Link: https://arxiv.org/abs/1511.07289 for more details.)DOC"); Applies the following element-wise computation on the input according to
https://arxiv.org/abs/1511.07289.
$y = \max(0, x) + \min(0, \alpha * (e^x - 1))$
)DOC");
} }
}; };
...@@ -285,9 +375,14 @@ class Relu6OpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -285,9 +375,14 @@ class Relu6OpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Relu6 operator"); AddInput("X", "Input of Relu6 operator");
AddOutput("Y", "Output of Relu6 operator"); AddOutput("Y", "Output of Relu6 operator");
AddComment("Relu6 activation operator, relu6 = min(max(0, x), 6)");
AddAttr<AttrType>("threshold", "The threshold value of Relu6") AddAttr<AttrType>("threshold", "The threshold value of Relu6")
.SetDefault(static_cast<AttrType>(6)); .SetDefault(static_cast<AttrType>(6));
AddComment(R"DOC(
Relu6 activation operator.
$y = \min(\max(0, x), 6)$
)DOC");
} }
}; };
...@@ -298,9 +393,14 @@ class PowOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -298,9 +393,14 @@ class PowOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of Pow operator"); AddInput("X", "Input of Pow operator");
AddOutput("Y", "Output of Pow operator"); AddOutput("Y", "Output of Pow operator");
AddComment("Pow activation operator, pow(x, factor) = x^factor");
AddAttr<AttrType>("factor", "The exponential factor of Pow") AddAttr<AttrType>("factor", "The exponential factor of Pow")
.SetDefault(static_cast<AttrType>(1)); .SetDefault(static_cast<AttrType>(1));
AddComment(R"DOC(
Pow activation operator.
$y = x^{factor}$
)DOC");
} }
}; };
...@@ -311,11 +411,16 @@ class STanhOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -311,11 +411,16 @@ class STanhOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of STanh operator"); AddInput("X", "Input of STanh operator");
AddOutput("Y", "Output of STanh operator"); AddOutput("Y", "Output of STanh operator");
AddComment("STanh activation operator, stanh = b * tanh(a * x)");
AddAttr<AttrType>("scale_a", "The scale parameter of a for the input") AddAttr<AttrType>("scale_a", "The scale parameter of a for the input")
.SetDefault(static_cast<AttrType>(2 / 3)); .SetDefault(static_cast<AttrType>(2 / 3));
AddAttr<AttrType>("scale_b", "The scale parameter of b for the input") AddAttr<AttrType>("scale_b", "The scale parameter of b for the input")
.SetDefault(static_cast<AttrType>(1.7159)); .SetDefault(static_cast<AttrType>(1.7159));
AddComment(R"DOC(
STanh activation operator.
$$y = b * \frac{e^{a * x} - e^{-a * x}}{e^{a * x} + e^{-a * x}}$$
)DOC");
} }
}; };
...@@ -327,12 +432,19 @@ class ThresholdedReluOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -327,12 +432,19 @@ class ThresholdedReluOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of ThresholdedRelu operator"); AddInput("X", "Input of ThresholdedRelu operator");
AddOutput("Y", "Output of ThresholdedRelu operator"); AddOutput("Y", "Output of ThresholdedRelu operator");
AddComment(
"ThresholdedRelu activation operator, "
"thresholded_relu = x for x > threshold, "
"thresholded_relu = 0 otherwise.");
AddAttr<AttrType>("threshold", "The threshold location of activation") AddAttr<AttrType>("threshold", "The threshold location of activation")
.SetDefault(static_cast<AttrType>(1.0)); .SetDefault(static_cast<AttrType>(1.0));
AddComment(R"DOC(
ThresholdedRelu activation operator.
$$
y = \begin{cases}
x, \text{if } x > threshold \\
0, \text{otherwise}
\end{cases}
$$
)DOC");
} }
}; };
...@@ -344,27 +456,23 @@ class HardSigmoidOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -344,27 +456,23 @@ class HardSigmoidOpMaker : public framework::OpProtoAndCheckerMaker {
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "Input of HardSigmoid operator"); AddInput("X", "Input of HardSigmoid operator");
AddOutput("Y", "Output of HardSigmoid operator"); AddOutput("Y", "Output of HardSigmoid operator");
AddAttr<AttrType>("slope", "Slope for linear approximation of sigmoid")
.SetDefault(static_cast<AttrType>(0.2));
AddAttr<AttrType>("offset", "Offset for linear approximation of sigmoid")
.SetDefault(static_cast<AttrType>(0.5));
AddComment(R"DOC( AddComment(R"DOC(
Hard Sigmoid activation operator. HardSigmoid activation operator.
Segment-wise linear approximation of sigmoid[1]. Segment-wise linear approximation of sigmoid(https://arxiv.org/abs/1603.00391),
This is much faster than sigmoid. which is much faster than sigmoid.
hard_sigmoid = max(0, min(1, slope * x + shift)) $y = \max(0, \min(1, slope * x + shift))$
The slope should be positive. The offset can be either positive or negative. The slope should be positive. The offset can be either positive or negative.
The default slope and shift are set from [1]. The default slope and shift are set according to the above reference.
It is recommended to use the defaults for this activation. It is recommended to use the defaults for this activation.
References: )DOC");
[1] Noisy Activation Functions
(https://arxiv.org/abs/1603.00391)
)DOC");
AddAttr<AttrType>("slope", "Slope for linear approximation of sigmoid")
.SetDefault(static_cast<AttrType>(0.2));
AddAttr<AttrType>("offset", "Offset for linear approximation of sigmoid")
.SetDefault(static_cast<AttrType>(0.5));
} }
}; };
......
...@@ -232,7 +232,7 @@ struct HardShrinkGradFunctor : public BaseActivationFunctor<T> { ...@@ -232,7 +232,7 @@ struct HardShrinkGradFunctor : public BaseActivationFunctor<T> {
} }
}; };
// softshrink(x) = x - lambda, if x > lambda; x + lambda, if x < lambda; 0 // softshrink(x) = x - lambda, if x > lambda; x + lambda, if x < -lambda; 0
// otherwise // otherwise
template <typename T> template <typename T>
struct SoftShrinkFunctor : public BaseActivationFunctor<T> { struct SoftShrinkFunctor : public BaseActivationFunctor<T> {
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册