diff --git a/paddle/fluid/operators/sigmoid_cross_entropy_with_logits_op.cc b/paddle/fluid/operators/sigmoid_cross_entropy_with_logits_op.cc index 135e2a6f7f877c9ef159a4542b834d5627649e81..c3b0fe32098cb4b41ccc155db58809ef9f1bf46b 100644 --- a/paddle/fluid/operators/sigmoid_cross_entropy_with_logits_op.cc +++ b/paddle/fluid/operators/sigmoid_cross_entropy_with_logits_op.cc @@ -113,14 +113,14 @@ The logistic loss is given as follows: $$loss = -Labels * \log(\sigma(X)) - (1 - Labels) * \log(1 - \sigma(X))$$ -We know that $$\sigma(X) = (1 / (1 + \exp(-X)))$$. By substituting this we get: +We know that $$\sigma(X) = \\frac{1}{1 + \exp(-X)}$$. By substituting this we get: $$loss = X - X * Labels + \log(1 + \exp(-X))$$ For stability and to prevent overflow of $$\exp(-X)$$ when X < 0, we reformulate the loss as follows: - $$loss = \max(X, 0) - X * Labels + \log(1 + \exp(-|X|))$$ + $$loss = \max(X, 0) - X * Labels + \log(1 + \exp(-\|X\|))$$ Both the input `X` and `Labels` can carry the LoD (Level of Details) information. However the output only shares the LoD with input `X`.