Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
1b0b253d
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
“2d16d69b5f06cbd00fafe42f71d47328c9b8a7f4”上不存在“paddle/phi/kernels/cpu/searchsorted_kernel.cc”
未验证
提交
1b0b253d
编写于
8月 31, 2022
作者:
G
Guanghua Yu
提交者:
GitHub
8月 31, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix softmax_with_cross_entropy en docs (#45527)
上级
fe2bfe15
变更
1
显示空白变更内容
内联
并排
Showing
1 changed file
with
90 addition
and
13 deletion
+90
-13
python/paddle/nn/functional/loss.py
python/paddle/nn/functional/loss.py
+90
-13
未找到文件。
python/paddle/nn/functional/loss.py
浏览文件 @
1b0b253d
...
@@ -184,27 +184,19 @@ def fluid_softmax_with_cross_entropy(logits,
...
@@ -184,27 +184,19 @@ def fluid_softmax_with_cross_entropy(logits,
1) Hard label (one-hot label, so every sample has exactly one class)
1) Hard label (one-hot label, so every sample has exactly one class)
.. math::
.. math::
\\loss_j=-\text{logits}_{label_j} +\log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right), j = 1,..., K
loss_j = -\\text{logits}_{label_j} +
\\log\\left(\\sum_{i=0}^{K}\\exp(\\text{logits}_i)\\right), j = 1,..., K
2) Soft label (each sample can have a distribution over all classes)
2) Soft label (each sample can have a distribution over all classes)
.. math::
.. math::
\\loss_j= -\sum_{i=0}^{K}\text{label}_i\left(\text{logits}_i - \log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right)\right), j = 1,...,K
loss_j = -\\sum_{i=0}^{K}\\text{label}_i
\\left(\\text{logits}_i - \\log\\left(\\sum_{i=0}^{K}
\\exp(\\text{logits}_i)\\right)\\right), j = 1,...,K
3) If :attr:`numeric_stable_mode` is :attr:`True`, softmax is calculated first by:
3) If :attr:`numeric_stable_mode` is :attr:`True`, softmax is calculated first by:
.. math::
.. math::
\\max_j&=\max_{i=0}^{K}{\text{logits}_i} \\
max_j &= \\max_{i=0}^{K}{\\text{logits}_i}
log\_max\_sum_j &= \log\sum_{i=0}^{K}\exp(logits_i - max_j)\\
softmax_j &= \exp(logits_j - max_j - {log\_max\_sum}_j)
log\\_max\\_sum_j &= \\log\\sum_{i=0}^{K}\\exp(logits_i - max_j)
softmax_j &= \\exp(logits_j - max_j - {log\\_max\\_sum}_j)
and then cross entropy loss is calculated by softmax and label.
and then cross entropy loss is calculated by softmax and label.
...
@@ -2030,6 +2022,91 @@ def softmax_with_cross_entropy(logits,
...
@@ -2030,6 +2022,91 @@ def softmax_with_cross_entropy(logits,
numeric_stable_mode
=
True
,
numeric_stable_mode
=
True
,
return_softmax
=
False
,
return_softmax
=
False
,
axis
=-
1
):
axis
=-
1
):
r
"""
This operator implements the cross entropy loss function with softmax. This function
combines the calculation of the softmax operation and the cross entropy loss function
to provide a more numerically stable gradient.
Because this operator performs a softmax on logits internally, it expects
unscaled logits. This operator should not be used with the output of
softmax operator since that would produce incorrect results.
When the attribute :attr:`soft_label` is set :attr:`False`, this operators
expects mutually exclusive hard labels, each sample in a batch is in exactly
one class with a probability of 1.0. Each sample in the batch will have a
single label.
The equation is as follows:
1) Hard label (one-hot label, so every sample has exactly one class)
.. math::
\\loss_j=-\text{logits}_{label_j} +\log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right), j = 1,..., K
2) Soft label (each sample can have a distribution over all classes)
.. math::
\\loss_j= -\sum_{i=0}^{K}\text{label}_i\left(\text{logits}_i - \log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right)\right), j = 1,...,K
3) If :attr:`numeric_stable_mode` is :attr:`True`, softmax is calculated first by:
.. math::
\\max_j&=\max_{i=0}^{K}{\text{logits}_i} \\
log\_max\_sum_j &= \log\sum_{i=0}^{K}\exp(logits_i - max_j)\\
softmax_j &= \exp(logits_j - max_j - {log\_max\_sum}_j)
and then cross entropy loss is calculated by softmax and label.
Args:
logits (Tensor): A multi-dimension ``Tensor`` , and the data type is float32 or float64. The input tensor of unscaled log probabilities.
label (Tensor): The ground truth ``Tensor`` , data type is the same
as the ``logits`` . If :attr:`soft_label` is set to :attr:`True`,
Label is a ``Tensor`` in the same shape with :attr:`logits`.
If :attr:`soft_label` is set to :attr:`True`, Label is a ``Tensor``
in the same shape with :attr:`logits` expect shape in dimension :attr:`axis` as 1.
soft_label (bool, optional): A flag to indicate whether to interpretant the given
labels as soft labels. Default False.
ignore_index (int, optional): Specifies a target value that is ignored and does
not contribute to the input gradient. Only valid
if :attr:`soft_label` is set to :attr:`False`.
Default: kIgnoreIndex(-100).
numeric_stable_mode (bool, optional): A flag to indicate whether to use a more
numerically stable algorithm. Only valid
when :attr:`soft_label` is :attr:`False`
and GPU is used. When :attr:`soft_label`
is :attr:`True` or CPU is used, the
algorithm is always numerically stable.
Note that the speed may be slower when use
stable algorithm. Default: True.
return_softmax (bool, optional): A flag indicating whether to return the softmax
along with the cross entropy loss. Default: False.
axis (int, optional): The index of dimension to perform softmax calculations. It
should be in range :math:`[-1, rank - 1]`, while :math:`rank`
is the rank of input :attr:`logits`. Default: -1.
Returns:
``Tensor`` or Tuple of two ``Tensor`` : Return the cross entropy loss if \
`return_softmax` is False, otherwise the tuple \
(loss, softmax), softmax is in the same shape \
with input logits and cross entropy loss is in \
the same shape with input logits except shape \
in dimension :attr:`axis` as 1.
Examples:
.. code-block:: python
import paddle
import numpy as np
data = np.random.rand(128).astype("float32")
label = np.random.rand(1).astype("int64")
data = paddle.to_tensor(data)
label = paddle.to_tensor(label)
linear = paddle.nn.Linear(128, 100)
x = linear(data)
out = paddle.nn.functional.softmax_with_cross_entropy(logits=x, label=label)
print(out)
"""
return
fluid_softmax_with_cross_entropy
(
logits
,
label
,
soft_label
,
return
fluid_softmax_with_cross_entropy
(
logits
,
label
,
soft_label
,
ignore_index
,
numeric_stable_mode
,
ignore_index
,
numeric_stable_mode
,
return_softmax
,
axis
)
return_softmax
,
axis
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录