提交 d7b20584 编写于 作者: C Cao Ying 提交者: GitHub

Merge pull request #3845 from lcy-seso/rename_mse_to_square_error

rename mse_cost into square_error_cost.
...@@ -434,9 +434,9 @@ lambda_cost ...@@ -434,9 +434,9 @@ lambda_cost
.. autoclass:: paddle.v2.layer.lambda_cost .. autoclass:: paddle.v2.layer.lambda_cost
:noindex: :noindex:
mse_cost square_error_cost
-------- --------
.. autoclass:: paddle.v2.layer.mse_cost .. autoclass:: paddle.v2.layer.square_error_cost
:noindex: :noindex:
rank_cost rank_cost
......
...@@ -55,7 +55,7 @@ PaddlePaddle是源于百度的一个深度学习平台。这份简短的介绍 ...@@ -55,7 +55,7 @@ PaddlePaddle是源于百度的一个深度学习平台。这份简短的介绍
# 线性计算网络层: ȳ = wx + b # 线性计算网络层: ȳ = wx + b
ȳ = fc_layer(input=x, param_attr=ParamAttr(name='w'), size=1, act=LinearActivation(), bias_attr=ParamAttr(name='b')) ȳ = fc_layer(input=x, param_attr=ParamAttr(name='w'), size=1, act=LinearActivation(), bias_attr=ParamAttr(name='b'))
# 计算误差函数,即 ȳ 和真实 y 之间的距离 # 计算误差函数,即 ȳ 和真实 y 之间的距离
cost = mse_cost(input= ȳ, label=y) cost = square_error_cost(input= ȳ, label=y)
outputs(cost) outputs(cost)
...@@ -69,7 +69,7 @@ PaddlePaddle是源于百度的一个深度学习平台。这份简短的介绍 ...@@ -69,7 +69,7 @@ PaddlePaddle是源于百度的一个深度学习平台。这份简短的介绍
- **数据层**:数据层 `data_layer` 是神经网络的入口,它读入数据并将它们传输到接下来的网络层。这里数据层有两个,分别对应于变量 `x` 和 `y`。 - **数据层**:数据层 `data_layer` 是神经网络的入口,它读入数据并将它们传输到接下来的网络层。这里数据层有两个,分别对应于变量 `x` 和 `y`。
- **全连接层**:全连接层 `fc_layer` 是基础的计算单元,这里利用它建模变量之间的线性关系。计算单元是神经网络的核心,PaddlePaddle支持大量的计算单元和任意深度的网络连接,从而可以拟合任意的函数来学习复杂的数据关系。 - **全连接层**:全连接层 `fc_layer` 是基础的计算单元,这里利用它建模变量之间的线性关系。计算单元是神经网络的核心,PaddlePaddle支持大量的计算单元和任意深度的网络连接,从而可以拟合任意的函数来学习复杂的数据关系。
- **回归误差代价层**:回归误差代价层 `mse_cost` 是众多误差代价函数层的一种,它们在训练过程作为网络的出口,用来计算模型的误差,是模型参数优化的目标函数。 - **回归误差代价层**:回归误差代价层 `square_error_cost` 是众多误差代价函数层的一种,它们在训练过程作为网络的出口,用来计算模型的误差,是模型参数优化的目标函数。
定义了网络结构并保存为 `trainer_config.py` 之后,运行以下训练命令: 定义了网络结构并保存为 `trainer_config.py` 之后,运行以下训练命令:
......
...@@ -49,7 +49,7 @@ To recover this relationship between ``X`` and ``Y``, we use a neural network wi ...@@ -49,7 +49,7 @@ To recover this relationship between ``X`` and ``Y``, we use a neural network wi
x = data_layer(name='x', size=1) x = data_layer(name='x', size=1)
y = data_layer(name='y', size=1) y = data_layer(name='y', size=1)
y_predict = fc_layer(input=x, param_attr=ParamAttr(name='w'), size=1, act=LinearActivation(), bias_attr=ParamAttr(name='b')) y_predict = fc_layer(input=x, param_attr=ParamAttr(name='w'), size=1, act=LinearActivation(), bias_attr=ParamAttr(name='b'))
cost = mse_cost(input=y_predict, label=y) cost = square_error_cost(input=y_predict, label=y)
outputs(cost) outputs(cost)
Some of the most fundamental usages of PaddlePaddle are demonstrated: Some of the most fundamental usages of PaddlePaddle are demonstrated:
......
...@@ -8,7 +8,7 @@ paddle.init(use_gpu=False) ...@@ -8,7 +8,7 @@ paddle.init(use_gpu=False)
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(2)) x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(2))
y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear()) y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear())
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1)) y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
cost = paddle.layer.mse_cost(input=y_predict, label=y) cost = paddle.layer.square_error_cost(input=y_predict, label=y)
# create parameters # create parameters
parameters = paddle.parameters.create(cost) parameters = paddle.parameters.create(cost)
......
...@@ -81,9 +81,9 @@ PaddlePaddle支持不同类型的输入数据,主要包括四种类型,和 ...@@ -81,9 +81,9 @@ PaddlePaddle支持不同类型的输入数据,主要包括四种类型,和
.. code-block:: bash .. code-block:: bash
y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear()) y_predict = paddle.layer.fc(input=x, size=1, act=paddle.activation.Linear())
cost = paddle.layer.mse_cost(input=y_predict, label=y) cost = paddle.layer.square_error_cost(input=y_predict, label=y)
其中,x与y为之前描述的输入层;而y_predict是接收x作为输入,接上一个全连接层;cost接收y_predict与y作为输入,接上方误差层。 其中,x与y为之前描述的输入层;而y_predict是接收x作为输入,接上一个全连接层;cost接收y_predict与y作为输入,接上方误差层。
最后一层cost中记录了神经网络的所有拓扑结构,通过组合不同的layer,我们即可完成神经网络的搭建。 最后一层cost中记录了神经网络的所有拓扑结构,通过组合不同的layer,我们即可完成神经网络的搭建。
...@@ -147,4 +147,4 @@ PaddlePaddle支持不同类型的输入数据,主要包括四种类型,和 ...@@ -147,4 +147,4 @@ PaddlePaddle支持不同类型的输入数据,主要包括四种类型,和
.. literalinclude:: src/train.py .. literalinclude:: src/train.py
:linenos: :linenos:
有关线性回归的实际应用,可以参考PaddlePaddle book的 `第一章节 <http://book.paddlepaddle.org/index.html>`_。 有关线性回归的实际应用,可以参考PaddlePaddle book的 `第一章节 <http://book.paddlepaddle.org/index.html>`_。
\ No newline at end of file
...@@ -213,7 +213,7 @@ I1116 09:10:17.123440 50 Util.cpp:130] Calling runInitFunctions ...@@ -213,7 +213,7 @@ I1116 09:10:17.123440 50 Util.cpp:130] Calling runInitFunctions
I1116 09:10:17.123764 50 Util.cpp:143] Call runInitFunctions done. I1116 09:10:17.123764 50 Util.cpp:143] Call runInitFunctions done.
[WARNING 2016-11-16 09:10:17,227 default_decorators.py:40] please use keyword arguments in paddle config. [WARNING 2016-11-16 09:10:17,227 default_decorators.py:40] please use keyword arguments in paddle config.
[INFO 2016-11-16 09:10:17,239 networks.py:1282] The input order is [movie_id, title, genres, user_id, gender, age, occupation, rating] [INFO 2016-11-16 09:10:17,239 networks.py:1282] The input order is [movie_id, title, genres, user_id, gender, age, occupation, rating]
[INFO 2016-11-16 09:10:17,239 networks.py:1289] The output order is [__mse_cost_0__] [INFO 2016-11-16 09:10:17,239 networks.py:1289] The output order is [__square_error_cost_0__]
I1116 09:10:17.392917 50 Trainer.cpp:170] trainer mode: Normal I1116 09:10:17.392917 50 Trainer.cpp:170] trainer mode: Normal
I1116 09:10:17.613910 50 PyDataProvider2.cpp:257] loading dataprovider dataprovider::process I1116 09:10:17.613910 50 PyDataProvider2.cpp:257] loading dataprovider dataprovider::process
I1116 09:10:17.680917 50 PyDataProvider2.cpp:257] loading dataprovider dataprovider::process I1116 09:10:17.680917 50 PyDataProvider2.cpp:257] loading dataprovider dataprovider::process
......
...@@ -53,7 +53,7 @@ __all__ = [ ...@@ -53,7 +53,7 @@ __all__ = [
'cos_sim', 'cos_sim',
'hsigmoid', 'hsigmoid',
'conv_projection', 'conv_projection',
'mse_cost', 'square_error_cost',
'regression_cost', 'regression_cost',
'classification_cost', 'classification_cost',
'LayerOutput', 'LayerOutput',
...@@ -4238,13 +4238,18 @@ def __cost_input__(input, label, weight=None): ...@@ -4238,13 +4238,18 @@ def __cost_input__(input, label, weight=None):
@wrap_name_default() @wrap_name_default()
@layer_support() @layer_support()
def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None): def square_error_cost(input,
label,
weight=None,
name=None,
coeff=1.0,
layer_attr=None):
""" """
mean squared error cost: sum of square error cost:
.. math:: .. math::
\\frac{1}{N}\sum_{i=1}^N(t_i-y_i)^2 cost = \\sum_{i=1}^N(t_i-y_i)^2
:param name: layer name. :param name: layer name.
:type name: basestring :type name: basestring
...@@ -4273,7 +4278,7 @@ def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None): ...@@ -4273,7 +4278,7 @@ def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None):
return LayerOutput(name, LayerType.COST, parents=parents, size=1) return LayerOutput(name, LayerType.COST, parents=parents, size=1)
regression_cost = mse_cost regression_cost = square_error_cost
@wrap_name_default("cost") @wrap_name_default("cost")
...@@ -5798,9 +5803,9 @@ def huber_regression_cost(input, ...@@ -5798,9 +5803,9 @@ def huber_regression_cost(input,
coeff=1.0, coeff=1.0,
layer_attr=None): layer_attr=None):
""" """
In statistics, the Huber loss is a loss function used in robust regression, In statistics, the Huber loss is a loss function used in robust regression,
that is less sensitive to outliers in data than the squared error loss. that is less sensitive to outliers in data than the squared error loss.
Given a prediction f(x), a label y and :math:`\delta`, the loss function Given a prediction f(x), a label y and :math:`\delta`, the loss function
is defined as: is defined as:
.. math: .. math:
...@@ -5848,13 +5853,13 @@ def huber_classification_cost(input, ...@@ -5848,13 +5853,13 @@ def huber_classification_cost(input,
coeff=1.0, coeff=1.0,
layer_attr=None): layer_attr=None):
""" """
For classification purposes, a variant of the Huber loss called modified Huber For classification purposes, a variant of the Huber loss called modified Huber
is sometimes used. Given a prediction f(x) (a real-valued classifier score) and is sometimes used. Given a prediction f(x) (a real-valued classifier score) and
a true binary class label :math:`y\in \left \{-1, 1 \right \}`, the modified Huber a true binary class label :math:`y\in \left \{-1, 1 \right \}`, the modified Huber
loss is defined as: loss is defined as:
.. math: .. math:
loss = \max \left ( 0, 1-yf(x) \right )^2, yf(x)\geq 1 loss = \max \left ( 0, 1-yf(x) \right )^2, yf(x)\geq 1
loss = -4yf(x), \text{otherwise} loss = -4yf(x), \text{otherwise}
The example usage is: The example usage is:
......
...@@ -45,7 +45,7 @@ layers { ...@@ -45,7 +45,7 @@ layers {
coeff: 1.0 coeff: 1.0
} }
layers { layers {
name: "__mse_cost_0__" name: "__square_error_cost_0__"
type: "square_error" type: "square_error"
size: 1 size: 1
active_type: "" active_type: ""
...@@ -130,7 +130,7 @@ input_layer_names: "label" ...@@ -130,7 +130,7 @@ input_layer_names: "label"
input_layer_names: "weight" input_layer_names: "weight"
input_layer_names: "multi_class_label" input_layer_names: "multi_class_label"
output_layer_names: "__cost_0__" output_layer_names: "__cost_0__"
output_layer_names: "__mse_cost_0__" output_layer_names: "__square_error_cost_0__"
output_layer_names: "__nce_layer_0__" output_layer_names: "__nce_layer_0__"
evaluators { evaluators {
name: "classification_error_evaluator" name: "classification_error_evaluator"
...@@ -146,7 +146,7 @@ sub_models { ...@@ -146,7 +146,7 @@ sub_models {
layer_names: "weight" layer_names: "weight"
layer_names: "__fc_layer_0__" layer_names: "__fc_layer_0__"
layer_names: "__cost_0__" layer_names: "__cost_0__"
layer_names: "__mse_cost_0__" layer_names: "__square_error_cost_0__"
layer_names: "multi_class_label" layer_names: "multi_class_label"
layer_names: "__nce_layer_0__" layer_names: "__nce_layer_0__"
input_layer_names: "input" input_layer_names: "input"
...@@ -154,7 +154,7 @@ sub_models { ...@@ -154,7 +154,7 @@ sub_models {
input_layer_names: "weight" input_layer_names: "weight"
input_layer_names: "multi_class_label" input_layer_names: "multi_class_label"
output_layer_names: "__cost_0__" output_layer_names: "__cost_0__"
output_layer_names: "__mse_cost_0__" output_layer_names: "__square_error_cost_0__"
output_layer_names: "__nce_layer_0__" output_layer_names: "__nce_layer_0__"
evaluator_names: "classification_error_evaluator" evaluator_names: "classification_error_evaluator"
is_recurrent_layer_group: false is_recurrent_layer_group: false
......
...@@ -10,7 +10,7 @@ fc = fc_layer(input=data, size=10, act=SoftmaxActivation()) ...@@ -10,7 +10,7 @@ fc = fc_layer(input=data, size=10, act=SoftmaxActivation())
outputs( outputs(
classification_cost( classification_cost(
input=fc, label=lbl, weight=wt), input=fc, label=lbl, weight=wt),
mse_cost( square_error_cost(
input=fc, label=lbl, weight=wt), input=fc, label=lbl, weight=wt),
nce_layer( nce_layer(
input=fc, input=fc,
......
...@@ -134,8 +134,9 @@ class CostLayerTest(unittest.TestCase): ...@@ -134,8 +134,9 @@ class CostLayerTest(unittest.TestCase):
cost3 = layer.cross_entropy_cost(input=inference, label=label) cost3 = layer.cross_entropy_cost(input=inference, label=label)
cost4 = layer.cross_entropy_with_selfnorm_cost( cost4 = layer.cross_entropy_with_selfnorm_cost(
input=inference, label=label) input=inference, label=label)
cost5 = layer.mse_cost(input=inference, label=label) cost5 = layer.square_error_cost(input=inference, label=label)
cost6 = layer.mse_cost(input=inference, label=label, weight=weight) cost6 = layer.square_error_cost(
input=inference, label=label, weight=weight)
cost7 = layer.multi_binary_label_cross_entropy_cost( cost7 = layer.multi_binary_label_cross_entropy_cost(
input=inference, label=label) input=inference, label=label)
cost8 = layer.rank_cost(left=score, right=score, label=score) cost8 = layer.rank_cost(left=score, right=score, label=score)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册