未验证 提交 643a097d 编写于 作者: A Aurelius84 提交者: GitHub

add name decription of weight, bias (#1395)

* add name of weight, bias

* refine doc

* refine op and layers
上级 fe478f5d
......@@ -73,7 +73,11 @@ Fluid 中的 :code:`Variable` 可以包含任何类型的值———在大多
Name
=========
Fluid 中部分Operator里包含 :code:`name` 参数,如 :ref:`cn_api_fluid_layers_fc` 。该参数常用于标记此类OP对应的网络层名称,便于开发人员在打印调试信息时,快速定位各个网络层输出数据的来源位置。若在OP中不指定 :code:`name` 参数,其默认值为None,则在打印该网络层时,Fluid 将自动生成形如 ``OP名_数字.tmp_数字`` 的唯一标识对网络层进行命名,其中的数字会自动递增,以区分同名OP下的不同网络层;若指定了 :code:`name` 参数,则以 ``name值_数字.tmp_数字`` 作为唯一标识进行网络层命名。
Fluid 中部分网络层里包含了 :code:`name` 参数,如 :ref:`cn_api_fluid_layers_fc` 。此 :code:`name` 一般用来作为网络层输出、权重的前缀标识,具体规则如下:
* 用于网络层输出的前缀标识。若网络层中指定了 :code:`name` 参数,Fluid 将以 ``name值_数字.tmp_数字`` 作为唯一标识对网络层输出进行命名;未指定 :code:`name` 参数时,则以 ``OP名_数字.tmp_数字`` 的方式进行命名,其中的数字会自动递增,以区分同名OP下的不同网络层。
* 用于权重或偏置变量的前缀标识。若在网络层中通过 ``param_attr`` 和 ``bias_attr`` 创建了权重变量或偏置变量, 如 :ref:`cn_api_fluid_layers_embedding` 、 :ref:`cn_api_fluid_layers_fc` ,则 Fluid 会自动生成 ``前缀.w_数字`` 或 ``前缀.b_数字`` 的唯一标识对其进行命名,其中 ``前缀`` 为用户指定的 :code:`name` 或自动生成的 ``OP名_数字`` 。若在 ``param_attr`` 和 ``bias_attr`` 中指定了 :code:`name` ,则用此 :code:`name` ,不再自动生成。细节请参考示例代码。
此外,在 :ref:`cn_api_fluid_ParamAttr` 中,可通过指定 :code:`name` 参数实现多个网络层的权重共享。
......@@ -85,13 +89,14 @@ Fluid 中部分Operator里包含 :code:`name` 参数,如 :ref:`cn_api_fluid_la
import numpy as np
x = fluid.layers.data(name='x', shape=[1], dtype='int64', lod_level=1)
emb = fluid.layers.embedding(input=x, size=(128, 100))
emb = fluid.layers.embedding(input=x, size=(128, 100)) # embedding_0.w_0
emb = fluid.layers.Print(emb) # Tensor[embedding_0.tmp_0]
# default name
fc_none = fluid.layers.fc(input=emb, size=1)
fc_none = fluid.layers.fc(input=emb, size=1) # fc_0.w_0, fc_0.b_0
fc_none = fluid.layers.Print(fc_none) # Tensor[fc_0.tmp_1]
fc_none1 = fluid.layers.fc(input=emb, size=1)
fc_none1 = fluid.layers.fc(input=emb, size=1) # fc_1.w_0, fc_1.b_0
fc_none1 = fluid.layers.Print(fc_none1) # Tensor[fc_1.tmp_1]
# name in ParamAttr
......@@ -99,10 +104,10 @@ Fluid 中部分Operator里包含 :code:`name` 参数,如 :ref:`cn_api_fluid_la
print(w_param_attrs.name) # fc_weight
# name == 'my_fc'
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs)
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_0
my_fc1 = fluid.layers.Print(my_fc1) # Tensor[my_fc.tmp_1]
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs)
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_1
my_fc2 = fluid.layers.Print(my_fc2) # Tensor[my_fc.tmp_3]
place = fluid.CPUPlace()
......@@ -113,9 +118,11 @@ Fluid 中部分Operator里包含 :code:`name` 参数,如 :ref:`cn_api_fluid_la
ret = exe.run(feed={'x': x_lodTensor}, fetch_list=[fc_none, fc_none1, my_fc1, my_fc2], return_numpy=False)
在上述示例中,总共包含了四个全连接层。其中 ``fc_none`` 和 ``fc_none1`` 均未指定 :code:`name` 参数,则以 ``OP名_数字.tmp_数字`` 分别进行命名:``fc_0.tmp_1`` 和 ``fc_1.tmp_1`` ,其中 ``fc_1`` 和 ``fc_0`` 中的数字自动递增以区分两个全连接层;另外两个全连接层 ``my_fc1`` 和 ``my_fc2`` 均指定了 :code:`name` 参数,但取值相同,Fluid 会在网络层名称后按照代码顺序以后缀 ``tmp_数字`` 进行区分,即网络层名称分别为 ``my_fc.tmp_1`` 和 ``my_fc.tmp_3`` 。
上述示例中, ``fc_none`` 和 ``fc_none1`` 均未指定 :code:`name` 参数,则以 ``OP名_数字.tmp_数字`` 分别对该OP输出进行命名:``fc_0.tmp_1`` 和 ``fc_1.tmp_1`` ,其中 ``fc_0`` 和 ``fc_1`` 中的数字自动递增以区分两个全连接层; ``my_fc1`` 和 ``my_fc2`` 均指定了 :code:`name` 参数,但取值相同,Fluid 以后缀 ``tmp_数字`` 进行区分,即 ``my_fc.tmp_1`` 和 ``my_fc.tmp_3`` 。
对于网络层中创建的变量, ``emb`` 层和 ``fc_none`` 、 ``fc_none1`` 层均默认以 ``OP名_数字`` 为前缀对权重或偏置变量进行命名,如 ``embedding_0.w_0`` 、 ``fc_0.w_0`` 、 ``fc_0.b_0`` ,其前缀与OP输出的前缀一致。 ``my_fc1`` 层和 ``my_fc2`` 层则优先以 ``ParamAttr`` 中指定的 ``fc_weight`` 作为共享权重的名称。而偏置变量 ``my_fc.b_0`` 和 ``my_fc.b_1`` 则次优地以 :code:`name` 作为前缀标识。
此外,上述示例中,``my_fc1`` 和 ``my_fc2`` 两个全连接层通过构建 ``ParamAttr`` ,并指定 :code:`name` 参数,实现了网络层权重参数的共享机制。
在上述示例中,``my_fc1`` 和 ``my_fc2`` 两个全连接层通过构建 ``ParamAttr`` ,并指定 :code:`name` 参数,实现了网络层权重变量的共享机制。
.. _api_guide_ParamAttr:
......
......@@ -72,7 +72,11 @@ All the learnable parameters in the model are kept in the memory space in form o
Name
=========
In Fluid, some operators contain the parameter :code:`name` , such as :ref:`api_fluid_layers_fc` . This parameter is often used to name the network layer corresponding to the OP, which can help developers quickly locate the source of the output data from each network layer when printing debugg information. If the :code:`name` parameter is not specified in the OP, the default value is None. When printing the network layer, Fluid will automatically generate a unique identifier such as ``OPName_number.tmp_number`` to name the layer. The numbers are automatically incremented to distinguish different network layers under the same OP. If :code:`name` parameter is specified, the network layer is named with the ``nameValue_number.tmp_number`` as the unique identifier.
In Fluid, some layers contain the parameter :code:`name` , such as :ref:`api_fluid_layers_fc` . This :code:`name` is generally used as the prefix identification of output and weight in network layers. The specific rules are as follows:
* Prefix identification for output of layers. If :code:`name` is specified in the layer, Fluid will name the output with ``nameValue_number.tmp_number`` . If the :code:`name` is not specified, ``OPName_number.tmp_number`` is automatically generated to name the layer. The numbers are automatically incremented to distinguish different network layers under the same operator.
* Prefix identification for weight or bias variable. If the weight and bias variables are created by ``param_attr`` and ``bias_attr`` in operator, such as :ref:`api_fluid_layers_embedding` 、 :ref:`api_fluid_layers_fc` , Fluid will generate ``prefix.w_number`` or ``prefix.b_number`` as unique identifier to name them, where the ``prefix`` is :code:`name` specified by users or ``OPName_number`` generated by default. If :code:`name` is specified in ``param_attr`` and ``bias_attr`` , the :code:`name` is no longer generated automatically. Refer to the sample code for details.
In addition, the weights of multiple network layers can be shared by specifying the :code:`name` parameter in :ref:`api_fluid_ParamAttr`.
......@@ -84,13 +88,14 @@ Sample Code:
import numpy as np
x = fluid.layers.data(name='x', shape=[1], dtype='int64', lod_level=1)
emb = fluid.layers.embedding(input=x, size=(128, 100))
emb = fluid.layers.embedding(input=x, size=(128, 100)) # embedding_0.w_0
emb = fluid.layers.Print(emb) # Tensor[embedding_0.tmp_0]
# default name
fc_none = fluid.layers.fc(input=emb, size=1)
fc_none = fluid.layers.fc(input=emb, size=1) # fc_0.w_0, fc_0.b_0
fc_none = fluid.layers.Print(fc_none) # Tensor[fc_0.tmp_1]
fc_none1 = fluid.layers.fc(input=emb, size=1)
fc_none1 = fluid.layers.fc(input=emb, size=1) # fc_1.w_0, fc_1.b_0
fc_none1 = fluid.layers.Print(fc_none1) # Tensor[fc_1.tmp_1]
# name in ParamAttr
......@@ -98,10 +103,10 @@ Sample Code:
print(w_param_attrs.name) # fc_weight
# name == 'my_fc'
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs)
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_0
my_fc1 = fluid.layers.Print(my_fc1) # Tensor[my_fc.tmp_1]
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs)
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_1
my_fc2 = fluid.layers.Print(my_fc2) # Tensor[my_fc.tmp_3]
place = fluid.CPUPlace()
......@@ -112,9 +117,11 @@ Sample Code:
ret = exe.run(feed={'x': x_lodTensor}, fetch_list=[fc_none, fc_none1, my_fc1, my_fc2], return_numpy=False)
In the above example, there are four fully connected layers. ``fc_none`` and ``fc_none1`` are not specified :code:`name` parameter, so this two layers are named ``fc_0.tmp_1`` and ``fc_1.tmp_1`` in the form ``OPName_number.tmp_number`` , where the numbers in ``fc_1`` and ``fc_0`` are automatically incremented to distinguish between this two fully connected layers. The other two fully connected layers ``my_fc1`` and ``my_fc2`` both specify the :code:`name` parameter, but the values are the same. Fluid will add the suffix ``tmp_number`` after the name in code order to distinguish the two layers. So the network layer names are ``my_fc.tmp_1`` and ``my_fc.tmp_3`` .
In the above example, ``fc_none`` and ``fc_none1`` are not specified :code:`name` parameter, so this two layers are named with ``fc_0.tmp_1`` and ``fc_1.tmp_1`` in the form ``OPName_number.tmp_number`` , where the numbers in ``fc_0`` and ``fc_1`` are automatically incremented to distinguish this two fully connected layers. The other two fully connected layers ``my_fc1`` and ``my_fc2`` both specify the :code:`name` parameter with same values. Fluid will distinguish the two layers by suffix ``tmp_number`` . That is ``my_fc.tmp_1`` and ``my_fc.tmp_3`` .
Variables created in ``emb`` layer and ``fc_none`` , ``fc_none1`` are named by the ``OPName_number`` , such as ``embedding_0.w_0`` 、 ``fc_0.w_0`` 、 ``fc_0.b_0`` . And the prefix is consistent with the prefix of network layer. The ``my_fc1`` layer and ``my_fc2`` layer preferentially name the shared weight with ``fc_weight`` specified in ``ParamAttr`` . The bias variables ``my_fc.b_0`` and ``my_fc.b_1`` are identified suboptimally with :code:`name` int the operator as prefix.
In addition, in the above example, the ``my_fc1`` and ``my_fc2`` two fully connected layers implement the sharing of weight parameters by constructing ``ParamAttr`` and specifying the :code:`name` parameter.
In the above example, the ``my_fc1`` and ``my_fc2`` two fully connected layers implement the sharing of weight parameters by constructing ``ParamAttr`` and specifying the :code:`name` parameter.
.. _api_guide_ParamAttr:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册