提交 8e957df4 编写于 作者: L luotao1 提交者: emailweixu

fix bug in dotmul_operator's api and anotation (#99)

* fix bug in dotmul_operator's api and anotation
* update rnn document
* remove redundant info of projection and operator in layers.py
上级 98bc889c
...@@ -142,12 +142,15 @@ We also project the encoder vector to :code:`decoder_size` dimensional space, ge ...@@ -142,12 +142,15 @@ We also project the encoder vector to :code:`decoder_size` dimensional space, ge
The decoder uses :code:`recurrent_group` to define the recurrent neural network. The step and output functions are defined in :code:`gru_decoder_with_attention`: The decoder uses :code:`recurrent_group` to define the recurrent neural network. The step and output functions are defined in :code:`gru_decoder_with_attention`:
.. code-block:: python .. code-block:: python
group_inputs=[StaticInput(input=encoded_vector,is_seq=True),
StaticInput(input=encoded_proj,is_seq=True)]
trg_embedding = embedding_layer( trg_embedding = embedding_layer(
input=data_layer(name='target_language_word', input=data_layer(name='target_language_word',
size=target_dict_dim), size=target_dict_dim),
size=word_vector_dim, size=word_vector_dim,
param_attr=ParamAttr(name='_target_language_embedding')) param_attr=ParamAttr(name='_target_language_embedding'))
group_inputs.append(trg_embedding)
# For decoder equipped with attention mechanism, in training, # For decoder equipped with attention mechanism, in training,
# target embedding (the groudtruth) is the data input, # target embedding (the groudtruth) is the data input,
# while encoded source sequence is accessed to as an unbounded memory. # while encoded source sequence is accessed to as an unbounded memory.
...@@ -156,13 +159,7 @@ The decoder uses :code:`recurrent_group` to define the recurrent neural network. ...@@ -156,13 +159,7 @@ The decoder uses :code:`recurrent_group` to define the recurrent neural network.
# All sequence inputs should have the same length. # All sequence inputs should have the same length.
decoder = recurrent_group(name=decoder_group_name, decoder = recurrent_group(name=decoder_group_name,
step=gru_decoder_with_attention, step=gru_decoder_with_attention,
input=[ input=group_inputs)
StaticInput(input=encoded_vector,
is_seq=True),
StaticInput(input=encoded_proj,
is_seq=True),
trg_embedding
])
The implementation of the step function is listed as below. First, it defines the **memory** of the decoder network. Then it defines attention, gated recurrent unit step function, and the output function: The implementation of the step function is listed as below. First, it defines the **memory** of the decoder network. Then it defines attention, gated recurrent unit step function, and the output function:
...@@ -217,10 +214,8 @@ The code is listed below: ...@@ -217,10 +214,8 @@ The code is listed below:
.. code-block:: python .. code-block:: python
gen_inputs = [StaticInput(input=encoded_vector, group_inputs=[StaticInput(input=encoded_vector,is_seq=True),
is_seq=True), StaticInput(input=encoded_proj,is_seq=True)]
StaticInput(input=encoded_proj,
is_seq=True), ]
# In generation, decoder predicts a next target word based on # In generation, decoder predicts a next target word based on
# the encoded source sequence and the last generated target word. # the encoded source sequence and the last generated target word.
# The encoded source sequence (encoder's output) must be specified by # The encoded source sequence (encoder's output) must be specified by
...@@ -231,10 +226,10 @@ The code is listed below: ...@@ -231,10 +226,10 @@ The code is listed below:
size=target_dict_dim, size=target_dict_dim,
embedding_name='_target_language_embedding', embedding_name='_target_language_embedding',
embedding_size=word_vector_dim) embedding_size=word_vector_dim)
gen_inputs.append(trg_embedding) group_inputs.append(trg_embedding)
beam_gen = beam_search(name=decoder_group_name, beam_gen = beam_search(name=decoder_group_name,
step=gru_decoder_with_attention, step=gru_decoder_with_attention,
input=gen_inputs, input=group_inputs,
id_input=data_layer(name="sent_id", id_input=data_layer(name="sent_id",
size=1), size=1),
dict_file=trg_dict_path, dict_file=trg_dict_path,
......
...@@ -169,6 +169,12 @@ dotmul_projection ...@@ -169,6 +169,12 @@ dotmul_projection
:members: dotmul_projection :members: dotmul_projection
:noindex: :noindex:
dotmul_operator
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: dotmul_operator
:noindex:
full_matrix_projection full_matrix_projection
---------------------- ----------------------
.. automodule:: paddle.trainer_config_helpers.layers .. automodule:: paddle.trainer_config_helpers.layers
......
...@@ -2464,11 +2464,11 @@ class MixedLayer(LayerBase): ...@@ -2464,11 +2464,11 @@ class MixedLayer(LayerBase):
if size != 0: if size != 0:
self.set_layer_size(size) self.set_layer_size(size)
else: else:
size = operator.calc_output_size(operator_conf.input_sizes) sz = operator.calc_output_size(operator_conf.input_sizes)
if size != 0: if sz != 0:
config_assert(size == self.config.size, config_assert(sz == self.config.size,
"different inputs have different size: %s vs. %s" % "different inputs have different size: %s vs. %s" %
(size, self.config.size)) (sz, self.config.size))
for input_index in xrange(len(self.inputs)): for input_index in xrange(len(self.inputs)):
input_layer = self.get_input_layer(input_index) input_layer = self.get_input_layer(input_index)
input = self.inputs[input_index] input = self.inputs[input_index]
......
...@@ -286,7 +286,6 @@ def full_matrix_projection(input, size=0, param_attr=None): ...@@ -286,7 +286,6 @@ def full_matrix_projection(input, size=0, param_attr=None):
size=size, size=size,
**param_attr.attr) **param_attr.attr)
proj.origin = input proj.origin = input
proj.origin.projection = "matrix"
return proj return proj
...@@ -333,7 +332,6 @@ def table_projection(input, size=0, param_attr=None): ...@@ -333,7 +332,6 @@ def table_projection(input, size=0, param_attr=None):
size=size, size=size,
**param_attr.attr) **param_attr.attr)
proj.origin = input proj.origin = input
proj.origin.projection = "table"
return proj return proj
...@@ -377,17 +375,15 @@ def identity_projection(input, offset=None): ...@@ -377,17 +375,15 @@ def identity_projection(input, offset=None):
if offset is None: if offset is None:
proj = IdentityProjection(input_layer_name=input.name) proj = IdentityProjection(input_layer_name=input.name)
proj.origin = input proj.origin = input
proj.origin.projection = 'identity'
else: else:
proj = IdentityOffsetProjection(input_layer_name=input.name, proj = IdentityOffsetProjection(input_layer_name=input.name,
offset=offset) offset=offset)
proj.origin = input proj.origin = input
proj.origin.projection = 'identity_offset'
return proj return proj
@wrap_param_attr_default() @wrap_param_attr_default()
def dotmul_projection(input, param_attr=None, scale=1): def dotmul_projection(input, param_attr=None):
""" """
DotMulProjection with a layer as input. DotMulProjection with a layer as input.
It performs element-wise multiplication with weight. It performs element-wise multiplication with weight.
...@@ -407,30 +403,35 @@ def dotmul_projection(input, param_attr=None, scale=1): ...@@ -407,30 +403,35 @@ def dotmul_projection(input, param_attr=None, scale=1):
:type input: LayerOutput :type input: LayerOutput
:param param_attr: Parameter config, None if use default. :param param_attr: Parameter config, None if use default.
:type param_attr: ParameterAttribute :type param_attr: ParameterAttribute
:param scale: config scalar, default value is one.
:type scale: float
:return: A DotMulProjection Object. :return: A DotMulProjection Object.
:rtype: DotMulProjection :rtype: DotMulProjection
""" """
proj = DotMulProjection(input_layer_name=input.name, proj = DotMulProjection(input_layer_name=input.name,
size=input.size, size=input.size,
**param_attr.attr) **param_attr.attr)
proj.origin = input proj.origin = input
return proj return proj
def dotmul_operator(x, y, scale=1): def dotmul_operator(x, y, scale=1):
""" """
DotMulOperator takes two inputs and performs element-wise multiplication: DotMulOperator takes two inputs and performs element-wise multiplication:
.. math:: .. math::
out.row[i] += scale * (in1.row[i] .* in2.row[i]) out.row[i] += scale * (x.row[i] .* y.row[i])
where :math:`.*` means element-wise multiplication, and where :math:`.*` means element-wise multiplication, and
scale is a config scalar, its default value is one. scale is a config scalar, its default value is one.
The example usage is: The example usage is:
.. code-block:: python .. code-block:: python
op = dotmul_operator(x, y,
scale=1) op = dotmul_operator(x=layer1, y=layer2, scale=0.5)
:param input: Input layer
:type input: LayerOutput :param x: Input layer1
:type x: LayerOutput
:param y: Input layer2
:type y: LayerOutput
:param scale: config scalar, default value is one. :param scale: config scalar, default value is one.
:type scale: float :type scale: float
:return: A DotMulOperator Object. :return: A DotMulOperator Object.
...@@ -487,7 +488,6 @@ def context_projection(input, context_len, context_start=None, ...@@ -487,7 +488,6 @@ def context_projection(input, context_len, context_start=None,
trainable_padding=trainable, trainable_padding=trainable,
**extra_dict) **extra_dict)
proj.origin = input proj.origin = input
proj.origin.projection = 'context'
return proj return proj
...@@ -2728,7 +2728,6 @@ def conv_operator(img, filter, filter_size, num_filters, ...@@ -2728,7 +2728,6 @@ def conv_operator(img, filter, filter_size, num_filters,
stride_y=stride_y, stride_y=stride_y,
groups=groups)) groups=groups))
op.origin = [img, filter] op.origin = [img, filter]
op.origin.operator = "conv_op"
return op return op
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册