提交 45fd32c8 编写于 作者: H Hongkun Yu 提交者: A. Unique TensorFlower

Internal change

PiperOrigin-RevId: 477574174
上级 902e59d0
......@@ -335,7 +335,9 @@ class TransformerScaffold(tf.keras.layers.Layer):
training=training)
layer_output += source_attention_output
else:
# if not norm_first, assume that the feedforwad does apply layer norm
# Attention: if not norm_first, assume that the feedforwad does apply
# layer norm. The feedford also apply residual connection. Please
# read the `GatedFeedforward` as a concrete example.
layer_output = self._feedforward_block(attention_output,
training=training)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册