Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
4d3eefbb
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
4d3eefbb
编写于
9月 30, 2020
作者:
X
xiemoyuan
提交者:
GitHub
9月 30, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Modify the docs for Transformer's APIs. test=document_fix (#27729)
上级
ab85a891
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
10 addition
and
10 deletion
+10
-10
python/paddle/nn/layer/transformer.py
python/paddle/nn/layer/transformer.py
+10
-10
未找到文件。
python/paddle/nn/layer/transformer.py
浏览文件 @
4d3eefbb
...
...
@@ -644,7 +644,7 @@ class TransformerDecoderLayer(Layer):
`weight_attr` to create parameters. Default: None, which means the
default weight parameter property is used. See usage for details
in :ref:`api_fluid_ParamAttr` .
bias_attr (ParamAttr|tuple, optional): To specify the bias parameter property.
bias_attr (ParamAttr|tuple
|bool
, optional): To specify the bias parameter property.
If it is a tuple, `bias_attr[0]` would be used as `bias_attr` for
self attention, `bias_attr[1]` would be used as `bias_attr` for
cross attention, and `bias_attr[2]` would be used as `bias_attr`
...
...
@@ -982,12 +982,12 @@ class Transformer(Layer):
applies another layer normalization on the output of last encoder/decoder layer.
Parameters:
d_model (int): The expected feature size in the encoder/decoder input
and output.
nhead (int
): The number of heads in multi-head attention(MHA).
num_encoder_layers (int
): The number of layers in encoder.
num_
encoder_layers (int): The number of layers in decoder.
dim_feedforward (int
): The hidden layer size in the feedforward network(FFN).
d_model (int
, optional
): The expected feature size in the encoder/decoder input
and output.
Default 512
nhead (int
, optional): The number of heads in multi-head attention(MHA). Default 8
num_encoder_layers (int
, optional): The number of layers in encoder. Default 6
num_
decoder_layers (int, optional): The number of layers in decoder. Default 6
dim_feedforward (int
, optional): The hidden layer size in the feedforward network(FFN). Default 2048
dropout (float, optional): The dropout probability used in pre-process
and post-precess of MHA and FFN sub-layer. Default 0.1
activation (str, optional): The activation function in the feedforward
...
...
@@ -1015,7 +1015,7 @@ class Transformer(Layer):
Default: None, which means the default weight parameter property is used.
See usage for details
in :code:`ParamAttr` .
bias_attr (ParamAttr|tuple, optional): To specify the bias parameter property.
bias_attr (ParamAttr|tuple
|bool
, optional): To specify the bias parameter property.
If it is a tuple, the length of `bias_attr` could be 1, 2 or 3. If it is 3,
`bias_attr[0]` would be used as `bias_attr` for self attention, `bias_attr[1]`
would be used as `bias_attr` for cross attention of `TransformerDecoder`,
...
...
@@ -1028,9 +1028,9 @@ class Transformer(Layer):
The `False` value means the corresponding layer would not have trainable
bias parameter. See usage for details in :code:`ParamAttr` .
Default: None,which means the default bias parameter property is used.
custom_encoder (Layer): If custom encoder is provided, use it as the encoder.
custom_encoder (Layer
, optional
): If custom encoder is provided, use it as the encoder.
Default None
custom_decoder (Layer): If custom decoder is provided, use it as the decoder.
custom_decoder (Layer
, optional
): If custom decoder is provided, use it as the decoder.
Default None
Examples:
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录