提交 3349094f 编写于 作者: C chengduozh

reduce the layer number of transfromer

test=develop
上级 06809ebb
...@@ -65,7 +65,9 @@ class ModelHyperParams(object): ...@@ -65,7 +65,9 @@ class ModelHyperParams(object):
# number of head used in multi-head attention. # number of head used in multi-head attention.
n_head = 8 n_head = 8
# number of sub-layers to be stacked in the encoder and decoder. # number of sub-layers to be stacked in the encoder and decoder.
n_layer = 6 # NOTE(zcd): the origin number of layer is 6, to make this unit test faster,
# we should reduce the layer number to 4.
n_layer = 4
# dropout rate used by all dropout layers. # dropout rate used by all dropout layers.
dropout = 0.1 dropout = 0.1
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册