未验证 提交 485bc6a0 编写于 作者: T Tao Luo 提交者: GitHub

Merge pull request #16868 from chengduoZH/speedup_test_parallel_executor_transformer

Reduce the layer number of transfromer model
...@@ -65,7 +65,9 @@ class ModelHyperParams(object): ...@@ -65,7 +65,9 @@ class ModelHyperParams(object):
# number of head used in multi-head attention. # number of head used in multi-head attention.
n_head = 8 n_head = 8
# number of sub-layers to be stacked in the encoder and decoder. # number of sub-layers to be stacked in the encoder and decoder.
n_layer = 6 # NOTE(zcd): the origin number of layer is 6, to make this unit test faster,
# we should reduce the layer number to 4.
n_layer = 4
# dropout rate used by all dropout layers. # dropout rate used by all dropout layers.
dropout = 0.1 dropout = 0.1
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册