未验证 提交 dce498ce 编写于 作者: L liu zhengxi 提交者: GitHub

alter the default value for vocab (#5062)

上级 448ef165
......@@ -36,14 +36,14 @@ class Vocab(object):
between tokens and indices to be used. If provided, adjust the tokens
and indices mapping according to it. If None, counter must be provided.
Default: None.
unk_token (str): special token for unknow token. If no need, it also
could be None. Default: '<unk>'.
pad_token (str): special token for padding token. If no need, it also
could be None. Default: '<pad>'.
bos_token (str): special token for bos token. If no need, it also
could be None. Default: <bos>'.
eos_token (str): special token for eos token. If no need, it also
could be None. Default: '<eos>'.
unk_token (str): special token for unknow token '<unk>'. If no need, it also
could be None. Default: None.
pad_token (str): special token for padding token '<pad>'. If no need, it also
could be None. Default: None.
bos_token (str): special token for bos token '<bos>'. If no need, it also
could be None. Default: None.
eos_token (str): special token for eos token '<eos>'. If no need, it also
could be None. Default: None.
**kwargs (dict): Keyword arguments ending with `_token`. It can be used
to specify further special tokens that will be exposed as attribute
of the vocabulary and associated with an index.
......@@ -54,10 +54,10 @@ class Vocab(object):
max_size=None,
min_freq=1,
token_to_idx=None,
unk_token='<unk>',
pad_token='<pad>',
bos_token='<bos>',
eos_token='<eos>',
unk_token=None,
pad_token=None,
bos_token=None,
eos_token=None,
**kwargs):
# Handle special tokens
combs = (('unk_token', unk_token), ('pad_token', pad_token),
......@@ -317,10 +317,10 @@ class Vocab(object):
max_size=None,
min_freq=1,
token_to_idx=None,
unk_token='<unk>',
pad_token='<pad>',
bos_token='<bos>',
eos_token='<eos>',
unk_token=None,
pad_token=None,
bos_token=None,
eos_token=None,
**kwargs):
"""
Building vocab accoring to given iterator and other information. Iterate
......@@ -333,14 +333,14 @@ class Vocab(object):
between tokens and indices to be used. If provided, adjust the tokens
and indices mapping according to it. If None, counter must be provided.
Default: None.
unk_token (str): special token for unknow token. If no need, it also
could be None. Default: '<unk>'.
pad_token (str): special token for padding token. If no need, it also
could be None. Default: '<pad>'.
bos_token (str): special token for bos token. If no need, it also
could be None. Default: <bos>'.
eos_token (str): special token for eos token. If no need, it also
could be None. Default: '<eos>'.
unk_token (str): special token for unknow token '<unk>'. If no need, it also
could be None. Default: None.
pad_token (str): special token for padding token '<pad>'. If no need, it also
could be None. Default: None.
bos_token (str): special token for bos token '<bos>'. If no need, it also
could be None. Default: None.
eos_token (str): special token for eos token '<eos>'. If no need, it also
could be None. Default: None.
**kwargs (dict): Keyword arguments ending with `_token`. It can be used
to specify further special tokens that will be exposed as attribute
of the vocabulary and associated with an index.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册