未验证 提交 b8ba348c 编写于 作者: J Jacob Devlin 提交者: GitHub

Merge pull request #266 from msramalho/patch-1

Tokenization code simplification
...@@ -84,10 +84,7 @@ def load_vocab(vocab_file): ...@@ -84,10 +84,7 @@ def load_vocab(vocab_file):
def convert_by_vocab(vocab, items): def convert_by_vocab(vocab, items):
"""Converts a sequence of [tokens|ids] using the vocab.""" """Converts a sequence of [tokens|ids] using the vocab."""
output = [] return [vocab[item] for item in items]
for item in items:
output.append(vocab[item])
return output
def convert_tokens_to_ids(vocab, tokens): def convert_tokens_to_ids(vocab, tokens):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册