PaddlePaddle / models
大约 1 年前同步成功

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 602
- 列表
- 看板
- 标记
- 里程碑
合并请求 255
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

查找文件 Blame 历史永久链接 Permalink

J

Add TokenEmbedding (#4983) · e59f15a1

由 Jack Zhou 提交于 12月 10, 2020

* Add TokenEmbedding

* download corpus embedding data
* load embedding data by specifying corpus name
* extend the vocab of tokenizer from corpus embedding data

* add unk token setting

* modify tokenizer

* add extend voacb

* move jieba tokenizer and rename corpus_name->embedding_name

* use bos url instead of localhost

* add log when loading data

* add token dot computation; add __repr__ of TokenEmbedding

* add color logging

* use paddlenlp.utils.log

* adjust repr

* update pretrained embedding table

* fix padding idx

e59f15a1

This project manages its dependencies using pip. 进一步了解

requirements.txt 38 字节