-
由 Jack Zhou 提交于
* Add TokenEmbedding * download corpus embedding data * load embedding data by specifying corpus name * extend the vocab of tokenizer from corpus embedding data * add unk token setting * modify tokenizer * add extend voacb * move jieba tokenizer and rename corpus_name->embedding_name * use bos url instead of localhost * add log when loading data * add token dot computation; add __repr__ of TokenEmbedding * add color logging * use paddlenlp.utils.log * adjust repr * update pretrained embedding table * fix padding idx
e59f15a1
This project manages its dependencies using pip.
进一步了解