1. 14 12月, 2020 1 次提交
    • J
      Add more embedding and sample for the TokenEmbedding · ec17d938
      Jack Zhou 提交于
      * add all wiki embedding and part of baidu encyclopedia embedding.
      
      * add embedding example
      
      * add people_daily, weibo, sougou pretrained embedding
      
      * add zhihu, finacial,literature embedding
      
      * Add embedding model readme; add embedding train example and readme
      
      * fix README example
      
      * fix embedding doc
      ec17d938
  2. 12 12月, 2020 1 次提交
  3. 10 12月, 2020 1 次提交
    • J
      Add TokenEmbedding (#4983) · e59f15a1
      Jack Zhou 提交于
      * Add TokenEmbedding
      
      * download corpus embedding data
      * load embedding data by specifying corpus name
      * extend the vocab of tokenizer from corpus embedding data
      
      * add unk token setting
      
      * modify tokenizer
      
      * add extend voacb
      
      * move jieba tokenizer and rename corpus_name->embedding_name
      
      * use bos url instead of localhost
      
      * add log when loading data
      
      * add token dot computation; add __repr__ of TokenEmbedding
      
      * add color logging
      
      * use paddlenlp.utils.log
      
      * adjust repr
      
      * update pretrained embedding table
      
      * fix padding idx
      e59f15a1
  4. 07 12月, 2020 1 次提交