1. 15 12月, 2020 2 次提交
    • L
      [Transformer] Simplify transformer reader and fix TranslationDataset (#5035) · 30ccfc67
      liu zhengxi 提交于
      * fix translation dataset and simplify transformer reader
      30ccfc67
    • L
      Update seq2seq example (#5016) · 7fae3401
      LiuChiachi 提交于
      * update seq2seq, using paddlenlp
      
      * Using new paddlenlp API
      
      * update seq2seqREADME
      
      * wrap dev ds
      
      * delete useless comments
      
      * update predict.py
      
      * using paddlenlp.bleu
      
      * remove shard
      
      * update README, using bleu perl
      
      * delete cand
      
      * Remove tokens that make sentences longer than max_len
      
      * remove pdb
      
      * remove useless code.
      
      * update url and dataset name of vae dataset(ptb and yahoo)
      
      * update seq2seq and vae, data and README
      7fae3401
  2. 12 12月, 2020 1 次提交
  3. 10 12月, 2020 1 次提交
    • J
      Add TokenEmbedding (#4983) · e59f15a1
      Jack Zhou 提交于
      * Add TokenEmbedding
      
      * download corpus embedding data
      * load embedding data by specifying corpus name
      * extend the vocab of tokenizer from corpus embedding data
      
      * add unk token setting
      
      * modify tokenizer
      
      * add extend voacb
      
      * move jieba tokenizer and rename corpus_name->embedding_name
      
      * use bos url instead of localhost
      
      * add log when loading data
      
      * add token dot computation; add __repr__ of TokenEmbedding
      
      * add color logging
      
      * use paddlenlp.utils.log
      
      * adjust repr
      
      * update pretrained embedding table
      
      * fix padding idx
      e59f15a1
  4. 08 12月, 2020 2 次提交
  5. 07 12月, 2020 2 次提交