1. 01 2月, 2019 1 次提交
    • G
      Speed up Transformer inference (#1476) · 00f3b76e
      Guo Sheng 提交于
      * Add py-reader and parallel-executor support in Transformer inference
      
      * Add statick k, v cache for encoder output in Transformer inference
      
      * Replace the cache from compute_qkv with cahce from split_heads in Transformer inference
      
      * Fuse k, q, v projection in Transformer
      
      * Revert the fused k, q, v projection in Transformer to be compatible with saved models
      
      * Use gather_op to replace sequence_expand_op in Transformer inference
      
      * Add fluid_transformer.md
      
      * Refine README for released models and data in Transformer
      
      * Refine README for released models and data in Transformer
      00f3b76e
  2. 31 1月, 2019 39 次提交