Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • 合并请求
  • !13553

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板

Adding fused_embedding_fc_lstm op !13553

  • Report abuse
!13553 已合并 9月 24, 2018 由 saxon_zh@saxon_zh 创建
#<User:0x00007f0ef8d19ff8>
  • 概览 14
  • 提交 5
  • 变更 10

Created by: jczaja

This PR is introducing fused_embedding_fc_lstm_op along with corresponding pass.

General idea is to replace lookup table of Embeddings with lookup table of Weights_x * embeddings. That way we can skip W_x * x in LSTM equations, and replace it with lookup. W_x * embeddings is computed only once in a relevant pass using SGEMM.

Performance and functional testing was done using Test_Text_classification C-API test along with Senta model and data.txt input. Accuracy on provided input is the same as when using fusion_lstm.

Performance gain observed is that execution time relatively to fusion_lstm is better by ~10%. Eg. fused_embedding_fc_lstm execution takes ~90% of time of fusion_lstm.

Memory consumption is is higher ((4*hidden_size / embedding_size) times higher) . For used benchmark peak memory consumption was four times higher.

Notes & missing items:

  • Unit tests will be implemented in next PR
  • This PR does not invalidate fusion_lstm which still should be used for models that do not have embedding eg. CRNN CTC , DeepSpeech etc.
  • BatchCompute path is enabled but was not validated (Test_Text_Classification is requiring Batch Size : 1.)
  • implementation of operator is heavily based on fusion_lstm_op . It would be good to exclude common part in a future , not to produce redundant code
  • Currently lookup table attributes : is_sparse and is_distributed are not supported (New op is not created when lookup op is not having either of mentioned attributes set)
  • Similar concept could be implemented for GRU ops.
指派人
分配到
审核者
Request review from
无
里程碑
无
分配里程碑
工时统计
标识: paddlepaddle/Paddle!13553
Source branch: github/fork/jczaja/prv-fused_embedding_fc_lstm_op
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7