PaddlePaddle / Paddle
大约 1 年前同步成功

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 1423
- 列表
- 看板
- 标记
- 里程碑
合并请求 543
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

Why the GRU layer does not include weighing of the input?

Created by: wojtuss

The dynamic GRU algorithm described in the documentation (http://paddlepaddle.org/docs/0.14.0/api/fluid/en/layers.html#dynamic-gru) includes multiplication of the input x_t by weight matrices W_x, but the PaddlePaddle's implementation of the operator does not include this operation.

Here are my questions:

Why is that?
Could the usual FC layer (actually the MUL operator) preceding the dynamic GRU op be incorporated into GRU in nn.py?
Could the combination of two FC layers with two GRUs for the two opposite directions (like in the CRNN-CTC model) be joined in nn.py into a single bidirectional variant of the GRU op?

Changes proposed in 2. and 3. would allow for significant optimization using a single MKL-DNN's GRU operator.