Summarize operators used in ConvS2S
Created by: lcy-seso
Here I summarize operators will be used in ConvS2S:
-
positional embedding
- look_up_table but has to support padding_idx : https://github.com/PaddlePaddle/Paddle/issues/7309
- addition
-
convolution block structure: one-dimensional convolution followed by a GLU.
Is it necessary to implement GLU in one operator to optimize the time efficiency. This can be determined later.
- sequence convolution
- 2D convolution: sequence_conv_op
- GLU
- offset operator ?? (To be determined later)
- sigmoid
- element-wise multiplication
- addition
- attention
- matmul_op (the batched matrix multiplication)
- softmax along the specified axis
- reshape op
- softmax
- weight normalization
- sequence convolution
The missing operator:
- weight normalization, related to https://github.com/PaddlePaddle/Paddle/issues/6914
Need to be enhanced: (This enhancement is also needed by both Transformer and ConvS2S)
- look_up_table : related to https://github.com/PaddlePaddle/Paddle/issues/7309