Input size in GRU operator
Created by: tpatejko
I'm working on integrating MKLDNN GRU primitive in PaddlePaddle. I would like to clarify my issue understanding format and size of Input tensor.
In the comment below: https://github.com/PaddlePaddle/Paddle/blob/83c85f34e84d5bae22d374374408de780d10ae21/paddle/fluid/operators/gru_op.cc#L75-L79
is total time step
a length of the longest sequence in the batch, or is it the sum of lengths of all the sequences in the batch?