diff --git a/doc/fluid/design/mkldnn/gru/gru.md b/doc/fluid/design/mkldnn/gru/gru.md index e6305bdd4c7f55e8404801b751f41be030f85112..19a503734e4cf2a4d8ace7f24c9db3cb71d983bb 100644 --- a/doc/fluid/design/mkldnn/gru/gru.md +++ b/doc/fluid/design/mkldnn/gru/gru.md @@ -112,4 +112,4 @@ After execution of oneDNN GRU primitive, output tensor has to be converted back Generally, `mkldnn_placement_pass` is called at the beginning of all passes to set `use_mkldnn` attribute to true in all supported operator. However, standard PP `gru` operator does not have `use_mkldnn` attribute so it is not set and later, when fused to `fusion_gru` it also does not have it set. Current solution for that problem is to call `mkldnn_placement_pass` once again somewhere after pass fusing gru. Note: calling placement pass at the end can break other oparators that are conditionally used with oneDNN (`FC` for example) so be aware of that. ## Bidirectional GRU -oneDNN kernel supports `bidirectional` calculations with `sum` or `concat`. It means that primitive calculates both directions `right2left` and `left2right` and then sums/concatenates both outputs. It was implemented in PP as a PoC but had some harsh limitations. Both directions were calculated on the same input, therefore padding input with 0s yield to wrong results. The only scenario when this worked fine were if all sentences in a batch were of equal length or simply `BatchSize==1`. It happened to be so rare scenario that development of bidirectional gru has been postponed. \ No newline at end of file +oneDNN kernel supports `bidirectional` calculations with `sum` or `concat`. It means that primitive calculates both directions `right2left` and `left2right` and then sums/concatenates both outputs. It was implemented in PP as a PoC but had some harsh limitations. Both directions were calculated on the same input, therefore padding input with 0s yield to wrong results. The only scenario when this worked fine were if all sentences in a batch were of equal length or simply `BatchSize==1`. It happened to be so rare scenario that development of bidirectional gru has been postponed.