Cherry pick/fix transformer (#16620)
* Imperative deep-first backward process (#16605) * Fix bug of gradient interface * shrink transformer * Right transformer * Change from width-first backward to deep-first backward process test=develop * Reverse iterator op's input test=develop * Polish code * Change the iteration direction in ingrads' map slots test=develop * Polish code test=develop * test=develop, cherry-pick fix for transformer in dygraph * test=develop, fix transformer in dygraph /
Showing
想要评论请 注册 或 登录