• J
    Cherry pick/fix transformer (#16620) · 29f34416
    Jiabin Yang 提交于
    * Imperative deep-first backward process (#16605)
    
    * Fix bug of gradient interface
    
    * shrink transformer
    
    * Right transformer
    
    * Change from width-first backward to deep-first backward process
    
    test=develop
    
    * Reverse iterator op's input
    
    test=develop
    
    * Polish code
    
    * Change the iteration direction in ingrads' map slots
    
    test=develop
    
    * Polish code
    
    test=develop
    
    * test=develop, cherry-pick fix for transformer in dygraph
    
    * test=develop, fix transformer in dygraph
    /
    29f34416
layer.cc 14.0 KB