• J
    Cherry pick/fix transformer (#16620) · 29f34416
    Jiabin Yang 提交于
    * Imperative deep-first backward process (#16605)
    
    * Fix bug of gradient interface
    
    * shrink transformer
    
    * Right transformer
    
    * Change from width-first backward to deep-first backward process
    
    test=develop
    
    * Reverse iterator op's input
    
    test=develop
    
    * Polish code
    
    * Change the iteration direction in ingrads' map slots
    
    test=develop
    
    * Polish code
    
    test=develop
    
    * test=develop, cherry-pick fix for transformer in dygraph
    
    * test=develop, fix transformer in dygraph
    /
    29f34416
test_imperative_transformer.py 43.0 KB