• M
    Forward recompute3 (#19913) · 9901f696
    mapingshuo 提交于
    * add recompute based checkpoints methods for large batch training
    test=develop
    
    * add append_backward_with_forward_recomputation
    test=develop
    
    * refine optimizer
    test=develop
    
    * update backward and optimizer
    test=develop
    
    * make Variable usable
    test=develop
    
    * add recompute code
    
    * refine optimizer
    test=develop
    
    * refine addup _append_backward_ops_with_checkpoints_
    1) for recompute part, just cache the grad_op_desc without appending to block
    2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch
    test=develop
    
    * make method private
    
    * add recompute strategy into DistributedStrategy
    test=develop
    
    * checkpoint version3
    test=develop
    
    * remove some print information
    test=develop
    
    * remove unused sumop
    test=develop
    
    * try to fix recompute with graph building modules
    
    * add input names to vars should be held
    
    * add memory debug tool
    
    * backup backward
    
    * Fix bugs
    
    * add backward desc for op not in any segments
    
    * add exception info for sub_block
    
    test=develop
    
    * modify code style
    
    test=develop
    
    * modify code style
    
    test=develop
    
    * remove print functions
    
    test=develop
    
    * add API spec
    
    test=develop
    test=document_preview
    
    * make Recompute a child class of Optimizer
    
    test=develop
    test=document_preview
    
    * add API spec
    
    test=develop
    test=document_preview
    
    * modify API spec
    
    test=develop
    test=document_preview
    
    * add document for Recompute
    
    test=develop
    test=document_preview
    
    * change API doc of Rcompute
    
    test=develop
    test=document_preview
    
    * code cleaning
    
    test=develop
    test=document_preview
    
    * modify API spec
    
    * fix bugs when segments hold no element
    
    * add testcase for Recompute Optimizer
    
    test=develop
    test=document_preview
    
    * add test for apply_gradient, and code cleaning
    
    test=develop
    test=document_preview
    
    * add test case for load function
    
    * enable CI
    
    test=develop
    test=document
    
    * add test case
    
    test=develop
    test=document_preview
    
    * add sample code for 4 function of recompute optimizer
    
    test=develop
    test=document_preview
    9901f696
test_optimizer.py 30.2 KB