• shaojie_wang's avatar
    [AMP]Master grad in static graph (#53362) · 972581d8
    shaojie_wang 提交于
    * add master gradients on static graph
    
    * add unit test for bf16 master grad static graph
    
    * use float16 as v100 test dtype
    
    * only skip GPU which do not support bf16
    
    * use linear layer to test master grad
    
    * 1.push master grad creation before all optimizer ops; 2.remove useless unittest; 3.use a function to create master grad states
    972581d8
adamw.py 24.6 KB