• Z
    [Cherry-Pick] Support pure fp16 training for AMP API. (#29544) (#30241) · d8dfef54
    Zhen Wang 提交于
    * Support pure fp16 training for AMP API. (#29544)
    
    * add cast ops before and after unsupported fp16 ops.
    
    * Keep partial net in FP32 pattern.
    
    * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
    
    * Add fp16 support for adam op.
    
    * add multi precision attr for adam.
    
    * Fix the bug of test_multi_precision_fp16_train UT.
    
    * Code format for CI.
    
    * Fix the redefine error about MPTypeTrait on windows.
    
    * fix bugs of the _create_accumulators func in Momentum.
    
    * fix bug when inserting post cast op.
    
    * Add the update_loss_scaling op in allow_set of UnusedVarCheck.
    
    * Update for ci coverage.
    
    * Add some doc for OptimizerWithMixedPrecision.
    
    * Fix the code style.
    
    * Imporve the doc of `amp_init`.
    
    * Change for fp16 testing if users have the infer program defined in separate way.
    
    * Remove tensor copy in the update_loss_scaling op. (#29426)
    
    * remove tensor copy in the update_loss_scaling op
    
    * not use thrust.
    
    * fix some cuda memory access error.
    d8dfef54
update_loss_scaling_op.cu 3.6 KB