• A
    Support async checkpoint in Orbit trainer/controller. · 2b4fe39d
    A. Unique TensorFlower 提交于
    This CL adds a field in Orbit trainer/controller indicating whether async checkpoint is enabled for checkpoint saving. BY default this value is set to False, which is equivalent to the existing behavior.
    
    In addition, a sync barrier is added at the end of training (in controller) to make sure users code won't prematurely access the checkpoint file/state when the async checkpoint saving is still ongoing.
    
    PiperOrigin-RevId: 529300903
    2b4fe39d
controller.py 24.1 KB