• O
    Checkpoint reshaping (#1953) · 80d0a32f
    Olatunji Ruwase 提交于
    * unit test, remove exception, add notes
    
    * Move param_shapes to model files
    
    * Remove hard-coded constants
    
    * Conditioned to zero optimizer
    
    * Add zero checkpoint merging
    
    * Print checkpoint version
    
    * Reshape zero_* ckpt files
    
    * Merge zero* files contraction
    
    * Utils for 3D contraction reshaping
    
    * Remove bogus import
    
    * Support bf16_zero ckpts
    
    * Add param slice mappings
    
    * Load universal checkpoints
    
    * Per group mappings from Stas
    
    * Hack to load bf16 zero files
    
    * Param attributes
    
    * WIP
    
    * Fix api bug
    
    * Update lp with local/remote hp
    
    * Disable vocab padding handling
    
    * Update z2 checkpoint
    
    * Remove debug prints
    
    * Remove debug prints; Rebase unit test
    
    * Add reshape assert
    
    * Padding
    
    * Typo
    
    * Catch nonexistent checkpoint path
    
    * Cleanup
    
    * Restore checkpoint state comparisons
    
    * Add torch version guards
    
    * More precise avoidance of false positives.
    Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
    80d0a32f
stage_1_and_2.py 105.0 KB