• O
    Improve z3 trace management (#1916) · 673cb608
    Olatunji Ruwase 提交于
    * Fix OOM and type mismatch
    
    * Toggle prefetching
    
    * Disable z3 prefetching for inference (temp workaround)
    
    * Fix zero3 tracing issues
    
    * Remove debug prints
    
    * Enable prefetch for inference
    
    * Code clarity
    
    * Invalidate trace cache
    
    * Trace cache invalidation when needed
    Separate nvme prefetch from all-gather prefetch
    
    * Track last used step id
    
    * Use debug name in error message
    
    * Construct param trace from module trace
    Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
    673cb608
partitioned_param_coordinator.py 20.8 KB