• L
    Fix memory leak in "git rev-list --objects" · 91b452cb
    Linus Torvalds 提交于
    Martin Langhoff points out that "git repack -a" ends up using up a lot of
    memory for big archives, and that git cvsimport probably should do only
    incremental repacks in order to avoid having repacking flush all the
    caches.
    
    The big majority of the memory usage of repacking is from git rev-list
    tracking all objects, and this patch should go a long way in avoiding the
    excessive memory usage: the bulk of it was due to the object names being
    leaked from the tree parser.
    
    For the historic Linux kernel archive, this simple patch does:
    
    Before:
    	/usr/bin/time git-rev-list --all --objects > /dev/null
    
    	72.45user 0.82system 1:13.55elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    	0inputs+0outputs (0major+125376minor)pagefaults 0swaps
    
    After:
    	/usr/bin/time git-rev-list --all --objects > /dev/null
    
    	75.22user 0.48system 1:16.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    	0inputs+0outputs (0major+43921minor)pagefaults 0swaps
    
    where we do end up wasting a bit of time on some extra strdup()s (which
    could be avoided, but that would require tracking where the pathnames came
    from), but we avoid a lot of memory usage.
    
    Minor page faults track maximum RSS very closely (each page fault maps in
    one page into memory), so the reduction from 125376 page faults to 43921
    means a rough reduction of VM footprint from almost half a gigabyte to
    about a third of that. Those numbers were also double-checked by looking
    at "top" while the process was running.
    
    (Side note: at least part of the remaining VM footprint is the mapping of
    the 177MB pack-file, so the remaining memory use is at least partly "well
    behaved" from a project caching perspective).
    
    For the current git archive itself, the memory usage for a "--all
    --objects" rev-list invocation dropped from 7128 pages to 2318 (27MB to
    9MB), so the reduction seems to hold for much smaller projects too.
    
    For regular "git-rev-list" usage (ie without the "--objects" flag) this
    patch has no impact.
    Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
    Signed-off-by: NJunio C Hamano <junkio@cox.net>
    91b452cb
builtin-rev-list.c 8.2 KB