• N
    improve diff-delta with sparse and/or repetitive data · 06a9f920
    Nicolas Pitre 提交于
    It is useless to preserve multiple hash entries for consecutive blocks
    with the same hash.  Keeping only the first one will allow for matching
    the longest string of identical bytes while subsequent blocks will only
    allow for shorter matches.  The backward matching code will match the
    end of it as necessary.
    
    This improves both performances (no repeated string compare with long
    successions of identical bytes, or even small group of bytes), as well
    as compression (less likely to need random hash bucket entry culling),
    especially with sparse files.
    
    With well behaved data sets this patch doesn't change much.
    Signed-off-by: NNicolas Pitre <nico@cam.org>
    Signed-off-by: NJunio C Hamano <junkio@cox.net>
    06a9f920
diff-delta.c 13.9 KB