• J
    find_pack_entry: replace last_found_pack with MRU cache · a73cdd21
    Jeff King 提交于
    Each pack has an index for looking up entries in O(log n)
    time, but if we have multiple packs, we have to scan through
    them linearly. This can produce a measurable overhead for
    some operations.
    
    We dealt with this long ago in f7c22cc6 (always start looking
    up objects in the last used pack first, 2007-05-30), which
    keeps what is essentially a 1-element most-recently-used
    cache. In theory, we should be able to do better by keeping
    a similar but longer cache, that is the same length as the
    pack-list itself.
    
    Since we now have a convenient generic MRU structure, we can
    plug it in and measure. Here are the numbers for running
    p5303 against linux.git:
    
    Test                      HEAD^                HEAD
    ------------------------------------------------------------------------
    5303.3: rev-list (1)      31.56(31.28+0.27)    31.30(31.08+0.20) -0.8%
    5303.4: repack (1)        40.62(39.35+2.36)    40.60(39.27+2.44) -0.0%
    5303.6: rev-list (50)     31.31(31.06+0.23)    31.23(31.00+0.22) -0.3%
    5303.7: repack (50)       58.65(69.12+1.94)    58.27(68.64+2.05) -0.6%
    5303.9: rev-list (1000)   38.74(38.40+0.33)    31.87(31.62+0.24) -17.7%
    5303.10: repack (1000)    367.20(441.80+4.62)  342.00(414.04+3.72) -6.9%
    
    The main numbers of interest here are the rev-list ones
    (since that is exercising the normal object lookup code
    path).  The single-pack case shouldn't improve at all; the
    260ms speedup there is just part of the run-to-run noise
    (but it's important to note that we didn't make anything
    worse with the overhead of maintaining our cache). In the
    50-pack case, we see similar results. There may be a slight
    improvement, but it's mostly within the noise.
    
    The 1000-pack case does show a big improvement, though. That
    carries over to the repack case, as well. Even though we
    haven't touched its pack-search loop yet, it does still do a
    lot of normal object lookups (e.g., for the internal
    revision walk), and so improves.
    
    As a point of reference, I also ran the 1000-pack test
    against a version of HEAD^ with the last_found_pack
    optimization disabled. It takes ~60s, so that gives an
    indication of how much even the single-element cache is
    helping.
    
    For comparison, here's a smaller repository, git.git:
    
    Test                      HEAD^               HEAD
    ---------------------------------------------------------------------
    5303.3: rev-list (1)      1.56(1.54+0.01)    1.54(1.51+0.02) -1.3%
    5303.4: repack (1)        1.84(1.80+0.10)    1.82(1.80+0.09) -1.1%
    5303.6: rev-list (50)     1.58(1.55+0.02)    1.59(1.57+0.01) +0.6%
    5303.7: repack (50)       2.50(3.18+0.04)    2.50(3.14+0.04) +0.0%
    5303.9: rev-list (1000)   2.76(2.71+0.04)    2.24(2.21+0.02) -18.8%
    5303.10: repack (1000)    13.21(19.56+0.25)  11.66(18.01+0.21) -11.7%
    
    You can see that the percentage improvement is similar.
    That's because the lookup we are optimizing is roughly
    O(nr_objects * nr_packs). Since the number of packs is
    constant in both tests, we'd expect the improvement to be
    linear in the number of objects. But the whole process is
    also linear in the number of objects, so the improvement
    is a constant factor.
    
    The exact improvement does also depend on the contents of
    the packs. In p5303, the extra packs all have 5 first-parent
    commits in them, which is a reasonable simulation of a
    pushed-to repository. But it also means that only 250
    first-parent commits are in those packs (compared to almost
    50,000 total in linux.git), and the rest are in the huge
    "base" pack. So once we start looking at history in taht big
    pack, that's where we'll find most everything, and even the
    1-element cache gets close to 100% cache hits.  You could
    almost certainly show better numbers with a more
    pathological case (e.g., distributing the objects more
    evenly across the packs). But that's simply not that
    realistic a scenario, so it makes more sense to focus on
    these numbers.
    
    The implementation itself is a straightforward application
    of the MRU code. We provide an MRU-ordered list of packs
    that shadows the packed_git list. This is easy to do because
    we only create and revise the pack list in one place. The
    "reprepare" code path actually drops the whole MRU and
    replaces it for simplicity. It would be more efficient to
    just add new entries, but there's not much point in
    optimizing here; repreparing happens rarely, and only after
    doing a lot of other expensive work.  The key things to keep
    optimized are traversal (which is just a normal linked list,
    albeit with one extra level of indirection over the regular
    packed_git list), and marking (which is a constant number of
    pointer assignments, though slightly more than the old
    last_found_pack was; it doesn't seem to create a measurable
    slowdown, though).
    Signed-off-by: NJeff King <peff@peff.net>
    Signed-off-by: NJunio C Hamano <gitster@pobox.com>
    a73cdd21
cache.h 66.3 KB