1. 31 12月, 2013 3 次提交
    • V
      pack-objects: implement bitmap writing · 7cc8f971
      Vicent Marti 提交于
      This commit extends more the functionality of `pack-objects` by allowing
      it to write out a `.bitmap` index next to any written packs, together
      with the `.idx` index that currently gets written.
      
      If bitmap writing is enabled for a given repository (either by calling
      `pack-objects` with the `--write-bitmap-index` flag or by having
      `pack.writebitmaps` set to `true` in the config) and pack-objects is
      writing a packfile that would normally be indexed (i.e. not piping to
      stdout), we will attempt to write the corresponding bitmap index for the
      packfile.
      
      Bitmap index writing happens after the packfile and its index has been
      successfully written to disk (`finish_tmp_packfile`). The process is
      performed in several steps:
      
          1. `bitmap_writer_set_checksum`: this call stores the partial
             checksum for the packfile being written; the checksum will be
             written in the resulting bitmap index to verify its integrity
      
          2. `bitmap_writer_build_type_index`: this call uses the array of
             `struct object_entry` that has just been sorted when writing out
             the actual packfile index to disk to generate 4 type-index bitmaps
             (one for each object type).
      
             These bitmaps have their nth bit set if the given object is of
             the bitmap's type. E.g. the nth bit of the Commits bitmap will be
             1 if the nth object in the packfile index is a commit.
      
             This is a very cheap operation because the bitmap writing code has
             access to the metadata stored in the `struct object_entry` array,
             and hence the real type for each object in the packfile.
      
          3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
             index for one of the packfiles we're trying to repack, this call
             will efficiently rebuild the existing bitmaps so they can be
             reused on the new index. All the existing bitmaps will be stored
             in a `reuse` hash table, and the commit selection phase will
             prioritize these when selecting, as they can be written directly
             to the new index without having to perform a revision walk to
             fill the bitmap. This can greatly speed up the repack of a
             repository that already has bitmaps.
      
          4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
             a given `pack-objects` run, the sequence of commits generated
             during the Counting Objects phase will be stored in an array.
      
             We then use that array to build up the list of selected commits.
             Writing a bitmap in the index for each object in the repository
             would be cost-prohibitive, so we use a simple heuristic to pick
             the commits that will be indexed with bitmaps.
      
             The current heuristics are a simplified version of JGit's
             original implementation. We select a higher density of commits
             depending on their age: the 100 most recent commits are always
             selected, after that we pick 1 commit of each 100, and the gap
             increases as the commits grow older. On top of that, we make sure
             that every single branch that has not been merged (all the tips
             that would be required from a clone) gets their own bitmap, and
             when selecting commits between a gap, we tend to prioritize the
             commit with the most parents.
      
             Do note that there is no right/wrong way to perform commit
             selection; different selection algorithms will result in
             different commits being selected, but there's no such thing as
             "missing a commit". The bitmap walker algorithm implemented in
             `prepare_bitmap_walk` is able to adapt to missing bitmaps by
             performing manual walks that complete the bitmap: the ideal
             selection algorithm, however, would select the commits that are
             more likely to be used as roots for a walk in the future (e.g.
             the tips of each branch, and so on) to ensure a bitmap for them
             is always available.
      
          5. `bitmap_writer_build`: this is the computationally expensive part
             of bitmap generation. Based on the list of commits that were
             selected in the previous step, we perform several incremental
             walks to generate the bitmap for each commit.
      
             The walks begin from the oldest commit, and are built up
             incrementally for each branch. E.g. consider this dag where A, B,
             C, D, E, F are the selected commits, and a, b, c, e are a chunk
             of simplified history that will not receive bitmaps.
      
                  A---a---B--b--C--c--D
                           \
                            E--e--F
      
             We start by building the bitmap for A, using A as the root for a
             revision walk and marking all the objects that are reachable
             until the walk is over. Once this bitmap is stored, we reuse the
             bitmap walker to perform the walk for B, assuming that once we
             reach A again, the walk will be terminated because A has already
             been SEEN on the previous walk.
      
             This process is repeated for C, and D, but when we try to
             generate the bitmaps for E, we can reuse neither the current walk
             nor the bitmap we have generated so far.
      
             What we do now is resetting both the walk and clearing the
             bitmap, and performing the walk from scratch using E as the
             origin. This new walk, however, does not need to be completed.
             Once we hit B, we can lookup the bitmap we have already stored
             for that commit and OR it with the existing bitmap we've composed
             so far, allowing us to limit the walk early.
      
             After all the bitmaps have been generated, another iteration
             through the list of commits is performed to find the best XOR
             offsets for compression before writing them to disk. Because of
             the incremental nature of these bitmaps, XORing one of them with
             its predecesor results in a minimal "bitmap delta" most of the
             time. We can write this delta to the on-disk bitmap index, and
             then re-compose the original bitmaps by XORing them again when
             loaded.
      
             This is a phase very similar to pack-object's `find_delta` (using
             bitmaps instead of objects, of course), except the heuristics
             have been greatly simplified: we only check the 10 bitmaps before
             any given one to find best compressing one. This gives good
             results in practice, because there is locality in the ordering of
             the objects (and therefore bitmaps) in the packfile.
      
           6. `bitmap_writer_finish`: the last step in the process is
      	serializing to disk all the bitmap data that has been generated
      	in the two previous steps.
      
      	The bitmap is written to a tmp file and then moved atomically to
      	its final destination, using the same process as
      	`pack-write.c:write_idx_file`.
      Signed-off-by: NVicent Marti <tanoku@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      7cc8f971
    • V
      pack-objects: use bitmaps when packing objects · 6b8fda2d
      Vicent Marti 提交于
      In this patch, we use the bitmap API to perform the `Counting Objects`
      phase in pack-objects, rather than a traditional walk through the object
      graph. For a reasonably-packed large repo, the time to fetch and clone
      is often dominated by the full-object revision walk during the Counting
      Objects phase. Using bitmaps can reduce the CPU time required on the
      server (and therefore start sending the actual pack data with less
      delay).
      
      For bitmaps to be used, the following must be true:
      
        1. We must be packing to stdout (as a normal `pack-objects` from
           `upload-pack` would do).
      
        2. There must be a .bitmap index containing at least one of the
           "have" objects that the client is asking for.
      
        3. Bitmaps must be enabled (they are enabled by default, but can be
           disabled by setting `pack.usebitmaps` to false, or by using
           `--no-use-bitmap-index` on the command-line).
      
      If any of these is not true, we fall back to doing a normal walk of the
      object graph.
      
      Here are some sample timings from a full pack of `torvalds/linux` (i.e.
      something very similar to what would be generated for a clone of the
      repository) that show the speedup produced by various
      methods:
      
          [existing graph traversal]
          $ time git pack-objects --all --stdout --no-use-bitmap-index \
      			    </dev/null >/dev/null
          Counting objects: 3237103, done.
          Compressing objects: 100% (508752/508752), done.
          Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)
      
          real    0m44.111s
          user    0m42.396s
          sys     0m3.544s
      
          [bitmaps only, without partial pack reuse; note that
           pack reuse is automatic, so timing this required a
           patch to disable it]
          $ time git pack-objects --all --stdout </dev/null >/dev/null
          Counting objects: 3237103, done.
          Compressing objects: 100% (508752/508752), done.
          Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)
      
          real    0m5.413s
          user    0m5.604s
          sys     0m1.804s
      
          [bitmaps with pack reuse (what you get with this patch)]
          $ time git pack-objects --all --stdout </dev/null >/dev/null
          Reusing existing pack: 3237103, done.
          Total 3237103 (delta 0), reused 0 (delta 0)
      
          real    0m1.636s
          user    0m1.460s
          sys     0m0.172s
      Signed-off-by: NVicent Marti <tanoku@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      6b8fda2d
    • J
      pack-objects: split add_object_entry · ce2bc424
      Jeff King 提交于
      This function actually does three things:
      
        1. Check whether we've already added the object to our
           packing list.
      
        2. Check whether the object meets our criteria for adding.
      
        3. Actually add the object to our packing list.
      
      It's a little hard to see these three phases, because they
      happen linearly in the rather long function. Instead, this
      patch breaks them up into three separate helper functions.
      
      The result is a little easier to follow, though it
      unfortunately suffers from some optimization
      interdependencies between the stages (e.g., during step 3 we
      use the packing list index from step 1 and the packfile
      information from step 2).
      
      More importantly, though, the various parts can be
      composed differently, as they will be in the next patch.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ce2bc424
  2. 25 10月, 2013 2 次提交
    • V
      pack-objects: factor out name_hash · 68fb36eb
      Vicent Marti 提交于
      As the pack-objects system grows beyond the single
      pack-objects.c file, more parts (like the soon-to-exist
      bitmap code) will need to compute hashes for matching
      deltas. Factor out name_hash to make it available to other
      files.
      Signed-off-by: NVicent Marti <tanoku@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      68fb36eb
    • V
      pack-objects: refactor the packing list · 2834bc27
      Vicent Marti 提交于
      The hash table that stores the packing list for a given `pack-objects`
      run was tightly coupled to the pack-objects code.
      
      In this commit, we refactor the hash table and the underlying storage
      array into a `packing_data` struct. The functionality for accessing and
      adding entries to the packing list is hence accessible from other parts
      of Git besides the `pack-objects` builtin.
      
      This refactoring is a requirement for further patches in this series
      that will require accessing the commit packing list from outside of
      `pack-objects`.
      
      The hash table implementation has been minimally altered: we now
      use table sizes which are always a power of two, to ensure a uniform
      index distribution in the array.
      Signed-off-by: NVicent Marti <tanoku@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2834bc27
  3. 29 8月, 2013 1 次提交
  4. 03 8月, 2013 1 次提交
    • B
      Don't close pack fd when free'ing pack windows · 7c3ecb32
      Brandon Casey 提交于
      Now that close_one_pack() has been introduced to handle file
      descriptor pressure, it is not strictly necessary to close the
      pack file descriptor in unuse_one_window() when we're under memory
      pressure.
      
      Jeff King provided a justification for leaving the pack file open:
      
         If you close packfile descriptors, you can run into racy situations
         where somebody else is repacking and deleting packs, and they go away
         while you are trying to access them. If you keep a descriptor open,
         you're fine; they last to the end of the process. If you don't, then
         they disappear from under you.
      
         For normal object access, this isn't that big a deal; we just rescan
         the packs and retry. But if you are packing yourself (e.g., because
         you are a pack-objects started by upload-pack for a clone or fetch),
         it's much harder to recover (and we print some warnings).
      
      Let's do so (or uh, not do so).
      Signed-off-by: NBrandon Casey <drafnel@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      7c3ecb32
  5. 05 2月, 2013 1 次提交
  6. 05 10月, 2012 1 次提交
    • J
      peel_ref: do not return a null sha1 · e6dbffa6
      Jeff King 提交于
      The idea of the peel_ref function is to dereference tag
      objects recursively until we hit a non-tag, and return the
      sha1. Conceptually, it should return 0 if it is successful
      (and fill in the sha1), or -1 if there was nothing to peel.
      
      However, the current behavior is much more confusing. For a
      regular loose ref, the behavior is as described above. But
      there is an optimization to reuse the peeled-ref value for a
      ref that came from a packed-refs file. If we have such a
      ref, we return its peeled value, even if that peeled value
      is null (indicating that we know the ref definitely does
      _not_ peel).
      
      It might seem like such information is useful to the caller,
      who would then know not to bother loading and trying to peel
      the object. Except that they should not bother loading and
      trying to peel the object _anyway_, because that fallback is
      already handled by peel_ref. In other words, the whole point
      of calling this function is that it handles those details
      internally, and you either get a sha1, or you know that it
      is not peel-able.
      
      This patch catches the null sha1 case internally and
      converts it into a -1 return value (i.e., there is nothing
      to peel). This simplifies callers, which do not need to
      bother checking themselves.
      
      Two callers are worth noting:
      
        - in pack-objects, a comment indicates that there is a
          difference between non-peelable tags and unannotated
          tags. But that is not the case (before or after this
          patch). Whether you get a null sha1 has to do with
          internal details of how peel_ref operated.
      
        - in show-ref, if peel_ref returns a failure, the caller
          tries to decide whether to try peeling manually based on
          whether the REF_ISPACKED flag is set. But this doesn't
          make any sense. If the flag is set, that does not
          necessarily mean the ref came from a packed-refs file
          with the "peeled" extension. But it doesn't matter,
          because even if it didn't, there's no point in trying to
          peel it ourselves, as peel_ref would already have done
          so. In other words, the fallback peeling is guaranteed
          to fail.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e6dbffa6
  7. 21 8月, 2012 1 次提交
  8. 10 7月, 2012 1 次提交
  9. 30 5月, 2012 1 次提交
  10. 19 5月, 2012 2 次提交
  11. 12 4月, 2012 1 次提交
    • J
      gc: do not explode objects which will be immediately pruned · 7e52f566
      Jeff King 提交于
      When we pack everything into one big pack with "git repack
      -Ad", any unreferenced objects in to-be-deleted packs are
      exploded into loose objects, with the intent that they will
      be examined and possibly cleaned up by the next run of "git
      prune".
      
      Since the exploded objects will receive the mtime of the
      pack from which they come, if the source pack is old, those
      loose objects will end up pruned immediately. In that case,
      it is much more efficient to skip the exploding step
      entirely for these objects.
      
      This patch teaches pack-objects to receive the expiration
      information and avoid writing these objects out. It also
      teaches "git gc" to pass the value of gc.pruneexpire to
      repack (which in turn learns to pass it along to
      pack-objects) so that this optimization happens
      automatically during "git gc" and "git gc --auto".
      Signed-off-by: NJeff King <peff@peff.net>
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      7e52f566
  12. 27 2月, 2012 1 次提交
  13. 02 2月, 2012 3 次提交
  14. 13 1月, 2012 1 次提交
    • J
      thin-pack: try harder to use preferred base objects as base · 15f07e06
      Jeff King 提交于
      When creating a pack using objects that reside in existing packs, we try
      to avoid recomputing futile delta between an object (trg) and a candidate
      for its base object (src) if they are stored in the same packfile, and trg
      is not recorded as a delta already. This heuristics makes sense because it
      is likely that we tried to express trg as a delta based on src but it did
      not produce a good delta when we created the existing pack.
      
      As the pack heuristics prefer producing delta to remove data, and Linus's
      law dictates that the size of a file grows over time, we tend to record
      the newest version of the file as inflated, and older ones as delta
      against it.
      
      When creating a thin-pack to transfer recent history, it is likely that we
      will try to send an object that is recorded in full, as it is newer.  But
      the heuristics to avoid recomputing futile delta effectively forbids us
      from attempting to express such an object as a delta based on another
      object. Sending an object in full is often more expensive than sending a
      suboptimal delta based on other objects, and it is even more so if we
      could use an object we know the receiving end already has (i.e. preferred
      base object) as the delta base.
      
      Tweak the recomputation avoidance logic, so that we do not punt on
      computing delta against a preferred base object.
      
      The effect of this change can be seen on two simulated upload-pack
      workloads. The first is based on 44 reflog entries from my git.git
      origin/master reflog, and represents the packs that kernel.org sent me git
      updates for the past month or two. The second workload represents much
      larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
      v1.2.0, and so on.
      
      The table below shows the average generated pack size and the average CPU
      time consumed for each dataset, both before and after the patch:
      
                        dataset
                  | reflog | tags
      ---------------------------------
           before | 53358  | 2750977
      size  after | 32398  | 2668479
           change |   -39% |      -3%
      ---------------------------------
           before |  0.18  | 1.12
      CPU   after |  0.18  | 1.15
           change |    +0% |      +3%
      
      This patch makes a much bigger difference for packs with a shorter slice
      of history (since its effect is seen at the boundaries of the pack) though
      it has some benefit even for larger packs.
      Signed-off-by: NJeff King <peff@peff.net>
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      15f07e06
  15. 02 12月, 2011 1 次提交
  16. 17 11月, 2011 1 次提交
    • J
      pack-object: tolerate broken packs that have duplicated objects · f63c79db
      Junio C Hamano 提交于
      When --reuse-delta is in effect (which is the default), and an existing
      pack in the repository has the same object registered twice (e.g. one copy
      in a non-delta format and the other copy in a delta against some other
      object), an attempt to repack the repository can result in a cyclic delta
      dependency, causing write_one() function to infinitely recurse into
      itself.
      
      Detect such a case and break the loopy dependency by writing out an object
      that is involved in such a loop in the non-delta format.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f63c79db
  17. 29 10月, 2011 3 次提交
  18. 28 10月, 2011 1 次提交
  19. 21 10月, 2011 1 次提交
    • D
      pack-objects: don't traverse objects unnecessarily · 38d4debb
      Dan McGee 提交于
      This brings back some of the performance lost in optimizing recency
      order inside pack objects. We were doing extreme amounts of object
      re-traversal: for the 2.14 million objects in the Linux kernel
      repository, we were calling add_to_write_order() over 1.03 billion times
      (a 0.2% hit rate, making 99.8% of of these calls extraneous).
      
      Two optimizations take place here- we can start our objects array
      iteration from a known point where we left off before we started trying
      to find our tags, and we don't need to do the deep dives required by
      add_family_to_write_order() if the object has already been marked as
      filled.
      
      These two optimizations bring some pretty spectacular results via `perf
      stat`:
      
      task-clock:   83373 ms        --> 43800 ms         (50% faster)
      cycles:       221,633,461,676 --> 116,307,209,986  (47% fewer)
      instructions: 149,299,179,939 --> 122,998,800,184  (18% fewer)
      
      Helped-by: Ramsay Jones (format string fix in "die" message)
      Signed-off-by: NDan McGee <dpmcgee@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      38d4debb
  20. 18 10月, 2011 3 次提交
  21. 15 10月, 2011 2 次提交
    • J
      downgrade "packfile cannot be accessed" errors to warnings · 58a6a9cc
      Jeff King 提交于
      These can happen if another process simultaneously prunes a
      pack. But that is not usually an error condition, because a
      properly-running prune should have repacked the object into
      a new pack. So we will notice that the pack has disappeared
      unexpectedly, print a message, try other packs (possibly
      after re-scanning the list of packs), and find it in the new
      pack.
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      58a6a9cc
    • J
      pack-objects: protect against disappearing packs · 4c080182
      Jeff King 提交于
      It's possible that while pack-objects is running, a
      simultaneously running prune process might delete a pack
      that we are interested in. Because we load the pack indices
      early on, we know that the pack contains our item, but by
      the time we try to open and map it, it is gone.
      
      Since c715f783, we already protect against this in the normal
      object access code path, but pack-objects accesses the packs
      at a lower level.  In the normal access path, we call
      find_pack_entry, which will call find_pack_entry_one on each
      pack index, which does the actual lookup. If it gets a hit,
      we will actually open and verify the validity of the
      matching packfile (using c715f783's is_pack_valid). If we
      can't open it, we'll issue a warning and pretend that we
      didn't find it, causing us to go on to the next pack (or on
      to loose objects).
      
      Furthermore, we will cache the descriptor to the opened
      packfile. Which means that later, when we actually try to
      access the object, we are likely to still have that packfile
      opened, and won't care if it has been unlinked from the
      filesystem.
      
      Notice the "likely" above. If there is another pack access
      in the interim, and we run out of descriptors, we could
      close the pack. And then a later attempt to access the
      closed pack could fail (we'll try to re-open it, of course,
      but it may have been deleted). In practice, this doesn't
      happen because we tend to look up items and then access them
      immediately.
      
      Pack-objects does not follow this code path. Instead, it
      accesses the packs at a much lower level, using
      find_pack_entry_one directly. This means we skip the
      is_pack_valid check, and may end up with the name of a
      packfile, but no open descriptor.
      
      We can add the same is_pack_valid check here. Unfortunately,
      the access patterns of pack-objects are not quite as nice
      for keeping lookup and object access together. We look up
      each object as we find out about it, and the only later when
      writing the packfile do we necessarily access it. Which
      means that the opened packfile may be closed in the interim.
      
      In practice, however, adding this check still has value, for
      three reasons.
      
        1. If you have a reasonable number of packs and/or a
           reasonable file descriptor limit, you can keep all of
           your packs open simultaneously. If this is the case,
           then the race is impossible to trigger.
      
        2. Even if you can't keep all packs open at once, you
           may end up keeping the deleted one open (i.e., you may
           get lucky).
      
        3. The race window is shortened. You may notice early that
           the pack is gone, and not try to access it. Triggering
           the problem without this check means deleting the pack
           any time after we read the list of index files, but
           before we access the looked-up objects.  Triggering it
           with this check means deleting the pack means deleting
           the pack after we do a lookup (and successfully access
           the packfile), but before we access the object. Which
           is a smaller window.
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4c080182
  22. 02 9月, 2011 1 次提交
    • J
      list-objects: pass callback data to show_objects() · 49473672
      Junio C Hamano 提交于
      The traverse_commit_list() API takes two callback functions, one to show
      commit objects, and the other to show other kinds of objects. Even though
      the former has a callback data parameter, so that the callback does not
      have to rely on global state, the latter does not.
      
      Give the show_objects() callback the same callback data parameter.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      49473672
  23. 05 8月, 2011 1 次提交
  24. 09 7月, 2011 1 次提交
    • J
      pack-objects: optimize "recency order" · 1b4bb16b
      Junio C Hamano 提交于
      This optimizes the "recency order" (see pack-heuristics.txt in
      Documentation/technical/ directory) used to order objects within a
      packfile in three ways:
      
       - Commits at the tip of tags are written together, in the hope that
         revision traversal done in incremental fetch (which starts by
         putting them in a revision queue marked as UNINTERESTING) will see a
         better locality of these objects;
      
       - In the original recency order, trees and blobs are intermixed. Write
         trees together before blobs, in the hope that this will improve
         locality when running pathspec-limited revision traversal, i.e.
         "git log paths...";
      
       - When writing blob objects out, write the whole family of blobs that use
         the same delta base object together, by starting from the root of the
         delta chain, and writing its immediate children in a width-first
         manner, in the hope that this will again improve locality when reading
         blobs that belong to the same path, which are likely to be deltified
         against each other.
      
      I tried various workloads in the Linux kernel repositories (HEAD at
      v3.0-rc6-71-g4dd1b49) packed with v1.7.6 and with this patch, counting how
      large seeks are needed between adjacent accesses to objects in the pack,
      and the result looks promising.  The history has 2072052 objects, weighing
      some 490MiB.
      
       * Simple commit-only log.
      
         $ git log >/dev/null
      
         There are 254656 commits in total.
      
                                        v1.7.6  with patch
         Total number of access :      258,031     258,032
                0.0% percentile :           12          12
               10.0% percentile :          259         259
               20.0% percentile :          294         294
               30.0% percentile :          326         326
               40.0% percentile :          363         363
               50.0% percentile :          415         415
               60.0% percentile :          513         513
               70.0% percentile :          857         858
               80.0% percentile :       10,434      10,441
               90.0% percentile :       91,985      91,996
               95.0% percentile :      260,852     260,885
               99.0% percentile :    1,150,680   1,152,811
               99.9% percentile :    3,148,435   3,148,435
             Less than 2MiB seek:       99.70%      99.69%
      
         95% of the pack accesses look at data that is no further than 260kB
         from the previous location we accessed. The patch does not change the
         order of commit objects very much, and the result is very similar.
      
       * Pathspec-limited log.
      
         $ git log drivers/net >/dev/null
      
         The path is touched by 26551 commits and merges (among 254656 total).
      
                                        v1.7.6  with patch
         Total number of access :      559,511     558,663
                0.0% percentile :            0           0
               10.0% percentile :          182         167
               20.0% percentile :          259         233
               30.0% percentile :          357         304
               40.0% percentile :          714         485
               50.0% percentile :        5,046       3,976
               60.0% percentile :      688,671     443,578
               70.0% percentile :  319,574,732 110,370,100
               80.0% percentile :  361,647,599 123,707,229
               90.0% percentile :  393,195,669 128,947,636
               95.0% percentile :  405,496,875 131,609,321
               99.0% percentile :  412,942,470 133,078,115
               99.5% percentile :  413,172,266 133,163,349
               99.9% percentile :  413,354,356 133,240,445
             Less than 2MiB seek:       61.71%      62.87%
      
         With the current pack heuristics, more than 30% of accesses have to
         seek further than 300MB; the updated pack heuristics ensures that less
         than 0.1% of accesses have to seek further than 135MB. This is largely
         due to the fact that the updated heuristics does not mix blobs and
         trees together.
      
       * Blame.
      
         $ git blame drivers/net/ne.c >/dev/null
      
         The path is touched by 34 commits and merges.
      
                                        v1.7.6  with patch
         Total number of access :      178,147     178,166
                0.0% percentile :            0           0
               10.0% percentile :          142         139
               20.0% percentile :          222         194
               30.0% percentile :          373         300
               40.0% percentile :        1,168         837
               50.0% percentile :       11,248       7,334
               60.0% percentile :  305,121,284 106,850,130
               70.0% percentile :  361,427,854 123,709,715
               80.0% percentile :  388,127,343 128,171,047
               90.0% percentile :  399,987,762 130,200,707
               95.0% percentile :  408,230,673 132,174,308
               99.0% percentile :  412,947,017 133,181,160
               99.5% percentile :  413,312,798 133,220,425
               99.9% percentile :  413,352,366 133,269,051
             Less than 2MiB seek:       56.47%      56.83%
      
         The result is very similar to the pathspec-limited log above, which
         only looks at the tree objects.
      
       * Packing recent history.
      
         $ (git for-each-ref --format='^%(refname)' refs/tags; echo HEAD) |
           git pack-objects --revs --stdout >/dev/null
      
         This should pack data worth 71 commits.
      
                                        v1.7.6  with patch
         Total number of access :       11,511      11,514
                0.0% percentile :            0           0
               10.0% percentile :           48          47
               20.0% percentile :          134          98
               30.0% percentile :          332         178
               40.0% percentile :        1,386         293
               50.0% percentile :        8,030         478
               60.0% percentile :       33,676       1,195
               70.0% percentile :      147,268      26,216
               80.0% percentile :    9,178,662     464,598
               90.0% percentile :   67,922,665     965,782
               95.0% percentile :   87,773,251   1,226,102
               99.0% percentile :   98,011,763   1,932,377
               99.5% percentile :  100,074,427  33,642,128
               99.9% percentile :  105,336,398 275,772,650
             Less than 2MiB seek:       77.09%      99.04%
      
          The long-tail part of the result looks worse with the patch, but
          the change helps majority of the access. 99.04% of the accesses
          need less than 2MiB of seeking, compared to 77.09% with the current
          packing heuristics.
      
       * Index pack.
      
         $ git index-pack -v .git/objects/pack/pack*.pack
      
                                        v1.7.6  with patch
         Total number of access :    2,791,228   2,788,802
                0.0% percentile :            9           9
               10.0% percentile :          140          89
               20.0% percentile :          233         167
               30.0% percentile :          322         235
               40.0% percentile :          464         310
               50.0% percentile :          862         423
               60.0% percentile :        2,566         686
               70.0% percentile :       25,827       1,498
               80.0% percentile :    1,317,862       4,971
               90.0% percentile :   11,926,385     119,398
               95.0% percentile :   41,304,149     952,519
               99.0% percentile :  227,613,070   6,709,650
               99.5% percentile :  321,265,121  11,734,871
               99.9% percentile :  382,919,785  33,155,191
             Less than 2MiB seek:       81.73%      96.92%
      
         As the index-pack command already walks objects in the delta chain
         order, writing the blobs out in the delta chain order seems to
         drastically improve the locality of access.
      
      Note that a half-a-gigabyte packfile comfortably fits in the buffer cache,
      and you would unlikely to see much performance difference on a modern and
      reasonably beefy machine with enough memory and local disks. Benchmarking
      with cold cache (or over NFS) would be interesting.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1b4bb16b
  25. 11 6月, 2011 3 次提交
    • J
      zlib: zlib can only process 4GB at a time · ef49a7a0
      Junio C Hamano 提交于
      The size of objects we read from the repository and data we try to put
      into the repository are represented in "unsigned long", so that on larger
      architectures we can handle objects that weigh more than 4GB.
      
      But the interface defined in zlib.h to communicate with inflate/deflate
      limits avail_in (how many bytes of input are we calling zlib with) and
      avail_out (how many bytes of output from zlib are we ready to accept)
      fields effectively to 4GB by defining their type to be uInt.
      
      In many places in our code, we allocate a large buffer (e.g. mmap'ing a
      large loose object file) and tell zlib its size by assigning the size to
      avail_in field of the stream, but that will truncate the high octets of
      the real size. The worst part of this story is that we often pass around
      z_stream (the state object used by zlib) to keep track of the number of
      used bytes in input/output buffer by inspecting these two fields, which
      practically limits our callchain to the same 4GB limit.
      
      Wrap z_stream in another structure git_zstream that can express avail_in
      and avail_out in unsigned long. For now, just die() when the caller gives
      a size that cannot be given to a single zlib call. In later patches in the
      series, we would make git_inflate() and git_deflate() internally loop to
      give callers an illusion that our "improved" version of zlib interface can
      operate on a buffer larger than 4GB in one go.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ef49a7a0
    • J
      zlib: wrap deflateBound() too · 225a6f10
      Junio C Hamano 提交于
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      225a6f10
    • J
      zlib: wrap deflate side of the API · 55bb5c91
      Junio C Hamano 提交于
      Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
      of deflateInit2 in remote-curl.c to tell the library to use gzip header
      and trailer in git_deflate_init_gzip().
      
      There is only one caller that cares about the status from deflateEnd().
      Introduce git_deflate_end_gently() to let that sole caller retrieve the
      status and act on it (i.e. die) for now, but we would probably want to
      make inflate_end/deflate_end die when they ran out of memory and get
      rid of the _gently() kind.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      55bb5c91
  26. 06 4月, 2011 1 次提交
  27. 28 2月, 2011 1 次提交