1. 02 5月, 2013 20 次提交
  2. 18 3月, 2013 1 次提交
    • M
      pack-refs: add fully-peeled trait · c29c46fa
      Michael Haggerty 提交于
      Older versions of pack-refs did not write peel lines for
      refs outside of refs/tags. This meant that on reading the
      pack-refs file, we might set the REF_KNOWS_PEELED flag for
      such a ref, even though we do not know anything about its
      peeled value.
      
      The previous commit updated the writer to always peel, no
      matter what the ref is. That means that packed-refs files
      written by newer versions of git are fine to be read by both
      old and new versions of git. However, we still have the
      problem of reading packed-refs files written by older
      versions of git, or by other implementations which have not
      yet learned the same trick.
      
      The simplest fix would be to always unset the
      REF_KNOWS_PEELED flag for refs outside of refs/tags that do
      not have a peel line (if it has a peel line, we know it is
      valid, but we cannot assume a missing peel line means
      anything). But that loses an important optimization, as
      upload-pack should not need to load the object pointed to by
      refs/heads/foo to determine that it is not a tag.
      
      Instead, we add a "fully-peeled" trait to the packed-refs
      file. If it is set, we know that we can trust a missing peel
      line to mean that a ref cannot be peeled. Otherwise, we fall
      back to assuming nothing.
      
      [commit message and tests by Jeff King <peff@peff.net>]
      Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c29c46fa
  3. 22 12月, 2012 1 次提交
    • J
      refs: do not use cached refs in repack_without_ref · b3f1280e
      Jeff King 提交于
      When we delete a ref that is packed, we rewrite the whole
      packed-refs file and simply omit the ref that no longer
      exists. However, we base the rewrite on whatever happens to
      be in our refs cache, not what is necessarily on disk. That
      opens us up to a race condition if another process is
      simultaneously packing the refs, as we will overwrite their
      newly-made pack-refs file with our potentially stale data,
      losing commits.
      
      You can demonstrate the race like this:
      
        # setup some repositories
        git init --bare parent &&
        (cd parent && git config core.logallrefupdates true) &&
        git clone parent child &&
        (cd child && git commit --allow-empty -m base)
      
        # in one terminal, repack the refs repeatedly
        cd parent &&
        while true; do
      	git pack-refs --all
        done
      
        # in another terminal, simultaneously push updates to
        # master, and create and delete an unrelated ref
        cd child &&
        while true; do
      	git push origin HEAD:newbranch &&
      	git commit --allow-empty -m foo
      	us=`git rev-parse master` &&
      	git push origin master &&
      	git push origin :newbranch &&
      	them=`git --git-dir=../parent rev-parse master` &&
      	if test "$them" != "$us"; then
      		echo >&2 "$them" != "$us"
      		exit 1
      	fi
        done
      
      In many cases the two processes will conflict over locking
      the packed-refs file, and the deletion of newbranch will
      simply fail.  But eventually you will hit the race, which
      happens like this:
      
        1. We push a new commit to master. It is already packed
           (from the looping pack-refs call). We write the new
           value (let us call it B) to $GIT_DIR/refs/heads/master,
           but the old value (call it A) remains in the
           packed-refs file.
      
        2. We push the deletion of newbranch, spawning a
           receive-pack process. Receive-pack advertises all refs
           to the client, causing it to iterate over each ref; it
           caches the packed refs in memory, which points at the
           stale value A.
      
        3. Meanwhile, a separate pack-refs process is running. It
           runs to completion, updating the packed-refs file to
           point master at B, and deleting $GIT_DIR/refs/heads/master
           which also pointed at B.
      
        4. Back in the receive-pack process, we get the
           instruction to delete :newbranch. We take a lock on
           packed-refs (which works, as the other pack-refs
           process has already finished). We then rewrite the
           contents using the cached refs, which contain the stale
           value A.
      
      The resulting packed-refs file points master once again at
      A. The loose ref which would override it to point at B was
      deleted (rightfully) in step 3. As a result, master now
      points at A. The only trace that B ever existed in the
      parent is in the reflog: the final entry will show master
      moving from A to B, even though the ref still points at A
      (so you can detect this race after the fact, because the
      next reflog entry will move from A to C).
      
      We can fix this by invalidating the packed-refs cache after
      we have taken the lock. This means that we will re-read the
      packed-refs file, and since we have the lock, we will be
      sure that what we read will be atomically up-to-date when we
      write (it may be out of date with respect to loose refs, but
      that is OK, as loose refs take precedence).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b3f1280e
  4. 22 10月, 2012 1 次提交
    • J
      Fix failure to delete a packed ref through a symref · b274a714
      Johan Herland 提交于
      When deleting a ref through a symref (e.g. using 'git update-ref -d HEAD'
      to delete refs/heads/master), we would remove the loose ref, but a packed
      version of the same ref would remain, the end result being that instead of
      deleting refs/heads/master we would appear to reset it to its state as of
      the last repack.
      
      This patch fixes the issue, by making sure we pass the correct ref name
      when invoking repack_without_ref() from within delete_ref().
      Signed-off-by: NJohan Herland <johan@herland.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b274a714
  5. 17 10月, 2012 1 次提交
  6. 05 10月, 2012 3 次提交
    • J
      peel_ref: check object type before loading · 6c4a060d
      Jeff King 提交于
      The point of peel_ref is to dereference tags; if the base
      object is not a tag, then we can return early without even
      loading the object into memory.
      
      This patch accomplishes that by checking sha1_object_info
      for the type. For a packed object, we can get away with just
      looking in the pack index. For a loose object, we only need
      to inflate the first couple of header bytes.
      
      This is a bit of a gamble; if we do find a tag object, then
      we will end up loading the content anyway, and the extra
      lookup will have been wasteful. However, if it is not a tag
      object, then we save loading the object entirely. Depending
      on the ratio of non-tags to tags in the input, this can be a
      minor win or minor loss.
      
      However, it does give us one potential major win: if a ref
      points to a large blob (e.g., via an unannotated tag), then
      we can avoid looking at it entirely.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      6c4a060d
    • J
      peel_ref: do not return a null sha1 · e6dbffa6
      Jeff King 提交于
      The idea of the peel_ref function is to dereference tag
      objects recursively until we hit a non-tag, and return the
      sha1. Conceptually, it should return 0 if it is successful
      (and fill in the sha1), or -1 if there was nothing to peel.
      
      However, the current behavior is much more confusing. For a
      regular loose ref, the behavior is as described above. But
      there is an optimization to reuse the peeled-ref value for a
      ref that came from a packed-refs file. If we have such a
      ref, we return its peeled value, even if that peeled value
      is null (indicating that we know the ref definitely does
      _not_ peel).
      
      It might seem like such information is useful to the caller,
      who would then know not to bother loading and trying to peel
      the object. Except that they should not bother loading and
      trying to peel the object _anyway_, because that fallback is
      already handled by peel_ref. In other words, the whole point
      of calling this function is that it handles those details
      internally, and you either get a sha1, or you know that it
      is not peel-able.
      
      This patch catches the null sha1 case internally and
      converts it into a -1 return value (i.e., there is nothing
      to peel). This simplifies callers, which do not need to
      bother checking themselves.
      
      Two callers are worth noting:
      
        - in pack-objects, a comment indicates that there is a
          difference between non-peelable tags and unannotated
          tags. But that is not the case (before or after this
          patch). Whether you get a null sha1 has to do with
          internal details of how peel_ref operated.
      
        - in show-ref, if peel_ref returns a failure, the caller
          tries to decide whether to try peeling manually based on
          whether the REF_ISPACKED flag is set. But this doesn't
          make any sense. If the flag is set, that does not
          necessarily mean the ref came from a packed-refs file
          with the "peeled" extension. But it doesn't matter,
          because even if it didn't, there's no point in trying to
          peel it ourselves, as peel_ref would already have done
          so. In other words, the fallback peeling is guaranteed
          to fail.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e6dbffa6
    • J
      peel_ref: use faster deref_tag_noverify · 44da6f69
      Jeff King 提交于
      When we are asked to peel a ref to a sha1, we internally call
      deref_tag, which will recursively parse each tagged object
      until we reach a non-tag. This has the benefit that we will
      verify our ability to load and parse the pointed-to object.
      
      However, there is a performance downside: we may not need to
      load that object at all (e.g., if we are listing peeled
      simply listing peeled refs), or it may be a large object
      that should follow a streaming code path (e.g., an annotated
      tag of a large blob).
      
      It makes more sense for peel_ref to choose the fast thing
      rather than performing the extra check, for two reasons:
      
        1. We will already sometimes short-circuit the tag parsing
           in favor of a peeled entry from a packed-refs file. So
           we are already favoring speed in some cases, and it is
           not wise for a caller to rely on peel_ref to detect
           corruption.
      
        2. We already silently ignore much larger corruptions,
           like a ref that points to a non-existent object, or a
           tag object that exists but is corrupted.
      
        2. peel_ref is not the right place to check for such a
           database corruption. It is returning only the sha1
           anyway, not the actual object. Any callers which use
           that sha1 to load an object will soon discover the
           corruption anyway, so we are really just pushing back
           the discovery to later in the program.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      44da6f69
  7. 25 5月, 2012 1 次提交
    • M
      Avoid sorting if references are added to ref_cache in order · 654ad400
      Michael Haggerty 提交于
      The old code allowed many references to be efficiently added to a
      single directory, because it just appended the references to the
      containing directory unsorted without doing any searching (and
      therefore without requiring any intermediate sorting).  But the old
      code was inefficient when a large number of subdirectories were added
      to a directory, because the directory always had to be searched to see
      if the new subdirectory already existed, and this search required the
      directory to be sorted first.  The same was repeated for every new
      subdirectory, so the time scaled like O(N^2), where N is the number of
      subdirectories within a single directory.
      
      In practice, references are often added to the ref_cache in
      lexicographic order, for example when reading the packed-refs file.
      So build some intelligence into add_entry_to_dir() to optimize for the
      case of references and/or subdirectories being added in lexicographic
      order: if the existing entries were already sorted, and the new entry
      comes after the last existing entry, then adjust ref_dir::sorted to
      reflect the fact that the ref_dir is still sorted.
      
      Thanks to Peff for pointing out the performance regression that
      inspired this change.
      Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      654ad400
  8. 23 5月, 2012 4 次提交
  9. 21 5月, 2012 1 次提交
  10. 05 5月, 2012 1 次提交
    • J
      refs: fix find_containing_dir() regression · 663c1295
      Junio C Hamano 提交于
      The function used to return NULL when asked to find the containing
      directory for a ref that does not exist, allowing the caller to
      omit iteration altogether. But a misconversion in an earlier change
      "refs.c: extract function search_for_subdir()" started returning the
      top-level directory entry, forcing callers to walk everything.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      663c1295
  11. 04 5月, 2012 6 次提交