1. 26 9月, 2015 1 次提交
    • J
      use xsnprintf for generating git object headers · ef1286d3
      Jeff King 提交于
      We generally use 32-byte buffers to format git's "type size"
      header fields. These should not generally overflow unless
      you can produce some truly gigantic objects (and our types
      come from our internal array of constant strings). But it is
      a good idea to use xsnprintf to make sure this is the case.
      
      Note that we slightly modify the interface to
      write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
      parameter as well as an "out" (on the way in it stores the
      allocated size of the header, and on the way out it returns
      the ultimate size of the header).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ef1286d3
  2. 05 9月, 2015 1 次提交
    • J
      read_info_alternates: handle paths larger than PATH_MAX · 5015f01c
      Jeff King 提交于
      This function assumes that the relative_base path passed
      into it is no larger than PATH_MAX, and writes into a
      fixed-size buffer. However, this path may not have actually
      come from the filesystem; for example, add_submodule_odb
      generates a path using a strbuf and passes it in. This is
      hard to trigger in practice, though, because the long
      submodule directory would have to exist on disk before we
      would try to open its info/alternates file.
      
      We can easily avoid the bug, though, by simply creating the
      filename on the heap.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5015f01c
  3. 13 8月, 2015 1 次提交
    • C
      git_open_noatime: return with errno=0 on success · dff6f280
      Clemens Buchacher 提交于
      In read_sha1_file_extended we die if read_object fails with a fatal
      error. We detect a fatal error if errno is non-zero and is not
      ENOENT. If the object could not be read because it does not exist,
      this is not considered a fatal error and we want to return NULL.
      
      Somewhere down the line, read_object calls git_open_noatime to open
      a pack index file, for example. We first try open with O_NOATIME.
      If O_NOATIME fails with EPERM, we retry without O_NOATIME. When the
      second open succeeds, errno is however still set to EPERM from the
      first attempt. When we finally determine that the object does not
      exist, read_object returns NULL and read_sha1_file_extended dies
      with a fatal error:
      
          fatal: failed to read object <sha1>: Operation not permitted
      
      Fix this by resetting errno to zero before we call open again.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NClemens Buchacher <clemens.buchacher@intel.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dff6f280
  4. 11 8月, 2015 2 次提交
    • J
      add_to_alternates_file: don't add duplicate entries · 77b9b1d1
      Jeff King 提交于
      The add_to_alternates_file function blindly uses
      hold_lock_file_for_append to copy the existing contents, and
      then adds the new line to it. This has two minor problems:
      
        1. We might add duplicate entries, which are ugly and
           inefficient.
      
        2. We do not check that the file ends with a newline, in
           which case we would bogusly append to the final line.
           This is quite unlikely in practice, though, as we call
           this function only from git-clone, so presumably we are
           the only writers of the file (and we always add a
           newline).
      
      Instead of using hold_lock_file_for_append, let's copy the
      file line by line, which ensures all records are properly
      terminated. If we see an extra line, we can simply abort the
      update (there is no point in even copying the rest, as we
      know that it would be identical to the original).
      
      As a bonus, we also get rid of some calls to the
      static-buffer mkpath and git_path functions.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      77b9b1d1
    • J
      sha1_file.c: rename move_temp_to_file() to finalize_object_file() · cb5add58
      Junio C Hamano 提交于
      Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
      loosen, 2009-03-25), we kept reminding ourselves:
      
          NEEDSWORK: this should be renamed to finalize_temp_file() as
          "moving" is only a part of what it does, when no patch between
          master to pu changes the call sites of this function.
      
      without doing anything about it.  Let's do so.
      
      The purpose of this function was not to move but to finalize.  The
      detail of the primarily implementation of finalizing was to link the
      temporary file to its final name and then to unlink, which wasn't
      even "moving".  The alternative implementation did "move" by calling
      rename(2), which is a fun tangent.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cb5add58
  5. 09 7月, 2015 1 次提交
    • J
      check_and_freshen_file: fix reversed success-check · 3096b2ec
      Jeff King 提交于
      When we want to write out a loose object file, we have
      always first made sure we don't already have the object
      somewhere. Since 33d4221c (write_sha1_file: freshen existing
      objects, 2014-10-15), we also update the timestamp on the
      file, so that a simultaneous prune knows somebody is
      likely to reference it soon.
      
      If our utime() call fails, we treat this the same as not
      having the object in the first place; the safe thing to do
      is write out another copy. However, the loose-object check
      accidentally inverts the utime() check; it returns failure
      _only_ when the utime() call actually succeeded. Thus it was
      failing to protect us there, and in the normal case where
      utime() succeeds, it caused us to pointlessly write out and
      link the object.
      
      This passed our freshening tests, because writing out the
      new object is certainly _one_ way of updating its utime. So
      the normal case was inefficient, but not wrong.
      
      While we're here, let's also drop a comment in front of the
      check_and_freshen functions, making a note of their return
      type (since it is not our usual "0 for success, -1 for
      error").
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3096b2ec
  6. 23 6月, 2015 1 次提交
    • J
      for_each_packed_object: automatically open pack index · f813e9ea
      Jeff King 提交于
      When for_each_packed_object is called, we call
      prepare_packed_git() to make sure we have the actual list of
      packs. But the latter does not actually open the pack
      indices, meaning that pack->nr_objects may simply be 0 if
      the pack has not otherwise been used since the program
      started.
      
      In practice, this didn't come up for the current callers,
      because they iterate the packed objects only after iterating
      all reachable objects (so for it to matter you would have to
      have a pack consisting only of unreachable objects). But it
      is a dangerous and confusing interface that should be fixed
      for future callers.
      
      Note that we do not end the iteration when a pack cannot be
      opened, but we do return an error. That lets you complete
      the iteration even in actively-repacked repository where an
      .idx file may racily go away, but it also lets callers know
      that they may not have gotten the complete list (which the
      current reachability-check caller does care about).
      
      We have to tweak one of the prune tests due to the changed
      return value; an earlier test creates bogus .idx files and
      does not clean them up. Having to make this tweak is a good
      thing; it means we will not prune in a broken repository,
      and the test confirms that we do not negatively impact a
      more lenient caller, count-objects.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f813e9ea
  7. 10 6月, 2015 1 次提交
    • J
      index-pack: avoid excessive re-reading of pack directory · 0eeb077b
      Jeff King 提交于
      Since 45e8a748 (has_sha1_file: re-check pack directory before
      giving up, 2013-08-30), we spend extra effort for
      has_sha1_file to give the right answer when somebody else is
      repacking. Usually this effort does not matter, because
      after finding that the object does not exist, the next step
      is usually to die().
      
      However, some code paths make a large number of
      has_sha1_file checks which are _not_ expected to return 1.
      The collision test in index-pack.c is such a case. On a
      local system, this can cause a performance slowdown of
      around 5%. But on a system with high-latency system calls
      (like NFS), it can be much worse.
      
      This patch introduces a "quick" flag to has_sha1_file which
      callers can use when they would prefer high performance at
      the cost of false negatives during repacks. There may be
      other code paths that can use this, but the index-pack one
      is the most obviously critical, so we'll start with
      switching that one.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0eeb077b
  8. 29 5月, 2015 2 次提交
    • J
      xmmap(): drop "Out of memory?" · 9ca0aaf6
      Junio C Hamano 提交于
      We show that message with die_errno(), but the OS is ought to know
      why mmap(2) failed much better than we do.  There is no reason for
      us to say "Out of memory?" here.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9ca0aaf6
    • J
      config.c: avoid xmmap error messages · 1570856b
      Jeff King 提交于
      The config-writing code uses xmmap to map the existing
      config file, which will die if the map fails. This has two
      downsides:
      
        1. The error message is not very helpful, as it lacks any
           context about the file we are mapping:
      
             $ mkdir foo
             $ git config --file=foo some.key value
             fatal: Out of memory? mmap failed: No such device
      
        2. We normally do not die in this code path; instead, we'd
           rather report the error and return an appropriate exit
           status (which is part of the public interface
           documented in git-config.1).
      
      This patch introduces a "gentle" form of xmmap which lets us
      produce our own error message. We do not want to use mmap
      directly, because we would like to use the other
      compatibility elements of xmmap (e.g., handling 0-length
      maps portably).
      
      The end result is:
      
          $ git.compile config --file=foo some.key value
          error: unable to mmap 'foo': No such device
          $ echo $?
          3
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1570856b
  9. 19 5月, 2015 1 次提交
    • J
      sha1_file: pass empty buffer to index empty file · f6a1e1e2
      Jim Hill 提交于
      `git add` of an empty file with a filter pops complaints from
      `copy_fd` about a bad file descriptor.
      
      This traces back to these lines in sha1_file.c:index_core:
      
      	if (!size) {
      		ret = index_mem(sha1, NULL, size, type, path, flags);
      
      The problem here is that content to be added to the index can be
      supplied from an fd, or from a memory buffer, or from a pathname. This
      call is supplying a NULL buffer pointer and a zero size.
      
      Downstream logic takes the complete absence of a buffer to mean the
      data is to be found elsewhere -- for instance, these, from convert.c:
      
      	if (params->src) {
      		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
      	} else {
      		write_err = copy_fd(params->fd, child_process.in);
      	}
      
      ~If there's a buffer, write from that, otherwise the data must be coming
      from an open fd.~
      
      Perfectly reasonable logic in a routine that's going to write from
      either a buffer or an fd.
      
      So change `index_core` to supply an empty buffer when indexing an empty
      file.
      
      There's a patch out there that instead changes the logic quoted above to
      take a `-1` fd to mean "use the buffer", but it seems to me that the
      distinction between a missing buffer and an empty one carries intrinsic
      semantics, where the logic change is adapting the code to handle
      incorrect arguments.
      Signed-off-by: NJim Hill <gjthill@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f6a1e1e2
  10. 07 5月, 2015 1 次提交
  11. 06 5月, 2015 2 次提交
    • J
      write_sha1_file(): do not use a separate sha1[] array · 1427a7ff
      Junio C Hamano 提交于
      In the beginning, write_sha1_file() did not have a way to tell the
      caller the name of the object it wrote to the caller.  This was
      changed in d6d3f9d0 (This implements the new "recursive tree"
      write-tree., 2005-04-09) by adding the "returnsha1" parameter to the
      function so that the callers who are interested in the value can
      optionally pass a pointer to receive it.
      
      It turns out that all callers do want to know the name of the object
      it just has written.  Nobody passes a NULL to this parameter, hence
      it is not necessary to use a separate sha1[] array to receive the
      result from  write_sha1_file_prepare(), and copy the result to the
      returnsha1 supplied by the caller.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1427a7ff
    • E
      hash-object --literally: fix buffer overrun with extra-long object type · 0c3db67c
      Eric Sunshine 提交于
      "hash-object" learned in 5ba9a93b (hash-object: add --literally
      option, 2014-09-11) to allow crafting a corrupt/broken object of
      unknown type.
      
      When the user-provided type is particularly long, however, it can
      overflow the relatively small stack-based character array handed to
      write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(),
      leading to stack corruption (and crash).  Introduce a custom helper
      to allow arbitrarily long typenames just for "hash-object --literally".
      
      [jc: Eric's original used a strbuf in the more common codepaths, and
      I rewrote it to avoid penalizing the non-literally code. Bugs are mine]
      Signed-off-by: NEric Sunshine <sunshine@sunshineco.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0c3db67c
  12. 21 4月, 2015 3 次提交
    • J
      sha1_file: only freshen packs once per run · ee1c6c34
      Jeff King 提交于
      Since 33d4221c (write_sha1_file: freshen existing objects,
      2014-10-15), we update the mtime of existing objects that we
      would have written out (had they not existed). For the
      common case in which many objects are packed, we may update
      the mtime on a single packfile repeatedly. This can result
      in a noticeable performance problem if calling utime() is
      expensive (e.g., because your storage is on NFS).
      
      We can fix this by keeping a per-pack flag that lets us
      freshen only once per program invocation.
      
      An alternative would be to keep the packed_git.mtime flag up
      to date as we freshen, and freshen only once every N
      seconds. In practice, it's not worth the complexity. We are
      racing against prune expiration times here, which inherently
      must be set to accomodate reasonable program running times
      (because they really care about the time between an object
      being written and it becoming referenced, and the latter is
      typically the last step a program takes).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ee1c6c34
    • J
      sha1_file: freshen pack objects before loose · b5f52f37
      Jeff King 提交于
      When writing out an object file, we first check whether it
      already exists and if so optimize out the write. Prior to
      33d4221c, we did this by calling has_sha1_file(), which will
      check for packed objects followed by loose. Since that
      commit, we check loose objects first.
      
      For the common case of a repository whose objects are mostly
      packed, this means we will make a lot of extra access()
      system calls checking for loose objects. We should follow
      the same packed-then-loose order that all of our other
      lookups use.
      Reported-by: NStefan Saasen <ssaasen@atlassian.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b5f52f37
    • J
      reachable: only mark local objects as recent · 1385bb7b
      Jeff King 提交于
      When pruning and repacking a repository that has an
      alternate object store configured, we may traverse a large
      number of objects in the alternate. This serves no purpose,
      and may be expensive to do. A longer explanation is below.
      
      Commits d3038d22 and abcb8655 taught prune and pack-objects
      (respectively) to treat "recent" objects as tips for
      reachability, so that we keep whole chunks of history. They
      built on the object traversal in 660c889e (sha1_file: add
      for_each iterators for loose and packed objects,
      2014-10-15), which covers both local and alternate objects.
      
      In both cases, covering alternate objects is unnecessary, as
      both commands can only drop objects from the local
      repository. In the case of prune, we traverse only the local
      object directory. And in the case of repacking, while we may
      or may not include local objects in our pack, we will never
      reach into the alternate with "repack -d". The "-l" option
      is only a question of whether we are migrating objects from
      the alternate into our repository, or leaving them
      untouched.
      
      It is possible that we may drop an object that is depended
      upon by another object in the alternate. For example,
      imagine two repositories, A and B, with A pointing to B as
      an alternate. Now imagine a commit that is in B which
      references a tree that is only in A. Traversing from recent
      objects in B might prevent A from dropping that tree. But
      this case isn't worth covering. Repo B should take
      responsibility for its own objects. It would never have had
      the commit in the first place if it did not also have the
      tree, and assuming it is using the same "keep recent chunks
      of history" scheme, then it would itself keep the tree, as
      well.
      
      So checking the alternate objects is not worth doing, and
      come with a significant performance impact. In both cases,
      we skip any recent objects that have already been marked
      SEEN (i.e., that we know are already reachable for prune, or
      included in the pack for a repack). So there is a slight
      waste of time in opening the alternate packs at all, only to
      notice that we have already considered each object. But much
      worse, the alternate repository may have a large number of
      objects that are not reachable from the local repository at
      all, and we end up adding them to the traversal.
      
      We can fix this by considering only local unseen objects.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1385bb7b
  13. 31 3月, 2015 1 次提交
    • J
      sha1_file: squelch "packfile cannot be accessed" warnings · 319b678a
      Jeff King 提交于
      When we find an object in a packfile index, we make sure we
      can still open the packfile itself (or that it is already
      open), as it might have been deleted by a simultaneous
      repack. If we can't access the packfile, we print a warning
      for the user and tell the caller that we don't have the
      object (we can then look in other packfiles, or find a loose
      version, before giving up).
      
      The warning we print to the user isn't really accomplishing
      anything, and it is potentially confusing to users. In the
      normal case, it is complete noise; we find the object
      elsewhere, and the user does not have to care that we racily
      saw a packfile index that became stale. It didn't affect the
      operation at all.
      
      A possibly more interesting case is when we later can't find
      the object, and report failure to the user. In this case the
      warning could be considered a clue toward that ultimate
      failure. But it's not really a useful clue in practice. We
      wouldn't even print it consistently (since we are racing
      with another process, we might not even see the .idx file,
      or we might win the race and open the packfile, completing
      the operation).
      
      This patch drops the warning entirely (not only from the
      fill_pack_entry site, but also from an identical use in
      pack-objects). If we did find the warning interesting in the
      error case, we could stuff it away and reveal it to the user
      when we later die() due to the broken object. But that
      complexity just isn't worth it.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      319b678a
  14. 06 3月, 2015 1 次提交
  15. 10 2月, 2015 2 次提交
    • J
      sha1_file: fix iterating loose alternate objects · b0a42642
      Jonathon Mah 提交于
      The string in 'base' contains a path suffix to a specific object;
      when its value is used, the suffix must either be filled (as in
      stat_sha1_file, open_sha1_file, check_and_freshen_nonlocal) or
      cleared (as in prepare_packed_git) to avoid junk at the end.
      
      660c889e (sha1_file: add for_each iterators for loose and packed
      objects, 2014-10-15) introduced loose_from_alt_odb(), but this did
      neither and treated 'base' as a complete path to the "base" object
      directory, instead of a pointer to the "base" of the full path
      string.
      
      The trailing path after 'base' is still initialized to NUL, hiding
      the bug in some common cases.  Additionally the descendent
      for_each_file_in_obj_subdir() function swallows ENOENT, so an error
      only shows if the alternate's path was last filled with a valid
      object (where statting /path/to/existing/00/0bjectfile/00 fails).
      Signed-off-by: NJonathon Mah <me@JonathonMah.com>
      Helped-by: NKyle J. McKay <mackyle@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b0a42642
    • J
      for_each_loose_file_in_objdir: take an optional strbuf path · e6f875e0
      Jeff King 提交于
      We feed a root "objdir" path to this iterator function,
      which then copies the result into a strbuf, so that it can
      repeatedly append the object sub-directories to it. Let's
      make it easy for callers to just pass us a strbuf in the
      first place.
      
      We leave the original interface as a convenience for callers
      who want to just pass a const string like the result of
      get_object_directory().
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e6f875e0
  16. 02 12月, 2014 1 次提交
    • N
      path.c: make get_pathname() call sites return const char * · dcf69262
      Nguyễn Thái Ngọc Duy 提交于
      Before the previous commit, get_pathname returns an array of PATH_MAX
      length. Even if git_path() and similar functions does not use the
      whole array, git_path() caller can, in theory.
      
      After the commit, get_pathname() may return a buffer that has just
      enough room for the returned string and git_path() caller should never
      write beyond that.
      
      Make git_path(), mkpath() and git_path_submodule() return a const
      buffer to make sure callers do not write in it at all.
      
      This could have been part of the previous commit, but the "const"
      conversion is too much distraction from the core changes in path.c.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dcf69262
  17. 26 11月, 2014 1 次提交
  18. 17 10月, 2014 4 次提交
    • J
      write_sha1_file: freshen existing objects · 33d4221c
      Jeff King 提交于
      When we try to write a loose object file, we first check
      whether that object already exists. If so, we skip the
      write as an optimization. However, this can interfere with
      prune's strategy of using mtimes to mark files in progress.
      
      For example, if a branch contains a particular tree object
      and is deleted, that tree object may become unreachable, and
      have an old mtime. If a new operation then tries to write
      the same tree, this ends up as a noop; we notice we
      already have the object and do nothing. A prune running
      simultaneously with this operation will see the object as
      old, and may delete it.
      
      We can solve this by "freshening" objects that we avoid
      writing by updating their mtime. The algorithm for doing so
      is essentially the same as that of has_sha1_file. Therefore
      we provide a new (static) interface "check_and_freshen",
      which finds and optionally freshens the object. It's trivial
      to implement freshening and simple checking by tweaking a
      single parameter.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      33d4221c
    • J
      sha1_file: add for_each iterators for loose and packed objects · 660c889e
      Jeff King 提交于
      We typically iterate over the reachable objects in a
      repository by starting at the tips and walking the graph.
      There's no easy way to iterate over all of the objects,
      including unreachable ones. Let's provide a way of doing so.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      660c889e
    • J
      prune: factor out loose-object directory traversal · 27e1e22d
      Jeff King 提交于
      Prune has to walk $GIT_DIR/objects/?? in order to find the
      set of loose objects to prune. Other parts of the code
      (e.g., count-objects) want to do the same. Let's factor it
      out into a reusable for_each-style function.
      
      Note that this is not quite a straight code movement. The
      original code had strange behavior when it found a file of
      the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex
      digits. It executed a "break" from the loop, meaning that we
      stopped pruning in that directory (but still pruned other
      directories!). This was probably a bug; we do not want to
      process the file as an object, but we should keep going
      otherwise (and that is how the new code handles it).
      
      We are also a little more careful with loose object
      directories which fail to open. The original code silently
      ignored any failures, but the new code will complain about
      any problems besides ENOENT.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      27e1e22d
    • J
      foreach_alt_odb: propagate return value from callback · fe1b2268
      Jeff King 提交于
      We check the return value of the callback and stop iterating
      if it is non-zero. However, we do not make the non-zero
      return value available to the caller, so they have no way of
      knowing whether the operation succeeded or not (technically
      they can keep their own error flag in the callback data, but
      that is unlike our other for_each functions).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fe1b2268
  19. 02 10月, 2014 1 次提交
  20. 23 9月, 2014 1 次提交
  21. 29 8月, 2014 2 次提交
    • S
      convert: stream from fd to required clean filter to reduce used address space · 9035d75a
      Steffen Prohaska 提交于
      The data is streamed to the filter process anyway.  Better avoid mapping
      the file if possible.  This is especially useful if a clean filter
      reduces the size, for example if it computes a sha1 for binary data,
      like git media.  The file size that the previous implementation could
      handle was limited by the available address space; large files for
      example could not be handled with (32-bit) msysgit.  The new
      implementation can filter files of any size as long as the filter output
      is small enough.
      
      The new code path is only taken if the filter is required.  The filter
      consumes data directly from the fd.  If it fails, the original data is
      not immediately available.  The condition can easily be handled as
      a fatal error, which is expected for a required filter anyway.
      
      If the filter was not required, the condition would need to be handled
      in a different way, like seeking to 0 and reading the data.  But this
      would require more restructuring of the code and is probably not worth
      it.  The obvious approach of falling back to reading all data would not
      help achieving the main purpose of this patch, which is to handle large
      files with limited address space.  If reading all data is an option, we
      can simply take the old code path right away and mmap the entire file.
      
      The environment variable GIT_MMAP_LIMIT, which has been introduced in
      a previous commit is used to test that the expected code path is taken.
      A related test that exercises required filters is modified to verify
      that the data actually has been modified on its way from the file system
      to the object store.
      Signed-off-by: NSteffen Prohaska <prohaska@zib.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9035d75a
    • S
      mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size · 02710228
      Steffen Prohaska 提交于
      In order to test expectations about mmap in a way similar to testing
      expectations about malloc with GIT_ALLOC_LIMIT introduced by
      d41489a6 (Add more large blob test cases, 2012-03-07), introduce a
      new environment variable GIT_MMAP_LIMIT to limit the largest allowed
      mmap length.
      
      xmmap() is modified to check the size of the requested region and
      fail it if it is beyond the limit.  Together with GIT_ALLOC_LIMIT
      tests can now confirm expectations about memory consumption.
      Signed-off-by: NSteffen Prohaska <prohaska@zib.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      02710228
  22. 22 8月, 2014 1 次提交
  23. 19 8月, 2014 1 次提交
  24. 16 7月, 2014 1 次提交
  25. 14 7月, 2014 1 次提交
  26. 02 7月, 2014 1 次提交
  27. 01 7月, 2014 3 次提交
    • J
      prepare_packed_git_one: refactor duplicate-pack check · 47bf4b0f
      Jeff King 提交于
      When we are reloading the list of packs, we check whether a
      particular pack has been loaded. This is slightly tricky,
      because we load packs based on the presence of their ".idx"
      files, but record the name of the matching ".pack" file.
      Therefore we want to compare their bases.
      
      The existing code stripped off ".idx" from a file we found,
      then compared that whole base length to strings containing
      the ".pack" version. This meant we could end up comparing
      bytes past what the ".pack" string contained, if the ".idx"
      file name was much longer.
      
      In practice, it worked OK because memcmp would end up seeing
      a difference in the two strings and would return early
      before hitting the full length. However, memcmp may
      sometimes read extra bytes past a difference (e.g., because
      it is comparing 64-bit words), or is even free to compare in
      reverse order.
      
      Furthermore, our memcmp made no guarantees that we matched
      the whole pack name, up to ".pack". So "foo.idx" would match
      "foo-bar.pack", which is wrong (but does not typically
      happen, because our pack names have a fixed size).
      
      We can fix both issues, avoid magic numbers, and document
      that we expect to compare against a string with ".pack" by
      using strip_suffix.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      47bf4b0f
    • J
      replace has_extension with ends_with · 2975c770
      Jeff King 提交于
      These two are almost the same function, with the exception
      that has_extension only matches if there is content before
      the suffix. So ends_with(".exe", ".exe") is true, but
      has_extension would not be.
      
      This distinction does not matter to any of the callers,
      though, and we can just replace uses of has_extension with
      ends_with. We prefer the "ends_with" name because it is more
      generic, and there is nothing about the function that
      requires it to be used for file extensions.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2975c770
    • R
      sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one() · 880fb8de
      René Scharfe 提交于
      Instead of using strbuf to create a message string in case a path is
      too long for our fixed-size buffer, replace that buffer with a strbuf
      and thus get rid of the limitation.
      Helped-by: NDuy Nguyen <pclouds@gmail.com>
      Signed-off-by: NRene Scharfe <l.s.r@web.de>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      880fb8de
  28. 16 5月, 2014 1 次提交
    • J
      open_sha1_file: report "most interesting" errno · d6c8a05b
      Jeff King 提交于
      When we try to open a loose object file, we first attempt to
      open in the local object database, and then try any
      alternates. This means that the errno value when we return
      will be from the last place we looked (and due to the way
      the code is structured, simply ENOENT if we do not have have
      any alternates).
      
      This can cause confusing error messages, as read_sha1_file
      checks for ENOENT when reporting a missing object. If errno
      is something else, we report that. If it is ENOENT, but
      has_loose_object reports that we have it, then we claim the
      object is corrupted. For example:
      
          $ chmod 0 .git/objects/??/*
          $ git rev-list --all
          fatal: loose object b2d6fab18b92d49eac46dc3c5a0bcafabda20131 (stored in .git/objects/b2/d6fab18b92d49eac46dc3c5a0bcafabda20131) is corrupt
      
      This patch instead keeps track of the "most interesting"
      errno we receive during our search. We consider ENOENT to be
      the least interesting of all, and otherwise report the first
      error found (so problems in the object database take
      precedence over ones in alternates). Here it is with this
      patch:
      
          $ git rev-list --all
          fatal: failed to read object b2d6fab18b92d49eac46dc3c5a0bcafabda20131: Permission denied
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d6c8a05b