1. 04 9月, 2006 1 次提交
  2. 03 9月, 2006 1 次提交
    • J
      pack-objects: re-validate data we copy from elsewhere. · df6d6101
      Junio C Hamano 提交于
      When reusing data from an existing pack and from a new style
      loose objects, we used to just copy it staight into the
      resulting pack.  Instead make sure they are not corrupt, but
      do so only when we are not streaming to stdout, in which case
      the receiving end will do the validation either by unpacking
      the stream or by constructing the .idx file.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      df6d6101
  3. 24 8月, 2006 1 次提交
    • S
      Convert memcpy(a,b,20) to hashcpy(a,b). · e702496e
      Shawn Pearce 提交于
      This abstracts away the size of the hash values when copying them
      from memory location to memory location, much as the introduction
      of hashcmp abstracted away hash value comparsion.
      
      A few call sites were using char* rather than unsigned char* so
      I added the cast rather than open hashcpy to be void*.  This is a
      reasonable tradeoff as most call sites already use unsigned char*
      and the existing hashcmp is also declared to be unsigned char*.
      
      [jc: Splitted the patch to "master" part, to be followed by a
       patch for merge-recursive.c which is not in "master" yet.
      
       Fixed the cast in the latter hunk to combine-diff.c which was
       wrong in the original.
      
       Also converted ones left-over in combine-diff.c, diff-lib.c and
       upload-pack.c ]
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      e702496e
  4. 18 8月, 2006 1 次提交
  5. 16 8月, 2006 1 次提交
  6. 04 8月, 2006 1 次提交
  7. 26 7月, 2006 1 次提交
  8. 24 7月, 2006 1 次提交
  9. 10 7月, 2006 1 次提交
  10. 01 7月, 2006 1 次提交
  11. 30 6月, 2006 2 次提交
  12. 21 6月, 2006 1 次提交
  13. 20 6月, 2006 1 次提交
  14. 06 6月, 2006 1 次提交
    • L
      pack-objects: improve path grouping heuristics. · ce0bd642
      Linus Torvalds 提交于
      This trivial patch not only simplifies the name hashing, it actually
      improves packing for both git and the kernel.
      
      The git archive pack shrinks from 6824090->6622627 bytes (a 3%
      improvement), and the kernel pack shrinks from 108756213 to 108219021 (a
      mere 0.5% improvement, but still, it's an improvement from making the
      hashing much simpler!)
      
      We just create a 32-bit hash, where we "age" previous characters by two
      bits, so the last characters in a filename count most. So when we then
      compare the hashes in the sort routine, filenames that end the same way
      sort the same way.
      
      It takes the subdirectory into account (unless the filename is > 16
      characters), but files with the same name within the same subdirectory
      will obviously sort closer than files in different subdirectories.
      
      And, incidentally (which is why I tried the hash change in the first
      place, of course) builtin-rev-list.c will sort fairly close to rev-list.c.
      
      And no, it's not a "good hash" in the sense of being secure or unique, but
      that's not what we're looking for. The whole "hash" thing is misnamed
      here. It's not so much a hash as a "sorting number".
      
      [jc: rolled in simplification for computing the sorting number
       computation for thin pack base objects]
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      ce0bd642
  15. 31 5月, 2006 1 次提交
    • L
      tree_entry(): new tree-walking helper function · 4c068a98
      Linus Torvalds 提交于
      This adds a "tree_entry()" function that combines the common operation of
      doing a "tree_entry_extract()" + "update_tree_entry()".
      
      It also has a simplified calling convention, designed for simple loops
      that traverse over a whole tree: the arguments are pointers to the tree
      descriptor and a name_entry structure to fill in, and it returns a boolean
      "true" if there was an entry left to be gotten in the tree.
      
      This allows tree traversal with
      
      	struct tree_desc desc;
      	struct name_entry entry;
      
      	desc.buf = tree->buffer;
      	desc.size = tree->size;
      	while (tree_entry(&desc, &entry) {
      		... use "entry.{path, sha1, mode, pathlen}" ...
      	}
      
      which is not only shorter than writing it out in full, it's hopefully less
      error prone too.
      
      [ It's actually a tad faster too - we don't need to recalculate the entry
        pathlength in both extract and update, but need to do it only once.
        Also, some callers can avoid doing a "strlen()" on the result, since
        it's returned as part of the name_entry structure.
      
        However, by now we're talking just 1% speedup on "git-rev-list --objects
        --all", and we're definitely at the point where tree walking is no
        longer the issue any more. ]
      
      NOTE! Not everybody wants to use this new helper function, since some of
      the tree walkers very much on purpose do the descriptor update separately
      from the entry extraction. So the "extract + update" sequence still
      remains as the core sequence, this is just a simplified interface.
      
      We should probably add a silly two-line inline helper function for
      initializing the descriptor from the "struct tree" too, just to cut down
      on the noise from that common "desc" initializer.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      4c068a98
  16. 17 5月, 2006 1 次提交
  17. 16 5月, 2006 3 次提交
    • J
      Fix pack-index issue on 64-bit platforms a bit more portably. · 1b9bc5a7
      Junio C Hamano 提交于
      Apparently <stdint.h> is not enough for uint32_t on OpenBSD; use
      "unsigned int" -- hopefully that would stay 32-bit on every
      platform we care about, at least until we update the pack-index
      file format.
      
      Our sha1 routines optimized for architectures use uint32_t and
      expects '#include <stdint.h>' to be enough, so OpenBSD on arm or
      ppc might have similar issues down the road, I dunno.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      1b9bc5a7
    • N
      pack-object: slightly more efficient · ff45715c
      Nicolas Pitre 提交于
      Avoid creating a delta index for objects with maximum depth since they
      are not going to be used as delta base anyway.  This also reduce peak
      memory usage slightly as the current object's delta index is not useful
      until the next object in the loop is considered for deltification. This
      saves a bit more than 1% on CPU usage.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      ff45715c
    • N
      simple euristic for further free packing improvements · 4e8da195
      Nicolas Pitre 提交于
      Given that the early eviction of objects with maximum delta depth
      may exhibit bad packing on its own, why not considering a bias against
      deep base objects in try_delta() to mitigate that bad behavior.
      
      This patch adjust the MAX_size allowed for a delta based on the depth of
      the base object as well as enabling the early eviction of max depth
      objects from the object window.  When used separately, those two things
      produce slightly better and much worse results respectively.  But their
      combined effect is a surprising significant packing improvement.
      
      With this really simple patch the GIT repo gets nearly 15% smaller, and
      the Linux kernel repo about 5% smaller, with no significantly measurable
      CPU usage difference.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      4e8da195
  18. 15 5月, 2006 1 次提交
  19. 14 5月, 2006 1 次提交
    • D
      Fix git-pack-objects for 64-bit platforms · 66561f5a
      Dennis Stosberg 提交于
      The offset of an object in the pack is recorded as a 4-byte integer
      in the index file.  When reading the offset from the mmap'ed index
      in prepare_pack_revindex(), the address is dereferenced as a long*.
      This works fine as long as the long type is four bytes wide.  On
      NetBSD/sparc64, however, a long is 8 bytes wide and so dereferencing
      the offset produces garbage.
      
      [jc: taking suggestion by Linus to use uint32_t]
      Signed-off-by: NDennis Stosberg <dennis@stosberg.net>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      66561f5a
  20. 06 5月, 2006 1 次提交
    • J
      pack-object: squelch eye-candy on non-tty · 86118bcb
      Junio C Hamano 提交于
      One of my post-update scripts runs a git-fetch into a separate
      repository and sends the results back to me (2>&1); I end up
      getting this in the mail:
      
          Generating pack...
          Done counting 180 objects.
          Result has 131 objects.
          Deltifying 131 objects.
             0% (0/131) done^M   1% (2/131) done^M...
      
      This defaults not to do the progress report when not on a tty.
      
      You could give --progress to force the progress report, but
      let's not bother even documenting it nor mentioning it in the
      usage string.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      86118bcb
  21. 28 4月, 2006 1 次提交
  22. 27 4月, 2006 1 次提交
  23. 21 4月, 2006 2 次提交
  24. 17 4月, 2006 1 次提交
  25. 07 4月, 2006 1 次提交
    • J
      Thin pack generation: optimization. · 5379a5c5
      Junio C Hamano 提交于
      Jens Axboe noticed that recent "git push" has become very slow
      since we made --thin transfer the default.
      
      Thin pack generation to push a handful revisions that touch
      relatively small number of paths out of huge tree was stupid; it
      registered _everything_ from the excluded revisions.  As a
      result, "Counting objects" phase was unnecessarily expensive.
      
      This changes the logic to register the blobs and trees from
      excluded revisions only for paths we are actually going to send
      to the other end.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      5379a5c5
  26. 04 4月, 2006 2 次提交
  27. 03 4月, 2006 2 次提交
  28. 30 3月, 2006 1 次提交
    • J
      tree/diff header cleanup. · 1b0c7174
      Junio C Hamano 提交于
      Introduce tree-walk.[ch] and move "struct tree_desc" and
      associated functions from various places.
      
      Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
      move it to cache.h.  This macro returns the canonicalized
      st_mode value in the host byte order for files, symlinks and
      directories -- to be compared with a tree_desc entry.
      create_ce_mode(mode) in cache.h is similar but is intended to be
      used for index entries (so it does not work for directories) and
      returns the value in the network byte order.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      1b0c7174
  29. 06 3月, 2006 1 次提交
    • J
      pack-objects: simplify "thin" pack. · 70ca1a3f
      Junio C Hamano 提交于
      There was a misguided logic to overly prefer using objects that
      we are not going to pack as the base object.  This was
      unnecessary.  It does not matter to the unpacking side where the
      base object is -- it matters more to make the resulting delta
      smaller.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      70ca1a3f
  30. 02 3月, 2006 2 次提交
    • N
      diff-delta: allow reusing of the reference buffer index · 38fd0721
      Nicolas Pitre 提交于
      When a reference buffer is used multiple times then its index can be
      computed only once and reused multiple times.  This patch adds an extra
      pointer to a pointer argument (from_index) to diff_delta() for this.
      
      If from_index is NULL then everything is like before.
      
      If from_index is non NULL and *from_index is NULL then the index is
      created and its location stored to *from_index.  In this case the caller
      has the responsibility to free the memory pointed to by *from_index.
      
      If from_index and *from_index are non NULL then the index is reused as
      is.
      
      This currently saves about 10% of CPU time to repack the git archive.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      38fd0721
    • L
      Re-fix compilation warnings. · 2b74cffa
      Luck, Tony 提交于
      Commit 8fcf1ad9 has a
      combination of double cast and Andreas' switch to using
      unsigned long ... just the latter is sufficient (and a lot less
      ugly than using the double cast).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      2b74cffa
  31. 27 2月, 2006 1 次提交
  32. 25 2月, 2006 1 次提交
    • L
      fix warning from pack-objects.c · 8fcf1ad9
      Luck, Tony 提交于
      When compiling on ia64 I get this warning (from gcc 3.4.3):
      
      gcc -o pack-objects.o -c -g -O2 -Wall -DSHA1_HEADER='<openssl/sha.h>'  pack-objects.c
      pack-objects.c: In function `pack_revindex_ix':
      pack-objects.c:94: warning: cast from pointer to integer of different size
      
      A double cast (first to long, then to int) shuts gcc up, but is there
      a better way?
      
      [jc: Andreas Ericsson suggests to use ulong instead. ]
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      8fcf1ad9
  33. 24 2月, 2006 1 次提交