1. 01 3月, 2008 2 次提交
  2. 07 6月, 2007 1 次提交
  3. 27 5月, 2007 1 次提交
    • S
      Lazily open pack index files on demand · d079837e
      Shawn O. Pearce 提交于
      In some repository configurations the user may have many packfiles,
      but all of the recent commits/trees/tags/blobs are likely to
      be in the most recent packfile (the one with the newest mtime).
      It is therefore common to be able to complete an entire operation
      by accessing only one packfile, even if there are 25 packfiles
      available to the repository.
      
      Rather than opening and mmaping the corresponding .idx file for
      every pack found, we now only open and map the .idx when we suspect
      there might be an object of interest in there.
      
      Of course we cannot known in advance which packfile contains an
      object, so we still need to scan the entire packed_git list to
      locate anything.  But odds are users want to access objects in the
      most recently created packfiles first, and that may be all they
      ever need for the current operation.
      
      Junio observed in b867092f that placing recent packfiles before
      older ones can slightly improve access times for recent objects,
      without degrading it for historical object access.
      
      This change improves upon Junio's observations by trying even harder
      to avoid the .idx files that we won't need.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      d079837e
  4. 26 5月, 2007 1 次提交
  5. 11 4月, 2007 1 次提交
    • N
      get rid of num_packed_objects() · 57059091
      Nicolas Pitre 提交于
      The coming index format change doesn't allow for the number of objects
      to be determined from the size of the index file directly.  Instead, Let's
      initialize a field in the packed_git structure with the object count when
      the index is validated since the count is always known at that point.
      
      While at it let's reorder some struct packed_git fields to avoid padding
      due to needed 64-bit alignment for some of them.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      57059091
  6. 06 4月, 2007 1 次提交
  7. 17 3月, 2007 1 次提交
    • N
      [PATCH] clean up pack index handling a bit · 42873078
      Nicolas Pitre 提交于
      Especially with the new index format to come, it is more appropriate
      to encapsulate more into check_packed_git_idx() and assume less of the
      index format in struct packed_git.
      
      To that effect, the index_base is renamed to index_data with void * type
      so it is not used directly but other pointers initialized with it. This
      allows for a couple pointer cast removal, as well as providing a better
      generic name to grep for when adding support for new index versions or
      formats.
      
      And index_data is declared const too while at it.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      42873078
  8. 08 3月, 2007 2 次提交
    • S
      Use off_t when we really mean a file offset. · c4001d92
      Shawn O. Pearce 提交于
      Not all platforms have declared 'unsigned long' to be a 64 bit value,
      but we want to support a 64 bit packfile (or close enough anyway)
      in the near future as some projects are getting large enough that
      their packed size exceeds 4 GiB.
      
      By using off_t, the POSIX type that is declared to mean an offset
      within a file, we support whatever maximum file size the underlying
      operating system will handle.  For most modern systems this is up
      around 2^60 or higher.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      c4001d92
    • S
      Use uint32_t for all packed object counts. · 326bf396
      Shawn O. Pearce 提交于
      As we permit up to 2^32-1 objects in a single packfile we cannot
      use a signed int to represent the object offset within a packfile,
      after 2^31-1 objects we will start seeing negative indexes and
      error out or compute bad addresses within the mmap'd index.
      
      This is a minor cleanup that does not introduce any significant
      logic changes.  It is roach free.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      326bf396
  9. 27 2月, 2007 1 次提交
    • N
      convert object type handling from a string to a number · 21666f1a
      Nicolas Pitre 提交于
      We currently have two parallel notation for dealing with object types
      in the code: a string and a numerical value.  One of them is obviously
      redundent, and the most used one requires more stack space and a bunch
      of strcmp() all over the place.
      
      This is an initial step for the removal of the version using a char array
      found in object reading code paths.  The patch is unfortunately large but
      there is no sane way to split it in smaller parts without breaking the
      system.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      21666f1a
  10. 05 1月, 2007 1 次提交
    • J
      pack-check.c::verify_packfile(): don't run SHA-1 update on huge data · 8977c110
      Junio C Hamano 提交于
      Running the SHA1_Update() on the whole packfile in a single call
      revealed an overflow problem we had in the SHA-1 implementation
      on POWER architecture some time ago, which was fixed with commit
      b47f509b (June 19, 2006).  Other SHA-1 implementations may have
      a similar problem.
      
      The sliding mmap() series already makes chunked calls to
      SHA1_Update(), so this patch itself will become moot when it
      graduates to "master", but in the meantime, run the hash
      function in smaller chunks to prevent possible future problems.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      8977c110
  11. 30 12月, 2006 4 次提交
    • S
      Loop over pack_windows when inflating/accessing data. · 079afb18
      Shawn O. Pearce 提交于
      When multiple mmaps start getting used for all pack file access it
      is not possible to get all data associated with a specific object
      in one contiguous memory region.  This limitation prevents simply
      passing a single address and length to SHA1_Update or to inflate.
      
      Instead we need to loop until we have processed all data of interest.
      
      As we loop over the data we are always interested in reusing the same
      window 'cursor', as the prior window will no longer be of any use
      to us.  This allows the use_pack() call to automatically decrement
      the use count of the prior window before setting up access for us
      to the next window.
      
      Within each loop we need to make use of the available length output
      parameter of use_pack() to tell us how many bytes are available in
      the current memory region, as we cannot tell otherwise.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      079afb18
    • S
      Replace use_packed_git with window cursors. · 03e79c88
      Shawn O. Pearce 提交于
      Part of the implementation concept of the sliding mmap window for
      pack access is to permit multiple windows per pack to be mapped
      independently.  Since the inuse_cnt is associated with the mmap and
      not with the file, this value is in struct pack_window and needs to
      be incremented/decremented for each pack_window accessed by any code.
      
      To faciliate that implementation we need to replace all uses of
      use_packed_git() and unuse_packed_git() with a different API that
      follows struct pack_window objects rather than struct packed_git.
      
      The way this works is when we need to start accessing a pack for
      the first time we should setup a new window 'cursor' by declaring
      a local and setting it to NULL:
      
        struct pack_windows *w_curs = NULL;
      
      To obtain the memory region which contains a specific section of
      the pack file we invoke use_pack(), supplying the address of our
      current window cursor:
      
        unsigned int len;
        unsigned char *addr = use_pack(p, &w_curs, offset, &len);
      
      the returned address `addr` will be the first byte at `offset`
      within the pack file.  The optional variable len will also be
      updated with the number of bytes remaining following the address.
      
      Multiple calls to use_pack() with the same window cursor will
      update the window cursor, moving it from one window to another
      when necessary.  In this way each window cursor variable maintains
      only one struct pack_window inuse at a time.
      
      Finally before exiting the scope which originally declared the window
      cursor we must invoke unuse_pack() to unuse the current window (which
      may be different from the one that was first obtained from use_pack):
      
        unuse_pack(&w_curs);
      
      This implementation is still not complete with regards to multiple
      windows, as only one window per pack file is supported right now.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      03e79c88
    • S
      Refactor packed_git to prepare for sliding mmap windows. · c41ee586
      Shawn O. Pearce 提交于
      The idea behind the sliding mmap window pack reader implementation
      is to have multiple mmap regions active against the same pack file,
      thereby allowing the process to mmap in only the active/hot sections
      of the pack and reduce overall virtual address space usage.
      
      To implement this we need to refactor the mmap related data
      (pack_base, pack_use_cnt) out of struct packed_git and move them
      into a new struct pack_window.
      
      We are refactoring the code to support a single struct pack_window
      per packfile, thereby emulating the prior behavior of mmap'ing the
      entire pack file.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      c41ee586
    • S
      Replace unpack_entry_gently with unpack_entry. · 4d703a1a
      Shawn O. Pearce 提交于
      The unpack_entry_gently function currently has only two callers:
      the delta base resolution in sha1_file.c and the main loop of
      pack-check.c.  Both of these must change to using unpack_entry
      directly when we implement sliding window mmap logic, so I'm doing
      it earlier to help break down the change set.
      
      This may cause a slight performance decrease for delta base
      resolution as well as for pack-check.c's verify_packfile(), as
      the pack use counter will be incremented and decremented for every
      object that is unpacked.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      4d703a1a
  12. 23 9月, 2006 1 次提交
    • N
      many cleanups to sha1_file.c · 43057304
      Nicolas Pitre 提交于
      Those cleanups are mainly to set the table for the support of deltas
      with base objects referenced by offsets instead of sha1.  This means
      that many pack lookup functions are converted to take a pack/offset
      tuple instead of a sha1.
      
      This eliminates many struct pack_entry usages since this structure
      carried redundent information in many cases, and it increased stack
      footprint needlessly for a couple recursively called functions that used
      to declare a local copy of it for every recursion loop.
      
      In the process, packed_object_info_detail() has been reorganized as well
      so to look much saner and more amenable to deltas with offset support.
      
      Finally the appropriate adjustments have been made to functions that
      depend on the above changes.  But there is no functionality changes yet
      simply some code refactoring at this point.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      43057304
  13. 18 8月, 2006 1 次提交
  14. 20 6月, 2006 1 次提交
  15. 28 4月, 2006 1 次提交
  16. 06 3月, 2006 1 次提交
  17. 16 2月, 2006 1 次提交
  18. 10 2月, 2006 1 次提交
    • N
      remove delta-against-self bit · d60fc1c8
      Nicolas Pitre 提交于
      After experimenting with code to add the ability to encode a delta
      against part of the deltified file, it turns out that resulting packs
      are _bigger_ than when this ability is not used.  The raw delta output
      might be smaller, but it doesn't compress as well using gzip with a
      negative net saving on average.
      
      Said bit would in fact be more useful to allow for encoding the copying
      of chunks larger than 64KB providing more savings with large files.
      This will correspond to packs version 3.
      
      While the current code still produces packs version 2, it is made future
      proof so pack versions 2 and 3 are accepted.  Any pack version 2 are
      compatible with version 3 since the redefined bit was never used before.
      When enough time has passed, code to use that bit to produce version 3
      packs could be added.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      d60fc1c8
  19. 10 8月, 2005 1 次提交
    • T
      [PATCH] -Werror fixes · 4ec99bf0
      Timo Sirainen 提交于
      GCC's format __attribute__ is good for checking errors, especially
      with -Wformat=2 parameter. This fixes most of the reported problems
      against 2005-08-09 snapshot.
      4ec99bf0
  20. 08 7月, 2005 1 次提交
  21. 01 7月, 2005 2 次提交
    • J
      [PATCH] Show more details of packfile with verify-pack -v. · ad8c80a5
      Junio C Hamano 提交于
      This implements show_pack_info() function used in verify-pack
      command when -v flag is used to obtain something like
      unpack-objects used to give when it was first written.
      
      It shows the following for each non-deltified object found in
      the pack:
      
          SHA1 type size offset
      
      For deltified objects, it shows this instead:
      
          SHA1 type size offset depth base_sha1
      
      In order to get the output in the order that appear in the pack
      file for debugging purposes, you can do this:
      
       $ git-verify-pack -v packfile | sort -n -k 4,4
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ad8c80a5
    • J
      [PATCH] verify-pack updates. · f3bf9224
      Junio C Hamano 提交于
      Nico pointed out that having verify_pack.c and verify-pack.c was
      confusing.  Rename verify_pack.c to pack-check.c as suggested,
      and enhances the verification done quite a bit.
      
       - Built-in sha1_file unpacking knows that a base object of a
         deltified object _must_ be in the same pack, and takes
         advantage of that fact.
      
       - Earlier verify-pack command only checked the SHA1 sum for the
         entire pack file and did not look into its contents.  It now
         checks everything idx file claims to have unpacks correctly.
      
       - It now has a hook to give more detailed information for
         objects contained in the pack under -v flag.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f3bf9224
  22. 30 6月, 2005 1 次提交
    • J
      [PATCH] Add git-verify-pack command. · f9253394
      Junio C Hamano 提交于
      Given a list of <pack>.idx files, this command validates the
      index file and the corresponding .pack file for consistency.
      
      This patch also uses the same validation mechanism in fsck-cache
      when the --full flag is used.
      
      During normal operation, sha1_file.c verifies that a given .idx
      file matches the .pack file by comparing the SHA1 checksum
      stored in .idx file and .pack file as a minimum sanity check.
      We may further want to check the pack signature and version when
      we map the pack, but that would be a separate patch.
      
      Earlier, errors to map a pack file was not flagged fatal but led
      to a random fatal error later.  This version explicitly die()s
      when such an error is detected.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9253394