1. 23 2月, 2013 1 次提交
  2. 16 9月, 2012 1 次提交
  3. 07 8月, 2012 1 次提交
  4. 30 7月, 2012 1 次提交
    • J
      do not write null sha1s to on-disk index · 4337b585
      Jeff King 提交于
      We should never need to write the null sha1 into an index
      entry (short of the 1 in 2^160 chance that somebody actually
      has content that hashes to it). If we attempt to do so, it
      is much more likely that it is a bug, since we use the null
      sha1 as a sentinel value to mean "not valid".
      
      The presence of null sha1s in the index (which can come
      from, among other things, "update-index --cacheinfo", or by
      reading a corrupted tree) can cause problems for later
      readers, because they cannot distinguish the literal null
      sha1 from its use a sentinel value.  For example, "git
      diff-files" on such an entry would make it appear as if it
      is stat-dirty, and until recently, the diff code assumed
      such an entry meant that we should be diffing a working tree
      file rather than a blob.
      
      Ideally, we would stop such entries from entering even our
      in-core index. However, we do sometimes legitimately add
      entries with null sha1s in order to represent these sentinel
      situations; simply forbidding them in add_index_entry breaks
      a lot of the existing code. However, we can at least make
      sure that our in-core sentinel representation never makes it
      to disk.
      
      To be thorough, we will test an attempt to add both a blob
      and a submodule entry. In the former case, we might run into
      problems anyway because we will be missing the blob object.
      But in the latter case, we do not enforce connectivity
      across gitlink entries, making this our only point of
      enforcement. The current implementation does not care which
      type of entry we are seeing, but testing both cases helps
      future-proof the test suite in case that changes.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4337b585
  5. 12 7月, 2012 2 次提交
    • T
      Strip namelen out of ce_flags into a ce_namelen field · b60e188c
      Thomas Gummerer 提交于
      Strip the name length from the ce_flags field and move it
      into its own ce_namelen field in struct cache_entry. This
      will both give us a tiny bit of a performance enhancement
      when working with long pathnames and is a refactoring for
      more readability of the code.
      
      It enhances readability, by making it more clear what
      is a flag, and where the length is stored and make it clear
      which functions use stages in comparisions and which only
      use the length.
      
      It also makes CE_NAMEMASK private, so that users don't
      mistakenly write the name length in the flags.
      Signed-off-by: NThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b60e188c
    • J
      cache_name_compare(): do not truncate while comparing paths · d5f53338
      Junio C Hamano 提交于
      We failed to use ce_namelen() equivalent and instead only compared
      up to the CE_NAMEMASK bytes by mistake.  Adding an overlong path
      that shares the same common prefix as an existing entry in the index
      did not add a new entry, but instead replaced the existing one, as
      the result.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d5f53338
  6. 09 7月, 2012 1 次提交
  7. 05 4月, 2012 1 次提交
  8. 04 4月, 2012 7 次提交
  9. 24 3月, 2012 1 次提交
  10. 18 2月, 2012 1 次提交
    • J
      refresh_index: do not show unmerged path that is outside pathspec · 3d1f148c
      Junio C Hamano 提交于
      When running "git add --refresh <pathspec>", we incorrectly showed the
      path that is unmerged even if it is outside the specified pathspec, even
      though we did honor pathspec and refreshed only the paths that matched.
      
      Note that this cange does not affect "git update-index --refresh"; for
      hysterical raisins, it does not take a pathspec (it takes real paths) and
      more importantly itss command line options are parsed and executed one by
      one as they are encountered, so "git update-index --refresh foo" means
      "first refresh the index, and then update the entry 'foo' by hashing the
      contents in file 'foo'", not "refresh only entry 'foo'".
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3d1f148c
  11. 19 11月, 2011 3 次提交
    • J
      refresh_index: make porcelain output more specific · 73b7eae6
      Jeff King 提交于
      If you have a deleted file and a porcelain refreshes the
      cache, we print:
      
        Unstaged changes after reset:
        M	file
      
      This is technically correct, in that the file is modified,
      but it's friendlier to the user if we further differentiate
      the case of a deleted file (especially because this output
      looks a lot like "diff --name-status", which would also make
      the distinction).
      
      Similarly, we can distinguish typechanges ("T") and
      intent-to-add files ("A"), both of which appear as just "M"
      in the current output.
      
      The plumbing output for all cases remains "needs update" for
      historical compatibility.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      73b7eae6
    • J
      refresh_index: rename format variables · 4bd4e730
      Jeff King 提交于
      When refreshing the index, for modified (or unmerged) files we will print
      "needs update" (or "needs merge") for plumbing, or line similar to the
      output from "diff --name-status" for porcelain.
      
      The variables holding which type of message to show are named after the
      plumbing messages. However, as we begin to differentiate more cases at the
      porcelain level (with the plumbing message staying the same), that naming
      scheme will become awkward.
      
      Instead, name the variables after which case we found (modified or
      unmerged), not what we will output.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4bd4e730
    • J
      read-cache: let refresh_cache_ent pass up changed flags · d05e6970
      Jeff King 提交于
      This will enable refresh_cache to differentiate more cases
      of modification (such as typechange) when telling the user
      what isn't fresh.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d05e6970
  12. 27 10月, 2011 2 次提交
    • R
      read-cache.c: allocate index entries individually · debed2a6
      René Scharfe 提交于
      The code to estimate the in-memory size of the index based on its on-disk
      representation is subtly wrong for certain architecture-dependent struct
      layouts.  Instead of fixing it, replace the code to keep the index entries
      in a single large block of memory and allocate each entry separately
      instead.  This is both simpler and more flexible, as individual entries
      can now be freed.  Actually using that added flexibility is left for a
      later patch.
      Suggested-by: NJunio C Hamano <gitster@pobox.com>
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      debed2a6
    • R
      read-cache.c: fix index memory allocation · 8f41c07f
      René Scharfe 提交于
      estimate_cache_size() tries to guess how much memory is needed for the
      in-memory representation of an index file.  It does that by using the
      file size, the number of entries and the difference of the sizes of the
      on-disk and in-memory structs -- without having to check the length of
      the name of each entry, which varies for each entry, but their sums are
      the same no matter the representation.
      
      Except there can be a difference.  First of all, the size is really
      calculated by ce_size and ondisk_ce_size based on offsetof(..., name),
      not sizeof, which can be different.  And entries are padded with 1 to 8
      NULs at the end (after the variable name) to make their total length a
      multiple of eight.
      
      So in order to allocate enough memory to hold the index, change the
      delta calculation to be based on offsetof(..., name) and round up to
      the next multiple of eight.
      
      On a 32-bit Linux, this delta was used before:
      
      	sizeof(struct cache_entry)        == 72
      	sizeof(struct ondisk_cache_entry) == 64
      	                                    ---
      	                                      8
      
      The actual difference for an entry with a filename length of one was,
      however (find the definitions are in cache.h):
      
      	offsetof(struct cache_entry, name)        == 72
      	offsetof(struct ondisk_cache_entry, name) == 62
      
      	ce_size        == (72 + 1 + 8) & ~7 == 80
      	ondisk_ce_size == (62 + 1 + 8) & ~7 == 64
      	                                      ---
      	                                       16
      
      So eight bytes less had been allocated for such entries.  The new
      formula yields the correct delta:
      
      	(72 - 62 + 7) & ~7 == 16
      Reported-by: NJohn Hsing <tsyj2007@gmail.com>
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8f41c07f
  13. 26 8月, 2011 1 次提交
  14. 09 6月, 2011 1 次提交
  15. 08 6月, 2011 1 次提交
  16. 28 5月, 2011 1 次提交
  17. 10 5月, 2011 1 次提交
  18. 22 3月, 2011 2 次提交
    • J
      update $GIT_INDEX_FILE when there are racily clean entries · 483fbe2b
      Junio C Hamano 提交于
      Traditional "opportunistic index update" done by read-only "diff" and
      "status" was about updating cached lstat(2) information in the index for
      the next round.  We missed another obvious optimization opportunity: when
      there are racily clean entries that will cease to be racily clean by
      updating $GIT_INDEX_FILE.  Detect that case and write $GIT_INDEX_FILE out
      to give it a newer timestamp.
      
      Noticed by Lasse Makholm by stracing "git status" in a fresh checkout and
      counting the number of open(2) calls.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      483fbe2b
    • J
      diff/status: refactor opportunistic index update · ccdc4ec3
      Junio C Hamano 提交于
      When we had to refresh the index internally before running diff or status,
      we opportunistically updated the $GIT_INDEX_FILE so that later invocation
      of git can use the lstat(2) we already did in this invocation.
      
      Make them share a helper function to do so.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ccdc4ec3
  19. 23 2月, 2011 1 次提交
  20. 08 2月, 2011 1 次提交
  21. 04 2月, 2011 2 次提交
  22. 07 10月, 2010 1 次提交
    • J
      Support case folding for git add when core.ignorecase=true · dc1ae704
      Joshua Jensen 提交于
      When MyDir/ABC/filea.txt is added to Git, the disk directory MyDir/ABC/
      is renamed to mydir/aBc/, and then mydir/aBc/fileb.txt is added, the
      index will contain MyDir/ABC/filea.txt and mydir/aBc/fileb.txt. Although
      the earlier portions of this patch series account for those differences
      in case, this patch makes the pathing consistent by folding the case of
      newly added files against the first file added with that path.
      
      In read-cache.c's add_to_index(), the index_name_exists() support used
      for git status's case insensitive directory lookups is used to find the
      proper directory case according to what the user already checked in.
      That is, MyDir/ABC/'s case is used to alter the stored path for
      fileb.txt to MyDir/ABC/fileb.txt (instead of mydir/aBc/fileb.txt).
      
      This is especially important when cloning a repository to a case
      sensitive file system. MyDir/ABC/ and mydir/aBc/ exist in the same
      directory on a Windows machine, but on Linux, the files exist in two
      separate directories. The update to add_to_index(), in effect, treats a
      Windows file system as case sensitive by making path case consistent.
      Signed-off-by: NJoshua Jensen <jjensen@workspacewhiz.com>
      Signed-off-by: NJohannes Sixt <j6t@kdbg.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dc1ae704
  23. 12 8月, 2010 1 次提交
  24. 03 2月, 2010 1 次提交
  25. 24 1月, 2010 1 次提交
    • J
      Make ce_uptodate() trustworthy again · 125fd984
      Junio C Hamano 提交于
      The rule has always been that a cache entry that is ce_uptodate(ce)
      means that we already have checked the work tree entity and we know
      there is no change in the work tree compared to the index, and nobody
      should have to double check.  Note that false ce_uptodate(ce) does not
      mean it is known to be dirty---it only means we don't know if it is
      clean.
      
      There are a few codepaths (refresh-index and preload-index are among
      them) that mark a cache entry as up-to-date based solely on the return
      value from ie_match_stat(); this function uses lstat() to see if the
      work tree entity has been touched, and for a submodule entry, if its
      HEAD points at the same commit as the commit recorded in the index of
      the superproject (a submodule that is not even cloned is considered
      clean).
      
      A submodule is no longer considered unmodified merely because its HEAD
      matches the index of the superproject these days, in order to prevent
      people from forgetting to commit in the submodule and updating the
      superproject index with the new submodule commit, before commiting the
      state in the superproject.  However, the patch to do so didn't update
      the codepath that marks cache entries up-to-date based on the updated
      definition and instead worked it around by saying "we don't trust the
      return value of ce_uptodate() for submodules."
      
      This makes ce_uptodate() trustworthy again by not marking submodule
      entries up-to-date.
      
      The next step _could_ be to introduce a few "in-core" flag bits to
      cache_entry structure to record "this entry is _known_ to be dirty",
      call is_submodule_modified() from ie_match_stat(), and use these new
      bits to avoid running this rather expensive check more than once, but
      that can be a separate patch.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      125fd984
  26. 22 1月, 2010 1 次提交
    • L
      Remove diff machinery dependency from read-cache · fb7d3f32
      Linus Torvalds 提交于
      Exal Sibeaz pointed out that some git files are way too big, and that
      add_files_to_cache() brings in all the diff machinery to any git binary
      that needs the basic git SHA1 object operations from read-cache.c. Which
      is pretty much all of them.
      
      It's doubly silly, since add_files_to_cache() is only used by builtin
      programs (add, checkout and commit), so it's fairly easily fixed by just
      moving the thing to builtin-add.c, and avoiding the dependency entirely.
      
      I initially argued to Exal that it would probably be best to try to depend
      on smart compilers and linkers, but after spending some time trying to
      make -ffunction-sections work and giving up, I think Exal was right, and
      the fix is to just do some trivial cleanups like this.
      
      This trivial cleanup results in pretty stunning file size differences.
      The diff machinery really is mostly used by just the builtin programs, and
      you have things like these trivial before-and-after numbers:
      
        -rwxr-xr-x 1 torvalds torvalds 1727420 2010-01-21 10:53 git-hash-object
        -rwxrwxr-x 1 torvalds torvalds  940265 2010-01-21 11:16 git-hash-object
      
      Now, I'm not saying that 940kB is good either, but that's mostly all the
      debug information - you can see the real code with 'size':
      
         text	   data	    bss	    dec	    hex	filename
       418675	   3920	 127408	 550003	  86473	git-hash-object (before)
       230650	   2288	 111728	 344666	  5425a	git-hash-object (after)
      
      ie we have a nice 24% size reduction from this trivial cleanup.
      
      It's not just that one file either. I get:
      
      	[torvalds@nehalem git]$ du -s /home/torvalds/libexec/git-core
      	45640	/home/torvalds/libexec/git-core (before)
      	33508	/home/torvalds/libexec/git-core (after)
      
      so we're talking 12MB of diskspace here.
      
      (Of course, stripping all the binaries brings the 33MB down to 9MB, so the
      whole debug information thing is still the bulk of it all, but that's a
      separate issue entirely)
      
      Now, I'm sure there are other things we should do, and changing our
      compiler flags from -O2 to -Os would bring the text size down by an
      additional almost 20%, but this thing Exal pointed out seems to be some
      good low-hanging fruit.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fb7d3f32
  27. 12 1月, 2010 1 次提交
  28. 04 1月, 2010 1 次提交
    • J
      "reset --merge": fix unmerged case · e11d7b59
      Junio C Hamano 提交于
      Commit 9e8eceab (Add 'merge' mode to 'git reset', 2008-12-01) disallowed
      "git reset --merge" when there was unmerged entries.  But it wished if
      unmerged entries were reset as if --hard (instead of --merge) has been
      used.  This makes sense because all "mergy" operations makes sure that
      any path involved in the merge does not have local modifications before
      starting, so resetting such a path away won't lose any information.
      
      The previous commit changed the behavior of --merge to accept resetting
      unmerged entries if they are reset to a different state than HEAD, but it
      did not reset the changes in the work tree, leaving the conflict markers
      in the resulting file in the work tree.
      
      Fix it by doing three things:
      
       - Update the documentation to match the wish of original "reset --merge"
         better, namely, "An unmerged entry is a sign that the path didn't have
         any local modification and can be safely resetted to whatever the new
         HEAD records";
      
       - Update read_index_unmerged(), which reads the index file into the cache
         while dropping any higher-stage entries down to stage #0, not to copy
         the object name from the higher stage entry.  The code used to take the
         object name from the a stage entry ("base" if you happened to have
         stage #1, or "ours" if both sides added, etc.), which essentially meant
         that you are getting random results depending on what the merge did.
      
         The _only_ reason we want to keep a previously unmerged entry in the
         index at stage #0 is so that we don't forget the fact that we have
         corresponding file in the work tree in order to be able to remove it
         when the tree we are resetting to does not have the path.  In order to
         differentiate such an entry from ordinary cache entry, the cache entry
         added by read_index_unmerged() is marked as CE_CONFLICTED.
      
       - Update merged_entry() and deleted_entry() so that they pay attention to
         cache entries marked as CE_CONFLICTED.  They are previously unmerged
         entries, and the files in the work tree that correspond to them are
         resetted away by oneway_merge() to the version from the tree we are
         resetting to.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e11d7b59