1. 03 12月, 2017 1 次提交
    • J
      diffcore-rename: make diff-tree -l0 mean -l<large> · 89973554
      Jonathan Tan 提交于
      In the documentation of diff-tree, it is stated that the -l option
      "prevents rename/copy detection from running if the number of
      rename/copy targets exceeds the specified number". The documentation
      does not mention any special handling for the number 0, but the
      implementation before commit 9f7e4bfa ("diff: remove silent clamp of
      renameLimit", 2017-11-13) treated 0 as a special value indicating that
      the rename limit is to be a very large number instead.
      
      The commit 9f7e4bfa changed that behavior, treating 0 as 0. Revert
      this behavior to what it was previously. This allows existing scripts
      and tools that use "-l0" to continue working. The alternative (to have
      "-l0" suppress rename detection) is probably much less useful, since
      users can just refrain from specifying -M and/or -C to have the same
      effect.
      Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
      Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
      Reviewed-by: NElijah Newren <newren@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      89973554
  2. 15 11月, 2017 2 次提交
    • E
      diff: remove silent clamp of renameLimit · 9f7e4bfa
      Elijah Newren 提交于
      In commit 0024a549 (Fix the rename detection limit checking; 2007-09-14),
      the renameLimit was clamped to 32767.  This appears to have been to simply
      avoid integer overflow in the following computation:
      
         num_create * num_src <= rename_limit * rename_limit
      
      although it also could be viewed as a hardcoded bound on the amount of CPU
      time we're willing to allow users to tell git to spend on handling
      renames.  An upper bound may make sense, but unfortunately this upper
      bound was neither communicated to the users, nor documented anywhere.
      
      Although large limits can make things slow, we have users who would be
      ecstatic to have a small five file change be correctly cherry picked even
      if they have to manually specify a large limit and wait ten minutes for
      the renames to be detected.
      Signed-off-by: NElijah Newren <newren@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9f7e4bfa
    • E
      progress: fix progress meters when dealing with lots of work · d6861d02
      Elijah Newren 提交于
      The possibility of setting merge.renameLimit beyond 2^16 raises the
      possibility that the values passed to progress can exceed 2^32.
      Use uint64_t, because it "ought to be enough for anybody".  :-)
      Signed-off-by: NElijah Newren <newren@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d6861d02
  3. 01 11月, 2017 2 次提交
    • B
      diff: make struct diff_flags members lowercase · 0d1e0e78
      Brandon Williams 提交于
      Now that the flags stored in struct diff_flags are being accessed
      directly and not through macros, change all struct members from being
      uppercase to lowercase.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	@@
      	- E.RECURSIVE
      	+ E.recursive
      
      	@@
      	expression E;
      	@@
      	- E.TREE_IN_RECURSIVE
      	+ E.tree_in_recursive
      
      	@@
      	expression E;
      	@@
      	- E.BINARY
      	+ E.binary
      
      	@@
      	expression E;
      	@@
      	- E.TEXT
      	+ E.text
      
      	@@
      	expression E;
      	@@
      	- E.FULL_INDEX
      	+ E.full_index
      
      	@@
      	expression E;
      	@@
      	- E.SILENT_ON_REMOVE
      	+ E.silent_on_remove
      
      	@@
      	expression E;
      	@@
      	- E.FIND_COPIES_HARDER
      	+ E.find_copies_harder
      
      	@@
      	expression E;
      	@@
      	- E.FOLLOW_RENAMES
      	+ E.follow_renames
      
      	@@
      	expression E;
      	@@
      	- E.RENAME_EMPTY
      	+ E.rename_empty
      
      	@@
      	expression E;
      	@@
      	- E.HAS_CHANGES
      	+ E.has_changes
      
      	@@
      	expression E;
      	@@
      	- E.QUICK
      	+ E.quick
      
      	@@
      	expression E;
      	@@
      	- E.NO_INDEX
      	+ E.no_index
      
      	@@
      	expression E;
      	@@
      	- E.ALLOW_EXTERNAL
      	+ E.allow_external
      
      	@@
      	expression E;
      	@@
      	- E.EXIT_WITH_STATUS
      	+ E.exit_with_status
      
      	@@
      	expression E;
      	@@
      	- E.REVERSE_DIFF
      	+ E.reverse_diff
      
      	@@
      	expression E;
      	@@
      	- E.CHECK_FAILED
      	+ E.check_failed
      
      	@@
      	expression E;
      	@@
      	- E.RELATIVE_NAME
      	+ E.relative_name
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_SUBMODULES
      	+ E.ignore_submodules
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_CUMULATIVE
      	+ E.dirstat_cumulative
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_BY_FILE
      	+ E.dirstat_by_file
      
      	@@
      	expression E;
      	@@
      	- E.ALLOW_TEXTCONV
      	+ E.allow_textconv
      
      	@@
      	expression E;
      	@@
      	- E.TEXTCONV_SET_VIA_CMDLINE
      	+ E.textconv_set_via_cmdline
      
      	@@
      	expression E;
      	@@
      	- E.DIFF_FROM_CONTENTS
      	+ E.diff_from_contents
      
      	@@
      	expression E;
      	@@
      	- E.DIRTY_SUBMODULES
      	+ E.dirty_submodules
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_UNTRACKED_IN_SUBMODULES
      	+ E.ignore_untracked_in_submodules
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_DIRTY_SUBMODULES
      	+ E.ignore_dirty_submodules
      
      	@@
      	expression E;
      	@@
      	- E.OVERRIDE_SUBMODULE_CONFIG
      	+ E.override_submodule_config
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_BY_LINE
      	+ E.dirstat_by_line
      
      	@@
      	expression E;
      	@@
      	- E.FUNCCONTEXT
      	+ E.funccontext
      
      	@@
      	expression E;
      	@@
      	- E.PICKAXE_IGNORE_CASE
      	+ E.pickaxe_ignore_case
      
      	@@
      	expression E;
      	@@
      	- E.DEFAULT_FOLLOW_RENAMES
      	+ E.default_follow_renames
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0d1e0e78
    • B
      diff: remove DIFF_OPT_TST macro · 3b69daed
      Brandon Williams 提交于
      Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	identifier fld;
      	@@
      	- DIFF_OPT_TST(&E, fld)
      	+ E.flags.fld
      
      	@@
      	type T;
      	T *ptr;
      	identifier fld;
      	@@
      	- DIFF_OPT_TST(ptr, fld)
      	+ ptr->flags.fld
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3b69daed
  4. 20 8月, 2017 1 次提交
    • J
      progress: simplify "delayed" progress API · 8aade107
      Junio C Hamano 提交于
      We used to expose the full power of the delayed progress API to the
      callers, so that they can specify, not just the message to show and
      expected total amount of work that is used to compute the percentage
      of work performed so far, the percent-threshold parameter P and the
      delay-seconds parameter N.  The progress meter starts to show at N
      seconds into the operation only if we have not yet completed P per-cent
      of the total work.
      
      Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there
      are oddballs that chose more random-looking values like 95%.
      
      For a smoother workload, (50%, 1s) would allow us to start showing
      the progress meter earlier than (0%, 2s), while keeping the chance
      of not showing progress meter for long running operation the same as
      the latter.  For a task that would take 2s or more to complete, it
      is likely that less than half of it would complete within the first
      second, if the workload is smooth.  But for a spiky workload whose
      earlier part is easier, such a setting is likely to fail to show the
      progress meter entirely and (0%, 2s) is more appropriate.
      
      But that is merely a theory.  Realistically, it is of dubious value
      to ask each codepath to carefully consider smoothness of their
      workload and specify their own setting by passing two extra
      parameters.  Let's simplify the API by dropping both parameters and
      have everybody use (0%, 2s).
      
      Oh, by the way, the percent-threshold parameter and the structure
      member were consistently misspelled, which also is now fixed ;-)
      Helped-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8aade107
  5. 01 7月, 2017 1 次提交
    • S
      hashmap.h: compare function has access to a data field · 7663cdc8
      Stefan Beller 提交于
      When using the hashmap a common need is to have access to caller provided
      data in the compare function. A couple of times we abuse the keydata field
      to pass in the data needed. This happens for example in patch-ids.c.
      
      This patch changes the function signature of the compare function
      to have one more void pointer available. The pointer given for each
      invocation of the compare function must be defined in the init function
      of the hashmap and is just passed through.
      
      Documentation of this new feature is deferred to a later patch.
      This is a rather mechanical conversion, just adding the new pass-through
      parameter.  However while at it improve the naming of the fields of all
      compare functions used by hashmaps by ensuring unused parameters are
      prefixed with 'unused_' and naming the parameters what they are (instead
      of 'unused' make it 'unused_keydata').
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      7663cdc8
  6. 17 6月, 2017 1 次提交
  7. 05 6月, 2017 1 次提交
  8. 02 6月, 2017 1 次提交
  9. 15 11月, 2016 1 次提交
  10. 30 9月, 2016 1 次提交
  11. 02 8月, 2016 1 次提交
  12. 29 6月, 2016 2 次提交
  13. 31 3月, 2016 1 次提交
    • S
      diffcore: fix iteration order of identical files during rename detection · ca4e3ca0
      SZEDER Gábor 提交于
      If the two paths 'dir/A/file' and 'dir/B/file' have identical content
      and the parent directory is renamed, e.g. 'git mv dir other-dir', then
      diffcore reports the following exact renames:
      
          renamed:    dir/B/file -> other-dir/A/file
          renamed:    dir/A/file -> other-dir/B/file
      
      While technically not wrong, this is confusing not only for the user,
      but also for git commands that make decisions based on rename
      information, e.g. 'git log --follow other-dir/A/file' follows
      'dir/B/file' past the rename.
      
      This behavior is a side effect of commit v2.0.0-rc4~8^2~14
      (diffcore-rename.c: simplify finding exact renames, 2013-11-14): the
      hashmap storing sources returns entries from the same bucket, i.e.
      sources matching the current destination, in LIFO order.  Thus the
      iteration first examines 'other-dir/A/file' and 'dir/B/file' and, upon
      finding identical content and basename, reports an exact rename.
      
      Other hashmap users are apparently happy with the current iteration
      order over the entries of a bucket.  Changing the iteration order
      would risk upsetting other hashmap users and would increase the memory
      footprint of each bucket by a pointer to the tail element.
      
      Fill the hashmap with source entries in reverse order to restore the
      original exact rename detection behavior.
      Reported-by: NBill Okara <billokara@gmail.com>
      Signed-off-by: NSZEDER Gábor <szeder@ira.uka.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ca4e3ca0
  14. 23 2月, 2016 1 次提交
  15. 28 2月, 2015 2 次提交
    • J
      diffcore-rename: avoid processing duplicate destinations · 4d6be03b
      Jeff King 提交于
      The rename code cannot handle an input where we have
      duplicate destinations (i.e., more than one diff_filepair in
      the queue with the same string in its pair->two->path). We
      end up allocating only one slot in the rename_dst mapping.
      If we fill in the diff_filepair for that slot, when we
      re-queue the results, we may queue that filepair multiple
      times. When the diff is finally flushed, the filepair is
      processed and free()d multiple times, leading to heap
      corruption.
      
      This situation should only happen when a tree diff sees
      duplicates in one of the trees (see the added test for a
      detailed example). Rather than handle it, the sanest thing
      is just to turn off rename detection altogether for the
      diff.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4d6be03b
    • J
      diffcore-rename: split locate_rename_dst into two functions · f98c2f7e
      Jeff King 提交于
      This function manages the mapping of destination pathnames
      to filepairs, and it handles both insertion and lookup. This
      makes the return value a bit confusing, as we return a newly
      created entry (even though no caller cares), and have no
      room to indicate to the caller that an entry already
      existed.
      
      Instead, let's break this up into two distinct functions,
      both backed by a common binary search. The binary search
      will use our normal "return the index if we found something,
      or negative index minus one to show where it would have
      gone" semantics.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f98c2f7e
  16. 19 8月, 2014 1 次提交
  17. 08 7月, 2014 2 次提交
    • K
      hashmap: add simplified hashmap_get_from_hash() API · ab73a9d1
      Karsten Blees 提交于
      Hashmap entries are typically looked up by just a key. The hashmap_get()
      API expects an initialized entry structure instead, to support compound
      keys. This flexibility is currently only needed by find_dir_entry() in
      name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
      (currently five) call sites of hashmap_get() have to set up a near emtpy
      entry structure, resulting in duplicate code like this:
      
        struct hashmap_entry keyentry;
        hashmap_entry_init(&keyentry, hash(key));
        return hashmap_get(map, &keyentry, key);
      
      Add a hashmap_get_from_hash() API that allows hashmap lookups by just
      specifying the key and its hash code, i.e.:
      
        return hashmap_get_from_hash(map, hash(key), key);
      Signed-off-by: NKarsten Blees <blees@dcon.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ab73a9d1
    • K
      hashmap: factor out getting a hash code from a SHA1 · 039dc71a
      Karsten Blees 提交于
      Copying the first bytes of a SHA1 is duplicated in six places,
      however, the implications (the actual value would depend on the
      endianness of the platform) is documented only once.
      
      Add a properly documented API for this.
      Signed-off-by: NKarsten Blees <blees@dcon.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      039dc71a
  18. 04 3月, 2014 1 次提交
  19. 25 2月, 2014 1 次提交
  20. 19 11月, 2013 3 次提交
  21. 17 3月, 2013 1 次提交
  22. 30 7月, 2012 1 次提交
    • J
      diff: do not use null sha1 as a sentinel value · e5450100
      Jeff King 提交于
      The diff code represents paths using the diff_filespec
      struct. This struct has a sha1 to represent the sha1 of the
      content at that path, as well as a sha1_valid member which
      indicates whether its sha1 field is actually useful. If
      sha1_valid is not true, then the filespec represents a
      working tree file (e.g., for the no-index case, or for when
      the index is not up-to-date).
      
      The diff_filespec is only used internally, though. At the
      interfaces to the diff subsystem, callers feed the sha1
      directly, and we create a diff_filespec from it. It's at
      that point that we look at the sha1 and decide whether it is
      valid or not; callers may pass the null sha1 as a sentinel
      value to indicate that it is not.
      
      We should not typically see the null sha1 coming from any
      other source (e.g., in the index itself, or from a tree).
      However, a corrupt tree might have a null sha1, which would
      cause "diff --patch" to accidentally diff the working tree
      version of a file instead of treating it as a blob.
      
      This patch extends the edges of the diff interface to accept
      a "sha1_valid" flag whenever we accept a sha1, and to use
      that flag when creating a filespec. In some cases, this
      means passing the flag through several layers, making the
      code change larger than would be desirable.
      
      One alternative would be to simply die() upon seeing
      corrupted trees with null sha1s. However, this fix more
      directly addresses the problem (while bogus sha1s in a tree
      are probably a bad thing, it is really the sentinel
      confusion sending us down the wrong code path that is what
      makes it devastating). And it means that git is more capable
      of examining and debugging these corrupted trees. For
      example, you can still "diff --raw" such a tree to find out
      when the bogus entry was introduced; you just cannot do a
      "--patch" diff (just as you could not with any other
      corrupted tree, as we do not have any content to diff).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e5450100
  23. 24 3月, 2012 1 次提交
    • J
      teach diffcore-rename to optionally ignore empty content · 90d43b07
      Jeff King 提交于
      Our rename detection is a heuristic, matching pairs of
      removed and added files with similar or identical content.
      It's unlikely to be wrong when there is actual content to
      compare, and we already take care not to do inexact rename
      detection when there is not enough content to produce good
      results.
      
      However, we always do exact rename detection, even when the
      blob is tiny or empty. It's easy to get false positives with
      an empty blob, simply because it is an obvious content to
      use as a boilerplate (e.g., when telling git that an empty
      directory is worth tracking via an empty .gitignore).
      
      This patch lets callers specify whether or not they are
      interested in using empty files as rename sources and
      destinations. The default is "yes", keeping the original
      behavior. It works by detecting the empty-blob sha1 for
      rename sources and destinations.
      
      One more flexible alternative would be to allow the caller
      to specify a minimum size for a blob to be "interesting" for
      rename detection. But that would catch small boilerplate
      files, not large ones (e.g., if you had the GPL COPYING file
      in many directories).
      
      A better alternative would be to allow a "-rename"
      gitattribute to allow boilerplate files to be marked as
      such. I'll leave the complexity of that solution until such
      time as somebody actually wants it. The complaints we've
      seen so far revolve around empty files, so let's start with
      the simple thing.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      90d43b07
  24. 02 6月, 2011 1 次提交
  25. 29 4月, 2011 1 次提交
  26. 24 3月, 2011 1 次提交
    • M
      diffcore-rename: don't consider unmerged path as source · d7c9bf22
      Martin von Zweigbergk 提交于
      Since e9c84099 (diff-index --cached --raw: show tree entry on the LHS for
      unmerged entries., 2007-01-05), an unmerged entry should be detected by
      using DIFF_PAIR_UNMERGED(p), not by noticing both one and two sides of
      the filepair records mode=0 entries. However, it forgot to update some
      parts of the rename detection logic.
      
      This only makes difference in the "diff --cached" codepath where an
      unmerged filepair carries information on the entries that came from the
      tree.  It probably hasn't been noticed for a long time because nobody
      would run "diff -M" during a conflict resolution, but "git status" uses
      rename detection when it internally runs "diff-index" and "diff-files"
      and gives nonsense results.
      
      In an unmerged pair, "one" side can have a valid filespec to record the
      tree entry (e.g. what's in HEAD) when running "diff --cached". This can
      be used as a rename source to other paths in the index that are not
      unmerged. The path that is unmerged by definition does not have the
      final content yet (i.e. "two" side cannot have a valid filespec), so it
      can never be a rename destination.
      
      Use the DIFF_PAIR_UNMERGED() to detect unmerged filepair correctly, and
      allow the valid "one" side of an unmerged filepair to be considered a
      potential rename source, but never to be considered a rename destination.
      
      Commit message and first two test cases by Junio, the rest by Martin.
      Signed-off-by: NMartin von Zweigbergk <martin.von.zweigbergk@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d7c9bf22
  27. 23 3月, 2011 3 次提交
  28. 22 2月, 2011 2 次提交
  29. 19 2月, 2011 2 次提交