1. 12 12月, 2017 1 次提交
  2. 08 11月, 2017 1 次提交
    • J
      diff: --ignore-cr-at-eol · e9282f02
      Junio C Hamano 提交于
      A new option --ignore-cr-at-eol tells the diff machinery to treat a
      carriage-return at the end of a (complete) line as if it does not
      exist.
      
      Just like other "--ignore-*" options to ignore various kinds of
      whitespace differences, this will help reviewing the real changes
      you made without getting distracted by spurious CRLF<->LF conversion
      made by your editor program.
      Helped-by: NJohannes Schindelin <Johannes.Schindelin@gmx.de>
      [jch: squashed in command line completion by Dscho]
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e9282f02
  3. 01 11月, 2017 6 次提交
    • B
      diff: make struct diff_flags members lowercase · 0d1e0e78
      Brandon Williams 提交于
      Now that the flags stored in struct diff_flags are being accessed
      directly and not through macros, change all struct members from being
      uppercase to lowercase.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	@@
      	- E.RECURSIVE
      	+ E.recursive
      
      	@@
      	expression E;
      	@@
      	- E.TREE_IN_RECURSIVE
      	+ E.tree_in_recursive
      
      	@@
      	expression E;
      	@@
      	- E.BINARY
      	+ E.binary
      
      	@@
      	expression E;
      	@@
      	- E.TEXT
      	+ E.text
      
      	@@
      	expression E;
      	@@
      	- E.FULL_INDEX
      	+ E.full_index
      
      	@@
      	expression E;
      	@@
      	- E.SILENT_ON_REMOVE
      	+ E.silent_on_remove
      
      	@@
      	expression E;
      	@@
      	- E.FIND_COPIES_HARDER
      	+ E.find_copies_harder
      
      	@@
      	expression E;
      	@@
      	- E.FOLLOW_RENAMES
      	+ E.follow_renames
      
      	@@
      	expression E;
      	@@
      	- E.RENAME_EMPTY
      	+ E.rename_empty
      
      	@@
      	expression E;
      	@@
      	- E.HAS_CHANGES
      	+ E.has_changes
      
      	@@
      	expression E;
      	@@
      	- E.QUICK
      	+ E.quick
      
      	@@
      	expression E;
      	@@
      	- E.NO_INDEX
      	+ E.no_index
      
      	@@
      	expression E;
      	@@
      	- E.ALLOW_EXTERNAL
      	+ E.allow_external
      
      	@@
      	expression E;
      	@@
      	- E.EXIT_WITH_STATUS
      	+ E.exit_with_status
      
      	@@
      	expression E;
      	@@
      	- E.REVERSE_DIFF
      	+ E.reverse_diff
      
      	@@
      	expression E;
      	@@
      	- E.CHECK_FAILED
      	+ E.check_failed
      
      	@@
      	expression E;
      	@@
      	- E.RELATIVE_NAME
      	+ E.relative_name
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_SUBMODULES
      	+ E.ignore_submodules
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_CUMULATIVE
      	+ E.dirstat_cumulative
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_BY_FILE
      	+ E.dirstat_by_file
      
      	@@
      	expression E;
      	@@
      	- E.ALLOW_TEXTCONV
      	+ E.allow_textconv
      
      	@@
      	expression E;
      	@@
      	- E.TEXTCONV_SET_VIA_CMDLINE
      	+ E.textconv_set_via_cmdline
      
      	@@
      	expression E;
      	@@
      	- E.DIFF_FROM_CONTENTS
      	+ E.diff_from_contents
      
      	@@
      	expression E;
      	@@
      	- E.DIRTY_SUBMODULES
      	+ E.dirty_submodules
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_UNTRACKED_IN_SUBMODULES
      	+ E.ignore_untracked_in_submodules
      
      	@@
      	expression E;
      	@@
      	- E.IGNORE_DIRTY_SUBMODULES
      	+ E.ignore_dirty_submodules
      
      	@@
      	expression E;
      	@@
      	- E.OVERRIDE_SUBMODULE_CONFIG
      	+ E.override_submodule_config
      
      	@@
      	expression E;
      	@@
      	- E.DIRSTAT_BY_LINE
      	+ E.dirstat_by_line
      
      	@@
      	expression E;
      	@@
      	- E.FUNCCONTEXT
      	+ E.funccontext
      
      	@@
      	expression E;
      	@@
      	- E.PICKAXE_IGNORE_CASE
      	+ E.pickaxe_ignore_case
      
      	@@
      	expression E;
      	@@
      	- E.DEFAULT_FOLLOW_RENAMES
      	+ E.default_follow_renames
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0d1e0e78
    • B
      diff: remove DIFF_OPT_CLR macro · b2100e52
      Brandon Williams 提交于
      Remove the `DIFF_OPT_CLR` macro and instead set the flags directly.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	identifier fld;
      	@@
      	- DIFF_OPT_CLR(&E, fld)
      	+ E.flags.fld = 0
      
      	@@
      	type T;
      	T *ptr;
      	identifier fld;
      	@@
      	- DIFF_OPT_CLR(ptr, fld)
      	+ ptr->flags.fld = 0
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b2100e52
    • B
      diff: remove DIFF_OPT_SET macro · 23dcf77f
      Brandon Williams 提交于
      Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	identifier fld;
      	@@
      	- DIFF_OPT_SET(&E, fld)
      	+ E.flags.fld = 1
      
      	@@
      	type T;
      	T *ptr;
      	identifier fld;
      	@@
      	- DIFF_OPT_SET(ptr, fld)
      	+ ptr->flags.fld = 1
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      23dcf77f
    • B
      diff: remove DIFF_OPT_TST macro · 3b69daed
      Brandon Williams 提交于
      Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
      This conversion is done using the following semantic patch:
      
      	@@
      	expression E;
      	identifier fld;
      	@@
      	- DIFF_OPT_TST(&E, fld)
      	+ E.flags.fld
      
      	@@
      	type T;
      	T *ptr;
      	identifier fld;
      	@@
      	- DIFF_OPT_TST(ptr, fld)
      	+ ptr->flags.fld
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3b69daed
    • B
      diff: add flag to indicate textconv was set via cmdline · afa73c53
      Brandon Williams 提交于
      git-show is unique in that it wants to use textconv by default except
      for when it is showing blobs.  When asked to show a blob, show doesn't
      want to use textconv unless the user explicitly requested that it be
      used by providing the command line flag '--textconv'.
      
      Currently this is done by using a parallel set of 'touched' flags which
      get set every time a particular flag is set or cleared.  In a future
      patch we want to eliminate this parallel set of flags so instead of
      relying on if the textconv flag has been touched, add a new flag
      'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true
      via the command line.
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      afa73c53
    • B
      diff: convert flags to be stored in bitfields · 02f2f56b
      Brandon Williams 提交于
      We cannot add many more flags to the diff machinery due to the
      limitations of the number of flags that can be stored in a single
      unsigned int.  In order to allow for more flags to be added to the diff
      machinery in the future this patch converts the flags to be stored in
      bitfields in 'struct diff_flags'.
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      02f2f56b
  4. 29 10月, 2017 1 次提交
  5. 27 10月, 2017 1 次提交
    • J
      xdiff: reassign xpparm_t.flags bits · 446d12cb
      Junio C Hamano 提交于
      We have packed the bits too tightly in such a way that it is not
      easy to add a new type of whitespace ignoring option, a new type
      of LCS algorithm, or a new type of post-cleanup heuristics.
      
      Reorder bits a bit to give room for these three classes of options
      to grow.  Also make use of XDF_WHITESPACE_FLAGS macro where we check
      any of these bits are on, instead of using DIFF_XDL_TST() macro on
      individual possibilities.  That way, the "is any of the bits on?"
      code does not have to change when we add more ways to ignore
      whitespaces.
      
      While at it, add a comment in front of the bit definitions to
      clarify in which structure these defined bits may appear.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      446d12cb
  6. 26 10月, 2017 1 次提交
  7. 21 10月, 2017 2 次提交
    • J
      diff: handle NULs in get_string_hash() · b66b5072
      Jeff King 提交于
      For computing moved lines, we feed the characters of each
      line into a hash. When we've been asked to ignore
      whitespace, then we pick each character using next_byte(),
      which returns -1 on end-of-string, which it determines using
      the start/end pointers we feed it.
      
      However our check of its return value treats "0" the same as
      "-1", meaning we'd quit if the string has an embedded NUL.
      This is unlikely to ever come up in practice since our line
      boundaries generally come from calling strlen() in the first
      place.
      
      But it was a bit surprising to me as a reader of the
      next_byte() code. And it's possible that we may one day feed
      this function with more exotic input, which otherwise works
      with arbitrary ptr/len pairs.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b66b5072
    • J
      diff: fix whitespace-skipping with --color-moved · da58318e
      Jeff King 提交于
      The code for handling whitespace with --color-moved
      represents partial strings as a pair of pointers. There are
      two possible conventions for the end pointer:
      
        1. It points to the byte right after the end of the
           string.
      
        2. It points to the final byte of the string.
      
      But we seem to use both conventions in the code:
      
        a. we assign the initial pointers from the NUL-terminated
           string using (1)
      
        b. we eat trailing whitespace by checking the second
           pointer for isspace(), which needs (2)
      
        c. the next_byte() function checks for end-of-string with
           "if (cp > endp)", which is (2)
      
        d. in next_byte() we skip past internal whitespace with
           "while (cp < end)", which is (1)
      
      This creates fewer bugs than you might think, because there
      are some subtle interactions. Because of (a) and (c), we
      always return the NUL-terminator from next_byte(). But all
      of the callers of next_byte() happen to handle that
      gracefully.
      
      Because of the mismatch between (d) and (c), next_byte()
      could accidentally return a whitespace character right at
      endp. But because of the interaction of (a) and (b), we fail
      to actually chomp trailing whitespace, meaning our endp
      _always_ points to a NUL, canceling out the problem.
      
      But that does leave (b) as a real bug: when ignoring
      whitespace only at the end-of-line, we don't correctly trim
      it, and fail to match up lines.
      
      We can fix the whole thing by moving consistently to one
      convention. Since convention (1) is idiomatic in our code
      base, we'll pick that one.
      
      The existing "-w" and "-b" tests continue to pass, and a new
      "--ignore-space-at-eol" shows off the breakage we're fixing.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      da58318e
  8. 17 10月, 2017 1 次提交
    • J
      Revert "color: check color.ui in git_default_config()" · 33c643bb
      Jeff King 提交于
      This reverts commit 136c8c8b.
      
      That commit was trying to address a bug caused by 4c7f1819
      (make color.ui default to 'auto', 2013-06-10), in which
      plumbing like diff-tree defaulted to "auto" color, but did
      not respect a "color.ui" directive to disable it.
      
      But it also meant that we started respecting "color.ui" set
      to "always". This was a known problem, but 4c7f1819 argued
      that nobody ought to be doing that. However, that turned out
      to be wrong, and we got a number of bug reports related to
      "add -p" regressing in v2.14.2.
      
      Let's revert 136c8c8b, fixing the regression to "add -p".
      This leaves the problem from 4c7f1819 unfixed, but:
      
        1. It's a pretty obscure problem in the first place. I
           only noticed it while working on the color code, and we
           haven't got a single bug report or complaint about it.
      
        2. We can make a more moderate fix on top by respecting
           "never" but not "always" for plumbing commands. This
           is just the minimal fix to go back to the working state
           we had before v2.14.2.
      
      Note that this isn't a pure revert. We now have a test in
      t3701 which shows off the "add -p" regression. This can be
      flipped to success.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      33c643bb
  9. 16 10月, 2017 1 次提交
    • J
      diff: fix infinite loop with --color-moved --ignore-space-change · fa5ba2c1
      Jeff King 提交于
      The --color-moved code uses next_byte() to advance through
      the blob contents. When the user has asked to ignore
      whitespace changes, we try to collapse any whitespace change
      down to a single space.
      
      However, we enter the conditional block whenever we see the
      IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't
      whitespace.
      
      This means that the combination of "--color-moved and
      --ignore-space-change" was completely broken. Worse, because
      we return from next_byte() without having advanced our
      pointer, the function makes no forward progress in the
      buffer and loops infinitely.
      
      Fix this by entering the conditional only when we actually
      see whitespace. We can apply this also to the
      IGNORE_WHITESPACE change. That code path isn't buggy
      (because it falls through to returning the next
      non-whitespace byte), but it makes the logic more clear if
      we only bother to look at whitespace flags after seeing that
      the next byte is whitespace.
      Reported-by: NOrgad Shaneh <orgads@gmail.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fa5ba2c1
  10. 28 9月, 2017 1 次提交
  11. 22 9月, 2017 1 次提交
  12. 14 9月, 2017 1 次提交
    • J
      avoid "write_in_full(fd, buf, len) != len" pattern · 06f46f23
      Jeff King 提交于
      The return value of write_in_full() is either "-1", or the
      requested number of bytes[1]. If we make a partial write
      before seeing an error, we still return -1, not a partial
      value. This goes back to f6aa66cb (write_in_full: really
      write in full or return error on disk full., 2007-01-11).
      
      So checking anything except "was the return value negative"
      is pointless. And there are a couple of reasons not to do
      so:
      
        1. It can do a funny signed/unsigned comparison. If your
           "len" is signed (e.g., a size_t) then the compiler will
           promote the "-1" to its unsigned variant.
      
           This works out for "!= len" (unless you really were
           trying to write the maximum size_t bytes), but is a
           bug if you check "< len" (an example of which was fixed
           recently in config.c).
      
           We should avoid promoting the mental model that you
           need to check the length at all, so that new sites are
           not tempted to copy us.
      
        2. Checking for a negative value is shorter to type,
           especially when the length is an expression.
      
        3. Linus says so. In d34cf19b (Clean up write_in_full()
           users, 2007-01-11), right after the write_in_full()
           semantics were changed, he wrote:
      
             I really wish every "write_in_full()" user would just
             check against "<0" now, but this fixes the nasty and
             stupid ones.
      
           Appeals to authority aside, this makes it clear that
           writing it this way does not have an intentional
           benefit. It's a historical curiosity that we never
           bothered to clean up (and which was undoubtedly
           cargo-culted into new sites).
      
      So let's convert these obviously-correct cases (this
      includes write_str_in_full(), which is just a wrapper for
      write_in_full()).
      
      [1] A careful reader may notice there is one way that
          write_in_full() can return a different value. If we ask
          write() to write N bytes and get a return value that is
          _larger_ than N, we could return a larger total. But
          besides the fact that this would imply a totally broken
          version of write(), it would already invoke undefined
          behavior. Our internal remaining counter is an unsigned
          size_t, which means that subtracting too many byte will
          wrap it around to a very large number. So we'll instantly
          begin reading off the end of the buffer, trying to write
          gigabytes (or petabytes) of data.
      Signed-off-by: NJeff King <peff@peff.net>
      Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      06f46f23
  13. 07 9月, 2017 3 次提交
  14. 06 9月, 2017 3 次提交
    • J
      tempfile: auto-allocate tempfiles on heap · 076aa2cb
      Jeff King 提交于
      The previous commit taught the tempfile code to give up
      ownership over tempfiles that have been renamed or deleted.
      That makes it possible to use a stack variable like this:
      
        struct tempfile t;
      
        create_tempfile(&t, ...);
        ...
        if (!err)
                rename_tempfile(&t, ...);
        else
                delete_tempfile(&t);
      
      But doing it this way has a high potential for creating
      memory errors. The tempfile we pass to create_tempfile()
      ends up on a global linked list, and it's not safe for it to
      go out of scope until we've called one of those two
      deactivation functions.
      
      Imagine that we add an early return from the function that
      forgets to call delete_tempfile(). With a static or heap
      tempfile variable, the worst case is that the tempfile hangs
      around until the program exits (and some functions like
      setup_shallow_temporary rely on this intentionally, creating
      a tempfile and then leaving it for later cleanup).
      
      But with a stack variable as above, this is a serious memory
      error: the variable goes out of scope and may be filled with
      garbage by the time the tempfile code looks at it.  Let's
      see if we can make it harder to get this wrong.
      
      Since many callers need to allocate arbitrary numbers of
      tempfiles, we can't rely on static storage as a general
      solution. So we need to turn to the heap. We could just ask
      all callers to pass us a heap variable, but that puts the
      burden on them to call free() at the right time.
      
      Instead, let's have the tempfile code handle the heap
      allocation _and_ the deallocation (when the tempfile is
      deactivated and removed from the list).
      
      This changes the return value of all of the creation
      functions. For the cleanup functions (delete and rename),
      we'll add one extra bit of safety: instead of taking a
      tempfile pointer, we'll take a pointer-to-pointer and set it
      to NULL after freeing the object. This makes it safe to
      double-call functions like delete_tempfile(), as the second
      call treats the NULL input as a noop. Several callsites
      follow this pattern.
      
      The resulting patch does have a fair bit of noise, as each
      caller needs to be converted to handle:
      
        1. Storing a pointer instead of the struct itself.
      
        2. Passing the pointer instead of taking the struct
           address.
      
        3. Handling a "struct tempfile *" return instead of a file
           descriptor.
      
      We could play games to make this less noisy. For example, by
      defining the tempfile like this:
      
        struct tempfile {
      	struct heap_allocated_part_of_tempfile {
                      int fd;
                      ...etc
              } *actual_data;
        }
      
      Callers would continue to have a "struct tempfile", and it
      would be "active" only when the inner pointer was non-NULL.
      But that just makes things more awkward in the long run.
      There aren't that many callers, so we can simply bite
      the bullet and adjust all of them. And the compiler makes it
      easy for us to find them all.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      076aa2cb
    • J
      tempfile: do not delete tempfile on failed close · 49bd0fc2
      Jeff King 提交于
      When close_tempfile() fails, we delete the tempfile and
      reset the fields of the tempfile struct. This makes it
      easier for callers to return without cleaning up, but it
      also makes this common pattern:
      
        if (close_tempfile(tempfile))
      	return error_errno("error closing %s", tempfile->filename.buf);
      
      wrong, because the "filename" field has been reset after the
      failed close. And it's not easy to fix, as in many cases we
      don't have another copy of the filename (e.g., if it was
      created via one of the mks_tempfile functions, and we just
      have the original template string).
      
      Let's drop the feature that a failed close automatically
      deletes the file. This puts the burden on the caller to do
      the deletion themselves, but this isn't that big a deal.
      Callers which do:
      
        if (write(...) || close_tempfile(...)) {
      	delete_tempfile(...);
      	return -1;
        }
      
      already had to call delete when the write() failed, and so
      aren't affected. Likewise, any caller which just calls die()
      in the error path is OK; we'll delete the tempfile during
      the atexit handler.
      
      Because this patch changes the semantics of close_tempfile()
      without changing its signature, all callers need to be
      manually checked and converted to the new scheme. This patch
      covers all in-tree callers, but there may be others for
      not-yet-merged topics. To catch these, we rename the
      function to close_tempfile_gently(), which will attract
      compile-time attention to new callers. (Technically the
      original could be considered "gentle" already in that it
      didn't die() on errors, but this one is even more so).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      49bd0fc2
    • J
      always check return value of close_tempfile · 45c6b1ed
      Jeff King 提交于
      If close_tempfile() encounters an error, then it deletes the
      tempfile and resets the "struct tempfile". But many code
      paths ignore the return value and continue to use the
      tempfile. Instead, we should generally treat this the same
      as a write() error.
      
      Note that in the postimage of some of these cases our error
      message will be bogus after a failed close because we look
      at tempfile->filename (either directly or via get_tempfile_path).
      But after the failed close resets the tempfile object, this
      is guaranteed to be the empty string. That will be addressed
      in a future patch (because there are many more cases of the
      same problem than just these instances).
      
      Note also in the hunk in gpg-interface.c that it's fine to
      call delete_tempfile() in the error path, even if
      close_tempfile() failed and already deleted the file. The
      tempfile code is smart enough to know the second deletion is
      a noop.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      45c6b1ed
  15. 24 8月, 2017 1 次提交
  16. 21 8月, 2017 1 次提交
  17. 18 8月, 2017 1 次提交
    • J
      diff: retire sane_truncate_fn · 08a8509e
      Junio C Hamano 提交于
      Long time ago, 23707811 ("diff: do not chomp hunk-header in the
      middle of a character", 2008-01-02) introduced sane_truncate_line()
      helper function to trim the "function header" line that is shown at
      the end of the hunk header line, in order to avoid chomping it in
      the middle of a single UTF-8 character.  It also added a facility to
      define a custom callback function to make it possible to extend it
      to non UTF-8 encodings.
      
      During the following 8 1/2 years, nobody found need for this custom
      callback facility.
      
      A custom callback function is a wrong design to use here anyway---if
      your contents need support for non UTF-8 encoding, you shouldn't
      have to write a custom function and recompile Git to plumb it in.  A
      better approach would be to extend sane_truncate_line() function and
      have a new member in emit_callback to conditionally trigger it.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      08a8509e
  18. 17 8月, 2017 2 次提交
    • J
      diff: define block by number of alphanumeric chars · f0b8fb6e
      Jonathan Tan 提交于
      The existing behavior of diff --color-moved=zebra does not define the
      minimum size of a block at all, instead relying on a heuristic applied
      later to filter out sets of adjacent moved lines that are shorter than 3
      lines long. This can be confusing, because a block could thus be colored
      as moved at the source but not at the destination (or vice versa),
      depending on its neighbors.
      
      Instead, teach diff that the minimum size of a block is 20 alphanumeric
      characters, the same heuristic used by "git blame". This allows diff to
      still exclude uninteresting lines appearing on their own (such as those
      solely consisting of one or a few closing braces), as was the intention
      of the adjacent-moved-line heuristic.
      
      This requires a change in some tests in that some of their lines are no
      longer considered to be part of a block, because they are too short.
      Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f0b8fb6e
    • J
      diff: respect MIN_BLOCK_LENGTH for last block · 09153277
      Jonathan Tan 提交于
      Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line
      that does not belong to the current block. In particular, this means
      that MIN_BLOCK_LENGTH is not checked after all lines are encountered.
      
      Perform that check.
      Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      09153277
  19. 15 8月, 2017 1 次提交
  20. 04 8月, 2017 1 次提交
  21. 14 7月, 2017 1 次提交
    • J
      color: check color.ui in git_default_config() · 136c8c8b
      Jeff King 提交于
      Back in prehistoric times, our decision on whether or not to
      show color by default relied on using a config callback that
      either did or didn't load color config like color.diff.
      When we introduced color.ui, we put it in the same boat:
      commands had to manually respect it by using git_color_config()
      or its git_color_default_config() convenience wrapper.
      
      But in 4c7f1819 (make color.ui default to 'auto',
      2013-06-10), that changed. Since then, we default color.ui
      to auto in all programs, meaning that even plumbing commands
      like "git diff-tree --pretty" might colorize the output.
      Nobody seems to have complained in the intervening years,
      presumably because the "is stdout a tty" check does a good
      job of catching the right cases.
      
      But that leaves an interesting curiosity: color.ui defaults
      to auto even in plumbing, but you can't actually _disable_
      the color via config. So if you really hate color and set
      "color.ui" to false, diff-tree will still show color (but
      porcelain like git-diff won't).  Nobody noticed that either,
      probably because very few people disable color.
      
      One could argue that the plumbing should _always_ disable
      color unless an explicit --color option is given on the
      command line. But in practice, this creates a lot of
      complications for scripts which do want plumbing to show
      user-visible output. They can't just pass "--color" blindly;
      they need to check the user's config and decide what to
      send.
      
      Given that nobody has complained about the current behavior,
      let's assume it's a good path, and follow it to its
      conclusion: supporting color.ui everywhere.
      
      Note that you can create havoc by setting color.ui=always in
      your config, but that's more or less already the case. We
      could disallow it entirely, but it is handy for one-offs
      like:
      
        git -c color.ui=always foo >not-a-tty
      
      when "foo" does not take a --color option itself.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      136c8c8b
  22. 11 7月, 2017 1 次提交
  23. 01 7月, 2017 7 次提交
    • S
      diff.c: add dimming to moved line detection · 86b452e2
      Stefan Beller 提交于
      Any lines inside a moved block of code are not interesting. Boundaries
      of blocks are only interesting if they are next to another block of moved
      code.
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      86b452e2
    • S
      diff.c: color moved lines differently, plain mode · 176841f0
      Stefan Beller 提交于
      Add the 'plain' mode for move detection of code. This omits the checking
      for adjacent blocks, so it is not as useful. If you have a lot of the
      same blocks moved in the same patch, the 'Zebra' would end up slow as it
      is O(n^2) (n is number of same blocks). So this may be useful there and
      is generally easy to add. Instead be very literal at the move detection,
      do not skip over short blocks here.
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      176841f0
    • S
      diff.c: color moved lines differently · 2e2d5ac1
      Stefan Beller 提交于
      When a patch consists mostly of moving blocks of code around, it can
      be quite tedious to ensure that the blocks are moved verbatim, and not
      undesirably modified in the move. To that end, color blocks that are
      moved within the same patch differently. For example (OM, del, add,
      and NM are different colors):
      
          [OM]  -void sensitive_stuff(void)
          [OM]  -{
          [OM]  -        if (!is_authorized_user())
          [OM]  -                die("unauthorized");
          [OM]  -        sensitive_stuff(spanning,
          [OM]  -                        multiple,
          [OM]  -                        lines);
          [OM]  -}
      
                 void another_function()
                 {
          [del] -        printf("foo");
          [add] +        printf("bar");
                 }
      
          [NM]  +void sensitive_stuff(void)
          [NM]  +{
          [NM]  +        if (!is_authorized_user())
          [NM]  +                die("unauthorized");
          [NM]  +        sensitive_stuff(spanning,
          [NM]  +                        multiple,
          [NM]  +                        lines);
          [NM]  +}
      
      However adjacent blocks may be problematic. For example, in this
      potentially malicious patch, the swapping of blocks can be spotted:
      
          [OM]  -void sensitive_stuff(void)
          [OM]  -{
          [OMA] -        if (!is_authorized_user())
          [OMA] -                die("unauthorized");
          [OM]  -        sensitive_stuff(spanning,
          [OM]  -                        multiple,
          [OM]  -                        lines);
          [OMA] -}
      
                 void another_function()
                 {
          [del] -        printf("foo");
          [add] +        printf("bar");
                 }
      
          [NM]  +void sensitive_stuff(void)
          [NM]  +{
          [NMA] +        sensitive_stuff(spanning,
          [NMA] +                        multiple,
          [NMA] +                        lines);
          [NM]  +        if (!is_authorized_user())
          [NM]  +                die("unauthorized");
          [NMA] +}
      
      If the moved code is larger, it is easier to hide some permutation in the
      code, which is why some alternative coloring is needed.
      
      This patch implements the first mode:
      * basic alternating 'Zebra' mode
        This conveys all information needed to the user.  Defer customization to
        later patches.
      
      First I implemented an alternative design, which would try to fingerprint
      a line by its neighbors to detect if we are in a block or at the boundary.
      This idea iss error prone as it inspected each line and its neighboring
      lines to determine if the line was (a) moved and (b) if was deep inside
      a hunk by having matching neighboring lines. This is unreliable as the
      we can construct hunks which have equal neighbors that just exceed the
      number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
      as a line, that is permutated to AXYZCXYZBXYZD..').
      
      Instead this provides a dynamic programming greedy algorithm that finds
      the largest moved hunk and then has several modes on highlighting bounds.
      
      A note on the options '--submodule=diff' and '--color-words/--word-diff':
      In the conversion to use emit_line in the prior patches both submodules
      as well as word diff output carefully chose to call emit_line with sign=0.
      All output with sign=0 is ignored for move detection purposes in this
      patch, such that no weird looking output will be generated for these
      cases. This leads to another thought: We could pass on '--color-moved' to
      submodules such that they color up moved lines for themselves. If we'd do
      so only line moves within a repository boundary are marked up.
      
      It is useful to have moved lines colored, but there are annoying corner
      cases, such as a single line moved, that is very common. For example
      in a typical patch of C code, we have closing braces that end statement
      blocks or functions.
      
      While it is technically true that these lines are moved as they show up
      elsewhere, it is harmful for the review as the reviewers attention is
      drawn to such a minor side annoyance.
      
      For now let's have a simple solution of hardcoding the number of
      moved lines to be at least 3 before coloring them. Note, that the
      length is applied across all blocks to find the 'lonely' blocks
      that pollute new code, but do not interfere with a permutated
      block where each permutation has less lines than 3.
      Helped-by: NJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2e2d5ac1
    • S
      diff.c: buffer all output if asked to · e6e045f8
      Stefan Beller 提交于
      Introduce a new option 'emitted_symbols' in the struct diff_options which
      controls whether all output is buffered up until all output is available.
      It is set internally in diff.c when necessary.
      
      We'll have a new struct 'emitted_string' in diff.c which will be used to
      buffer each line.  The emitted_string will duplicate the memory of the
      line to buffer as that is easiest to reason about for now. In a future
      patch we may want to decrease the memory usage by not duplicating all
      output for buffering but rather we may want to store offsets into the
      file or in case of hunk descriptions such as the similarity score, we
      could just store the relevant number and reproduce the text later on.
      
      This approach was chosen as a first step because it is quite simple
      compared to the alternative with less memory footprint.
      
      emit_diff_symbol factors out the emission part and depending on the
      diff_options->emitted_symbols the emission will be performed directly
      when calling emit_diff_symbol or after the whole process is done, i.e.
      by buffering we have add the possibility for a second pass over the
      whole output before doing the actual output.
      
      In 6440d341 (2012-03-14, diff: tweak a _copy_ of diff_options with
      word-diff) we introduced a duplicate diff options struct for word
      emissions as we may have different regex settings in there.
      When buffering the output, we need to operate on just one buffer,
      so we have to copy back the emissions of the word buffer into the
      main buffer.
      
      Unconditionally enable output via buffer in this patch as it yields
      a great opportunity for testing, i.e. all the diff tests from the
      test suite pass without having reordering issues (i.e. only parts
      of the output got buffered, and we forgot to buffer other parts).
      The test suite passes, which gives confidence that we converted all
      functions to use emit_string for output.
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e6e045f8
    • S
      146fdb0d
    • S
      30b7e1e7
    • S
      diff.c: convert word diffing to use emit_diff_symbol · bd033291
      Stefan Beller 提交于
      The word diffing is not line oriented and would need some serious
      effort to be transformed into a line oriented approach, so
      just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line.
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      bd033291