提交 · 948cbe6703d0d1e3ba65fd10037bafd1b8b87696 · 李少辉-开发者 / git

12 12月, 2017 1 次提交

diff: use skip_to_optional_arg() · 948cbe67

由 Christian Couder 提交于 12月 09, 2017

Let's simplify diff option parsing using skip_to_optional_arg().
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

948cbe67

08 11月, 2017 1 次提交

diff: --ignore-cr-at-eol · e9282f02

由 Junio C Hamano 提交于 10月 26, 2017

A new option --ignore-cr-at-eol tells the diff machinery to treat a
carriage-return at the end of a (complete) line as if it does not
exist.

Just like other "--ignore-*" options to ignore various kinds of
whitespace differences, this will help reviewing the real changes
you made without getting distracted by spurious CRLF<->LF conversion
made by your editor program.
Helped-by: NJohannes Schindelin <Johannes.Schindelin@gmx.de>
[jch: squashed in command line completion by Dscho]
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e9282f02

01 11月, 2017 6 次提交

diff: make struct diff_flags members lowercase · 0d1e0e78

由 Brandon Williams 提交于 10月 31, 2017

Now that the flags stored in struct diff_flags are being accessed
directly and not through macros, change all struct members from being
uppercase to lowercase.
This conversion is done using the following semantic patch:

	@@
	expression E;
	@@
	- E.RECURSIVE
	+ E.recursive

	@@
	expression E;
	@@
	- E.TREE_IN_RECURSIVE
	+ E.tree_in_recursive

	@@
	expression E;
	@@
	- E.BINARY
	+ E.binary

	@@
	expression E;
	@@
	- E.TEXT
	+ E.text

	@@
	expression E;
	@@
	- E.FULL_INDEX
	+ E.full_index

	@@
	expression E;
	@@
	- E.SILENT_ON_REMOVE
	+ E.silent_on_remove

	@@
	expression E;
	@@
	- E.FIND_COPIES_HARDER
	+ E.find_copies_harder

	@@
	expression E;
	@@
	- E.FOLLOW_RENAMES
	+ E.follow_renames

	@@
	expression E;
	@@
	- E.RENAME_EMPTY
	+ E.rename_empty

	@@
	expression E;
	@@
	- E.HAS_CHANGES
	+ E.has_changes

	@@
	expression E;
	@@
	- E.QUICK
	+ E.quick

	@@
	expression E;
	@@
	- E.NO_INDEX
	+ E.no_index

	@@
	expression E;
	@@
	- E.ALLOW_EXTERNAL
	+ E.allow_external

	@@
	expression E;
	@@
	- E.EXIT_WITH_STATUS
	+ E.exit_with_status

	@@
	expression E;
	@@
	- E.REVERSE_DIFF
	+ E.reverse_diff

	@@
	expression E;
	@@
	- E.CHECK_FAILED
	+ E.check_failed

	@@
	expression E;
	@@
	- E.RELATIVE_NAME
	+ E.relative_name

	@@
	expression E;
	@@
	- E.IGNORE_SUBMODULES
	+ E.ignore_submodules

	@@
	expression E;
	@@
	- E.DIRSTAT_CUMULATIVE
	+ E.dirstat_cumulative

	@@
	expression E;
	@@
	- E.DIRSTAT_BY_FILE
	+ E.dirstat_by_file

	@@
	expression E;
	@@
	- E.ALLOW_TEXTCONV
	+ E.allow_textconv

	@@
	expression E;
	@@
	- E.TEXTCONV_SET_VIA_CMDLINE
	+ E.textconv_set_via_cmdline

	@@
	expression E;
	@@
	- E.DIFF_FROM_CONTENTS
	+ E.diff_from_contents

	@@
	expression E;
	@@
	- E.DIRTY_SUBMODULES
	+ E.dirty_submodules

	@@
	expression E;
	@@
	- E.IGNORE_UNTRACKED_IN_SUBMODULES
	+ E.ignore_untracked_in_submodules

	@@
	expression E;
	@@
	- E.IGNORE_DIRTY_SUBMODULES
	+ E.ignore_dirty_submodules

	@@
	expression E;
	@@
	- E.OVERRIDE_SUBMODULE_CONFIG
	+ E.override_submodule_config

	@@
	expression E;
	@@
	- E.DIRSTAT_BY_LINE
	+ E.dirstat_by_line

	@@
	expression E;
	@@
	- E.FUNCCONTEXT
	+ E.funccontext

	@@
	expression E;
	@@
	- E.PICKAXE_IGNORE_CASE
	+ E.pickaxe_ignore_case

	@@
	expression E;
	@@
	- E.DEFAULT_FOLLOW_RENAMES
	+ E.default_follow_renames
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0d1e0e78

diff: remove DIFF_OPT_CLR macro · b2100e52

由 Brandon Williams 提交于 10月 31, 2017

Remove the `DIFF_OPT_CLR` macro and instead set the flags directly.
This conversion is done using the following semantic patch:

	@@
	expression E;
	identifier fld;
	@@
	- DIFF_OPT_CLR(&E, fld)
	+ E.flags.fld = 0

	@@
	type T;
	T *ptr;
	identifier fld;
	@@
	- DIFF_OPT_CLR(ptr, fld)
	+ ptr->flags.fld = 0
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b2100e52

diff: remove DIFF_OPT_SET macro · 23dcf77f

由 Brandon Williams 提交于 10月 31, 2017

Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:

	@@
	expression E;
	identifier fld;
	@@
	- DIFF_OPT_SET(&E, fld)
	+ E.flags.fld = 1

	@@
	type T;
	T *ptr;
	identifier fld;
	@@
	- DIFF_OPT_SET(ptr, fld)
	+ ptr->flags.fld = 1
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

23dcf77f

diff: remove DIFF_OPT_TST macro · 3b69daed

由 Brandon Williams 提交于 10月 31, 2017

Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:

	@@
	expression E;
	identifier fld;
	@@
	- DIFF_OPT_TST(&E, fld)
	+ E.flags.fld

	@@
	type T;
	T *ptr;
	identifier fld;
	@@
	- DIFF_OPT_TST(ptr, fld)
	+ ptr->flags.fld
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3b69daed

diff: add flag to indicate textconv was set via cmdline · afa73c53

由 Brandon Williams 提交于 10月 31, 2017

git-show is unique in that it wants to use textconv by default except
for when it is showing blobs.  When asked to show a blob, show doesn't
want to use textconv unless the user explicitly requested that it be
used by providing the command line flag '--textconv'.

Currently this is done by using a parallel set of 'touched' flags which
get set every time a particular flag is set or cleared.  In a future
patch we want to eliminate this parallel set of flags so instead of
relying on if the textconv flag has been touched, add a new flag
'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true
via the command line.
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

afa73c53

diff: convert flags to be stored in bitfields · 02f2f56b

由 Brandon Williams 提交于 10月 31, 2017

We cannot add many more flags to the diff machinery due to the
limitations of the number of flags that can be stored in a single
unsigned int.  In order to allow for more flags to be added to the diff
machinery in the future this patch converts the flags to be stored in
bitfields in 'struct diff_flags'.
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

02f2f56b

29 10月, 2017 1 次提交

diff: fix lstat() error handling in diff_populate_filespec() · 10e0ca84

由 Andrey Okoshkin 提交于 10月 27, 2017

Add lstat() error handling not only for ENOENT case.
Otherwise uninitialised 'struct stat st' variable is used later in case of
lstat() non-ENOENT failure which leads to processing of rubbish values of
file mode ('S_ISLNK(st.st_mode)' check) or size ('xsize_t(st.st_size)').
Signed-off-by: NAndrey Okoshkin <a.okoshkin@samsung.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

10e0ca84

27 10月, 2017 1 次提交

xdiff: reassign xpparm_t.flags bits · 446d12cb

由 Junio C Hamano 提交于 10月 26, 2017

We have packed the bits too tightly in such a way that it is not
easy to add a new type of whitespace ignoring option, a new type
of LCS algorithm, or a new type of post-cleanup heuristics.

Reorder bits a bit to give room for these three classes of options
to grow.  Also make use of XDF_WHITESPACE_FLAGS macro where we check
any of these bits are on, instead of using DIFF_XDL_TST() macro on
individual possibilities.  That way, the "is any of the bits on?"
code does not have to change when we add more ways to ignore
whitespaces.

While at it, add a comment in front of the bit definitions to
clarify in which structure these defined bits may appear.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

446d12cb

26 10月, 2017 1 次提交

diff.c: get rid of duplicate implementation · 01be97c2

由 Stefan Beller 提交于 10月 25, 2017

The implementations in diff.c to detect moved lines needs to compare
strings and hash strings, which is implemented in that file, as well
as in the xdiff library.

Remove the rather recent implementation in diff.c and rely on the well
exercised code in the xdiff lib.

With this change the hash used for bucketing the strings for the moved
line detection changes from FNV32 (that is provided via the hashmaps
memhash) to DJB2 (which is used internally in xdiff).  Benchmarks found
on the web[1] do not indicate that these hashes are different in
performance for readable strings.

[1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speedSigned-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

01be97c2

21 10月, 2017 2 次提交

diff: handle NULs in get_string_hash() · b66b5072

由 Jeff King 提交于 10月 19, 2017

For computing moved lines, we feed the characters of each
line into a hash. When we've been asked to ignore
whitespace, then we pick each character using next_byte(),
which returns -1 on end-of-string, which it determines using
the start/end pointers we feed it.

However our check of its return value treats "0" the same as
"-1", meaning we'd quit if the string has an embedded NUL.
This is unlikely to ever come up in practice since our line
boundaries generally come from calling strlen() in the first
place.

But it was a bit surprising to me as a reader of the
next_byte() code. And it's possible that we may one day feed
this function with more exotic input, which otherwise works
with arbitrary ptr/len pairs.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b66b5072

diff: fix whitespace-skipping with --color-moved · da58318e

由 Jeff King 提交于 10月 19, 2017

The code for handling whitespace with --color-moved
represents partial strings as a pair of pointers. There are
two possible conventions for the end pointer:

  1. It points to the byte right after the end of the
     string.

  2. It points to the final byte of the string.

But we seem to use both conventions in the code:

  a. we assign the initial pointers from the NUL-terminated
     string using (1)

  b. we eat trailing whitespace by checking the second
     pointer for isspace(), which needs (2)

  c. the next_byte() function checks for end-of-string with
     "if (cp > endp)", which is (2)

  d. in next_byte() we skip past internal whitespace with
     "while (cp < end)", which is (1)

This creates fewer bugs than you might think, because there
are some subtle interactions. Because of (a) and (c), we
always return the NUL-terminator from next_byte(). But all
of the callers of next_byte() happen to handle that
gracefully.

Because of the mismatch between (d) and (c), next_byte()
could accidentally return a whitespace character right at
endp. But because of the interaction of (a) and (b), we fail
to actually chomp trailing whitespace, meaning our endp
_always_ points to a NUL, canceling out the problem.

But that does leave (b) as a real bug: when ignoring
whitespace only at the end-of-line, we don't correctly trim
it, and fail to match up lines.

We can fix the whole thing by moving consistently to one
convention. Since convention (1) is idiomatic in our code
base, we'll pick that one.

The existing "-w" and "-b" tests continue to pass, and a new
"--ignore-space-at-eol" shows off the breakage we're fixing.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

da58318e

17 10月, 2017 1 次提交

Revert "color: check color.ui in git_default_config()" · 33c643bb

由 Jeff King 提交于 10月 13, 2017

This reverts commit 136c8c8b.

That commit was trying to address a bug caused by 4c7f1819
(make color.ui default to 'auto', 2013-06-10), in which
plumbing like diff-tree defaulted to "auto" color, but did
not respect a "color.ui" directive to disable it.

But it also meant that we started respecting "color.ui" set
to "always". This was a known problem, but 4c7f1819 argued
that nobody ought to be doing that. However, that turned out
to be wrong, and we got a number of bug reports related to
"add -p" regressing in v2.14.2.

Let's revert 136c8c8b, fixing the regression to "add -p".
This leaves the problem from 4c7f1819 unfixed, but:

  1. It's a pretty obscure problem in the first place. I
     only noticed it while working on the color code, and we
     haven't got a single bug report or complaint about it.

  2. We can make a more moderate fix on top by respecting
     "never" but not "always" for plumbing commands. This
     is just the minimal fix to go back to the working state
     we had before v2.14.2.

Note that this isn't a pure revert. We now have a test in
t3701 which shows off the "add -p" regression. This can be
flipped to success.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

33c643bb

16 10月, 2017 1 次提交

diff: fix infinite loop with --color-moved --ignore-space-change · fa5ba2c1

由 Jeff King 提交于 10月 12, 2017

The --color-moved code uses next_byte() to advance through
the blob contents. When the user has asked to ignore
whitespace changes, we try to collapse any whitespace change
down to a single space.

However, we enter the conditional block whenever we see the
IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't
whitespace.

This means that the combination of "--color-moved and
--ignore-space-change" was completely broken. Worse, because
we return from next_byte() without having advanced our
pointer, the function makes no forward progress in the
buffer and loops infinitely.

Fix this by entering the conditional only when we actually
see whitespace. We can apply this also to the
IGNORE_WHITESPACE change. That code path isn't buggy
(because it falls through to returning the next
non-whitespace byte), but it makes the logic more clear if
we only bother to look at whitespace flags after seeing that
the next byte is whitespace.
Reported-by: NOrgad Shaneh <orgads@gmail.com>
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fa5ba2c1

28 9月, 2017 1 次提交

diff: correct newline in summary for renamed files · 58aaced4

由 Stefan Beller 提交于 9月 27, 2017

In 146fdb0d (diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY,
2017-06-29), the conversion from direct printing to the symbol emission
dropped the new line character for renamed, copied and rewritten files.

Add the emission of a newline, add a test for this case.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Helped-by: NJeff King <peff@peff.net>
Signed-off-by: NStefan Beller <sbeller@google.com>
Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

58aaced4

22 9月, 2017 1 次提交

ALLOC_GROW: avoid -Wsign-compare warnings · 071bcaab

由 Ramsay Jones 提交于 9月 21, 2017

Signed-off-by: NRamsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

071bcaab

14 9月, 2017 1 次提交

avoid "write_in_full(fd, buf, len) != len" pattern · 06f46f23

由 Jeff King 提交于 9月 13, 2017

The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "< len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "<0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.
Signed-off-by: NJeff King <peff@peff.net>
Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

06f46f23

07 9月, 2017 3 次提交
- R
  diff: release strbuf after use in show_stats() · 5a612017
  由 Rene Scharfe 提交于 8月 30, 2017
```
Signed-off-by: NRene Scharfe <l.s.r@web.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  5a612017
- R
  diff: release strbuf after use in show_rename_copy() · 348eda24
  由 Rene Scharfe 提交于 8月 30, 2017
```
Signed-off-by: NRene Scharfe <l.s.r@web.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  348eda24
- R
  diff: release strbuf after use in diff_summary() · fa842d84
  由 Rene Scharfe 提交于 8月 30, 2017
```
Signed-off-by: NRene Scharfe <l.s.r@web.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  fa842d84
06 9月, 2017 3 次提交

tempfile: auto-allocate tempfiles on heap · 076aa2cb

由 Jeff King 提交于 9月 05, 2017

The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:

  struct tempfile t;

  create_tempfile(&t, ...);
  ...
  if (!err)
          rename_tempfile(&t, ...);
  else
          delete_tempfile(&t);

But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.

Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).

But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it.  Let's
see if we can make it harder to get this wrong.

Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.

Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).

This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.

The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:

  1. Storing a pointer instead of the struct itself.

  2. Passing the pointer instead of taking the struct
     address.

  3. Handling a "struct tempfile *" return instead of a file
     descriptor.

We could play games to make this less noisy. For example, by
defining the tempfile like this:

  struct tempfile {
	struct heap_allocated_part_of_tempfile {
                int fd;
                ...etc
        } *actual_data;
  }

Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

076aa2cb

tempfile: do not delete tempfile on failed close · 49bd0fc2

由 Jeff King 提交于 9月 05, 2017

When close_tempfile() fails, we delete the tempfile and
reset the fields of the tempfile struct. This makes it
easier for callers to return without cleaning up, but it
also makes this common pattern:

  if (close_tempfile(tempfile))
	return error_errno("error closing %s", tempfile->filename.buf);

wrong, because the "filename" field has been reset after the
failed close. And it's not easy to fix, as in many cases we
don't have another copy of the filename (e.g., if it was
created via one of the mks_tempfile functions, and we just
have the original template string).

Let's drop the feature that a failed close automatically
deletes the file. This puts the burden on the caller to do
the deletion themselves, but this isn't that big a deal.
Callers which do:

  if (write(...) || close_tempfile(...)) {
	delete_tempfile(...);
	return -1;
  }

already had to call delete when the write() failed, and so
aren't affected. Likewise, any caller which just calls die()
in the error path is OK; we'll delete the tempfile during
the atexit handler.

Because this patch changes the semantics of close_tempfile()
without changing its signature, all callers need to be
manually checked and converted to the new scheme. This patch
covers all in-tree callers, but there may be others for
not-yet-merged topics. To catch these, we rename the
function to close_tempfile_gently(), which will attract
compile-time attention to new callers. (Technically the
original could be considered "gentle" already in that it
didn't die() on errors, but this one is even more so).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

49bd0fc2

always check return value of close_tempfile · 45c6b1ed

由 Jeff King 提交于 9月 05, 2017

If close_tempfile() encounters an error, then it deletes the
tempfile and resets the "struct tempfile". But many code
paths ignore the return value and continue to use the
tempfile. Instead, we should generally treat this the same
as a write() error.

Note that in the postimage of some of these cases our error
message will be bogus after a failed close because we look
at tempfile->filename (either directly or via get_tempfile_path).
But after the failed close resets the tempfile object, this
is guaranteed to be the empty string. That will be addressed
in a future patch (because there are many more cases of the
same problem than just these instances).

Note also in the hunk in gpg-interface.c that it's fine to
call delete_tempfile() in the error path, even if
close_tempfile() failed and already deleted the file. The
tempfile code is smart enough to know the second deletion is
a noop.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

45c6b1ed

24 8月, 2017 1 次提交

pack: move has_sha1_pack() · 150e3001

由 Jonathan Tan 提交于 8月 18, 2017

Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

150e3001

21 8月, 2017 1 次提交

sha1_file: convert index_path to struct object_id · 98e019b0

由 Patryk Obara 提交于 8月 20, 2017

Convert all remaining callers as well.
Signed-off-by: NPatryk Obara <patryk.obara@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

98e019b0

18 8月, 2017 1 次提交

diff: retire sane_truncate_fn · 08a8509e

由 Junio C Hamano 提交于 8月 17, 2017

Long time ago, 23707811 ("diff: do not chomp hunk-header in the
middle of a character", 2008-01-02) introduced sane_truncate_line()
helper function to trim the "function header" line that is shown at
the end of the hunk header line, in order to avoid chomping it in
the middle of a single UTF-8 character.  It also added a facility to
define a custom callback function to make it possible to extend it
to non UTF-8 encodings.

During the following 8 1/2 years, nobody found need for this custom
callback facility.

A custom callback function is a wrong design to use here anyway---if
your contents need support for non UTF-8 encoding, you shouldn't
have to write a custom function and recompile Git to plumb it in.  A
better approach would be to extend sane_truncate_line() function and
have a new member in emit_callback to conditionally trigger it.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

08a8509e

17 8月, 2017 2 次提交

diff: define block by number of alphanumeric chars · f0b8fb6e

由 Jonathan Tan 提交于 8月 15, 2017

The existing behavior of diff --color-moved=zebra does not define the
minimum size of a block at all, instead relying on a heuristic applied
later to filter out sets of adjacent moved lines that are shorter than 3
lines long. This can be confusing, because a block could thus be colored
as moved at the source but not at the destination (or vice versa),
depending on its neighbors.

Instead, teach diff that the minimum size of a block is 20 alphanumeric
characters, the same heuristic used by "git blame". This allows diff to
still exclude uninteresting lines appearing on their own (such as those
solely consisting of one or a few closing braces), as was the intention
of the adjacent-moved-line heuristic.

This requires a change in some tests in that some of their lines are no
longer considered to be part of a block, because they are too short.
Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f0b8fb6e

diff: respect MIN_BLOCK_LENGTH for last block · 09153277

由 Jonathan Tan 提交于 8月 15, 2017

Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line
that does not belong to the current block. In particular, this means
that MIN_BLOCK_LENGTH is not checked after all lines are encountered.

Perform that check.
Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

09153277

15 8月, 2017 1 次提交

diff: avoid redundantly clearing a flag · 23b65f95

由 Jonathan Tan 提交于 8月 11, 2017

No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in
mark_color_as_moved(), so it is redundant to clear it for the current
line. Therefore, clear it only for previous lines.

This makes a refactoring in a subsequent patch easier.
Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

23b65f95

04 8月, 2017 1 次提交

diff: stop allowing diff to have submodules configured in .git/config · 078b75e9

由 Brandon Williams 提交于 8月 03, 2017

Traditionally a submodule is comprised of a gitlink as well as a
corresponding entry in the .gitmodules file.  Diff doesn't follow this
paradigm as its config callback routine falls back to populating the
submodule-config if a config entry starts with 'submodule.'.

Remove this behavior in order to be consistent with how the
submodule-config is populated, via calling 'gitmodules_config()' or
'repo_read_gitmodules()'.
Signed-off-by: NBrandon Williams <bmwill@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

078b75e9

14 7月, 2017 1 次提交

color: check color.ui in git_default_config() · 136c8c8b

由 Jeff King 提交于 7月 13, 2017

Back in prehistoric times, our decision on whether or not to
show color by default relied on using a config callback that
either did or didn't load color config like color.diff.
When we introduced color.ui, we put it in the same boat:
commands had to manually respect it by using git_color_config()
or its git_color_default_config() convenience wrapper.

But in 4c7f1819 (make color.ui default to 'auto',
2013-06-10), that changed. Since then, we default color.ui
to auto in all programs, meaning that even plumbing commands
like "git diff-tree --pretty" might colorize the output.
Nobody seems to have complained in the intervening years,
presumably because the "is stdout a tty" check does a good
job of catching the right cases.

But that leaves an interesting curiosity: color.ui defaults
to auto even in plumbing, but you can't actually _disable_
the color via config. So if you really hate color and set
"color.ui" to false, diff-tree will still show color (but
porcelain like git-diff won't).  Nobody noticed that either,
probably because very few people disable color.

One could argue that the plumbing should _always_ disable
color unless an explicit --color option is given on the
command line. But in practice, this creates a lot of
complications for scripts which do want plumbing to show
user-visible output. They can't just pass "--color" blindly;
they need to check the user's config and decide what to
send.

Given that nobody has complained about the current behavior,
let's assume it's a good path, and follow it to its
conclusion: supporting color.ui everywhere.

Note that you can create havoc by setting color.ui=always in
your config, but that's more or less already the case. We
could disallow it entirely, but it is handy for one-offs
like:

  git -c color.ui=always foo >not-a-tty

when "foo" does not take a --color option itself.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

136c8c8b

11 7月, 2017 1 次提交

use DIV_ROUND_UP · 42c78a21

由 René Scharfe 提交于 7月 08, 2017

Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.
Signed-off-by: NRene Scharfe <l.s.r@web.de>
Reviewed-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

42c78a21

01 7月, 2017 7 次提交

diff.c: add dimming to moved line detection · 86b452e2

由 Stefan Beller 提交于 6月 30, 2017

Any lines inside a moved block of code are not interesting. Boundaries
of blocks are only interesting if they are next to another block of moved
code.
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

86b452e2

diff.c: color moved lines differently, plain mode · 176841f0

由 Stefan Beller 提交于 6月 30, 2017

Add the 'plain' mode for move detection of code. This omits the checking
for adjacent blocks, so it is not as useful. If you have a lot of the
same blocks moved in the same patch, the 'Zebra' would end up slow as it
is O(n^2) (n is number of same blocks). So this may be useful there and
is generally easy to add. Instead be very literal at the move detection,
do not skip over short blocks here.
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

176841f0

diff.c: color moved lines differently · 2e2d5ac1

由 Stefan Beller 提交于 6月 30, 2017

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

This patch implements the first mode:
* basic alternating 'Zebra' mode
  This conveys all information needed to the user.  Defer customization to
  later patches.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.

While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.

For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: NJonathan Tan <jonathantanmy@google.com>
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2e2d5ac1

diff.c: buffer all output if asked to · e6e045f8

由 Stefan Beller 提交于 6月 29, 2017

Introduce a new option 'emitted_symbols' in the struct diff_options which
controls whether all output is buffered up until all output is available.
It is set internally in diff.c when necessary.

We'll have a new struct 'emitted_string' in diff.c which will be used to
buffer each line. The emitted_string will duplicate the memory of the
line to buffer as that is easiest to reason about for now. In a future
patch we may want to decrease the memory usage by not duplicating all
output for buffering but rather we may want to store offsets into the
file or in case of hunk descriptions such as the similarity score, we
could just store the relevant number and reproduce the text later on.

This approach was chosen as a first step because it is quite simple
compared to the alternative with less memory footprint.

emit_diff_symbol factors out the emission part and depending on the
diff_options->emitted_symbols the emission will be performed directly
when calling emit_diff_symbol or after the whole process is done, i.e.
by buffering we have add the possibility for a second pass over the
whole output before doing the actual output.

In 6440d341 (2012-03-14, diff: tweak a _copy_ of diff_options with
word-diff) we introduced a duplicate diff options struct for word
emissions as we may have different regex settings in there.
When buffering the output, we need to operate on just one buffer,
so we have to copy back the emissions of the word buffer into the
main buffer.

Unconditionally enable output via buffer in this patch as it yields
a great opportunity for testing, i.e. all the diff tests from the
test suite pass without having reordering issues (i.e. only parts
of the output got buffered, and we forgot to buffer other parts).
The test suite passes, which gives confidence that we converted all
functions to use emit_string for output.
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e6e045f8

S
diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY · 146fdb0d
由 Stefan Beller 提交于 6月 29, 2017
```
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
146fdb0d
S
diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP · 30b7e1e7
由 Stefan Beller 提交于 6月 29, 2017
```
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
30b7e1e7

diff.c: convert word diffing to use emit_diff_symbol · bd033291

由 Stefan Beller 提交于 6月 29, 2017

The word diffing is not line oriented and would need some serious
effort to be transformed into a line oriented approach, so
just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line.
Signed-off-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bd033291

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致