提交 · 678c5741118729dc9072182bd368e0a3402dd4f1 · 李少辉-开发者 / git

19 4月, 2012 1 次提交

Prevent graph_width of stat width from falling below min · 678c5741

由 Lucian Poston 提交于 4月 18, 2012

Update tests in t4052 fixed by this change.
Signed-off-by: NLucian Poston <lucian.poston@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

678c5741

17 4月, 2012 1 次提交

Adjust stat width calculations to take --graph output into account · 3f145132

由 Lucian Poston 提交于 4月 16, 2012

The recent change to compute the width of diff --stat did not take into
consideration the output from --graph. The consequence is that when both
options are used, e.g. in 'log --stat --graph', the lines are too long.

Adjust stat width calculations to take --graph output into account.
Signed-off-by: NLucian Poston <lucian.poston@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3f145132

02 3月, 2012 5 次提交

diff -p: squelch "diff --git" header for stat-dirty paths · b3f01ff2

由 Junio C Hamano 提交于 2月 29, 2012

The plumbing "diff" commands look at the working tree files without
refreshing the index themselves for performance reasons (the calling
script is expected to do that upfront just once, before calling one or
more of them). In the early days of git, they showed the "diff --git"
header before they actually ask the xdiff machinery to produce patches,
and ended up showing only these headers if the real contents are the same
and the difference they noticed was only because the stat info cached in
the index did not match that of the working tree. It was too late for the
implementation to take the header that it already emitted back.

But 3e97c7c6 (No diff -b/-w output for all-whitespace changes, 2009-11-19)
introduced necessary logic to keep the meta-information headers in a
strbuf and delay their output until the xdiff machinery noticed actual
changes. This was primarily in order to generate patches that ignore
whitespaces. When operating under "-w" mode, we wouldn't know if the
header is needed until we actually look at the resulting patch, so it was
a sensible thing to do, but we did not realize that the same reasoning
applies to stat-dirty paths.

Later, 296c6bb2 (diff: fix "git show -C -C" output when renaming a binary
file, 2010-05-26) generalized this machinery and added must_show_header
toggle. This is turned on when the header must be shown even when there
is no patch to be produced, e.g. only the mode was changed, or the path
was renamed, without changing the contents. However, when it did so, it
still kept the special case for the "-w" mode, which meant that the
plumbing would keep showing these phantom changes.

This corrects this historical inconsistency by allowing the plumbing to
omit paths that are only stat-dirty from its output in the same way as it
handles whitespace only changes under "-w" option.

The change in the behaviour can be seen in the updated test.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b3f01ff2

diff --stat: add config option to limit graph width · df44483a

由 Zbigniew Jędrzejewski-Szmek 提交于 3月 01, 2012

Config option diff.statGraphWidth=<width> is equivalent to
--stat-graph-width=<width>, except that the config option is ignored
by format-patch.

For the graph-width limiting to be usable, it should happen
'automatically' once configured, hence the config option.
Nevertheless, graph width limiting only makes sense when used on a
wide terminal, so it should not influence the output of format-patch,
which adheres to the 80-column standard.
Signed-off-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

df44483a

diff --stat: enable limiting of the graph part · 969fe57b

由 Zbigniew Jędrzejewski-Szmek 提交于 3月 01, 2012

A new option --stat-graph-width=<width> can be used to limit the width
of the graph part even is more space is available. Up to <width>
columns will be used for the graph.

If commits changing a lot of lines are displayed in a wide terminal
window (200 or more columns), and the +- graph uses the full width,
the output can be hard to comfortably scan with a horizontal movement
of human eyes. Messages wrapped to about 80 columns would be
interspersed with very long +- lines. It makes sense to limit the
width of the graph part to a fixed value (e.g. 70 columns), even if
more columns are available.
Signed-off-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

969fe57b

diff --stat: use a maximum of 5/8 for the filename part · 1b058bc3

由 Zbigniew Jędrzejewski-Szmek 提交于 3月 01, 2012

The way that available columns are divided between the filename part
and the graph part is modified to use as many columns as necessary for
the filenames and the rest for the graph.

If there isn't enough columns to print both the filename and the
graph, at least 5/8 of available space is devoted to filenames. On a
standard 80 column terminal, or if not connected to a terminal and
using the default of 80 columns, this gives the same partition as
before.

The effect of this change is visible in the patch to the test vector
in t4052; with a small change with long filename, it stops truncating
the name part too short, and also allocates a bit more columns to the
graph for larger changes.
Signed-off-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

1b058bc3

diff --stat: use the full terminal width · af9fedc1

由 Zbigniew Jędrzejewski-Szmek 提交于 3月 01, 2012

Default to the real terminal width for diff --stat output, instead
of the hard-coded 80 columns.

Some projects (especially in Java), have long filename paths, with
nested directories or long individual filenames. When files are
renamed, the filename part in stat output can be almost useless. If
the middle part between { and } is long (because the file was moved to
a completely different directory), then most of the path would be
truncated.

It makes sense to detect and use the full terminal width and display
full filenames if possible.

The are commands like diff, show, and log, which can adapt the output
to the terminal width. There are also commands like format-patch,
whose output should be independent of the terminal width. Since it is
safer to use the 80-column default, the real terminal width is only
used if requested by the calling code by setting diffopts.stat_width=-1.
Normally this value is 0, and can be set by the user only to a
non-negative value, so -1 is safe to use internally.

This patch only changes the diff builtin to use the full terminal width.
Signed-off-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

af9fedc1

15 2月, 2012 1 次提交

diff --stat: show bars of same length for paths with same amount of changes · 2eeeef24

由 Junio C Hamano 提交于 2月 14, 2012

When commit 3ed74e60 (diff --stat: ensure at least one '-' for deletions,
and one '+' for additions, 2006-09-28) improved the output for files with
tiny modifications, we accidentally broke the logic to ensure that two
equal sized changes are shown with the bars of the same length, even when
rounding errors exist.

Compute the length of the graph bars, using the same "non-zero changes is
shown with at least one column" scaling logic, but by scaling the sum of
additions and deletions to come up with the total length of the bar (this
ensures that two equal sized changes result in bars of the same length),
and then scaling the smaller of the additions or deletions. The other side
is computed as the difference between the two.

This makes the apportioning between additions and deletions less accurate
due to rounding errors, but it is much less noticeable than two files with
the same amount of change showing bars of different length.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2eeeef24

08 2月, 2012 1 次提交

drop odd return value semantics from userdiff_config · 6680a087

由 Jeff King 提交于 2月 07, 2012

When the userdiff_config function was introduced in be58e70d
(diff: unify external diff and funcname parsing code,
2008-10-05), it used a return value convention unlike any
other config callback. Like other callbacks, it used "-1" to
signal error. But it returned "1" to indicate that it found
something, and "0" otherwise; other callbacks simply
returned "0" to indicate that no error occurred.

This distinction was necessary at the time, because the
userdiff namespace overlapped slightly with the color
configuration namespace. So "diff.color.foo" could mean "the
'foo' slot of diff coloring" or "the 'foo' component of the
"color" userdiff driver". Because the color-parsing code
would die on an unknown color slot, we needed the userdiff
code to indicate that it had matched the variable, letting
us bypass the color-parsing code entirely.

Later, in 8b8e8624 (ignore unknown color configuration,
2009-12-12), the color-parsing code learned to silently
ignore unknown slots. This means we no longer need to
protect userdiff-matched variables from reaching the
color-parsing code.

We can therefore change the userdiff_config calling
convention to a more normal one. This drops some code from
each caller, which is nice. But more importantly, it reduces
the cognitive load for readers who may wonder why
userdiff_config is unlike every other config callback.

There's no need to add a new test confirming that this
works; t4020 already contains a test that sets
diff.color.external.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

6680a087

04 2月, 2012 1 次提交

Use correct grammar in diffstat summary line · 7f814632

由 Nguyễn Thái Ngọc Duy 提交于 2月 01, 2012

"git diff --stat" and "git apply --stat" now learn to print the line
"%d files changed, %d insertions(+), %d deletions(-)" in singular form
whenever applicable. "0 insertions" and "0 deletions" are also omitted
unless they are both zero.

This matches how versions of "diffstat" that are not prehistoric produced
their output, and also makes this line translatable.

[jc: with help from Thomas Dickey in archaeology of "diffstat"]
[jc: squashed Jonathan's updates to illustrations in tutorials and a test]
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

7f814632

13 1月, 2012 1 次提交

word-diff: ignore '\ No newline at eof' marker · c7c2bc0a

由 Thomas Rast 提交于 1月 12, 2012

The word-diff logic accumulates + and - lines until another line type
appears (normally [ @\]), at which point it generates the word diff.
This is usually correct, but it breaks when the preimage does not have
a newline at EOF:

  $ printf "%s" "a a a" >a
  $ printf "%s\n" "a ab a" >b
  $ git diff --no-index --word-diff a b
  diff --git 1/a 2/b
  index 9f68e94..6a7c02f 100644
  --- 1/a
  +++ 2/b
  @@ -1 +1 @@
  [-a a a-]
   No newline at end of file
  {+a ab a+}

Because of the order of the lines in a unified diff

  @@ -1 +1 @@
  -a a a
  \ No newline at end of file
  +a ab a

the '\' line flushed the buffers, and the - and + lines were never
matched with each other.

A proper fix would defer such markers until the end of the hunk.
However, word-diff is inherently whitespace-ignoring, so as a cheap
fix simply ignore the marker (and hide it from the output).

We use a prefix match for '\ ' to parallel the logic in
apply.c:parse_fragment().  We currently do not localize this string
(just accept other variants of it in git-apply), but this should be
future-proof.
Noticed-by: NIvan Shirokoff <shirokoff@yandex-team.ru>
Signed-off-by: NThomas Rast <trast@student.ethz.ch>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c7c2bc0a

11 10月, 2011 1 次提交

diff: add option to show whole functions as context · 14937c2c

由 René Scharfe 提交于 10月 09, 2011

Add the option -W/--function-context to git diff.  It is similar to
the same option of git grep and expands the context of change hunks
so that the whole surrounding function is shown.  This "natural"
context can allow changes to be understood better.

Note: GNU patch doesn't like diffs generated with the new option;
it seems to expect context lines to be the same before and after
changes.  git apply doesn't complain.

This implementation has the same shortcoming as the one in grep,
namely that there is no way to explicitly find the end of a
function.  That means that a few lines of extra context are shown,
right up to the next recognized function begins.  It's already
useful in its current form, though.

The function get_func_line() in xdiff/xemit.c is extended to work
forward as well as backward to find post-context as well as
pre-context.  It returns the position of the first found matching
line.  The func_line parameter is made optional, as we don't need
it for -W.

The enhanced function is then used in xdl_emit_diff() to extend
the context as needed.  If the added context overlaps with the
next change, it is merged into the current hunk.
Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

14937c2c

04 10月, 2011 1 次提交

diff: resurrect XDF_NEED_MINIMAL with --minimal · 81b568c8

由 Junio C Hamano 提交于 10月 01, 2011

Earlier, 582aa00b (git diff too slow for a file, 2010-05-02)
unconditionally dropped XDF_NEED_MINIMAL option from the internal xdiff
invocation to help performance on pathological cases, while hinting that a
follow-up patch could reintroduce it with "--minimal" option from the
command line.

Make it so.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

81b568c8

23 9月, 2011 1 次提交

diff: teach --stat/--numstat to honor -U$num · f01cae91

由 Junio C Hamano 提交于 9月 22, 2011

"git diff -p" piped to external diffstat and "git diff --stat" may see
different patch text (both are valid and describe the same change
correctly) when counting the number of added and deleted lines, arriving
at different results to confuse the users, as --stat/--numstat codepath
always uses the hardcoded -U0 as the context length.

Make --stat/--numstat codepath to honor the context length the same way
as the textual patch codepath does to avoid this problem.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f01cae91

20 8月, 2011 2 次提交

diff: don't load color config in plumbing · 3e1dd17a

由 Jeff King 提交于 8月 17, 2011

The diff config callback is split into two functions: one
which loads "ui" config, and one which loads "basic" config.
The former chains to the latter, as the diff UI config is a
superset of the plumbing config.

The color.diff variable is only loaded in the UI config.
However, the basic config actually chains to
git_color_default_config, which loads color.ui. This doesn't
actually cause any bugs, because the plumbing diff code does
not actually look at the value of color.ui.

However, it is somewhat nonsensical, and it makes it
difficult to refactor the color code. It probably came about
because there is no git_color_config to load only color
config, but rather just git_color_default_config, which
loads color config and chains to git_default_config.

This patch splits out the color-specific portion of
git_color_default_config so that the diff UI config can call
it directly. This is perhaps better explained by the
chaining of callbacks. Before we had:

  git_diff_ui_config
    -> git_diff_basic_config
      -> git_color_default_config
        -> git_default_config

Now we have:

  git_diff_ui_config
    -> git_color_config
    -> git_diff_basic_config
      -> git_default_config
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3e1dd17a

color: delay auto-color decision until point of use · daa0c3d9

由 Jeff King 提交于 8月 17, 2011

When we read a color value either from a config file or from
the command line, we use git_config_colorbool to convert it
from the tristate always/never/auto into a single yes/no
boolean value.

This has some timing implications with respect to starting
a pager.

If we start (or decide not to start) the pager before
checking the colorbool, everything is fine. Either isatty(1)
will give us the right information, or we will properly
check for pager_in_use().

However, if we decide to start a pager after we have checked
the colorbool, things are not so simple. If stdout is a tty,
then we will have already decided to use color. However, the
user may also have configured color.pager not to use color
with the pager. In this case, we need to actually turn off
color. Unfortunately, the pager code has no idea which color
variables were turned on (and there are many of them
throughout the code, and they may even have been manipulated
after the colorbool selection by something like "--color" on
the command line).

This bug can be seen any time a pager is started after
config and command line options are checked. This has
affected "git diff" since 89d07f75 (diff: don't run pager if
user asked for a diff style exit code, 2007-08-12). It has
also affect the log family since 1fda91b5 (Fix 'git log'
early pager startup error case, 2010-08-24).

This patch splits the notion of parsing a colorbool and
actually checking the configuration. The "use_color"
variables now have an additional possible value,
GIT_COLOR_AUTO. Users of the variable should use the new
"want_color()" wrapper, which will lazily determine and
cache the auto-color decision.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

daa0c3d9

19 8月, 2011 2 次提交

git_config_colorbool: refactor stdout_is_tty handling · e269eb79

由 Jeff King 提交于 8月 17, 2011

Usually this function figures out for itself whether stdout
is a tty. However, it has an extra parameter just to allow
git-config to override the auto-detection for its
--get-colorbool option.

Instead of an extra parameter, let's just use a global
variable. This makes calling easier in the common case, and
will make refactoring the colorbool code much simpler.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e269eb79

diff: refactor COLOR_DIFF from a flag into an int · f1c96261

由 Jeff King 提交于 8月 17, 2011

This lets us store more than just a bit flag for whether we
want color; we can also store whether we want automatic
colors. This can be useful for making the automatic-color
decision closer to the point of use.

This mostly just involves replacing DIFF_OPT_* calls with
manipulations of the flag. The biggest exception is that
calls to DIFF_OPT_TST must check for "o->use_color > 0",
which lets an "unknown" value (i.e., the default) stay at
"no color". In the previous code, a value of "-1" was not
propagated at all.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f1c96261

13 7月, 2011 1 次提交

teach --histogram to diff · 8c912eea

由 Tay Ray Chuan 提交于 7月 12, 2011

Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show
that it is faster than its --patience cousin, as well as the default
Meyers algorithm.

The implementation has been reworked to use structs and pointers,
instead of bitmasks, thus doing away with JGit's 2^28 line limit.

We also use xdiff's default hash table implementation (xdl_hash_bits()
with XDL_HASHLONG()) for convenience.
Signed-off-by: NTay Ray Chuan <rctay89@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

8c912eea

11 6月, 2011 3 次提交

zlib: zlib can only process 4GB at a time · ef49a7a0

由 Junio C Hamano 提交于 6月 10, 2011

The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ef49a7a0

J
zlib: wrap deflateBound() too · 225a6f10
由 Junio C Hamano 提交于 6月 10, 2011
```
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
225a6f10

zlib: wrap deflate side of the API · 55bb5c91

由 Junio C Hamano 提交于 6月 10, 2011

Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

55bb5c91

01 6月, 2011 1 次提交

diff: futureproof "stop feeding the backend early" logic · 28b9264d

由 Junio C Hamano 提交于 5月 31, 2011

Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

28b9264d

28 5月, 2011 3 次提交

J
diff --stat-count: finishing touches · e5f85df8
由 Junio C Hamano 提交于 5月 27, 2011
```
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
e5f85df8

diff: introduce --stat-lines to limit the stat lines · 808e1db2

由 Michael J Gruber 提交于 5月 27, 2011

Often one is interested in the full --stat output only for commits which
change a few files, but not others, because larger restructuring gives a
--stat which fills a few screens.

Introduce a new option --stat-count=<count> which limits the --stat output
to the first <count> lines, followed by a "..." line. It can
also be given as the third parameter in
--stat=<width>,<name-width>,<count>.

Also, the unstuck form is supported analogous to the other two stat
parameters.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

808e1db2

diff.c: omit hidden entries from namelen calculation with --stat · 358e460e

由 Michael J Gruber 提交于 5月 27, 2011

Currently, --stat calculates the longest name from all items but then
drops some (mode changes) from the output later on.

Instead, drop them from the namelen generation and calculation.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

358e460e

24 5月, 2011 1 次提交

refactor get_textconv to not require diff_filespec · 3813e690

由 Jeff King 提交于 5月 23, 2011

This function actually does two things:

  1. Load the userdiff driver for the filespec.

  2. Decide whether the driver has a textconv component, and
     initialize the textconv cache if applicable.

Only part (1) requires the filespec object, and some callers
may not have a filespec at all. So let's split them it into
two functions, and put part (2) with the userdiff code,
which is a better fit.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3813e690

21 5月, 2011 1 次提交

do not read beyond end of malloc'd buffer · 42536dd9

由 Jim Meyering 提交于 5月 20, 2011

With diff.suppress-blank-empty=true, "git diff --word-diff" would
output data that had been read from uninitialized heap memory.
The problem was that fn_out_consume did not account for the
possibility of a line with length 1, i.e., the empty context line
that diff.suppress-blank-empty=true converts from " \n" to "\n".
Since it assumed there would always be a prefix character (the space),
it decremented "len" unconditionally, thus passing len=0 to emit_line,
which would then blindly call emit_line_0 with len=-1 which would
pass that value on to fwrite as SIZE_MAX.  Boom.
Signed-off-by: NJim Meyering <meyering@redhat.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

42536dd9

30 4月, 2011 7 次提交

Mark dirstat error messages for translation · 7478ac57

由 Johan Herland 提交于 4月 29, 2011

Suggested-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

7478ac57

Improve error handling when parsing dirstat parameters · 51670fc8

由 Johan Herland 提交于 4月 29, 2011

When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.

This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).

The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).

The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: NJunio C Hamano <gitster@pobox.com>
Improved-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

51670fc8

New --dirstat=lines mode, doing dirstat analysis based on diffstat · 1c57a627

由 Johan Herland 提交于 4月 29, 2011

This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.

The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).

For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.

In linux-2.6.git, running the three different --dirstat modes:

  time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null

yields the following average runtimes on my machine:

 - "changes" (default): ~6.0 s
 - "lines":             ~9.6 s
 - "files":             ~0.1 s

So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.

The patch also includes documentation and tests for the new dirstat mode.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

1c57a627

Allow specifying --dirstat cut-off percentage as a floating point number · 712d2c7d

由 Johan Herland 提交于 4月 29, 2011

Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.

Selftests verifying floating-point percentage input has been added.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Improved-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

712d2c7d

Add config variable for specifying default --dirstat behavior · 2d174951

由 Johan Herland 提交于 4月 29, 2011

The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.

When not specified, the --dirstat defaults are 'changes,noncumulative,3'.

The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2d174951

Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file · 333f3fb0

由 Johan Herland 提交于 4月 29, 2011

Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.

 - Preserve the current --dirstat=<limit> (where <limit> is an integer
   specifying a cut-off percentage)
 - Add --dirstat=cumulative, replacing --cumulative
 - Add --dirstat=files, replacing --dirstat-by-file
 - Also add --dirstat=changes and --dirstat=noncumulative for specifying the
   current default behavior. These allow the user to reset other --dirstat
   parameters (e.g. 'cumulative' and 'files') occuring earlier on the
   command line.

The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.

Allow multiple parameters to be separated by commas, e.g.:
  --dirstat=files,10,cumulative

Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

333f3fb0

Make --dirstat=0 output directories that contribute < 0.1% of changes · 58a8756a

由 Johan Herland 提交于 4月 29, 2011

The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.

This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.

This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

58a8756a

24 4月, 2011 2 次提交

diff: remove often unused parameters from diff_unmerge() · fa7b2908

由 Junio C Hamano 提交于 4月 22, 2011

e9c84099 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair. Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fa7b2908

diff.c: return filepair from diff_unmerge() · 76399c01

由 Junio C Hamano 提交于 4月 22, 2011

The underlying diff_queue() returns diff_filepair so that the caller can
further add information to it, and the helper function diff_unmerge()
utilizes the feature itself, but does not expose it to its callers, which
was kind of selfish.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

76399c01

13 4月, 2011 1 次提交

--dirstat: In case of renames, use target filename instead of source filename · 2ca86714

由 Johan Herland 提交于 4月 12, 2011

This changes --dirstat analysis to count "damage" toward the target filename,
rather than the source filename. For renames within a directory, this won't
matter to the final output, but when moving files between diretories, the
output now lists the target directory rather than the source directory.
Signed-off-by: NJohan Herland <johan@herland.net>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2ca86714

12 4月, 2011 2 次提交

Teach --dirstat not to completely ignore rearranged lines within a file · 2ff3a803

由 Johan Herland 提交于 4月 11, 2011

Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.

Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.

Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).
Signed-off-by: NJohan Herland <johan@herland.net>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2ff3a803

--dirstat-by-file: Make it faster and more correct · 0133dab7

由 Johan Herland 提交于 4月 11, 2011

Currently, when using --dirstat-by-file, it first does the full --dirstat
analysis (using diffcore_count_changes()), and then resets 'damage' to 1,
if any damage was found by diffcore_count_changes().

But --dirstat-by-file is not interested in the file damage per se. It only
cares if the file changed at all. In that sense it only cares if the blob
object for a file has changed. We therefore only need to compare the
object names of each file pair in the diff queue and we can skip the
entire --dirstat analysis and simply set 'damage' to 1 for each entry
where the object name has changed.

This makes --dirstat-by-file faster, and also bypasses --dirstat's practice
of ignoring rearranged lines within a file.

The patch also contains an added testcase verifying that --dirstat-by-file
now detects changes that only rearrange lines within a file.
Signed-off-by: NJohan Herland <johan@herland.net>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0133dab7

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致