提交 · 72441af7c4e3bde33cdf7edafcf09c227d5d5296 · 李少辉-开发者 / git

08 4月, 2014 1 次提交

tree-diff: rework diff_tree() to generate diffs for multiparent cases as well · 72441af7

由 Kirill Smelkov 提交于 4月 07, 2014

Previously diff_tree(), which is now named ll_diff_tree_sha1(), was
generating diff_filepair(s) for two trees t1 and t2, and that was
usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes
a commit introduces.

In Git, however, we have fundamentally built flexibility in that a
commit can have many parents - 1 for a plain commit, 2 for a simple merge,
but also more than 2 for merging several heads at once.

For merges there is a so called combine-diff, which shows diff, a merge
introduces by itself, omitting changes done by any parent. That works
through first finding paths, that are different to all parents, and then
showing generalized diff, with separate columns for +/- for each parent.
The code lives in combine-diff.c .

There is an impedance mismatch, however, in that a commit could
generally have any number of parents, and that while diffing trees, we
divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there
is no special casing for multiple parents commits in e.g.
revision-walker .

That impedance mismatch *hurts* *performance* *badly* for generating
combined diffs - in "combine-diff: optimize combine_diff_path
sets intersection" I've already removed some slowness from it, but from
the timings provided there, it could be seen, that combined diffs still
cost more than an order of magnitude more cpu time, compared to diff for
usual commits, and that would only be an optimistic estimate, if we take
into account that for e.g. linux.git there is only one merge for several
dozens of plain commits.

That slowness comes from the fact that currently, while generating
combined diff, a lot of time is spent computing diff(commit,commit^2)
just to only then intersect that huge diff to almost small set of files
from diff(commit,commit^1).

That's because at present, to compute combine-diff, for first finding
paths, that "every parent touches", we use the following combine-diff
property/definition:

D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)      (w.r.t. paths)

where

D(A,P1...Pn) is combined diff between commit A, and parents Pi

and

D(A,Pi) is usual two-tree diff Pi..A

So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n
1-parent diffs and intersecting results will be slow.

And usually, for linux.git and other topic-based workflows, that
D(A,P2) is huge, because, if merge-base of A and P2, is several dozens
of merges (from A, via first parent) below, that D(A,P2) will be diffing
sum of merges from several subsystems to 1 subsystem.

The solution is to avoid computing n 1-parent diffs, and to find
changed-to-all-parents paths via scanning A's and all Pi's trees
simultaneously, at each step comparing their entries, and based on that
comparison, populate paths result, and deduce we could *skip*
*recursing* into subdirectories, if at least for 1 parent, sha1 of that
dir tree is the same as in A. That would save us from doing significant
amount of needless work.

Such approach is very similar to what diff_tree() does, only there we
deal with scanning only 2 trees simultaneously, and for n+1 tree, the
logic is a bit more complex:

D(T,P1...Pn) calculation scheme
-------------------------------

D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn)	(regarding resulting paths set)

    D(T,Pj)		- diff between T..Pj
    D(T,P1...Pn)	- combined diff from T to parents P1,...,Pn

We start from all trees, which are sorted, and compare their entries in
lock-step:

     T     P1       Pn
     -     -        -
    |t|   |p1|     |pn|
    |-|   |--| ... |--|      imin = argmin(p1...pn)
    | |   |  |     |  |
    |-|   |--|     |--|
    |.|   |. |     |. |
     .     .        .
     .     .        .

at any time there could be 3 cases:

    1)  t < p[imin];
    2)  t > p[imin];
    3)  t = p[imin].

Schematic deduction of what every case means, and what to do, follows:

1)  t < p[imin]  ->  ∀j t ∉ Pj  ->  "+t" ∈ D(T,Pj)  ->  D += "+t";  t↓

2)  t > p[imin]

    2.1) ∃j: pj > p[imin]  ->  "-p[imin]" ∉ D(T,Pj)  ->  D += ø;  ∀ pi=p[imin]  pi↓
    2.2) ∀i  pi = p[imin]  ->  pi ∉ T  ->  "-pi" ∈ D(T,Pi)  ->  D += "-p[imin]";  ∀i pi↓

3)  t = p[imin]

    3.1) ∃j: pj > p[imin]  ->  "+t" ∈ D(T,Pj)  ->  only pi=p[imin] remains to investigate
    3.2) pi = p[imin]  ->  investigate δ(t,pi)
     |
     |
     v

    3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø  ->

                      ⎧δ(t,pi)  - if pi=p[imin]
             ->  D += ⎨
                      ⎩"+t"     - if pi>p[imin]

    in any case t↓  ∀ pi=p[imin]  pi↓

~

For comparison, here is how diff_tree() works:

D(A,B) calculation scheme
-------------------------

    A     B
    -     -
   |a|   |b|    a < b   ->  a ∉ B   ->   D(A,B) +=  +a    a↓
   |-|   |-|    a > b   ->  b ∉ A   ->   D(A,B) +=  -b    b↓
   | |   | |    a = b   ->  investigate δ(a,b)            a↓ b↓
   |-|   |-|
   |.|   |.|
    .     .
    .     .

~~~~~~~~

This patch generalizes diff tree-walker to work with arbitrary number of
parents as described above - i.e. now there is a resulting tree t, and
some parents trees tp[i] i=[0..nparent). The generalization builds on
the fact that usual diff

D(A,B)

is by definition the same as combined diff

D(A,[B]),

so if we could rework the code for common case and make it be not slower
for nparent=1 case, usual diff(t1,t2) generation will not be slower, and
multiparent diff tree-walker would greatly benefit generating
combine-diff.

What we do is as follows:

1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be
   a paths generator (new name diff_tree_paths()), with each generated path
   being `struct combine_diff_path` with info for path, new sha1,mode and for
   every parent which sha1,mode it was in it.

2) From that info, we can still generate usual diff queue with
   struct diff_filepairs, via "exporting" generated
   combine_diff_path, if we know we run for nparent=1 case.
   (see emit_diff() which is now named emit_diff_first_parent_only())

3) In order for diff_can_quit_early(), which checks

       DIFF_OPT_TST(opt, HAS_CHANGES))

   to work, that exporting have to be happening not in bulk, but
   incrementally, one diff path at a time.

   For such consumers, there is a new callback in diff_options
   introduced:

       ->pathchange(opt, struct combine_diff_path *)

   which, if set to !NULL, is called for every generated path.

   (see new compat ll_diff_tree_sha1() wrapper around new paths
    generator for setup)

4) The paths generation itself, is reworked from previous
   ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation
   scheme" provided above:

   On the start we allocate [nparent] arrays in place what was
   earlier just for one parent tree.

   then we just generalize loops, and comparison according to the
   algorithm.

Some notes(*):

1) alloca(), for small arrays, is used for "runs not slower for
   nparent=1 case than before" goal - if we change it to xmalloc()/free()
   the timings get ~1% worse. For alloca() we use just-introduced
   xalloca/xalloca_free compatibility wrappers, so it should not be a
   portability problem.

2) For every parent tree, we need to keep a tag, whether entry from that
   parent equals to entry from minimal parent. For performance reasons I'm
   keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ.
   Not doing so, we'd need to alloca another [nparent] array, which hurts
   performance.

3) For emitted paths, memory could be reused, if we know the path was
   processed via callback and will not be needed later. We use efficient
   hand-made realloc-style path_appendnew(), that saves us from ~1-1.5%
   of potential additional slowdown.

4) goto(s) are used in several places, as the code executes a little bit
   faster with lowered register pressure.

Also

- we should now check for FIND_COPIES_HARDER not only when two entries
  names are the same, and their hashes are equal, but also for a case,
  when a path was removed from some of all parents having it.

  The reason is, if we don't, that path won't be emitted at all (see
  "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants
  all paths - with diff or without - to be emitted, to be later analyzed
  for being copies sources.

  The new check is only necessary for nparent >1, as for nparent=1 case
  xmin_eqtotal always =1 =nparent, and a path is always added to diff as
  removal.

~~~~~~~~

Timings for

    # without -c, i.e. testing only nparent=1 case
    `git log --raw --no-abbrev --no-renames`

before and after the patch are as follows:

                navy.git        linux.git v3.10..v3.11

    before      0.611s          1.889s
    after       0.619s          1.907s
    slowdown    1.3%            0.9%

This timings show we did no harm to usual diff(tree1,tree2) generation.
From the table we can see that we actually did ~1% slowdown, but I think
I've "earned" that 1% in the previous patch ("tree-diff: reuse base
str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case,
net timings stays approximately the same.

The output also stayed the same.

(*) If we revert 1)-4) to more usual techniques, for nparent=1 case,
    we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as
   "do no harm for nparent=1 case" rule.

For linux.git, combined diff will run an order of magnitude faster and
appropriate timings will be provided in the next commit, as we'll be
taking advantage of the new diff tree-walker for combined-diff
generation there.

P.S. and combined diff is not some exotic/for-play-only stuff - for
example for a program I write to represent Git archives as readonly
filesystem, there is initial scan with

    `git log --reverse --raw --no-abbrev --no-renames -c`

to extract log of what was created/changed when, as a result building a
map

    {}  sha1    ->  in which commit (and date) a content was added

that `-c` means also show combined diff for merges, and without them, if
a merge is non-trivial (merges changes from two parents with both having
separate changes to a file), or an evil one, the map will not be full,
i.e. some valid sha1 would be absent from it.

That case was my initial motivation for combined diffs speedup.
Signed-off-by: NKirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

72441af7

27 3月, 2014 1 次提交

tree-diff: diff_tree() should now be static · ad6f3cc7

由 Kirill Smelkov 提交于 2月 24, 2014

We reworked all its users to use the functionality through
diff_tree_sha1 variant in recent patches (see "tree-diff: allow
diff_tree_sha1 to accept NULL sha1" and what comes next).

diff_tree() is now not used outside tree-diff.c - make it static.
Signed-off-by: NKirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ad6f3cc7

25 2月, 2014 1 次提交

combine-diff: combine_diff_path.len is not needed anymore · af82c788

由 Kirill Smelkov 提交于 1月 20, 2014

The field was used in order to speed-up name comparison and also to
mark removed paths by setting it to 0.

Because the updated code does significantly less strcmp and also
just removes paths from the list and free right after we know a path
will not be needed, it is not needed anymore.
Signed-off-by: NKirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

af82c788

13 12月, 2013 1 次提交

diff: move no-index detection to builtin/diff.c · 470faf96

由 Thomas Gummerer 提交于 12月 11, 2013

Currently the --no-index option is parsed in diff_no_index().  Move the
detection if a no-index diff should be executed to builtin/diff.c, where
we can use it for executing diff_no_index() conditionally.  This will
also allow us to execute other operations conditionally, which will be
done in the next patch.

There are no functional changes.
Helped-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NThomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

470faf96

07 12月, 2013 1 次提交

difftool: display the number of files in the diff queue in the prompt · ee7fb0b1

由 Zoltan Klinger 提交于 12月 06, 2013

When --prompt option is set, git-difftool displays a prompt for each
modified file to be viewed in an external diff program.  At that
point, it could be useful to display a counter and the total number
of files in the diff queue.

Below is the current difftool prompt for the first of 5 modified files:

    Viewing: 'diff.c'
    Launch 'vimdiff' [Y/n]:

Consider the modified prompt:

    Viewing (1/5): 'diff.c'
    Launch 'vimdiff' [Y/n]:

The current GIT_EXTERNAL_DIFF mechanism does not tell the number of
paths in the diff queue nor the current counter.  To make this
"counter/total" info available for GIT_EXTERNAL_DIFF programs
without breaking existing ones by doing the following:

 - Keep track of the number of paths shown so far in diff_options;

 - Export two new environment variables from run_external_diff() to
   show the total number of paths (from diff_queue_struct) and the
   current value of the counter (from diff_options); and

 - Update git-difftool--helper to use these two environment variables.
Signed-off-by: NZoltan Klinger <zoltan.klinger@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ee7fb0b1

01 11月, 2013 1 次提交

Use the word 'stuck' instead of 'sticked' · b0d12fc9

由 Nicolas Vigier 提交于 10月 31, 2013

The past participle of 'stick' is 'stuck'.
Signed-off-by: NNicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b0d12fc9

20 7月, 2013 1 次提交

diff: deprecate -q option to diff-files · 95a7c546

由 Junio C Hamano 提交于 7月 17, 2013

This reimplements the ancient "-q" option to "git diff-files" that
was inherited from "show-diff -q" in terms of "--diff-filter=d".  We
will be deprecating the "-q" option, so let's issue a warning when
we do so.

Incidentally this also tentatively fixes "git diff --no-index" to
honor "-q" and hide deletions; the use will get the same warning.

We should remove the support for "-q" in a future version but it is
not that urgent.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

95a7c546

18 7月, 2013 1 次提交

diff: preparse --diff-filter string argument · 1ecc1cbd

由 Junio C Hamano 提交于 7月 17, 2013

Instead of running strchr() on the list of status characters over
and over again, parse the --diff-filter option into bitfields and
use the bits to see if the change to the filepair matches the status
requested.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

1ecc1cbd

16 7月, 2013 2 次提交
- N
  remove diff_tree_{setup,release}_paths · bd1928df
  由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013
```
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  bd1928df
- N
  move struct pathspec and related functions to pathspec.[ch] · 64acde94
  由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013
```
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  64acde94
11 5月, 2013 1 次提交

diff_opt: track whether flags have been set explicitly · 6c374008

由 Junio C Hamano 提交于 5月 10, 2013

The diff_opt infrastructure sets flags based on defaults and command
line options.  It is impossible to tell whether a flag has been set
as a default or on explicit request.  Update the structure so that
this detection is possible:

 * Add an extra "opt->touched_flags" that keeps track of all the
   fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR.

 * You may continue setting the default values to the flags, like
   commands in the "log" family do in cmd_log_init_defaults(), but
   after you finished setting the defaults, you clear the
   touched_flags field;

 * And then you let the usual callchain call diff_opt_parse(),
   allowing the opt->flags be set or unset, while keeping track of
   which bits the user touched;

 * There is an optional callback "opt->set_default" that is called
   at the very beginning to let you inspect touched_flags and update
   opt->flags appropriately, before the remainder of the diffcore
   machinery is set up, taking the opt->flags value into account.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

6c374008

13 2月, 2013 1 次提交

diff: add diff_line_prefix function · f1922234

由 John Keeping 提交于 2月 07, 2013

This is a helper function to call the diff output_prefix function and
return its value as a C string, allowing us to greatly simplify
everywhere that needs to get the output prefix.
Signed-off-by: NJohn Keeping <john@keeping.me.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f1922234

17 1月, 2013 1 次提交

diff: Introduce --diff-algorithm command line option · 07924d4d

由 Michal Privoznik 提交于 1月 16, 2013

Since command line options have higher priority than config file
variables and taking previous commit into account, we need a way
how to specify myers algorithm on command line. However,
inventing `--myers` is not the right answer. We need far more
general option, and that is `--diff-algorithm`.
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

07924d4d

29 10月, 2012 1 次提交

Move setup_diff_pager to libgit.a · 4914c962

由 Nguyễn Thái Ngọc Duy 提交于 10月 26, 2012

This is used by diff-no-index.c, part of libgit.a while it stays in
builtin/diff.c. Move it to diff.c so that we won't get undefined
reference if a program that uses libgit.a happens to pull it in.

While at it, move check_pager from git.c to pager.c. It makes more
sense there and pager.c is also part of libgit.a
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJeff King <peff@peff.net>

4914c962

16 9月, 2012 1 次提交
- J
  diff.c: mark a private file-scope symbol as static · d2aea137
  由 Junio C Hamano 提交于 9月 15, 2012
```
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  d2aea137
04 8月, 2012 1 次提交

diff_setup_done(): return void · 28452655

由 Thomas Rast 提交于 8月 03, 2012

diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b73 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09).  The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.

Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.

Note that the function can still die().
Signed-off-by: NThomas Rast <trast@student.ethz.ch>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

28452655

30 7月, 2012 1 次提交

diff: do not use null sha1 as a sentinel value · e5450100

由 Jeff King 提交于 7月 28, 2012

The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e5450100

17 4月, 2012 1 次提交

Add output_prefix_length to diff_options · 5e71a84a

由 Lucian Poston 提交于 4月 16, 2012

Add output_prefix_length to diff_options. Initialize the value to 0 and only
set it when graph.c:diff_output_prefix_callback() is called.
Signed-off-by: NLucian Poston <lucian.poston@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5e71a84a

24 3月, 2012 1 次提交

teach diffcore-rename to optionally ignore empty content · 90d43b07

由 Jeff King 提交于 3月 22, 2012

Our rename detection is a heuristic, matching pairs of
removed and added files with similar or identical content.
It's unlikely to be wrong when there is actual content to
compare, and we already take care not to do inexact rename
detection when there is not enough content to produce good
results.

However, we always do exact rename detection, even when the
blob is tiny or empty. It's easy to get false positives with
an empty blob, simply because it is an obvious content to
use as a boilerplate (e.g., when telling git that an empty
directory is worth tracking via an empty .gitignore).

This patch lets callers specify whether or not they are
interested in using empty files as rename sources and
destinations. The default is "yes", keeping the original
behavior. It works by detecting the empty-blob sha1 for
rename sources and destinations.

One more flexible alternative would be to allow the caller
to specify a minimum size for a blob to be "interesting" for
rename detection. But that would catch small boilerplate
files, not large ones (e.g., if you had the GPL COPYING file
in many directories).

A better alternative would be to allow a "-rename"
gitattribute to allow boilerplate files to be marked as
such. I'll leave the complexity of that solution until such
time as somebody actually wants it. The complaints we've
seen so far revolve around empty files, so let's start with
the simple thing.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

90d43b07

02 3月, 2012 1 次提交

diff --stat: enable limiting of the graph part · 969fe57b

由 Zbigniew Jędrzejewski-Szmek 提交于 3月 01, 2012

A new option --stat-graph-width=<width> can be used to limit the width
of the graph part even is more space is available. Up to <width>
columns will be used for the graph.

If commits changing a lot of lines are displayed in a wide terminal
window (200 or more columns), and the +- graph uses the full width,
the output can be hard to comfortably scan with a horizontal movement
of human eyes. Messages wrapped to about 80 columns would be
interspersed with very long +- lines. It makes sense to limit the
width of the graph part to a fixed value (e.g. 70 columns), even if
more columns are available.
Signed-off-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

969fe57b

29 2月, 2012 1 次提交

pickaxe: allow -i to search in patch case-insensitively · accccde4

由 Junio C Hamano 提交于 2月 21, 2012

"git log -S<string>" is a useful way to find the last commit in the
codebase that touched the <string>. As it was designed to be used by a
porcelain script to dig the history starting from a block of text that
appear in the starting commit, it never had to look for anything but an
exact match.

When used by an end user who wants to look for the last commit that
removed a string (e.g. name of a variable) that he vaguely remembers,
however, it is useful to support case insensitive match.

When given the "--regexp-ignore-case" (or "-i") option, which originally
was designed to affect case sensitivity of the search done in the commit
log part, e.g. "log --grep", the matches made with -S/-G pickaxe search is
done case insensitively now.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

accccde4

20 2月, 2012 1 次提交

xdiff: PATIENCE/HISTOGRAM are not independent option bits · 307ab20b

由 Junio C Hamano 提交于 2月 19, 2012

Because the default Myers, patience and histogram algorithms cannot be in
effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not
independent bits.  Instead of wasting one bit per algorithm, define a few
macros to access the few bits they occupy and update the code that access
them.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

307ab20b

04 2月, 2012 1 次提交

Use correct grammar in diffstat summary line · 7f814632

由 Nguyễn Thái Ngọc Duy 提交于 2月 01, 2012

"git diff --stat" and "git apply --stat" now learn to print the line
"%d files changed, %d insertions(+), %d deletions(-)" in singular form
whenever applicable. "0 insertions" and "0 deletions" are also omitted
unless they are both zero.

This matches how versions of "diffstat" that are not prehistoric produced
their output, and also makes this line translatable.

[jc: with help from Thomas Dickey in archaeology of "diffstat"]
[jc: squashed Jonathan's updates to illustrations in tutorials and a test]
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

7f814632

18 12月, 2011 2 次提交

pass struct commit to diff_tree_combined_merge() · 82889295

由 René Scharfe 提交于 12月 17, 2011

Instead of passing the hash of a commit and then searching that
same commit in the single caller, simply pass the commit directly.
Signed-off-by: NRené Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

82889295

use struct sha1_array in diff_tree_combined() · 0041f09d

由 René Scharfe 提交于 12月 17, 2011

Maintaining an array of hashes is easier using sha1_array than
open-coding it.  This patch also fixes a leak of the SHA1 array
in  diff_tree_combined_merge().
Signed-off-by: NRené Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0041f09d

11 10月, 2011 1 次提交

diff: add option to show whole functions as context · 14937c2c

由 René Scharfe 提交于 10月 09, 2011

Add the option -W/--function-context to git diff.  It is similar to
the same option of git grep and expands the context of change hunks
so that the whole surrounding function is shown.  This "natural"
context can allow changes to be understood better.

Note: GNU patch doesn't like diffs generated with the new option;
it seems to expect context lines to be the same before and after
changes.  git apply doesn't complain.

This implementation has the same shortcoming as the one in grep,
namely that there is no way to explicitly find the end of a
function.  That means that a few lines of extra context are shown,
right up to the next recognized function begins.  It's already
useful in its current form, though.

The function get_func_line() in xdiff/xemit.c is extended to work
forward as well as backward to find post-context as well as
pre-context.  It returns the position of the first found matching
line.  The func_line parameter is made optional, as we don't need
it for -W.

The enhanced function is then used in xdl_emit_diff() to extend
the context as needed.  If the added context overlaps with the
next change, it is merged into the current hunk.
Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

14937c2c

19 8月, 2011 1 次提交

diff: refactor COLOR_DIFF from a flag into an int · f1c96261

由 Jeff King 提交于 8月 17, 2011

This lets us store more than just a bit flag for whether we
want color; we can also store whether we want automatic
colors. This can be useful for making the automatic-color
decision closer to the point of use.

This mostly just involves replacing DIFF_OPT_* calls with
manipulations of the flag. The biggest exception is that
calls to DIFF_OPT_TST must check for "o->use_color > 0",
which lets an "unknown" value (i.e., the default) stay at
"no color". In the previous code, a value of "-1" was not
propagated at all.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f1c96261

01 6月, 2011 1 次提交

diff: futureproof "stop feeding the backend early" logic · 28b9264d

由 Junio C Hamano 提交于 5月 31, 2011

Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

28b9264d

28 5月, 2011 1 次提交

diff: introduce --stat-lines to limit the stat lines · 808e1db2

由 Michael J Gruber 提交于 5月 27, 2011

Often one is interested in the full --stat output only for commits which
change a few files, but not others, because larger restructuring gives a
--stat which fills a few screens.

Introduce a new option --stat-count=<count> which limits the --stat output
to the first <count> lines, followed by a "..." line. It can
also be given as the third parameter in
--stat=<width>,<name-width>,<count>.

Also, the unstuck form is supported analogous to the other two stat
parameters.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

808e1db2

30 4月, 2011 2 次提交

New --dirstat=lines mode, doing dirstat analysis based on diffstat · 1c57a627

由 Johan Herland 提交于 4月 29, 2011

This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.

The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).

For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.

In linux-2.6.git, running the three different --dirstat modes:

  time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null

yields the following average runtimes on my machine:

 - "changes" (default): ~6.0 s
 - "lines":             ~9.6 s
 - "files":             ~0.1 s

So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.

The patch also includes documentation and tests for the new dirstat mode.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

1c57a627

Allow specifying --dirstat cut-off percentage as a floating point number · 712d2c7d

由 Johan Herland 提交于 4月 29, 2011

Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.

Selftests verifying floating-point percentage input has been added.
Improved-by: NJunio C Hamano <gitster@pobox.com>
Improved-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

712d2c7d

24 4月, 2011 2 次提交

diff: remove often unused parameters from diff_unmerge() · fa7b2908

由 Junio C Hamano 提交于 4月 22, 2011

e9c84099 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair. Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fa7b2908

diff.c: return filepair from diff_unmerge() · 76399c01

由 Junio C Hamano 提交于 4月 22, 2011

The underlying diff_queue() returns diff_filepair so that the caller can
further add information to it, and the helper function diff_unmerge()
utilizes the feature itself, but does not expose it to its callers, which
was kind of selfish.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

76399c01

03 4月, 2011 1 次提交

git diff -D: omit the preimage of deletes · 467ddc14

由 Junio C Hamano 提交于 2月 28, 2011

When reviewing a patch while concentrating primarily on the text after
then change, wading through pages of deleted text involves a cognitive
burden.

Introduce the -D option that omits the preimage text from the patch output
for deleted files.  When used with -B (represent total rewrite as a single
wholesale deletion followed by a single wholesale addition), the preimage
text is also omitted.

To prevent such a patch from being applied by mistake, the output is
designed not to be usable by "git apply" (or GNU "patch"); it is strictly
for human consumption.

It of course is possible to "apply" such a patch by hand, as a human can
read the intention out of such a patch.  It however is impossible to apply
such a patch even manually in reverse, as the whole point of this option
is to omit the information necessary to do so from the output.

Initial request by Mart Sõmermaa, documentation and tests helped by
Michael J Gruber.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

467ddc14

23 3月, 2011 1 次提交

diffcore-rename: fall back to -C when -C -C busts the rename limit · f31027c9

由 Junio C Hamano 提交于 1月 06, 2011

When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed. We however could fall back
to "git diff -C" if the number of modified paths is sufficiently small.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f31027c9

22 2月, 2011 2 次提交

add inexact rename detection progress infrastructure · 3ac942d4

由 Jeff King 提交于 2月 20, 2011

We might spend many seconds doing inexact rename detection
with no output.  It's nice to let the user know that
something is actually happening.

This patch adds the infrastructure, but no callers actually
turn on progress reporting.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3ac942d4

merge: improve inexact rename limit warning · bf0ab10f

由 Jeff King 提交于 2月 19, 2011

The warning is generated deep in the diffcore code, which
means that it will come first, followed possibly by a spew
of conflicts, making it hard to see.

Instead, let's have diffcore pass back the information about
how big the rename limit would needed to have been, and then
the caller can provide a more appropriate message (and at a
more appropriate time).

No refactoring of other non-merge callers is necessary,
because nobody else was even using the warn_on_rename_limit
feature.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bf0ab10f

04 2月, 2011 1 次提交
- N
  Convert struct diff_options to use struct pathspec · 66f13625
  由 Nguyễn Thái Ngọc Duy 提交于 12月 15, 2010
```
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
  66f13625
30 9月, 2010 1 次提交

merge-recursive: option to specify rename threshold · 10ae7526

由 Kevin Ballard 提交于 9月 27, 2010

The recursive merge strategy turns on rename detection but leaves the
rename threshold at the default. Add a strategy option to allow the user
to specify a rename threshold to use.
Signed-off-by: NKevin Ballard <kevin@sb.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

10ae7526

01 9月, 2010 1 次提交

git log/diff: add -G<regexp> that greps in the patch text · f506b8e8

由 Junio C Hamano 提交于 8月 23, 2010

Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
"git diff" family of commands.  This limits the diff queue to filepairs
whose patch text actually has an added or a deleted line that matches the
given regexp.  Unlike "-S<regexp>", changing other parts of the line that
has a substring that matches the given regexp IS counted as a change, as
such a change would appear as one deletion followed by one addition in a
patch text.

Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
that changes the number of occurrences of hits between the preimage and
the postimage to serve as a part of larger toolchain, this is meant to be
used as the top-level Porcelain feature.

The implementation unfortunately has to run "diff" twice if you are
running "log" family of commands to produce patches in the final output
(e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
result in-core if we wanted to, but that would require larger surgery to
the diffcore machinery (i.e. adding an extra pointer in the filepair
structure to keep a pointer to a strbuf around, stuff the textual diff to
the strbuf inside diffgrep_consume(), and make use of it in later stages
when it is available) and it may not be worth it.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f506b8e8

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致