提交 · 95b567c7c3cf6b85d74b79424cdfbd40a7dee7c9 · 李少辉-开发者 / git

21 6月, 2014 2 次提交

use skip_prefix to avoid magic numbers · ae021d87

由 Jeff King 提交于 6月 18, 2014

It's a common idiom to match a prefix and then skip past it
with a magic number, like:

  if (starts_with(foo, "bar"))
	  foo += 3;

This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes.  We can use skip_prefix to avoid the magic
numbers here.

Note that some of these conversions could be much shorter.
For example:

  if (starts_with(arg, "--foo=")) {
	  bar = arg + 6;
	  continue;
  }

could become:

  if (skip_prefix(arg, "--foo=", &bar))
	  continue;

However, I have left it as:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = v;
	  continue;
  }

to visually match nearby cases which need to actually
process the string. Like:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = atoi(v);
	  continue;
  }
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ae021d87

fast-import: fix read of uninitialized argv memory · ff45c0d4

由 Jeff King 提交于 6月 18, 2014

Fast-import shares code between its command-line parser and
the "option" command. To do so, it strips the "--" from any
command-line options and passes them to the option parser.
However, it does not confirm that the option even begins
with "--" before blindly passing "arg + 2".

It does confirm that the option starts with "-", so the only
affected case was:

  git fast-import -

which would read uninitialized memory after the argument. We
can fix it by using skip_prefix and checking the result. As
a bonus, this gets rid of some magic numbers.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ff45c0d4

22 4月, 2014 1 次提交

fast-import: add support to delete refs · 4ee1b225

由 Felipe Contreras 提交于 4月 20, 2014

Signed-off-by: NFelipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

4ee1b225

10 3月, 2014 1 次提交

use strchrnul() in place of strchr() and strlen() · 2c5495f7

由 Rohit Mani 提交于 3月 07, 2014

Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().
Helped-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NRohit Mani <rohit.mani@outlook.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2c5495f7

06 12月, 2013 1 次提交

replace {pre,suf}fixcmp() with {starts,ends}_with() · 59556548

由 Christian Couder 提交于 11月 30, 2013

Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

59556548

05 9月, 2013 2 次提交

use 'commit-ish' instead of 'committish' · a8a5406a

由 Richard Hansen 提交于 9月 04, 2013

Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.

The only remaining instances of 'committish' are:
  * variable, function, and macro names
  * "(also committish)" in the definition of commit-ish in
    gitglossary[7]
Signed-off-by: NRichard Hansen <rhansen@bbn.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a8a5406a

use 'tree-ish' instead of 'treeish' · bb8040f9

由 Richard Hansen 提交于 9月 04, 2013

Replace 'treeish' in documentation and comments with 'tree-ish' to
match gitglossary(7).

The only remaining instances of 'treeish' are:
  * variable, function, and macro names
  * "(also treeish)" in the definition of tree-ish in gitglossary(7)
Signed-off-by: NRichard Hansen <rhansen@bbn.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bb8040f9

31 8月, 2013 1 次提交

refs: report ref type from lock_any_ref_for_update · 9bbb0fa1

由 Brad King 提交于 8月 30, 2013

Expose lock_ref_sha1_basic's type_p argument to callers of
lock_any_ref_for_update.  Update all call sites to ignore it by passing
NULL for now.
Signed-off-by: NBrad King <brad.king@kitware.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

9bbb0fa1

24 6月, 2013 3 次提交

fast-import: allow moving the root tree · 62bfa11c

由 John Keeping 提交于 6月 23, 2013

Because fast-import.c::tree_content_remove does not check for the empty
path, it is not possible to move the root tree to a subdirectory.
Instead the error "Path  not in branch" is produced (note the double
space where the empty path has been inserted).

Fix this by explicitly checking for the empty path and handling it.
Signed-off-by: NJohn Keeping <john@keeping.me.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

62bfa11c

fast-import: allow ls or filecopy of the root tree · e0eb6b97

由 John Keeping 提交于 6月 23, 2013

Commit 178e1dea (fast-import: don't allow 'ls' of path with empty
components, 2012-03-09) restricted paths which:

    . contain an empty directory component (e.g. foo//bar is invalid),
    . end with a directory separator (e.g. foo/ is invalid),
    . start with a directory separator (e.g. /foo is invalid).

However, the implementation also caught the empty path, which should
represent the root tree.  Relax this restriction so that the empty path
is explicitly allowed and refers to the root tree.
Reported-by: NDave Abrahams <dave@boostpro.com>
Signed-off-by: NJohn Keeping <john@keeping.me.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e0eb6b97

fast-import: set valid mode on root tree in "ls" · adefdba5

由 John Keeping 提交于 6月 23, 2013

This prevents a failure later when we lift the restriction on ls with
the empty path.
Signed-off-by: NJohn Keeping <john@keeping.me.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

adefdba5

08 5月, 2013 1 次提交

fast-{import,export}: use get_sha1_hex() to read from marks file · 45c5d4a5

由 Felipe Contreras 提交于 5月 05, 2013

It's wrong to call get_sha1() if they should be SHA-1s, plus
inefficient.
Signed-off-by: NFelipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

45c5d4a5

29 4月, 2013 1 次提交

sparse: Fix mingw_main() argument number/type errors · 84d32bf7

由 Ramsay Jones 提交于 4月 27, 2013

Sparse issues 68 errors (two errors for each main() function) such
as the following:

      SP git.c
  git.c:510:5: error: too many arguments for function mingw_main
  git.c:510:5: error: symbol 'mingw_main' redeclared with different type \
    (originally declared at git.c:510) - different argument counts

The errors are caused by the 'main' macro used by the MinGW build
to provide a replacement main() function. The original main function
is effectively renamed to 'mingw_main' and is called from the new
main function. The replacement main is used to execute certain actions
common to all git programs on MinGW (e.g. ensure the standard I/O
streams are in binary mode).

In order to suppress the errors, we change the macro to include the
parameters in the declaration of the mingw_main function.

Unfortunately, this change provokes both sparse and gcc to complain
about 9 calls to mingw_main(), such as the following:

      CC git.o
  git.c: In function 'main':
  git.c:510: warning: passing argument 2 of 'mingw_main' from \
    incompatible pointer type
  git.c:510: note: expected 'const char **' but argument is of \
    type 'char **'

In order to suppress these warnings, since both of the main
functions need to be declared with the same prototype, we
change the declaration of the 9 main functions, thus:

    int main(int argc, char **argv)
Signed-off-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

84d32bf7

30 3月, 2013 1 次提交

fast-import: Fix an gcc -Wuninitialized warning · 0a34594c

由 Ramsay Jones 提交于 3月 26, 2013

Commit cbfd5e1c ("drop some obsolete "x = x" compiler warning hacks",
21-03-2013) removed a gcc hack that suppressed an "might be used
uninitialized" warning issued by older versions of gcc.

However, commit 3aa99df8 ('fast-import: clarify "inline" logic in
file_change_m', 21-03-2013) addresses an (almost) identical issue
(with very similar code), but includes additional code in it's
resolution. The solution used by this commit, unlike that used by
commit cbfd5e1c, also suppresses the -Wuninitialized warning on
older versions of gcc.

In order to suppress the warning (against the 'oe' symbol) in the
note_change_n() function, we adopt the same solution used by commit
3aa99df8.
Signed-off-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0a34594c

22 3月, 2013 3 次提交

fast-import: clarify "inline" logic in file_change_m · 3aa99df8

由 Jeff King 提交于 3月 21, 2013

When we read a fast-import line like:

  M 100644 :1 foo.c

we point the local object_entry variable "oe" to the object
named by the mark ":1". When the input uses the "inline"
construct, however, we do not have such an object_entry.

The current code is careful not to access "oe" in the inline
case, but we can make the assumption even more obvious (and
catch violations of it) by setting oe to NULL and adding a
comment. As a bonus, this also squelches an over-zealous gcc
-Wuninitialized warning, which means we can drop the "oe =
oe" initialization hack.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3aa99df8

drop some obsolete "x = x" compiler warning hacks · cbfd5e1c

由 Jeff King 提交于 3月 21, 2013

In cases where the setting and access of a variable are
protected by the same conditional flag, older versions of
gcc would generate a "might be used unitialized" warning. We
silence the warning by initializing the variable to itself,
a hack that gcc recognizes.

Modern versions of gcc are smart enough to get this right,
going back to at least version 4.3.5. gcc 4.1 does get it
wrong in both cases, but is sufficiently old that we
probably don't need to care about it anymore.
Signed-off-by: NJeff King <peff@peff.net>
Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

cbfd5e1c

fast-import: use pointer-to-pointer to keep list tail · 4db34cc1

由 Jeff King 提交于 3月 21, 2013

This is shorter, idiomatic, and it means the compiler does
not get confused about whether our "e" pointer is valid,
letting us drop the "e = e" hack.
Signed-off-by: NJeff King <peff@peff.net>
Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

4db34cc1

28 8月, 2012 1 次提交

in_merge_bases(): support only one "other" commit · a20efee9

由 Junio C Hamano 提交于 8月 27, 2012

In early days of its life, I planned to make it possible to compute
"is a commit contained in all of these other commits?" with this
function, but it turned out that no caller needed it.

Just make it take two commit objects and add a comment to say what
these two functions do.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a20efee9

11 4月, 2012 1 次提交

fast-import: tighten parsing of datarefs · 06454cb9

由 Pete Wyckoff 提交于 4月 07, 2012

The syntax for the use of mark references in fast-import
demands either a SP (space) or LF (end-of-line) after
a mark reference.  Fast-import does not complain when garbage
appears after a mark reference in some cases.

Factor out parsing of mark references and complain if
errant characters are found.  Also be a little more careful
when parsing "inline" and SHA1s, complaining if extra
characters appear or if the form of the dataref is unrecognized.

Buggy input can cause fast-import to produce the wrong output,
silently, without error.  This makes it difficult to track
down buggy generators of fast-import streams.  An example is
seen in the last line of this commit command:

    commit refs/heads/S2
    committer Name <name@example.com> 1112912893 -0400
    data <<COMMIT
    commit message
    COMMIT
    from :1M 100644 :103 hello.c

It is missing a newline and should be:

    [...]
    from :1
    M 100644 :103 hello.c

What fast-import does is to produce a commit with the same
contents for hello.c as in refs/heads/S2^.  What the buggy
program was expecting was the contents of blob :103.  While
the resulting commit graph looked correct, the contents in
some commits were wrong.
Signed-off-by: NPete Wyckoff <pw@padd.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

06454cb9

10 3月, 2012 2 次提交

fast-import: don't allow 'ls' of path with empty components · 178e1dea

由 Jonathan Nieder 提交于 3月 09, 2012

As the fast-import manual explains:

	The value of <path> must be in canonical form. That is it must
	not:
	. contain an empty directory component (e.g. foo//bar is invalid),
	. end with a directory separator (e.g. foo/ is invalid),
	. start with a directory separator (e.g. /foo is invalid),

Unfortunately the "ls" command accepts these invalid syntaxes and
responds by declaring that the indicated path is missing.  This is too
subtle and causes importers to silently misbehave; better to error out
so the operator knows what's happening.

The C, R, and M commands already error out for such paths.
Reported-by: NAndrew Sayers <andrew-git@pileofstuff.org>
Analysis-by: NDavid Barr <davidbarr@google.com>
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>

178e1dea

fast-import: leakfix for 'ls' of dirty trees · c27e559d

由 Jonathan Nieder 提交于 3月 09, 2012

When the chosen directory has changed since it was last written to
pack, "tree_content_get" makes a deep copy of its content to scribble
on while computing the tree name, which we forgot to free.

This leak has been present since the 'ls' command was introduced in
v1.7.5-rc0~3^2~33 (fast-import: add 'ls' command, 2010-12-02).
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>

c27e559d

06 3月, 2012 1 次提交

fast-import: zero all of 'struct tag' to silence valgrind · a8ea1b7a

由 Thomas Rast 提交于 3月 05, 2012

When running t9300, valgrind (correctly) complains about an
uninitialized value in write_crash_report:

  ==2971== Use of uninitialised value of size 8
  ==2971==    at 0x4164F4: sha1_to_hex (hex.c:70)
  ==2971==    by 0x4073E4: die_nicely (fast-import.c:468)
  ==2971==    by 0x43284C: die (usage.c:86)
  ==2971==    by 0x40420D: main (fast-import.c:2731)
  ==2971==  Uninitialised value was created by a heap allocation
  ==2971==    at 0x4C29B3D: malloc (vg_replace_malloc.c:263)
  ==2971==    by 0x433645: xmalloc (wrapper.c:35)
  ==2971==    by 0x405DF5: pool_alloc (fast-import.c:619)
  ==2971==    by 0x407755: pool_calloc.constprop.14 (fast-import.c:634)
  ==2971==    by 0x403F33: main (fast-import.c:3324)

Fix this by zeroing all of the 'struct tag'.  We would only need to
zero out the 'sha1' field, but this way seems more future-proof.
Signed-off-by: NThomas Rast <trast@student.ethz.ch>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a8ea1b7a

22 12月, 2011 1 次提交

Appease Sun Studio by renaming "tmpfile" · ab1900a3

由 Ævar Arnfjörð Bjarmason 提交于 12月 21, 2011

On Solaris the system headers define the "tmpfile" name, which'll
cause Git compiled with Sun Studio 12 Update 1 to whine about us
redefining the name:

"pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)

Just renaming the "tmpfile" variable to "tmp_file" in the relevant
places is the easiest way to fix this.
Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ab1900a3

06 12月, 2011 1 次提交

i18n: add infrastructure for translating Git with gettext · 5e9637c6

由 Ævar Arnfjörð Bjarmason 提交于 11月 18, 2011

Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.

This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.

This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.

The rest of the commit message goes into detail about various
sub-parts of this commit.

= Installation

Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.

= Perl

Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.

Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.

Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.

I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.

See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.

= Shell

Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.

If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.

If neither are present we'll use a dumb printf(1) fall-through
wrapper.

= About libcharset.h and langinfo.h

We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.

The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.

GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.

=Credits

This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.

[jc: squashed a small Makefile fix from Ramsay]
Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5e9637c6

01 12月, 2011 1 次提交

csum-file: introduce sha1file_checkpoint · 6c526148

由 Junio C Hamano 提交于 11月 17, 2011

It is useful to be able to rewind a check-summed file to a certain
previous state after writing data into it using sha1write() API. The
fast-import command does this after streaming a blob data to the packfile
being generated and then noticing that the same blob has already been
written, and it does this with a private code truncate_pack() that is
commented as "Yes, this is a layering violation".

Introduce two API functions, sha1file_checkpoint(), that allows the caller
to save a state of a sha1file, and then later revert it to the saved state.
Use it to reimplement truncate_pack().
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

6c526148

29 11月, 2011 1 次提交

fast-import: Fix incorrect fanout level when modifying existing notes refs · 18386857

由 Johan Herland 提交于 11月 25, 2011

This fixes the bug uncovered by the tests added in the previous two patches.

When an existing notes ref was loaded into the fast-import machinery, the
num_notes counter associated with that ref remained == 0, even though the
true number of notes in the loaded ref was higher. This caused a fanout
level of 0 to be used, although the actual fanout of the tree could be > 0.
Manipulating the notes tree at an incorrect fanout level causes removals to
silently fail, and modifications of existing notes to instead produce an
additional note (leaving the old object in place at a different fanout level).

This patch fixes the bug by explicitly counting the number of notes in the
notes tree whenever it looks like the num_notes counter could be wrong (when
num_notes == 0). There may be false positives (i.e. triggering the counting
when the notes tree is truly empty), but in those cases, the counting should
not take long.
Signed-off-by: NJohan Herland <johan@herland.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

18386857

06 10月, 2011 1 次提交

Change check_ref_format() to take a flags argument · 8d9c5010

由 Michael Haggerty 提交于 9月 15, 2011

Change check_ref_format() to take a flags argument that indicates what
is acceptable in the reference name (analogous to "git
check-ref-format"'s "--allow-onelevel" and "--refspec-pattern"). This
is more convenient for callers and also fixes a failure in the test
suite (and likely elsewhere in the code) by enabling "onelevel" and
"refspec-pattern" to be allowed independently of each other.

Also rename check_ref_format() to check_refname_format() to make it
obvious that it deals with refnames rather than references themselves.
Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

8d9c5010

23 9月, 2011 2 次提交

fast-import: don't allow to note on empty branch · 0bc69881

由 Dmitry Ivankov 提交于 9月 23, 2011

'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'notemodify' command can be used to add a note on branch head given it's
name. lookup_branch() is used it that case and it doesn't check for
null_sha1. So fast-import writes a note for null_sha1 object instead of
giving a error.

Add a check to deny adding a note on empty branch and add a corresponding
test.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0bc69881

fast-import: don't allow to tag empty branch · 2c9c8ee2

由 Dmitry Ivankov 提交于 9月 23, 2011

'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'tag' command can be used to tag a branch by it's name. lookup_branch()
is used it that case and it doesn't check for null_sha1. So fast-import
writes a tag for null_sha1 object instead of giving a error.

Add a check to deny tagging an empty branch and add a corresponding test.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2c9c8ee2

24 8月, 2011 2 次提交

fast-import: allow to tag newly created objects · 6c447f63

由 Dmitry Ivankov 提交于 8月 22, 2011

fast-import allows to tag objects by sha1 and to query sha1 of objects
being imported. So it should allow to tag these objects, make it do so.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

6c447f63

fast-import: add tests for tagging blobs · 2efe38e7

由 Dmitry Ivankov 提交于 8月 22, 2011

fast-import allows to create an annotated tag that annotates a blob,
via mark or direct sha1 specification.

For mark it works, for sha1 it tries to read the object. It tries to
do so via read_sha1_file, and then checks the size to be at least 46.

That's weird, let's just allow to (annotated) tag any object referenced
by sha1. If the object originates from our packfile, we still fail though.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2efe38e7

23 8月, 2011 2 次提交

fast-import: treat cat-blob as a delta base hint for next blob · a7e9c341

由 Dmitry Ivankov 提交于 8月 21, 2011

Delta base for blobs is chosen as a previously saved blob. If we
treat cat-blob's blob as a delta base for the next blob, nothing
is likely to become worse.

For fast-import stream producer like svn-fe cat-blob is used like
following:
- svn-fe reads file delta in svn format
- to apply it, svn-fe asks cat-blob 'svn delta base'
- applies 'svn delta' to the response
- produces a blob command to store the result

Currently there is no way for svn-fe to give fast-import a hint on
object delta base. While what's requested in cat-blob is most of
the time a best delta base possible. Of course, it could be not a
good delta base, but we don't know any better one anyway.

So do treat cat-blob's result as a delta base for next blob. The
profit is nice: 2x to 7x reduction in pack size AND 1.2x to 3x
time speedup due to diff_delta being faster on good deltas. git gc
--aggressive can compress it even more, by 10% to 70%, utilizing
more cpu time, real time and 3 cpu cores.

Tested on 213M and 2.7G fast-import streams, resulting packs are 22M
and 113M, import time is 7s and 60s, both streams are produced by
svn-fe, sniffed and then used as raw input for fast-import.

For git-fast-export produced streams there is no change as it doesn't
use cat-blob and doesn't try to reorder blobs in some smart way to
make successive deltas small.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Acked-by: NDavid Barr <davidbarr@google.com>
Acked-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a7e9c341

fast-import: count and report # of calls to diff_delta in stats · 94c3b482

由 Dmitry Ivankov 提交于 8月 21, 2011

It's an interesting number, how often do we try to deltify each type of
objects and how often do we succeed. So do add it to stats.

Success doesn't mean much gain in pack size though. As we allow delta to
be as big as (data.len - 20). And delta close to data.len gains nothing
compared to no delta at all even after zlib compression (delta is pretty
much the same as data, just with few modifications).

We should try to make less attempts that result in huge deltas as these
consume more cpu than trivial small deltas. Either by choosing a better
delta base or reducing delta size upper bound or doing less delta attempts
at all.

Currently, delta base for blobs is a waste literally. Each blob delta
base is chosen as a previously stored blob. Disabling deltas for blobs
doesn't increase pack size and reduce import time, or at least doesn't
increase time for all fast-import streams I've tried.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Acked-by: NDavid Barr <davidbarr@google.com>
Acked-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

94c3b482

15 8月, 2011 1 次提交

fast-import: prevent producing bad delta · 8fb3ad76

由 Dmitry Ivankov 提交于 8月 15, 2011

To produce deltas for tree objects fast-import tracks two versions
of tree's entries - base and current one. Base version stands both
for a delta base of this tree, and for a entry inside a delta base
of a parent tree. So care should be taken to keep it in sync.

tree_content_set cuts away a whole subtree and replaces it with a
new one (or NULL for lazy load of a tree with known sha1). It
keeps a base sha1 for this subtree (needed for parent tree). And
here is the problem, 'subtree' tree root doesn't have the implied
base version entries.

Adjusting the subtree to include them would mean a deep rewrite of
subtree. Invalidating the subtree base version would mean recursive
invalidation of parents' base versions. So just mark this tree as
do-not-delta me. Abuse setuid bit for this purpose.

tree_content_replace is the same as tree_content_set except that is
is used to replace the root, so just clearing base sha1 here (instead
of setting the bit) is fine.

[di: log message]
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

8fb3ad76

12 8月, 2011 2 次提交

fast-import: check committer name more strictly · 4b4963c0

由 Dmitry Ivankov 提交于 8月 11, 2011

The documentation declares following identity format:
(<name> SP)? LT <email> GT
where name is any string without LF and LT characters.
But fast-import just accepts any string up to first GT
instead of checking the whole format, and moreover just
writes it as is to the commit object.

git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the
space is mandatory. And the space quirk is already handled via
extending the string to the left when needed.

Modify fast-import input identity format to a slightly stricter
one - deny LF, LT and GT in both <name> and <email>. And check
for it.

This is stricter then git-fsck as fsck accepts "Name> <email>"
currently, but soon fsck check will be adjusted likewise.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

4b4963c0

fast-import: don't fail on omitted committer name · 17fb0072

由 Dmitry Ivankov 提交于 8月 11, 2011

fast-import format declares 'committer_name SP' to be optional in
'committer_name SP LT email GT'. But for a (commit) object SP is
obligatory while zero length committer_name is ok. git-fsck checks
that SP is present, so fast-import must prepend it if the name SP
part is omitted. It doesn't do so and thus for "LT email GT" ident
it writes a bad object.

Name cannot contain LT or GT, ident always comes after SP in fast-import.
So if ident starts with LT reuse the SP as if a valid 'SP LT email GT'
ident was passed.

This fixes a ident parsing bug for a well-formed fast-import input.
Though the parsing is still loose and can accept a ill-formed input.
Signed-off-by: NDmitry Ivankov <divanorama@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

17fb0072

20 7月, 2011 1 次提交

fast-import: introduce 'done' command · be56862f

由 Sverre Rabbelier 提交于 7月 16, 2011

Add a 'done' command that causes fast-import to stop reading from the
stream and exit.

If the new --done command line flag was passed on the command line
(or a "feature done" declaration included at the start of the stream),
make the 'done' command mandatory.  So "git fast-import --done"'s
input format will be prefix-free, making errors easier to detect when
they show up as early termination at some convenient time of the
upstream of a pipe writing to fast-import.

Another possible application of the 'done' command would to be allow a
fast-import stream that is only a small part of a larger encapsulating
stream to be easily parsed, leaving the file offset after the "done\n"
so the other application can pick up from there.  This patch does not
teach fast-import to do that --- fast-import still uses buffered input
(stdio).
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
Signed-off-by: NSverre Rabbelier <srabbelier@gmail.com>
Acked-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

be56862f

11 6月, 2011 3 次提交

zlib: zlib can only process 4GB at a time · ef49a7a0

由 Junio C Hamano 提交于 6月 10, 2011

The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ef49a7a0

J
zlib: wrap deflateBound() too · 225a6f10
由 Junio C Hamano 提交于 6月 10, 2011
```
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
225a6f10

zlib: wrap deflate side of the API · 55bb5c91

由 Junio C Hamano 提交于 6月 10, 2011

Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

55bb5c91

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致