提交 · e1afca4fd3e7cb4000874e991277f10119de4ad2 · 李少辉-开发者 / git

24 2月, 2009 1 次提交

write_index(): update index_state->timestamp after flushing to disk · e1afca4f

由 Kjetil Barvik 提交于 2月 23, 2009

Since this timestamp is used to check for racy-clean files, it is
important to keep it uptodate.

For the 'git checkout' command without the '-q' option, this make a
huge difference.  Before, each and every file which was updated, was
racy-clean after the call to unpack_trees() and write_index() but
before the GIT process ended.

And because of the call to show_local_changes() in builtin-checkout.c,
we ended up reading those files back into memory, doing a SHA1 to
check if the files was really different from the index.  And, of
course, no file was different.

With this fix, 'git checkout' without the '-q' option should now be
almost as fast as with the '-q' option, but not quite, as we still do
some few lstat(2) calls more without the '-q' option.

Below is some average numbers for 10 checkout's to v2.6.27 and 10 to
v2.6.25 of the Linux kernel, to show the difference:

before (git version 1.6.2.rc1.256.g58a87):
 7.860 user  2.427 sys  19.465 real  52.8% CPU  faults: 0 major 95331 minor
after:
 6.184 user  2.160 sys  17.619 real  47.4% CPU  faults: 0 major 38994 minor
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e1afca4f

20 2月, 2009 1 次提交

make USE_NSEC work as expected · fba2f38a

由 Kjetil Barvik 提交于 2月 19, 2009

Since the filesystem ext4 is now defined as stable in Linux v2.6.28,
and ext4 supports nanonsecond resolution timestamps natively, it is
time to make USE_NSEC work as expected.

This will make racy git situations less likely to happen. For 'git
checkout' this means it will be less likely that we have to open, read
the contents of the file into RAM, and check if file is really
modified or not. The result sould be a litle less used CPU time, less
pagefaults and a litle faster program, at least for 'git checkout'.

Since the number of possible racy git situations would increase when
disks gets faster, this patch would be more and more helpfull as times
go by. For a fast Solid State Disk, this patch should be helpfull.

Note that, when file operations starts to take less than 1 nanosecond,
one would again start to get more racy git situations.

For more info on racy git, see Documentation/technical/racy-git.txt
For more info on ext4, see http://kernelnewbies.org/Ext4Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fba2f38a

19 2月, 2009 1 次提交

check_updates(): effective removal of cache entries marked CE_REMOVE · 36419c8e

由 Kjetil Barvik 提交于 2月 18, 2009

Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25'
(move from tag v2.6.27 to tag v2.6.25 of the Linux kernel):

CPU: Core 2, speed 1999.95 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
                         mask of 0x00 (Unhalted core cycles) count 20000
Counted INST_RETIRED_ANY_P events (number of instructions retired) with a
                           unit mask of 0x00 (No unit mask) count 20000
CPU_CLK_UNHALT...|INST_RETIRED:2...|
  samples|      %|  samples|      %|
------------------------------------
   409247 100.000    342878 100.000 git
        CPU_CLK_UNHALT...|INST_RETIRED:2...|
          samples|      %|  samples|      %|
        ------------------------------------
           260476 63.6476    257843 75.1996 libz.so.1.2.3
           100876 24.6492     64378 18.7758 kernel-2.6.28.4_2.vmlinux
            30850  7.5382      7874  2.2964 libc-2.9.so
            14775  3.6103      8390  2.4469 git
             2020  0.4936      4325  1.2614 libcrypto.so.0.9.8
              191  0.0467        32  0.0093 libpthread-2.9.so
               58  0.0142        36  0.0105 ld-2.9.so
                1 2.4e-04         0       0 libldap-2.3.so.0.2.31

Detail list of the top 20 function entries (libz counted in one blob):

CPU_CLK_UNHALTED  INST_RETIRED_ANY_P
samples  %        samples  %        image name               symbol name
260476   63.6862  257843   75.2725  libz.so.1.2.3            /lib/libz.so.1.2.3
16587     4.0555  3636      1.0615  libc-2.9.so              memcpy
7710      1.8851  277       0.0809  libc-2.9.so              memmove
3679      0.8995  1108      0.3235  kernel-2.6.28.4_2.vmlinux d_validate
3546      0.8670  2607      0.7611  kernel-2.6.28.4_2.vmlinux __getblk
3174      0.7760  1813      0.5293  libc-2.9.so              _int_malloc
2396      0.5858  3681      1.0746  kernel-2.6.28.4_2.vmlinux copy_to_user
2270      0.5550  2528      0.7380  kernel-2.6.28.4_2.vmlinux __link_path_walk
2205      0.5391  1797      0.5246  kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty
2103      0.5142  1203      0.3512  kernel-2.6.28.4_2.vmlinux find_first_zero_bit
2077      0.5078  997       0.2911  kernel-2.6.28.4_2.vmlinux do_get_write_access
2070      0.5061  514       0.1501  git                      cache_name_compare
2043      0.4995  1501      0.4382  kernel-2.6.28.4_2.vmlinux rcu_irq_exit
2022      0.4944  1732      0.5056  kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc
2020      0.4939  4325      1.2626  libcrypto.so.0.9.8       /usr/lib/libcrypto.so.0.9.8
1965      0.4804  1384      0.4040  git                      patch_delta
1708      0.4176  984       0.2873  kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period
1682      0.4112  727       0.2122  kernel-2.6.28.4_2.vmlinux sysfs_slab_alias
1659      0.4056  290       0.0847  git                      find_pack_entry_one
1480      0.3619  1307      0.3816  kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks

Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles
per instruction, and compared to the total cycles spent inside the
source code of GIT for this command, all the memmove() calls
translates to (7710 * 100) / 14775 = 52.2% of this.

Retesting with a GIT program compiled for gcov usage, I found out that
the memmove() calls came from remove_index_entry_at() in read-cache.c,
where we have:

        memmove(istate->cache + pos,
                istate->cache + pos + 1,
                (istate->cache_nr - pos) * sizeof(struct cache_entry *));

remove_index_entry_at() is called 4902 times from check_updates() in
unpack-trees.c, and each time called we move each cache_entry pointers
(from the removed one) one step to the left.

Since we have 28828 entries in the cache this time, and if we on
average move half of them each time, we in total move approximately
4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if
each pointer is 8 bytes (64 bit).

OK, is seems that the function check_updates() is called 28 times, so
the estimated guess above had been more correct if check_updates() had
been called only once, but the point is: we get lots of bytes moved.

To fix this, and use an O(N) algorithm instead, where N is the number
of cache_entries, we delete/remove all entries in one loop through all
entries.

From a retest, the new remove_marked_cache_entries() from the patch
below, ended up with the following output line from oprofile:

46        0.0105  15        0.0041  git                      remove_marked_cache_entries

If we can trust the numbers from oprofile in this case, we saved
approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077
seconds CPU time with this fix for this particular test.  And notice
that now the CPU did only 46 / 15 = 3.1 cycles/instruction.
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

36419c8e

10 2月, 2009 2 次提交

unlink_entry(): introduce schedule_dir_for_removal() · 78478927

由 Kjetil Barvik 提交于 2月 09, 2009

Currently inside unlink_entry() if we get a successful removal of one
file with unlink(), we try to remove the leading directories each and
every time.  So if one directory containing 200 files is moved to an
other location we get 199 failed calls to rmdir() and 1 successful
call.

To fix this and avoid some unnecessary calls to rmdir(), we schedule
each directory for removal and wait much longer before we do the real
call to rmdir().

Since the unlink_entry() function is called with alphabetically sorted
names, this new function end up being very effective to avoid
unnecessary calls to rmdir().  In some cases over 95% of all calls to
rmdir() is removed with this patch.
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

78478927

lstat_cache(): swap func(length, string) into func(string, length) · 57199892

由 Kjetil Barvik 提交于 2月 09, 2009

Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

57199892

19 1月, 2009 4 次提交

lstat_cache(): introduce clear_lstat_cache() function · bda6eb0d

由 Kjetil Barvik 提交于 1月 18, 2009

If you want to completely clear the contents of the lstat_cache(), then
call this new function.
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bda6eb0d

lstat_cache(): introduce invalidate_lstat_cache() function · aeabab5c

由 Kjetil Barvik 提交于 1月 18, 2009

In some cases it could maybe be necessary to say to the cache that
"Hey, I deleted/changed the type of this pathname and if you currently
have it inside your cache, you should deleted it".

This patch introduce a function which support this.
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

aeabab5c

lstat_cache(): introduce has_dirs_only_path() function · bad4a54f

由 Kjetil Barvik 提交于 1月 18, 2009

The create_directories() function in entry.c currently calls stat()
or lstat() for each path component of the pathname 'path' each and every
time.  For the 'git checkout' command, this function is called on each
file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get
lots and lots of calls.

To fix this, we make a new wrapper to the lstat_cache() function, and
call the wrapper function instead of the calls to the stat() or the
lstat() functions.  Since the paths given to the create_directories()
function, is sorted alphabetically, the new wrapper would be very
cache effective in this situation.

To support it we must update the lstat_cache() function to be able to
say that "please test the complete length of 'name'", and also to give
it the length of a prefix, where the cache should use the stat()
function instead of the lstat() function to test each path component.

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bad4a54f

lstat_cache(): introduce has_symlink_or_noent_leading_path() function · 09c93066

由 Kjetil Barvik 提交于 1月 18, 2009

In some cases, especially inside the unpack-trees.c file, and inside
the verify_absent() function, we can avoid some unnecessary calls to
lstat(), if the lstat_cache() function can also be told to keep track
of non-existing directories.

So we update the lstat_cache() function to handle this new fact,
introduce a new wrapper function, and the result is that we save lots
of lstat() calls for a removed directory which previously contained
lots of files, when we call this new wrapper of lstat_cache() instead
of the old one.

We do similar changes inside the unlink_entry() function, since if we
can already say that the leading directory component of a pathname
does not exist, it is not necessary to try to remove a pathname below
it!

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!
Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

09c93066

18 1月, 2009 1 次提交

checkout: implement "@{-N}" shortcut name for N-th last branch · ae5a6c36

由 Junio C Hamano 提交于 1月 17, 2009

Implement a shortcut @{-N} for the N-th last branch checked out, that
works by parsing the reflog for the message added by previous
git-checkout invocations.  We expand the @{-N} to the branch name, so
that you end up on an attached HEAD on that branch.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ae5a6c36

15 1月, 2009 1 次提交

remove pathspec_match, use match_pathspec instead · 0b50922a

由 Clemens Buchacher 提交于 1月 14, 2009

Both versions have the same functionality. This removes any
redundancy.

This also adds makes two extensions to match_pathspec:

- If pathspec is NULL, return 1. This reflects the behavior of git
  commands, for which no paths usually means "match all paths".

- If seen is NULL, do not use it.
Signed-off-by: NClemens Buchacher <drizzd@aon.at>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0b50922a

13 1月, 2009 1 次提交

sha1_file: make "read_object" static · c2c5b270

由 Christian Couder 提交于 1月 12, 2009

This function is only used from "sha1_file.c".

And as we want to add a "replace_object" hook in "read_sha1_file",
we must not let people bypass the hook using something other than
"read_sha1_file".
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c2c5b270

11 1月, 2009 1 次提交

Wrap inflate and other zlib routines for better error reporting · 39c68542

由 Linus Torvalds 提交于 1月 07, 2009

R. Tyler Ballance reported a mysterious transient repository corruption;
after much digging, it turns out that we were not catching and reporting
memory allocation errors from some calls we make to zlib.

This one _just_ wraps things; it doesn't do the "retry on low memory
error" part, at least not yet. It is an independent issue from the
reporting. Some of the errors are expected and passed back to the caller,
but we die when zlib reports it failed to allocate memory for now.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

39c68542

21 12月, 2008 1 次提交

Introduce commit notes · 879ef248

由 Johannes Schindelin 提交于 12月 20, 2008

Commit notes are blobs which are shown together with the commit
message. These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.

The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).

The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.
Signed-off-by: NJohannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

879ef248

11 12月, 2008 1 次提交

make sure packs to be replaced are closed beforehand · c74faea1

由 Nicolas Pitre 提交于 12月 09, 2008

Especially on Windows where an opened file cannot be replaced, make
sure pack-objects always close packs it is about to replace. Even on
non Windows systems, this could save potential bad results if ever
objects were to be read from the new pack file using offset from the old
index.

This should fix t5303 on Windows.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Tested-by: Johannes Sixt <j6t@kdbg.org> (MinGW)
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c74faea1

29 11月, 2008 1 次提交

git add --intent-to-add: fix removal of cached emptiness · 388b2acd

由 Junio C Hamano 提交于 11月 28, 2008

This uses the extended index flag mechanism introduced earlier to mark
the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

The logic to detect an "intent to add" entry for the purpose of allowing
"git rm --cached $path" is tightened to check not just for a staged empty
blob, but with the CE_INTENT_TO_ADD bit.  This protects an empty blob that
was explicitly added and then modified in the work tree from being dropped
with this sequence:

	$ >empty
	$ git add empty
	$ echo "non empty" >empty
	$ git rm --cached empty
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

388b2acd

15 11月, 2008 1 次提交

Add cache preload facility · 671c9b7e

由 Linus Torvalds 提交于 11月 13, 2008

This can do the lstat() storm in parallel, giving potentially much
improved performance for cold-cache cases or things like NFS that have
weak metadata caching.

Just use "read_cache_preload()" instead of "read_cache()" to force an
optimistic preload of the index stat data.  The function takes a
pathspec as its argument, allowing us to preload only the relevant
portion of the index.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

671c9b7e

13 11月, 2008 4 次提交

checkout: Fix "initial checkout" detection · fa7b3c2f

由 Junio C Hamano 提交于 11月 12, 2008

Earlier commit 55218834 (checkout: do not lose staged removal, 2008-09-07)
tightened the rule to prevent switching branches from losing local
changes, so that staged removal of paths can be protected, while
attempting to keep a loophole to still allow a special case of switching
out of an un-checked-out state.

However, the loophole was made a bit too tight, and did not allow
switching from one branch (in an un-checked-out state) to check out
another branch.

The change to builtin-checkout.c in this commit loosens it to allow this,
by not insisting the original commit and the new commit to be the same.

It also introduces a new function, is_index_unborn (and an associated
macro, is_cache_unborn), to check if the repository is truly in an
un-checked-out state more reliably, by making sure that $GIT_INDEX_FILE
did not exist when populating the in-core index structure. A few places
the earlier commit 55218834 added the check for the initial checkout
condition are updated to use this function.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fa7b3c2f

define empty tree sha1 as a macro · 14d9c578

由 Jeff King 提交于 11月 12, 2008

This can potentially be used in a few places, so let's make
it available to all parts of the code.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

14d9c578

B
sha1_file.c: split has_loose_object() into local and non-local counterparts · 0f4dc14a
由 Brandon Casey 提交于 11月 09, 2008
```
Signed-off-by: NBrandon Casey <drafnel@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
0f4dc14a

packed_git: convert pack_local flag into a bitfield and add pack_keep · 8d25931d

由 Brandon Casey 提交于 11月 12, 2008

pack_keep will be set when a pack file has an associated .keep file.
Signed-off-by: NBrandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

8d25931d

03 11月, 2008 2 次提交

make unpack_object_header() non fatal · 09ded04b

由 Nicolas Pitre 提交于 10月 29, 2008

It is possible to have pack corruption in the object header.  Currently
unpack_object_header() simply die() on them instead of letting the caller
deal with that gracefully.

So let's have unpack_object_header() return an error instead, and find
a better name for unpack_object_header_gently() in that context.  All
callers of unpack_object_header() are ready for it.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

09ded04b

close another possibility for propagating pack corruption · 0e8189e2

由 Nicolas Pitre 提交于 10月 31, 2008

Abstract
--------

With index v2 we have a per object CRC to allow quick and safe reuse of
pack data when repacking.  This, however, doesn't currently prevent a
stealth corruption from being propagated into a new pack when _not_
reusing pack data as demonstrated by the modification to t5302 included
here.

The Context
-----------

The Git database is all checksummed with SHA1 hashes.  Any kind of
corruption can be confirmed by verifying this per object hash against
corresponding data.  However this can be costly to perform systematically
and therefore this check is often not performed at run time when
accessing the object database.

First, the loose object format is entirely compressed with zlib which
already provide a CRC verification of its own when inflating data.  Any
disk corruption would be caught already in this case.

Then, packed objects are also compressed with zlib but only for their
actual payload.  The object headers and delta base references are not
deflated for obvious performance reasons, however this leave them
vulnerable to potentially undetected disk corruptions.  Object types
are often validated against the expected type when they're requested,
and deflated size must always match the size recorded in the object header,
so those cases are pretty much covered as well.

Where corruptions could go unnoticed is in the delta base reference.
Of course, in the OBJ_REF_DELTA case,  the odds for a SHA1 reference to
get corrupted so it actually matches the SHA1 of another object with the
same size (the delta header stores the expected size of the base object
to apply against) are virtually zero.  In the OBJ_OFS_DELTA case, the
reference is a pack offset which would have to match the start boundary
of a different base object but still with the same size, and although this
is relatively much more "probable" than in the OBJ_REF_DELTA case, the
probability is also about zero in absolute terms.  Still, the possibility
exists as demonstrated in t5302 and is certainly greater than a SHA1
collision, especially in the OBJ_OFS_DELTA case which is now the default
when repacking.

Again, repacking by reusing existing pack data is OK since the per object
CRC provided by index v2 guards against any such corruptions. What t5302
failed to test is a full repack in such case.

The Solution
------------

As unlikely as this kind of stealth corruption can be in practice, it
certainly isn't acceptable to propagate it into a freshly created pack.
But, because this is so unlikely, we don't want to pay the run time cost
associated with extra validation checks all the time either.  Furthermore,
consequences of such corruption in anything but repacking should be rather
visible, and even if it could be quite unpleasant, it still has far less
severe consequences than actively creating bad packs.

So the best compromize is to check packed object CRC when unpacking
objects, and only during the compression/writing phase of a repack, and
only when not streaming the result.  The cost of this is minimal (less
than 1% CPU time), and visible only with a full repack.

Someone with a stats background could provide an objective evaluation of
this, but I suspect that it's bad RAM that has more potential for data
corruptions at this point, even in those cases where this extra check
is not performed.  Still, it is best to prevent a known hole for
corruption when recreating object data into a new pack.

What about the streamed pack case?  Well, any client receiving a pack
must always consider that pack as untrusty and perform full validation
anyway, hence no such stealth corruption could be propagated to remote
repositoryes already.  It is therefore worthless doing local validation
in that case.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0e8189e2

31 10月, 2008 2 次提交

A
git_pathdup: returns xstrdup-ed copy of the formatted path · aba13e7c
由 Alex Riesen 提交于 10月 27, 2008
```
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
aba13e7c

Add git_snpath: a .git path formatting routine with output buffer · fe2d7776

由 Alex Riesen 提交于 10月 27, 2008

The function's purpose is to replace git_path where the buffer of
formatted path may not be reused by subsequent calls of the function
or will be copied anyway.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fe2d7776

27 10月, 2008 2 次提交

Add mksnpath which allows you to specify the output buffer · 108bebea

由 Alex Riesen 提交于 10月 26, 2008

This is just vsnprintf's but additionally calls cleanup_path() on the
result. To be used as alternatives to mkpath() where the buffer for the
created path may not be reused by subsequent calls of the same formatting
function.
Signed-off-by: NAlex Riesen <raa.lkml@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

108bebea

Fix git branch -m for symrefs. · eca35a25

由 Miklos Vajna 提交于 10月 26, 2008

This had two problems with symrefs. First, it copied the actual sha1
instead of the "pointer", second it failed to remove the old ref after a
successful rename.

Given that till now delete_ref() always dereferenced symrefs, a new
parameters has been introduced to delete_ref() to allow deleting refs
without a dereference.
Signed-off-by: NMiklos Vajna <vmiklos@frugalware.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

eca35a25

23 10月, 2008 1 次提交

rm: loosen safety valve for empty files · f55527f8

由 Jeff King 提交于 10月 21, 2008

If a file is different between the working tree copy, the index, and the
HEAD, then we do not allow it to be deleted without --force.

However, this is overly tight in the face of "git add --intent-to-add":

  $ git add --intent-to-add file
  $ : oops, I don't actually want to stage that yet
  $ git rm --cached file
  error: 'empty' has staged content different from both the
  file and the HEAD (use -f to force removal)
  $ git rm -f --cached file

Unfortunately, there is currently no way to distinguish between an empty
file that has been added and an "intent to add" file. The ideal behavior
would be to disallow the former while allowing the latter.

This patch loosens the safety valve to allow the deletion only if we are
deleting the cached entry and the cached content is empty.  This covers
the intent-to-add situation, and assumes there is little harm in not
protecting users who have legitimately added an empty file.  In many
cases, the file will still be empty, in which case the safety valve does
not trigger anyway (since the content remains untouched in the working
tree). Otherwise, we do remove the fact that no content was staged, but
given that the content is by definition empty, it is not terribly
difficult for a user to recreate it.

However, we still document the desired behavior in the form of two
tests. One checks the correct removal of an intent-to-add file. The other
checks that we still disallow removal of empty files, but is marked as
expect_failure to indicate this compromise. If the intent-to-add feature
is ever extended to differentiate between normal empty files and
intent-to-add files, then the safety valve can be re-tightened.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f55527f8

20 10月, 2008 1 次提交

Enhance hold_lock_file_for_{update,append}() API · acd3b9ec

由 Junio C Hamano 提交于 10月 17, 2008

This changes the "die_on_error" boolean parameter to a mere "flags", and
changes the existing callers of hold_lock_file_for_update/append()
functions to pass LOCK_DIE_ON_ERROR.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

acd3b9ec

18 10月, 2008 1 次提交

refactor handling of "other" files in ls-files and status · 98fa4738

由 Jeff King 提交于 10月 16, 2008

When the "git status" display code was originally converted
to C, we copied the code from ls-files to discover whether a
pathname returned by read_directory was an "other", or
untracked, file.

Much later, 5698454e updated the code in ls-files to handle
some new cases caused by gitlinks.  This left the code in
wt-status.c broken: it would display submodule directories
as untracked directories. Nobody noticed until now, however,
because unless status.showUntrackedFiles was set to "all",
submodule directories were not actually reported by
read_directory. So the bug was only triggered in the
presence of a submodule _and_ this config option.

This patch pulls the ls-files code into a new function,
cache_name_is_other, and uses it in both places. This should
leave the ls-files functionality the same and fix the bug
in status.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

98fa4738

13 10月, 2008 1 次提交

Extend index to save more flags · 06aaaa0b

由 Nguyễn Thái Ngọc Duy 提交于 10月 01, 2008

The on-disk format of index only saves 16 bit flags, nearly all have
been used. The last bit (CE_EXTENDED) is used to for future extension.

This patch extends index entry format to save more flags in future.
The new entry format will be used when CE_EXTENDED bit is 1.

Because older implementation may not understand CE_EXTENDED bit and
misread the new format, if there is any extended entry in index, index
header version will turn 3, which makes it incompatible for older git.
If there is none, header version will return to 2 again.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

06aaaa0b

03 10月, 2008 2 次提交

fix openssl headers conflicting with custom SHA1 implementations · 9126f009

由 Nicolas Pitre 提交于 10月 01, 2008

On ARM I have the following compilation errors:

    CC fast-import.o
In file included from cache.h:8,
                 from builtin.h:6,
                 from fast-import.c:142:
arm/sha1.h:14: error: conflicting types for 'SHA_CTX'
/usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here
arm/sha1.h:16: error: conflicting types for 'SHA1_Init'
/usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here
arm/sha1.h:17: error: conflicting types for 'SHA1_Update'
/usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here
arm/sha1.h:18: error: conflicting types for 'SHA1_Final'
/usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here
make: *** [fast-import.o] Error 1

This is because openssl header files are always included in
git-compat-util.h since commit 684ec6c6 whenever NO_OPENSSL is not
set, which somehow brings in <openssl/sha1.h> clashing with the custom
ARM version.  Compilation of git is probably broken on PPC too for the
same reason.

Turns out that the only file requiring openssl/ssl.h and openssl/err.h
is imap-send.c.  But only moving those problematic includes there
doesn't solve the issue as it also includes cache.h which brings in the
conflicting local SHA1 header file.

As suggested by Jeff King, the best solution is to rename our references
to SHA1 functions and structure to something git specific, and define those
according to the implementation used.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

9126f009

config.c: make git_parse_long() static · 0433bcd9

由 Nanako Shiraishi 提交于 10月 02, 2008

This function is not used in any other file.
Signed-off-by: NNanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0433bcd9

01 10月, 2008 1 次提交

add have_git_dir() function · d2b0708e

由 Dmitry Potapov 提交于 9月 27, 2008

This function is used to learn whether git_dir is already set up or not.
It is necessary, because we want to read configuration in compat/cygwin.c
Signed-off-by: NDmitry Potapov <dpotapov@gmail.com>
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

d2b0708e

10 9月, 2008 3 次提交

push: receiver end advertises refs from alternate repositories · d79796bc

由 Junio C Hamano 提交于 9月 09, 2008

Earlier, when pushing into a repository that borrows from alternate object
stores, we followed the longstanding design decision not to trust refs in
the alternate repository that houses the object store we are borrowing
from. If your public repository is borrowing from Linus's public
repository, you pushed into it long time ago, and now when you try to push
your updated history that is in sync with more recent history from Linus,
you will end up sending not just your own development, but also the
changes you acquired through Linus's tree, even though the objects needed
for the latter already exists at the receiving end. This is because the
receiving end does not advertise that the objects only reachable from the
borrowed repository (i.e. Linus's) are already available there.

This solves the issue by making the receiving end advertise refs from
borrowed repositories. They are not sent with their true names but with a
phoney name ".have" to make sure that the old senders will safely ignore
them (otherwise, the old senders will misbehave, trying to push matching
refs, and mirror push that deletes refs that only exist at the receiving
end).
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

d79796bc

push: prepare sender to receive extended ref information from the receiver · 40c155ff

由 Junio C Hamano 提交于 9月 09, 2008

"git push" enhancement allows the receiving end to report not only its own
refs but refs in repositories it borrows from via the alternate object
store mechanism. By telling the sender that objects reachable from these
extra refs are already complete in the receiving end, the number of
objects that need to be transfered can be cut down.

These entries are sent over the wire with string ".have", instead of the
actual names of the refs. This string was chosen so that they are ignored
by older programs at the sending end. If we sent some random but valid
looking refnames for these entries, "matching refs" rule (triggered when
running "git push" without explicit refspecs, where the sender learns what
refs the receiver has, and updates only the ones with the names of the
refs the sender also has) and "delete missing" rule (triggered when "git
push --mirror" is used, where the sender tells the receiver to delete the
refs it itself does not have) would try to update/delete them, which is
not what we want.

This prepares the send-pack (and "push" that runs native protocol) to
accept extended existing ref information and make use of it. The ".have"
entries are excluded from ref matching rules, and are exempt from deletion
rule while pushing with --mirror option, but are still used for pack
generation purposes by providing more "bottom" range commits.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

40c155ff

is_directory(): a generic helper function · 90b4a71c

由 Junio C Hamano 提交于 9月 09, 2008

A simple "grep -e stat --and -e S_ISDIR" revealed there are many
open-coded implementations of this function.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

90b4a71c

01 9月, 2008 1 次提交

git-add --intent-to-add (-N) · 39425819

由 Junio C Hamano 提交于 8月 21, 2008

This adds "--intent-to-add" option to "git add".  This is to let the
system know that you will tell it the final contents to be staged later,
iow, just be aware of the presense of the path with the type of the blob
for now.  It is implemented by staging an empty blob as the content.

With this sequence:

    $ git reset --hard
    $ edit newfile
    $ git add -N newfile
    $ edit newfile oldfile
    $ git diff

the diff will show all changes relative to the current commit.  Then you
can do:

    $ git commit -a ;# commit everything

or

    $ git commit oldfile ;# only oldfile, newfile not yet added

to pretend you are working with an index-free system like CVS.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

39425819

24 8月, 2008 1 次提交

unpack_trees(): protect the handcrafted in-core index from read_cache() · 913e0e99

由 Junio C Hamano 提交于 8月 23, 2008

unpack_trees() rebuilds the in-core index from scratch by allocating a new
structure and finishing it off by copying the built one to the final
index.

The resulting in-core index is Ok for most use, but read_cache() does not
recognize it as such. The function is meant to be no-op if you already
have loaded the index, until you call discard_cache().

This change the way read_cache() detects an already initialized in-core
index, by introducing an extra bit, and marks the handcrafted in-core
index as initialized, to avoid this problem.

A better fix in the longer term would be to change the read_cache() API so
that it will always discard and re-read from the on-disk index to avoid
confusion. But there are higher level API that have relied on the current
semantics, and they and their users all need to get converted, which is
outside the scope of 'maint' track.

An example of such a higher level API is write_cache_as_tree(), which is
used by git-write-tree as well as later Porcelains like git-merge, revert
and cherry-pick. In the longer term, we should remove read_cache() from
there and add one to cmd_write_tree(); other callers expect that the
in-core index they prepared is what gets written as a tree so no other
change is necessary for this particular codepath.

The original version of this patch marked the index by pointing an
otherwise wasted malloc'ed memory with o->result.alloc, but this version
uses Linus's idea to use a new "initialized" bit, which is conceptually
much cleaner.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

913e0e99

23 8月, 2008 1 次提交

Extend "checkout --track" DWIM to support more cases · 9188ed89

由 Alex Riesen 提交于 8月 21, 2008

The code handles additionally "refs/remotes/<something>/name",
"remotes/<something>/name", and "refs/<namespace>/name".
Signed-off-by: NAlex Riesen <raa.lkml@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

9188ed89

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致