提交 · 1ddb4d7e5ed478ec7a0335090482c1944a64aca5 · 李少辉-开发者 / git

12 3月, 2014 1 次提交

upload-pack: send shallow info over stdin to pack-objects · b790e0f6

由 Nguyễn Thái Ngọc Duy 提交于 3月 11, 2014

Before cdab4858 (upload-pack: delegate rev walking in shallow fetch to
pack-objects - 2013-08-16) upload-pack does not write to the source
repository. cdab4858 starts to write $GIT_DIR/shallow_XXXXXX if it's a
shallow fetch, so the source repo must be writable.

git:// servers do not need write access to repos and usually don't
have it, which means cdab4858 breaks shallow clone over git://

Instead of using a temporary file as the media for shallow points, we
can send them over stdin to pack-objects as well. Prepend shallow
SHA-1 with --shallow so pack-objects knows what is what.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b790e0f6

28 2月, 2014 1 次提交

shallow: automatically clean up shallow tempfiles · 0179c945

由 Jeff King 提交于 2月 27, 2014

We sometimes write tempfiles of the form "shallow_XXXXXX"
during fetch/push operations with shallow repositories.
Under normal circumstances, we clean up the result when we
are done. However, we do no take steps to clean up after
ourselves when we exit due to die() or signal death.

This patch teaches the tempfile creation code to register
handlers to clean up after ourselves. To handle this, we
change the ownership semantics of the filename returned by
setup_temporary_shallow. It now keeps a copy of the filename
itself, and returns only a const pointer to it.

We can also do away with explicit tempfile removal in the
callers. They all exit not long after finishing with the
file, so they can rely on the auto-cleanup, simplifying the
code.

Note that we keep things simple and maintain only a single
filename to be cleaned. This is sufficient for the current
caller, but we future-proof it with a die("BUG").
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0179c945

21 2月, 2014 1 次提交

rename read_replace_refs to check_replace_refs · afc711b8

由 Michael Haggerty 提交于 2月 18, 2014

The semantics of this flag was changed in commit

    e1111cef inline lookup_replace_object() calls

but wasn't renamed at the time to minimize code churn.  Rename it now,
and add a comment explaining its use.
Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

afc711b8

11 12月, 2013 4 次提交

smart-http: support shallow fetch/clone · 16094885

由 Nguyễn Thái Ngọc Duy 提交于 12月 05, 2013

Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

16094885

upload-pack: make sure deepening preserves shallow roots · 79d3a236

由 Nguyễn Thái Ngọc Duy 提交于 12月 05, 2013

When "fetch --depth=N" where N exceeds the longest chain of history in
the source repo, usually we just send an "unshallow" line to the
client so full history is obtained.

When the source repo is shallow we need to make sure to "unshallow"
the current shallow point _and_ "shallow" again when the commit
reaches its shallow bottom in the source repo.

This should fix both cases: large <N> and --unshallow.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

79d3a236

N
shallow.c: extend setup_*_shallow() to accept extra shallow commits · 1a30f5a2
由 Nguyễn Thái Ngọc Duy 提交于 12月 05, 2013
```
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
1a30f5a2

make the sender advertise shallow commits to the receiver · ad491366

由 Nguyễn Thái Ngọc Duy 提交于 12月 05, 2013

If either receive-pack or upload-pack is called on a shallow
repository, shallow commits (*) will be sent after the ref
advertisement (but before the packet flush), so that the receiver has
the full "shape" of the sender's commit graph. This will be needed for
the receiver to update its .git/shallow if necessary.

This breaks the protocol for all clients trying to push to a shallow
repo, or fetch from one. Which is basically the same end result as
today's "is_repository_shallow() && die()" in receive-pack and
upload-pack. New clients will be made aware of shallow upstream and
can make use of this information.

The sender must send all shallow commits that are sent in the
following pack. It may send more shallow commits than necessary.

upload-pack for example may choose to advertise no shallow commits if
it knows in advance that the pack it's going to send contains no
shallow commits. But upload-pack is the server, so we choose the
cheaper way, send full .git/shallow and let the client deal with it.

Smart HTTP is not affected by this patch. Shallow support on
smart-http comes later separately.

(*) A shallow commit is a commit that terminates the revision
    walker. It is usually put in .git/shallow in order to keep the
    revision walker from going out of bound because there is no
    guarantee that objects behind this commit is available.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ad491366

06 12月, 2013 1 次提交

replace {pre,suf}fixcmp() with {starts,ends}_with() · 59556548

由 Christian Couder 提交于 11月 30, 2013

Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

59556548

19 11月, 2013 1 次提交

Revert "upload-pack: send non-HEAD symbolic refs" · d007dbf7

由 Junio C Hamano 提交于 11月 18, 2013

This reverts commit 5e7dcad7; there
may be unbounded number of symbolic refs in the repository, but the
capability header line in the on-wire protocol has a rather low
length limit.

d007dbf7

25 10月, 2013 1 次提交

use parse_commit_or_die instead of custom message · 367068e0

由 Jeff King 提交于 10月 24, 2013

Many calls to parse_commit detect errors and die. In some
cases, the custom error messages are more useful than what
parse_commit_or_die could produce, because they give some
context, like which ref the commit came from. Some, however,
just say "invalid commit". Let's convert the latter to use
parse_commit_or_die; its message is slightly more informative,
and it makes the error more consistent throughout git.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

367068e0

18 9月, 2013 3 次提交

upload-pack: send non-HEAD symbolic refs · 5e7dcad7

由 Junio C Hamano 提交于 9月 17, 2013

With the same mechanism as used to tell where "HEAD" points at to
the other end, we can tell the target of other symbolic refs as
well.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5e7dcad7

upload-pack: send symbolic ref information as capability · 7171d8c1

由 Junio C Hamano 提交于 9月 17, 2013

One long-standing flaw in the pack transfer protocol was that there
was no way to tell the other end which branch "HEAD" points at.
With a capability "symref=HEAD:refs/heads/master", let the sender to
tell the receiver what symbolic ref points at what ref.

This capability can be repeated more than once to represent symbolic
refs other than HEAD, such as "refs/remotes/origin/HEAD").

Add an infrastructure to collect symbolic refs, format them as extra
capabilities and put it on the wire.  For now, just send information
on the "HEAD" and nothing else.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

7171d8c1

upload-pack.c: do not pass confusing cb_data to mark_our_ref() · a4d695de

由 Junio C Hamano 提交于 9月 17, 2013

The callee does not use cb_data, and the caller is an intermediate
function in a callchain that later wants to use the cb_data for its
own use.  Clarify the code by breaking the dataflow explicitly by
not passing cb_data down to mark_our_ref().
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a4d695de

10 9月, 2013 2 次提交

upload-pack: bump keepalive default to 5 seconds · 115dedd7

由 Jeff King 提交于 9月 08, 2013

There is no reason not to turn on keepalives by default.
They take very little bandwidth, and significantly less than
the progress reporting they are replacing. And in the case
that progress reporting is on, we should never need to send
a keepalive anyway, as we will constantly be showing
progress and resetting the keepalive timer.

We do not necessarily know what the client's idea of a
reasonable timeout is, so let's keep this on the low side of
5 seconds. That is high enough that we will always prefer
our normal 1-second progress reports to sending a keepalive
packet, but low enough that no sane client should consider
the connection hung.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

115dedd7

upload-pack: send keepalive packets during pack computation · 05e95155

由 Jeff King 提交于 9月 08, 2013

When upload-pack has started pack-objects, there may be a quiet
period while pack-objects prepares the pack (i.e., counting objects
and delta compression). Normally we would see (and send to the
client) progress information, but if "--quiet" is in effect,
pack-objects will produce nothing at all until the pack data is
ready. On a large repository, this can take tens of seconds (or even
minutes if the system is loaded or the repository is badly packed).
Clients or intermediate proxies can sometimes give up in this
situation, assuming that the server or connection has hung.

This patch introduces a "keepalive" option; if upload-pack sees no
data from pack-objects for a certain number of seconds, it will send
an empty sideband data packet to let the other side know that we are
still working on it.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

05e95155

29 8月, 2013 1 次提交

upload-pack: delegate rev walking in shallow fetch to pack-objects · cdab4858

由 Nguyễn Thái Ngọc Duy 提交于 8月 16, 2013

upload-pack has a special revision walking code for shallow
recipients. It works almost like the similar code in pack-objects
except:

1. in upload-pack, graft points could be added for deepening;

2. also when the repository is deepened, the shallow point will be
   moved further away from the tip, but the old shallow point will be
   marked as edge to produce more efficient packs. See 6523078b (make
   shallow repository deepening more network efficient - 2009-09-03).

Pass the file to pack-objects via --shallow-file. This will override
$GIT_DIR/shallow and give pack-objects the exact repository shape
that upload-pack has.

mark edge commits by revision command arguments. Even if old shallow
points are passed as "--not" revisions as in this patch, they will not
be picked up by mark_edges_uninteresting() because this function looks
up to parents for edges, while in this case the edge is the children,
in the opposite direction. This will be fixed in an later patch when
all given uninteresting commits are marked as edges.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

cdab4858

09 7月, 2013 1 次提交

cache.h: move remote/connect API out of it · 47a59185

由 Junio C Hamano 提交于 7月 08, 2013

The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me.  This structure is not
about the local ref used by sha1-name API to name local objects.

It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values.  It belongs to "remote.h" together
with "struct refspec".

While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

47a59185

29 4月, 2013 1 次提交

upload-pack: ignore 'shallow' lines with unknown obj-ids · af04fa2a

由 Michael Heemskerk 提交于 4月 28, 2013

When the client sends a 'shallow' line for an object that the server does
not have, the server currently dies with the error: "did not find object
for shallow <obj-id>". The client may have truncated the history at
the commit by fetching shallowly from a different server, or the commit
may have been garbage collected by the server. In either case, this
unknown commit is not relevant for calculating the pack that is to be
sent and can be safely ignored, and it is not used when recomputing where
the updated history of the client is cauterised.

The documentation in technical/pack-protocol.txt has been updated to
remove the restriction that "Clients MUST NOT mention an obj-id which it
does not know exists on the server". This requirement is not realistic
because clients cannot know whether an object has been garbage collected
by the server.
Signed-off-by: NMichael Heemskerk <mheemskerk@atlassian.com>
Reviewed-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

af04fa2a

17 3月, 2013 3 次提交

upload-pack: load non-tip "want" objects from disk · f59de5d1

由 Jeff King 提交于 3月 16, 2013

It is a long-time security feature that upload-pack will not
serve any "want" lines that do not correspond to the tip of
one of our refs. Traditionally, this was enforced by
checking the objects in the in-memory hash; they should have
been loaded and received the OUR_REF flag during the
advertisement.

The stateless-rpc mode, however, has a race condition here:
one process advertises, and another receives the want lines,
so the refs may have changed in the interim.  To address
this, commit 051e4005 added a new verification mode; if the
object is not OUR_REF, we set a "has_non_tip" flag, and then
later verify that the requested objects are reachable from
our current tips.

However, we still die immediately when the object is not in
our in-memory hash, and at this point we should only have
loaded our tip objects. So the check_non_tip code path does
not ever actually trigger, as any non-tip objects would
have already caused us to die.

We can fix that by using parse_object instead of
lookup_object, which will load the object from disk if it
has not already been loaded.

We still need to check that parse_object does not return
NULL, though, as it is possible we do not have the object
at all. A more appropriate error message would be "no such
object" rather than "not our ref"; however, we do not want
to leak information about what objects are or are not in
the object database, so we continue to use the same "not
our ref" message that would be produced by an unreachable
object.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f59de5d1

upload-pack: make sure "want" objects are parsed · 06f15bf1

由 Jeff King 提交于 3月 16, 2013

When upload-pack receives a "want" line from the client, it
adds it to an object array. We call lookup_object to find
the actual object, which will only check for objects already
in memory. This works because we are expecting to find
objects that we already loaded during the ref advertisement.

We use the resulting object structs for a variety of
purposes. Some of them care only about the object flags, but
others care about the type of the object (e.g.,
ok_to_give_up), or even feed them to the revision parser
(when --depth is used), which assumes that objects it
receives are fully parsed.

Once upon a time, this was OK; any object we loaded into
memory would also have been parsed. But since 435c8332
(upload-pack: use peel_ref for ref advertisements,
2012-10-04), we try to avoid parsing objects during the ref
advertisement. This means that lookup_object may return an
object with a type of OBJ_NONE. The resulting mess depends
on the exact set of objects, but can include the revision
parser barfing, or the shallow code sending the wrong set of
objects.

This patch teaches upload-pack to parse each "want" object
as we receive it. We do not replace the lookup_object call
with parse_object, as the current code is careful not to let
just any object appear on a "want" line, but rather only one
we have previously advertised (whereas parse_object would
actually load any arbitrary object from disk).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

06f15bf1

upload-pack: drop lookup-before-parse optimization · a6eec126

由 Jeff King 提交于 3月 16, 2013

When we receive a "have" line from the client, we want to
load the object pointed to by the sha1. However, we are
careful to do:

  o = lookup_object(sha1);
  if (!o || !o->parsed)
	  o = parse_object(sha1);

to avoid loading the object from disk if we have already
seen it.  However, since ccdc6037 (parse_object: try internal
cache before reading object db), parse_object already does
this optimization internally. We can just call parse_object
directly.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a6eec126

21 2月, 2013 6 次提交

pkt-line: provide a LARGE_PACKET_MAX static buffer · 74543a04

由 Jeff King 提交于 2月 20, 2013

Most of the callers of packet_read_line just read into a
static 1000-byte buffer (callers which handle arbitrary
binary data already use LARGE_PACKET_MAX). This works fine
in practice, because:

  1. The only variable-sized data in these lines is a ref
     name, and refs tend to be a lot shorter than 1000
     characters.

  2. When sending ref lines, git-core always limits itself
     to 1000 byte packets.

However, the only limit given in the protocol specification
in Documentation/technical/protocol-common.txt is
LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
pack-protocol.txt, and then only describing what we write,
not as a specific limit for readers.

This patch lets us bump the 1000-byte limit to
LARGE_PACKET_MAX. Even though git-core will never write a
packet where this makes a difference, there are two good
reasons to do this:

  1. Other git implementations may have followed
     protocol-common.txt and used a larger maximum size. We
     don't bump into it in practice because it would involve
     very long ref names.

  2. We may want to increase the 1000-byte limit one day.
     Since packets are transferred before any capabilities,
     it's difficult to do this in a backwards-compatible
     way. But if we bump the size of buffer the readers can
     handle, eventually older versions of git will be
     obsolete enough that we can justify bumping the
     writers, as well. We don't have plans to do this
     anytime soon, but there is no reason not to start the
     clock ticking now.

Just bumping all of the reading bufs to LARGE_PACKET_MAX
would waste memory. Instead, since most readers just read
into a temporary buffer anyway, let's provide a single
static buffer that all callers can use. We can further wrap
this detail away by having the packet_read_line wrapper just
use the buffer transparently and return a pointer to the
static storage.  That covers most of the cases, and the
remaining ones already read into their own LARGE_PACKET_MAX
buffers.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

74543a04

pkt-line: teach packet_read_line to chomp newlines · 819b929d

由 Jeff King 提交于 2月 20, 2013

The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.

This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.

As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.

Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility.  However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.

This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.

By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch.  The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

819b929d

pkt-line: drop safe_write function · cdf4fb8e

由 Jeff King 提交于 2月 20, 2013

This is just write_or_die by another name. The one
distinction is that write_or_die will treat EPIPE specially
by suppressing error messages. That's fine, as we die by
SIGPIPE anyway (and in the off chance that it is disabled,
write_or_die will simulate it).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

cdf4fb8e

upload-pack: remove packet debugging harness · 97a83fa8

由 Jeff King 提交于 2月 20, 2013

If you set the GIT_DEBUG_SEND_PACK environment variable,
upload-pack will dump lines it receives in the receive_needs
phase to a descriptor. This debugging harness is a strict
subset of what GIT_TRACE_PACKET can do. Let's just drop it
in favor of that.

A few tests used GIT_DEBUG_SEND_PACK to confirm which
objects get sent; we have to adapt them to the new output
format.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

97a83fa8

upload-pack: do not add duplicate objects to shallow list · e58e57e4

由 Jeff King 提交于 2月 20, 2013

When the client tells us it has a shallow object via
"shallow <sha1>", we make sure we have the object, mark it
with a flag, then add it to a dynamic array of shallow
objects. This means that a client can get us to allocate
arbitrary amounts of memory just by flooding us with shallow
lines (whether they have the objects or not). You can
demonstrate it easily with:

  yes '0035shallow e83c5163' |
  git-upload-pack git.git

We already protect against duplicates in want lines by
checking if our flag is already set; let's do the same thing
here. Note that a client can still get us to allocate some
amount of memory by marking every object in the repo as
"shallow" (or "want"). But this at least bounds it with the
number of objects in the repository, which is not under the
control of an upload-pack client.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e58e57e4

upload-pack: use get_sha1_hex to parse "shallow" lines · b7b02170

由 Jeff King 提交于 2月 20, 2013

When we receive a line like "shallow <sha1>" from the
client, we feed the <sha1> part to get_sha1. This is a
mistake, as the argument on a shallow line is defined by
Documentation/technical/pack-protocol.txt to contain an
"obj-id".  This is never defined in the BNF, but it is clear
from the text and from the other uses that it is meant to be
a hex sha1, not an arbitrary identifier (and that is what
fetch-pack has always sent).

We should be using get_sha1_hex instead, which doesn't allow
the client to request arbitrary junk like "HEAD@{yesterday}".
Because this is just marking shallow objects, the client
couldn't actually do anything interesting (like fetching
objects from unreachable reflog entries), but we should keep
our parsing tight to be on the safe side.

Because get_sha1 is for the most part a superset of
get_sha1_hex, in theory the only behavior change should be
disallowing non-hex object references. However, there is
one interesting exception: get_sha1 will only parse
a 40-character hex sha1 if the string has exactly 40
characters, whereas get_sha1_hex will just eat the first 40
characters, leaving the rest. That means that current
versions of git-upload-pack will not accept a "shallow"
packet that has a trailing newline, even though the protocol
documentation is clear that newlines are allowed (even
encouraged) in non-binary parts of the protocol.

This never mattered in practice, though, because fetch-pack,
contrary to the protocol documentation, does not include a
newline in its shallow lines. JGit follows its lead (though
it correctly is strict on the parsing end about wanting a
hex object id).

We do not adjust fetch-pack to send newlines here, as it
would break communication with older versions of git (and
there is no actual benefit to doing so, except for
consistency with other parts of the protocol).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b7b02170

08 2月, 2013 2 次提交

upload-pack: optionally allow fetching from the tips of hidden refs · 390eb36b

由 Junio C Hamano 提交于 1月 28, 2013

With uploadpack.allowtipsha1inwant configuration option set, future
versions of "git fetch" that allow an exact object name (likely to
have been obtained out of band) on the LHS of the fetch refspec can
make a request with a "want" line that names an object that may not
have been advertised due to transfer.hiderefs configuration.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

390eb36b

upload/receive-pack: allow hiding ref hierarchies · daebaa78

由 Junio C Hamano 提交于 1月 18, 2013

A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.

Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.

Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).

Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.

An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

daebaa78

29 1月, 2013 1 次提交

upload-pack: simplify request validation · 3f1da57f

由 Junio C Hamano 提交于 1月 28, 2013

Long time ago, we used to punt on a large (read: asking for more
than 256 refs) fetch request and instead sent a full pack, because
we couldn't fit many refs on the command line of rev-list we run
internally to enumerate the objects to be sent.  To fix this,
565ebbf7 (upload-pack: tighten request validation., 2005-10-24),
added a check to count the number of refs in the request and matched
with the number of refs we advertised, and changed the invocation of
rev-list to pass "--all" to it, still keeping us under the command
line argument limit.

However, these days we feed the list of objects requested and the
list of objects the other end is known to have via standard input,
so there is no longer a valid reason to special case a full clone
request.  Remove the code associated with "create_full_pack" to
simplify the logic.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

3f1da57f

19 1月, 2013 1 次提交

upload-pack: share more code · cbbe50db

由 Junio C Hamano 提交于 1月 18, 2013

We mark the objects pointed at our refs with "OUR_REF" flag in two
functions (mark_our_ref() and send_ref()), but we can just use the
former as a helper for the latter.

Update the way mark_our_ref() prepares in-core object to use
lookup_unknown_object() to delay reading the actual object data,
just like we did in 435c8332 (upload-pack: use peel_ref for ref
advertisements, 2012-10-04).
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

cbbe50db

12 1月, 2013 1 次提交

fetch: add --unshallow for turning shallow repo into complete one · 4dcb167f

由 Nguyễn Thái Ngọc Duy 提交于 1月 11, 2013

The user can do --depth=2147483647 (*) for restoring complete repo
now. But it's hard to remember. Any other numbers larger than the
longest commit chain in the repository would also do, but some
guessing may be involved. Make easy-to-remember --unshallow an alias
for --depth=2147483647.

Make upload-pack recognize this special number as infinite depth. The
effect is essentially the same as before, except that upload-pack is
more efficient because it does not have to traverse to the bottom
anymore.

The chance of a user actually wanting exactly 2147483647 commits
depth, not infinite, on a repository with a history that long, is
probably too small to consider. The client can learn to add or
subtract one commit to avoid the special treatment when that actually
happens.

(*) This is the largest positive number a 32-bit signed integer can
    contain. JGit and older C Git store depth as "int" so both are OK
    with this number. Dulwich does not support shallow clone.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

4dcb167f

09 1月, 2013 1 次提交

upload-pack: only accept commits from "shallow" line · 6293ded3

由 Nguyễn Thái Ngọc Duy 提交于 1月 08, 2013

We only allow cuts at commits, not arbitrary objects. upload-pack will
fail eventually in register_shallow if a non-commit is given with a
generic error "Object %s is a %s, not a commit". Check it early and
give a more accurate error.

This should never show up in an ordinary session. It's for buggy
clients, or when the user manually edits .git/shallow.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

6293ded3

05 10月, 2012 1 次提交

upload-pack: use peel_ref for ref advertisements · 435c8332

由 Jeff King 提交于 10月 04, 2012

When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.

Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.

This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:

  echo 0000 | git-upload-pack . >/dev/null

best-of-five, warm cache, objects and refs fully packed:

  [before]             [after]
  real    0m0.026s     real    0m0.013s
  user    0m0.024s     user    0m0.008s
  sys     0m0.000s     sys     0m0.000s

Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):

  [before]             [after]
  real    0m0.704s     real    0m0.596s
  user    0m0.600s     user    0m0.496s
  sys     0m0.096s     sys     0m0.092s

This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).

In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):

  [before]             [after]
  real    0m0.443s     real    0m0.097s
  user    0m0.416s     user    0m0.080s
  sys     0m0.024s     sys     0m0.012s
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

435c8332

04 8月, 2012 1 次提交

include agent identifier in capability string · ff5effdf

由 Jeff King 提交于 8月 03, 2012

Instead of having the client advertise a particular version
number in the git protocol, we have managed extensions and
backwards compatibility by having clients and servers
advertise capabilities that they support. This is far more
robust than having each side consult a table of
known versions, and provides sufficient information for the
protocol interaction to complete.

However, it does not allow servers to keep statistics on
which client versions are being used. This information is
not necessary to complete the network request (the
capabilities provide enough information for that), but it
may be helpful to conduct a general survey of client
versions in use.

We already send the client version in the user-agent header
for http requests; adding it here allows us to gather
similar statistics for non-http requests.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ff5effdf

09 1月, 2012 1 次提交

server_supports(): parse feature list more carefully · f47182c8

由 Junio C Hamano 提交于 1月 08, 2012

We have been carefully choosing feature names used in the protocol
extensions so that the vocabulary does not contain a word that is a
substring of another word, so it is not a real problem, but we have
recently added "quiet" feature word, which would mean we cannot later
add some other word with "quiet" (e.g. "quiet-push"), which is awkward.

Let's make sure that we can eventually be able to do so by teaching the
clients and servers that feature words consist of non whitespace
letters. This parser also allows us to later add features with parameters
e.g. "feature=1.5" (parameter values need to be quoted for whitespaces,
but we will worry about the detauls when we do introduce them).
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
Signed-off-by: NClemens Buchacher <drizzd@aon.at>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f47182c8

07 1月, 2012 2 次提交

upload-pack: avoid parsing tag destinations · 90108a24

由 Jeff King 提交于 1月 06, 2012

When upload-pack advertises refs, it dereferences any tags
it sees, and shows the resulting sha1 to the client. It does
this by calling deref_tag. That function must load and parse
each tag object to find the sha1 of the tagged object.
However, it also ends up parsing the tagged object itself,
which is not strictly necessary for upload-pack's use.

Each tag produces two object loads (assuming it is not a
recursive tag), when it could get away with only a single
one. Dropping the second load halves the effort we spend.

The downside is that we are no longer verifying the
resulting object by loading it. In particular:

  1. We never cross-check the "type" field given in the tag
     object with the type of the pointed-to object.  If the
     tag says it points to a tag but doesn't, then we will
     keep peeling and realize the error.  If the tag says it
     points to a non-tag but actually points to a tag, we
     will stop peeling and just advertise the pointed-to
     tag.

  2. If we are missing the pointed-to object, we will not
     realize (because we never even look it up in the object
     db).

However, both of these are errors in the object database,
and both will be detected if a client actually requests the
broken objects in question. So we are simply pushing the
verification away from the advertising stage, and down to
the actual fetching stage.

On my test repo with 120K refs, this drops the time to
advertise the refs from ~3.2s to ~2.0s.
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

90108a24

upload-pack: avoid parsing objects during ref advertisement · 926f1dd9

由 Jeff King 提交于 1月 06, 2012

When we advertise a ref, the first thing we do is parse the
pointed-to object. This gives us two things:

  1. a "struct object" we can use to store flags

  2. the type of the object, so we know whether we need to
     dereference it as a tag

Instead, we can just use lookup_unknown_object to get an
object struct, and then fill in just the type field using
sha1_object_info (which, in the case of packed files, can
find the information without actually inflating the object
data).

This can save time if you have a large number of refs, and
the client isn't actually going to request those refs (e.g.,
because most of them are already up-to-date).

The downside is that we are no longer verifying objects that
we advertise by fully parsing them (however, we do still
know we actually have them, because sha1_object_info must
find them to get the type). While we might fail to detect a
corrupt object here, if the client actually fetches the
object, we will parse (and verify) it then.

On a repository with 120K refs, the advertisement portion of
upload-pack goes from ~3.4s to 3.2s (the failure to speed up
more is largely due to the fact that most of these refs are
tags, which need dereferenced to find the tag destination
anyway).
Signed-off-by: NJeff King <peff@peff.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

926f1dd9

06 12月, 2011 1 次提交

i18n: add infrastructure for translating Git with gettext · 5e9637c6

由 Ævar Arnfjörð Bjarmason 提交于 11月 18, 2011

Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.

This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.

This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.

The rest of the commit message goes into detail about various
sub-parts of this commit.

= Installation

Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.

= Perl

Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.

Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.

Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.

I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.

See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.

= Shell

Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.

If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.

If neither are present we'll use a dumb printf(1) fall-through
wrapper.

= About libcharset.h and langinfo.h

We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.

The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.

GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.

=Credits

This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.

[jc: squashed a small Makefile fix from Ramsay]
Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5e9637c6

02 9月, 2011 1 次提交

list-objects: pass callback data to show_objects() · 49473672

由 Junio C Hamano 提交于 9月 01, 2011

The traverse_commit_list() API takes two callback functions, one to show
commit objects, and the other to show other kinds of objects. Even though
the former has a callback data parameter, so that the callback does not
have to rely on global state, the latter does not.

Give the show_objects() callback the same callback data parameter.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

49473672

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致