1. 12 3月, 2014 1 次提交
  2. 28 2月, 2014 1 次提交
    • J
      shallow: automatically clean up shallow tempfiles · 0179c945
      Jeff King 提交于
      We sometimes write tempfiles of the form "shallow_XXXXXX"
      during fetch/push operations with shallow repositories.
      Under normal circumstances, we clean up the result when we
      are done. However, we do no take steps to clean up after
      ourselves when we exit due to die() or signal death.
      
      This patch teaches the tempfile creation code to register
      handlers to clean up after ourselves. To handle this, we
      change the ownership semantics of the filename returned by
      setup_temporary_shallow. It now keeps a copy of the filename
      itself, and returns only a const pointer to it.
      
      We can also do away with explicit tempfile removal in the
      callers. They all exit not long after finishing with the
      file, so they can rely on the auto-cleanup, simplifying the
      code.
      
      Note that we keep things simple and maintain only a single
      filename to be cleaned. This is sufficient for the current
      caller, but we future-proof it with a die("BUG").
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0179c945
  3. 21 2月, 2014 1 次提交
  4. 11 12月, 2013 4 次提交
  5. 06 12月, 2013 1 次提交
    • C
      replace {pre,suf}fixcmp() with {starts,ends}_with() · 59556548
      Christian Couder 提交于
      Leaving only the function definitions and declarations so that any
      new topic in flight can still make use of the old functions, replace
      existing uses of the prefixcmp() and suffixcmp() with new API
      functions.
      
      The change can be recreated by mechanically applying this:
      
          $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
            grep -v strbuf\\.c |
            xargs perl -pi -e '
              s|!prefixcmp\(|starts_with\(|g;
              s|prefixcmp\(|!starts_with\(|g;
              s|!suffixcmp\(|ends_with\(|g;
              s|suffixcmp\(|!ends_with\(|g;
            '
      
      on the result of preparatory changes in this series.
      Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      59556548
  6. 19 11月, 2013 1 次提交
  7. 25 10月, 2013 1 次提交
    • J
      use parse_commit_or_die instead of custom message · 367068e0
      Jeff King 提交于
      Many calls to parse_commit detect errors and die. In some
      cases, the custom error messages are more useful than what
      parse_commit_or_die could produce, because they give some
      context, like which ref the commit came from. Some, however,
      just say "invalid commit". Let's convert the latter to use
      parse_commit_or_die; its message is slightly more informative,
      and it makes the error more consistent throughout git.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      367068e0
  8. 18 9月, 2013 3 次提交
  9. 10 9月, 2013 2 次提交
    • J
      upload-pack: bump keepalive default to 5 seconds · 115dedd7
      Jeff King 提交于
      There is no reason not to turn on keepalives by default.
      They take very little bandwidth, and significantly less than
      the progress reporting they are replacing. And in the case
      that progress reporting is on, we should never need to send
      a keepalive anyway, as we will constantly be showing
      progress and resetting the keepalive timer.
      
      We do not necessarily know what the client's idea of a
      reasonable timeout is, so let's keep this on the low side of
      5 seconds. That is high enough that we will always prefer
      our normal 1-second progress reports to sending a keepalive
      packet, but low enough that no sane client should consider
      the connection hung.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      115dedd7
    • J
      upload-pack: send keepalive packets during pack computation · 05e95155
      Jeff King 提交于
      When upload-pack has started pack-objects, there may be a quiet
      period while pack-objects prepares the pack (i.e., counting objects
      and delta compression). Normally we would see (and send to the
      client) progress information, but if "--quiet" is in effect,
      pack-objects will produce nothing at all until the pack data is
      ready. On a large repository, this can take tens of seconds (or even
      minutes if the system is loaded or the repository is badly packed).
      Clients or intermediate proxies can sometimes give up in this
      situation, assuming that the server or connection has hung.
      
      This patch introduces a "keepalive" option; if upload-pack sees no
      data from pack-objects for a certain number of seconds, it will send
      an empty sideband data packet to let the other side know that we are
      still working on it.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      05e95155
  10. 29 8月, 2013 1 次提交
    • N
      upload-pack: delegate rev walking in shallow fetch to pack-objects · cdab4858
      Nguyễn Thái Ngọc Duy 提交于
      upload-pack has a special revision walking code for shallow
      recipients. It works almost like the similar code in pack-objects
      except:
      
      1. in upload-pack, graft points could be added for deepening;
      
      2. also when the repository is deepened, the shallow point will be
         moved further away from the tip, but the old shallow point will be
         marked as edge to produce more efficient packs. See 6523078b (make
         shallow repository deepening more network efficient - 2009-09-03).
      
      Pass the file to pack-objects via --shallow-file. This will override
      $GIT_DIR/shallow and give pack-objects the exact repository shape
      that upload-pack has.
      
      mark edge commits by revision command arguments. Even if old shallow
      points are passed as "--not" revisions as in this patch, they will not
      be picked up by mark_edges_uninteresting() because this function looks
      up to parents for edges, while in this case the edge is the children,
      in the opposite direction. This will be fixed in an later patch when
      all given uninteresting commits are marked as edges.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cdab4858
  11. 09 7月, 2013 1 次提交
    • J
      cache.h: move remote/connect API out of it · 47a59185
      Junio C Hamano 提交于
      The definition of "struct ref" in "cache.h", a header file so
      central to the system, always confused me.  This structure is not
      about the local ref used by sha1-name API to name local objects.
      
      It is what refspecs are expanded into, after finding out what refs
      the other side has, to define what refs are updated after object
      transfer succeeds to what values.  It belongs to "remote.h" together
      with "struct refspec".
      
      While we are at it, also move the types and functions related to the
      Git transport connection to a new header file connect.h
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      47a59185
  12. 29 4月, 2013 1 次提交
    • M
      upload-pack: ignore 'shallow' lines with unknown obj-ids · af04fa2a
      Michael Heemskerk 提交于
      When the client sends a 'shallow' line for an object that the server does
      not have, the server currently dies with the error: "did not find object
      for shallow <obj-id>".  The client may have truncated the history at
      the commit by fetching shallowly from a different server, or the commit
      may have been garbage collected by the server. In either case, this
      unknown commit is not relevant for calculating the pack that is to be
      sent and can be safely ignored, and it is not used when recomputing where
      the updated history of the client is cauterised.
      
      The documentation in technical/pack-protocol.txt has been updated to
      remove the restriction that "Clients MUST NOT mention an obj-id which it
      does not know exists on the server". This requirement is not realistic
      because clients cannot know whether an object has been garbage collected
      by the server.
      Signed-off-by: NMichael Heemskerk <mheemskerk@atlassian.com>
      Reviewed-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      af04fa2a
  13. 17 3月, 2013 3 次提交
    • J
      upload-pack: load non-tip "want" objects from disk · f59de5d1
      Jeff King 提交于
      It is a long-time security feature that upload-pack will not
      serve any "want" lines that do not correspond to the tip of
      one of our refs. Traditionally, this was enforced by
      checking the objects in the in-memory hash; they should have
      been loaded and received the OUR_REF flag during the
      advertisement.
      
      The stateless-rpc mode, however, has a race condition here:
      one process advertises, and another receives the want lines,
      so the refs may have changed in the interim.  To address
      this, commit 051e4005 added a new verification mode; if the
      object is not OUR_REF, we set a "has_non_tip" flag, and then
      later verify that the requested objects are reachable from
      our current tips.
      
      However, we still die immediately when the object is not in
      our in-memory hash, and at this point we should only have
      loaded our tip objects. So the check_non_tip code path does
      not ever actually trigger, as any non-tip objects would
      have already caused us to die.
      
      We can fix that by using parse_object instead of
      lookup_object, which will load the object from disk if it
      has not already been loaded.
      
      We still need to check that parse_object does not return
      NULL, though, as it is possible we do not have the object
      at all. A more appropriate error message would be "no such
      object" rather than "not our ref"; however, we do not want
      to leak information about what objects are or are not in
      the object database, so we continue to use the same "not
      our ref" message that would be produced by an unreachable
      object.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f59de5d1
    • J
      upload-pack: make sure "want" objects are parsed · 06f15bf1
      Jeff King 提交于
      When upload-pack receives a "want" line from the client, it
      adds it to an object array. We call lookup_object to find
      the actual object, which will only check for objects already
      in memory. This works because we are expecting to find
      objects that we already loaded during the ref advertisement.
      
      We use the resulting object structs for a variety of
      purposes. Some of them care only about the object flags, but
      others care about the type of the object (e.g.,
      ok_to_give_up), or even feed them to the revision parser
      (when --depth is used), which assumes that objects it
      receives are fully parsed.
      
      Once upon a time, this was OK; any object we loaded into
      memory would also have been parsed. But since 435c8332
      (upload-pack: use peel_ref for ref advertisements,
      2012-10-04), we try to avoid parsing objects during the ref
      advertisement. This means that lookup_object may return an
      object with a type of OBJ_NONE. The resulting mess depends
      on the exact set of objects, but can include the revision
      parser barfing, or the shallow code sending the wrong set of
      objects.
      
      This patch teaches upload-pack to parse each "want" object
      as we receive it. We do not replace the lookup_object call
      with parse_object, as the current code is careful not to let
      just any object appear on a "want" line, but rather only one
      we have previously advertised (whereas parse_object would
      actually load any arbitrary object from disk).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      06f15bf1
    • J
      upload-pack: drop lookup-before-parse optimization · a6eec126
      Jeff King 提交于
      When we receive a "have" line from the client, we want to
      load the object pointed to by the sha1. However, we are
      careful to do:
      
        o = lookup_object(sha1);
        if (!o || !o->parsed)
      	  o = parse_object(sha1);
      
      to avoid loading the object from disk if we have already
      seen it.  However, since ccdc6037 (parse_object: try internal
      cache before reading object db), parse_object already does
      this optimization internally. We can just call parse_object
      directly.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      a6eec126
  14. 21 2月, 2013 6 次提交
    • J
      pkt-line: provide a LARGE_PACKET_MAX static buffer · 74543a04
      Jeff King 提交于
      Most of the callers of packet_read_line just read into a
      static 1000-byte buffer (callers which handle arbitrary
      binary data already use LARGE_PACKET_MAX). This works fine
      in practice, because:
      
        1. The only variable-sized data in these lines is a ref
           name, and refs tend to be a lot shorter than 1000
           characters.
      
        2. When sending ref lines, git-core always limits itself
           to 1000 byte packets.
      
      However, the only limit given in the protocol specification
      in Documentation/technical/protocol-common.txt is
      LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
      pack-protocol.txt, and then only describing what we write,
      not as a specific limit for readers.
      
      This patch lets us bump the 1000-byte limit to
      LARGE_PACKET_MAX. Even though git-core will never write a
      packet where this makes a difference, there are two good
      reasons to do this:
      
        1. Other git implementations may have followed
           protocol-common.txt and used a larger maximum size. We
           don't bump into it in practice because it would involve
           very long ref names.
      
        2. We may want to increase the 1000-byte limit one day.
           Since packets are transferred before any capabilities,
           it's difficult to do this in a backwards-compatible
           way. But if we bump the size of buffer the readers can
           handle, eventually older versions of git will be
           obsolete enough that we can justify bumping the
           writers, as well. We don't have plans to do this
           anytime soon, but there is no reason not to start the
           clock ticking now.
      
      Just bumping all of the reading bufs to LARGE_PACKET_MAX
      would waste memory. Instead, since most readers just read
      into a temporary buffer anyway, let's provide a single
      static buffer that all callers can use. We can further wrap
      this detail away by having the packet_read_line wrapper just
      use the buffer transparently and return a pointer to the
      static storage.  That covers most of the cases, and the
      remaining ones already read into their own LARGE_PACKET_MAX
      buffers.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      74543a04
    • J
      pkt-line: teach packet_read_line to chomp newlines · 819b929d
      Jeff King 提交于
      The packets sent during ref negotiation are all terminated
      by newline; even though the code to chomp these newlines is
      short, we end up doing it in a lot of places.
      
      This patch teaches packet_read_line to auto-chomp the
      trailing newline; this lets us get rid of a lot of inline
      chomping code.
      
      As a result, some call-sites which are not reading
      line-oriented data (e.g., when reading chunks of packfiles
      alongside sideband) transition away from packet_read_line to
      the generic packet_read interface. This patch converts all
      of the existing callsites.
      
      Since the function signature of packet_read_line does not
      change (but its behavior does), there is a possibility of
      new callsites being introduced in later commits, silently
      introducing an incompatibility.  However, since a later
      patch in this series will change the signature, such a
      commit would have to be merged directly into this commit,
      not to the tip of the series; we can therefore ignore the
      issue.
      
      This is an internal cleanup and should produce no change of
      behavior in the normal case. However, there is one corner
      case to note. Callers of packet_read_line have never been
      able to tell the difference between a flush packet ("0000")
      and an empty packet ("0004"), as both cause packet_read_line
      to return a length of 0. Readers treat them identically,
      even though Documentation/technical/protocol-common.txt says
      we must not; it also says that implementations should not
      send an empty pkt-line.
      
      By stripping out the newline before the result gets to the
      caller, we will now treat the newline-only packet ("0005\n")
      the same as an empty packet, which in turn gets treated like
      a flush packet. In practice this doesn't matter, as neither
      empty nor newline-only packets are part of git's protocols
      (at least not for the line-oriented bits, and readers who
      are not expecting line-oriented packets will be calling
      packet_read directly, anyway). But even if we do decide to
      care about the distinction later, it is orthogonal to this
      patch.  The right place to tighten would be to stop treating
      empty packets as flush packets, and this change does not
      make doing so any harder.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      819b929d
    • J
      pkt-line: drop safe_write function · cdf4fb8e
      Jeff King 提交于
      This is just write_or_die by another name. The one
      distinction is that write_or_die will treat EPIPE specially
      by suppressing error messages. That's fine, as we die by
      SIGPIPE anyway (and in the off chance that it is disabled,
      write_or_die will simulate it).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cdf4fb8e
    • J
      upload-pack: remove packet debugging harness · 97a83fa8
      Jeff King 提交于
      If you set the GIT_DEBUG_SEND_PACK environment variable,
      upload-pack will dump lines it receives in the receive_needs
      phase to a descriptor. This debugging harness is a strict
      subset of what GIT_TRACE_PACKET can do. Let's just drop it
      in favor of that.
      
      A few tests used GIT_DEBUG_SEND_PACK to confirm which
      objects get sent; we have to adapt them to the new output
      format.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      97a83fa8
    • J
      upload-pack: do not add duplicate objects to shallow list · e58e57e4
      Jeff King 提交于
      When the client tells us it has a shallow object via
      "shallow <sha1>", we make sure we have the object, mark it
      with a flag, then add it to a dynamic array of shallow
      objects. This means that a client can get us to allocate
      arbitrary amounts of memory just by flooding us with shallow
      lines (whether they have the objects or not). You can
      demonstrate it easily with:
      
        yes '0035shallow e83c5163' |
        git-upload-pack git.git
      
      We already protect against duplicates in want lines by
      checking if our flag is already set; let's do the same thing
      here. Note that a client can still get us to allocate some
      amount of memory by marking every object in the repo as
      "shallow" (or "want"). But this at least bounds it with the
      number of objects in the repository, which is not under the
      control of an upload-pack client.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e58e57e4
    • J
      upload-pack: use get_sha1_hex to parse "shallow" lines · b7b02170
      Jeff King 提交于
      When we receive a line like "shallow <sha1>" from the
      client, we feed the <sha1> part to get_sha1. This is a
      mistake, as the argument on a shallow line is defined by
      Documentation/technical/pack-protocol.txt to contain an
      "obj-id".  This is never defined in the BNF, but it is clear
      from the text and from the other uses that it is meant to be
      a hex sha1, not an arbitrary identifier (and that is what
      fetch-pack has always sent).
      
      We should be using get_sha1_hex instead, which doesn't allow
      the client to request arbitrary junk like "HEAD@{yesterday}".
      Because this is just marking shallow objects, the client
      couldn't actually do anything interesting (like fetching
      objects from unreachable reflog entries), but we should keep
      our parsing tight to be on the safe side.
      
      Because get_sha1 is for the most part a superset of
      get_sha1_hex, in theory the only behavior change should be
      disallowing non-hex object references. However, there is
      one interesting exception: get_sha1 will only parse
      a 40-character hex sha1 if the string has exactly 40
      characters, whereas get_sha1_hex will just eat the first 40
      characters, leaving the rest. That means that current
      versions of git-upload-pack will not accept a "shallow"
      packet that has a trailing newline, even though the protocol
      documentation is clear that newlines are allowed (even
      encouraged) in non-binary parts of the protocol.
      
      This never mattered in practice, though, because fetch-pack,
      contrary to the protocol documentation, does not include a
      newline in its shallow lines. JGit follows its lead (though
      it correctly is strict on the parsing end about wanting a
      hex object id).
      
      We do not adjust fetch-pack to send newlines here, as it
      would break communication with older versions of git (and
      there is no actual benefit to doing so, except for
      consistency with other parts of the protocol).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b7b02170
  15. 08 2月, 2013 2 次提交
    • J
      upload-pack: optionally allow fetching from the tips of hidden refs · 390eb36b
      Junio C Hamano 提交于
      With uploadpack.allowtipsha1inwant configuration option set, future
      versions of "git fetch" that allow an exact object name (likely to
      have been obtained out of band) on the LHS of the fetch refspec can
      make a request with a "want" line that names an object that may not
      have been advertised due to transfer.hiderefs configuration.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      390eb36b
    • J
      upload/receive-pack: allow hiding ref hierarchies · daebaa78
      Junio C Hamano 提交于
      A repository may have refs that are only used for its internal
      bookkeeping purposes that should not be exposed to the others that
      come over the network.
      
      Teach upload-pack to omit some refs from its initial advertisement
      by paying attention to the uploadpack.hiderefs multi-valued
      configuration variable.  Do the same to receive-pack via the
      receive.hiderefs variable.  As a convenient short-hand, allow using
      transfer.hiderefs to set the value to both of these variables.
      
      Any ref that is under the hierarchies listed on the value of these
      variable is excluded from responses to requests made by "ls-remote",
      "fetch", etc. (for upload-pack) and "push" (for receive-pack).
      
      Because these hidden refs do not count as OUR_REF, an attempt to
      fetch objects at the tip of them will be rejected, and because these
      refs do not get advertised, "git push :" will not see local branches
      that have the same name as them as "matching" ones to be sent.
      
      An attempt to update/delete these hidden refs with an explicit
      refspec, e.g. "git push origin :refs/hidden/22", is rejected.  This
      is not a new restriction.  To the pusher, it would appear that there
      is no such ref, so its push request will conclude with "Now that I
      sent you all the data, it is time for you to update the refs.  I saw
      that the ref did not exist when I started pushing, and I want the
      result to point at this commit".  The receiving end will apply the
      compare-and-swap rule to this request and rejects the push with
      "Well, your update request conflicts with somebody else; I see there
      is such a ref.", which is the right thing to do. Otherwise a push to
      a hidden ref will always be "the last one wins", which is not a good
      default.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      daebaa78
  16. 29 1月, 2013 1 次提交
    • J
      upload-pack: simplify request validation · 3f1da57f
      Junio C Hamano 提交于
      Long time ago, we used to punt on a large (read: asking for more
      than 256 refs) fetch request and instead sent a full pack, because
      we couldn't fit many refs on the command line of rev-list we run
      internally to enumerate the objects to be sent.  To fix this,
      565ebbf7 (upload-pack: tighten request validation., 2005-10-24),
      added a check to count the number of refs in the request and matched
      with the number of refs we advertised, and changed the invocation of
      rev-list to pass "--all" to it, still keeping us under the command
      line argument limit.
      
      However, these days we feed the list of objects requested and the
      list of objects the other end is known to have via standard input,
      so there is no longer a valid reason to special case a full clone
      request.  Remove the code associated with "create_full_pack" to
      simplify the logic.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3f1da57f
  17. 19 1月, 2013 1 次提交
    • J
      upload-pack: share more code · cbbe50db
      Junio C Hamano 提交于
      We mark the objects pointed at our refs with "OUR_REF" flag in two
      functions (mark_our_ref() and send_ref()), but we can just use the
      former as a helper for the latter.
      
      Update the way mark_our_ref() prepares in-core object to use
      lookup_unknown_object() to delay reading the actual object data,
      just like we did in 435c8332 (upload-pack: use peel_ref for ref
      advertisements, 2012-10-04).
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cbbe50db
  18. 12 1月, 2013 1 次提交
    • N
      fetch: add --unshallow for turning shallow repo into complete one · 4dcb167f
      Nguyễn Thái Ngọc Duy 提交于
      The user can do --depth=2147483647 (*) for restoring complete repo
      now. But it's hard to remember. Any other numbers larger than the
      longest commit chain in the repository would also do, but some
      guessing may be involved. Make easy-to-remember --unshallow an alias
      for --depth=2147483647.
      
      Make upload-pack recognize this special number as infinite depth. The
      effect is essentially the same as before, except that upload-pack is
      more efficient because it does not have to traverse to the bottom
      anymore.
      
      The chance of a user actually wanting exactly 2147483647 commits
      depth, not infinite, on a repository with a history that long, is
      probably too small to consider. The client can learn to add or
      subtract one commit to avoid the special treatment when that actually
      happens.
      
      (*) This is the largest positive number a 32-bit signed integer can
          contain. JGit and older C Git store depth as "int" so both are OK
          with this number. Dulwich does not support shallow clone.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4dcb167f
  19. 09 1月, 2013 1 次提交
  20. 05 10月, 2012 1 次提交
    • J
      upload-pack: use peel_ref for ref advertisements · 435c8332
      Jeff King 提交于
      When upload-pack advertises refs, we attempt to peel tags
      and advertise the peeled version. We currently hand-roll the
      tag dereferencing, and use as many optimizations as we can
      to avoid loading non-tag objects into memory.
      
      Not only has peel_ref recently learned these optimizations,
      too, but it also contains an even more important one: it
      has access to the "peeled" data from the pack-refs file.
      That means we can avoid not only loading annotated tags
      entirely, but also avoid doing any kind of object lookup at
      all.
      
      This cut the CPU time to advertise refs by 50% in the
      linux-2.6 repo, as measured by:
      
        echo 0000 | git-upload-pack . >/dev/null
      
      best-of-five, warm cache, objects and refs fully packed:
      
        [before]             [after]
        real    0m0.026s     real    0m0.013s
        user    0m0.024s     user    0m0.008s
        sys     0m0.000s     sys     0m0.000s
      
      Those numbers are irrelevantly small compared to an actual
      fetch. Here's a larger repo (400K refs, of which 12K are
      unique, and of which only 107 are unique annotated tags):
      
        [before]             [after]
        real    0m0.704s     real    0m0.596s
        user    0m0.600s     user    0m0.496s
        sys     0m0.096s     sys     0m0.092s
      
      This shows only a 15% speedup (mostly because it has fewer
      actual tags to parse), but a larger absolute value (100ms,
      which isn't a lot compared to a real fetch, but this
      advertisement happens on every fetch, even if the client is
      just finding out they are completely up to date).
      
      In truly pathological cases, where you have a large number
      of unique annotated tags, it can make an even bigger
      difference. Here are the numbers for a linux-2.6 repository
      that has had every seventh commit tagged (so about 50K
      tags):
      
        [before]             [after]
        real    0m0.443s     real    0m0.097s
        user    0m0.416s     user    0m0.080s
        sys     0m0.024s     sys     0m0.012s
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      435c8332
  21. 04 8月, 2012 1 次提交
    • J
      include agent identifier in capability string · ff5effdf
      Jeff King 提交于
      Instead of having the client advertise a particular version
      number in the git protocol, we have managed extensions and
      backwards compatibility by having clients and servers
      advertise capabilities that they support. This is far more
      robust than having each side consult a table of
      known versions, and provides sufficient information for the
      protocol interaction to complete.
      
      However, it does not allow servers to keep statistics on
      which client versions are being used. This information is
      not necessary to complete the network request (the
      capabilities provide enough information for that), but it
      may be helpful to conduct a general survey of client
      versions in use.
      
      We already send the client version in the user-agent header
      for http requests; adding it here allows us to gather
      similar statistics for non-http requests.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ff5effdf
  22. 09 1月, 2012 1 次提交
    • J
      server_supports(): parse feature list more carefully · f47182c8
      Junio C Hamano 提交于
      We have been carefully choosing feature names used in the protocol
      extensions so that the vocabulary does not contain a word that is a
      substring of another word, so it is not a real problem, but we have
      recently added "quiet" feature word, which would mean we cannot later
      add some other word with "quiet" (e.g. "quiet-push"), which is awkward.
      
      Let's make sure that we can eventually be able to do so by teaching the
      clients and servers that feature words consist of non whitespace
      letters. This parser also allows us to later add features with parameters
      e.g. "feature=1.5" (parameter values need to be quoted for whitespaces,
      but we will worry about the detauls when we do introduce them).
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      Signed-off-by: NClemens Buchacher <drizzd@aon.at>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f47182c8
  23. 07 1月, 2012 2 次提交
    • J
      upload-pack: avoid parsing tag destinations · 90108a24
      Jeff King 提交于
      When upload-pack advertises refs, it dereferences any tags
      it sees, and shows the resulting sha1 to the client. It does
      this by calling deref_tag. That function must load and parse
      each tag object to find the sha1 of the tagged object.
      However, it also ends up parsing the tagged object itself,
      which is not strictly necessary for upload-pack's use.
      
      Each tag produces two object loads (assuming it is not a
      recursive tag), when it could get away with only a single
      one. Dropping the second load halves the effort we spend.
      
      The downside is that we are no longer verifying the
      resulting object by loading it. In particular:
      
        1. We never cross-check the "type" field given in the tag
           object with the type of the pointed-to object.  If the
           tag says it points to a tag but doesn't, then we will
           keep peeling and realize the error.  If the tag says it
           points to a non-tag but actually points to a tag, we
           will stop peeling and just advertise the pointed-to
           tag.
      
        2. If we are missing the pointed-to object, we will not
           realize (because we never even look it up in the object
           db).
      
      However, both of these are errors in the object database,
      and both will be detected if a client actually requests the
      broken objects in question. So we are simply pushing the
      verification away from the advertising stage, and down to
      the actual fetching stage.
      
      On my test repo with 120K refs, this drops the time to
      advertise the refs from ~3.2s to ~2.0s.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      90108a24
    • J
      upload-pack: avoid parsing objects during ref advertisement · 926f1dd9
      Jeff King 提交于
      When we advertise a ref, the first thing we do is parse the
      pointed-to object. This gives us two things:
      
        1. a "struct object" we can use to store flags
      
        2. the type of the object, so we know whether we need to
           dereference it as a tag
      
      Instead, we can just use lookup_unknown_object to get an
      object struct, and then fill in just the type field using
      sha1_object_info (which, in the case of packed files, can
      find the information without actually inflating the object
      data).
      
      This can save time if you have a large number of refs, and
      the client isn't actually going to request those refs (e.g.,
      because most of them are already up-to-date).
      
      The downside is that we are no longer verifying objects that
      we advertise by fully parsing them (however, we do still
      know we actually have them, because sha1_object_info must
      find them to get the type). While we might fail to detect a
      corrupt object here, if the client actually fetches the
      object, we will parse (and verify) it then.
      
      On a repository with 120K refs, the advertisement portion of
      upload-pack goes from ~3.4s to 3.2s (the failure to speed up
      more is largely due to the fact that most of these refs are
      tags, which need dereferenced to find the tag destination
      anyway).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      926f1dd9
  24. 06 12月, 2011 1 次提交
    • Æ
      i18n: add infrastructure for translating Git with gettext · 5e9637c6
      Ævar Arnfjörð Bjarmason 提交于
      Change the skeleton implementation of i18n in Git to one that can show
      localized strings to users for our C, Shell and Perl programs using
      either GNU libintl or the Solaris gettext implementation.
      
      This new internationalization support is enabled by default. If
      gettext isn't available, or if Git is compiled with
      NO_GETTEXT=YesPlease, Git falls back on its current behavior of
      showing interface messages in English. When using the autoconf script
      we'll auto-detect if the gettext libraries are installed and act
      appropriately.
      
      This change is somewhat large because as well as adding a C, Shell and
      Perl i18n interface we're adding a lot of tests for them, and for
      those tests to work we need a skeleton PO file to actually test
      translations. A minimal Icelandic translation is included for this
      purpose. Icelandic includes multi-byte characters which makes it easy
      to test various edge cases, and it's a language I happen to
      understand.
      
      The rest of the commit message goes into detail about various
      sub-parts of this commit.
      
      = Installation
      
      Gettext .mo files will be installed and looked for in the standard
      $(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
      override that, but that's only intended to be used to test Git itself.
      
      = Perl
      
      Perl code that's to be localized should use the new Git::I18n
      module. It imports a __ function into the caller's package by default.
      
      Instead of using the high level Locale::TextDomain interface I've
      opted to use the low-level (equivalent to the C interface)
      Locale::Messages module, which Locale::TextDomain itself uses.
      
      Locale::TextDomain does a lot of redundant work we don't need, and
      some of it would potentially introduce bugs. It tries to set the
      $TEXTDOMAIN based on package of the caller, and has its own
      hardcoded paths where it'll search for messages.
      
      I found it easier just to completely avoid it rather than try to
      circumvent its behavior. In any case, this is an issue wholly
      internal Git::I18N. Its guts can be changed later if that's deemed
      necessary.
      
      See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
      a further elaboration on this topic.
      
      = Shell
      
      Shell code that's to be localized should use the git-sh-i18n
      library. It's basically just a wrapper for the system's gettext.sh.
      
      If gettext.sh isn't available we'll fall back on gettext(1) if it's
      available. The latter is available without the former on Solaris,
      which has its own non-GNU gettext implementation. We also need to
      emulate eval_gettext() there.
      
      If neither are present we'll use a dumb printf(1) fall-through
      wrapper.
      
      = About libcharset.h and langinfo.h
      
      We use libcharset to query the character set of the current locale if
      it's available. I.e. we'll use it instead of nl_langinfo if
      HAVE_LIBCHARSET_H is set.
      
      The GNU gettext manual recommends using langinfo.h's
      nl_langinfo(CODESET) to acquire the current character set, but on
      systems that have libcharset.h's locale_charset() using the latter is
      either saner, or the only option on those systems.
      
      GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
      but MinGW and some others need to use libcharset.h's locale_charset()
      instead.
      
      =Credits
      
      This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
      did the initial Makefile / C work, and a lot of comments from the Git
      mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
      Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
      others.
      
      [jc: squashed a small Makefile fix from Ramsay]
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5e9637c6
  25. 02 9月, 2011 1 次提交
    • J
      list-objects: pass callback data to show_objects() · 49473672
      Junio C Hamano 提交于
      The traverse_commit_list() API takes two callback functions, one to show
      commit objects, and the other to show other kinds of objects. Even though
      the former has a callback data parameter, so that the callback does not
      have to rely on global state, the latter does not.
      
      Give the show_objects() callback the same callback data parameter.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      49473672