1. 15 2月, 2019 2 次提交
  2. 07 2月, 2019 2 次提交
    • J
      remote-curl: tighten "version 2" check for smart-http · cbdb8d14
      Jeff King 提交于
      In a v2 smart-http conversation, the server should reply to our initial
      request with a pkt-line saying "version 2". We check that with
      starts_with(), but really that should be the only thing in that packet.
      A response of "version 20" should not match.
      
      Let's tighten this check to use strcmp(). Note that we don't need to
      worry about a trailing newline here, because the ptk-line code will have
      chomped it for us already.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cbdb8d14
    • J
      remote-curl: refactor smart-http discovery · 8ee3e120
      Jeff King 提交于
      After making initial contact with an http server, we have to decide if
      the server supports smart-http, and if so, which version. Our rules are
      a bit inconsistent:
      
        1. For v0, we require that the content-type indicates a smart-http
           response. We also require the response to look vaguely like a
           pkt-line starting with "#". If one of those does not match, we fall
           back to dumb-http.
      
           But according to our http protocol spec[1]:
      
             Dumb servers MUST NOT return a return type starting with
             `application/x-git-`.
      
           If we see the expected content-type, we should consider it
           smart-http. At that point we can parse the pkt-line for real, and
           complain if it is not syntactically valid.
      
        2. For v2, we do not actually check the content-type. Our v2 protocol
           spec says[2]:
      
             When using the http:// or https:// transport a client makes a
             "smart" info/refs request as described in `http-protocol.txt`[...]
      
           and the http spec is clear that for a smart-http response[3]:
      
             The Content-Type MUST be `application/x-$servicename-advertisement`.
      
           So it is required according to the spec.
      
      These inconsistencies were easy to miss because of the way the original
      code was written as an inline conditional. Let's pull it out into its
      own function for readability, and improve a few things:
      
       - we now predicate the smart/dumb decision entirely on the presence of
         the correct content-type
      
       - we do a real pkt-line parse before deciding how to proceed (and die
         if it isn't valid)
      
       - use skip_prefix() for comparing service strings, instead of
         constructing expected output in a strbuf; this avoids dealing with
         memory cleanup
      
      Note that this _is_ tightening what the client will allow. It's all
      according to the spec, but it's possible that other implementations
      might violate these. However, violating these particular rules seems
      like an odd choice for a server to make.
      
      [1] Documentation/technical/http-protocol.txt, l. 166-167
      [2] Documentation/technical/protocol-v2.txt, l. 63-64
      [3] Documentation/technical/http-protocol.txt, l. 247
      Helped-by: NJosh Steadmon <steadmon@google.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8ee3e120
  3. 11 1月, 2019 3 次提交
  4. 03 1月, 2019 2 次提交
    • M
      pack-protocol.txt: accept error packets in any context · 2d103c31
      Masaya Suzuki 提交于
      In the Git pack protocol definition, an error packet may appear only in
      a certain context. However, servers can face a runtime error (e.g. I/O
      error) at an arbitrary timing. This patch changes the protocol to allow
      an error packet to be sent instead of any packet.
      
      Without this protocol spec change, when a server cannot process a
      request, there's no way to tell that to a client. Since the server
      cannot produce a valid response, it would be forced to cut a connection
      without telling why. With this protocol spec change, the server can be
      more gentle in this situation. An old client may see these error packets
      as an unexpected packet, but this is not worse than having an unexpected
      EOF.
      
      Following this protocol spec change, the error packet handling code is
      moved to pkt-line.c. Implementation wise, this implementation uses
      pkt-line to communicate with a subprocess. Since this is not a part of
      Git protocol, it's possible that a packet that is not supposed to be an
      error packet is mistakenly parsed as an error packet. This error packet
      handling is enabled only for the Git pack protocol parsing code
      considering this.
      Signed-off-by: NMasaya Suzuki <masayasuzuki@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2d103c31
    • M
      Use packet_reader instead of packet_read_line · 01f9ec64
      Masaya Suzuki 提交于
      By using and sharing a packet_reader while handling a Git pack protocol
      request, the same reader option is used throughout the code. This makes
      it easy to set a reader option to the request parsing code.
      Signed-off-by: NMasaya Suzuki <masayasuzuki@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      01f9ec64
  5. 10 12月, 2018 1 次提交
  6. 12 11月, 2018 1 次提交
    • T
      remote-curl.c: xcurl_off_t is not portable (on 32 bit platfoms) · cb8010bb
      Torsten Bögershausen 提交于
      When  setting
      DEVELOPER = 1
      DEVOPTS = extra-all
      
      "gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516" errors out with
      "comparison is always false due to limited range of data type"
      "[-Werror=type-limits]"
      
      It turns out that the function xcurl_off_t() has 2 flavours:
      
      - It gives a warning 32 bit systems, like Linux
      - It takes the signed ssize_t as a paramter, but the only caller is using
        a size_t (which is typically unsigned these days)
      
      The original motivation of this function is to make sure that sizes > 2GiB
      are handled correctly. The curl documentation says:
      "For any given platform/compiler curl_off_t must be typedef'ed to a 64-bit
       wide signed integral data type"
      On a 32 bit system "size_t" can be promoted into a 64 bit signed value
      without loss of data, and therefore we may see the
      "comparison is always false" warning.
      On a 64 bit system it may happen, at least in theory, that size_t is > 2^63,
      and then the promotion from an unsigned "size_t" into a signed "curl_off_t"
      may be a problem.
      
      One solution to suppress a possible compiler warning could be to remove
      the function xcurl_off_t().
      
      However, to be on the very safe side, we keep it and improve it:
      
      - The len parameter is changed from ssize_t to size_t
      - A temporally variable "size" is used, promoted int uintmax_t and the compared
        with "maximum_signed_value_of_type(curl_off_t)".
        Thanks to Junio C Hamano for this hint.
      Signed-off-by: NTorsten Bögershausen <tboegi@web.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cb8010bb
  7. 06 9月, 2018 1 次提交
  8. 09 8月, 2018 1 次提交
  9. 23 5月, 2018 2 次提交
  10. 24 4月, 2018 1 次提交
  11. 11 4月, 2018 1 次提交
  12. 16 3月, 2018 6 次提交
  13. 15 3月, 2018 2 次提交
  14. 21 2月, 2018 1 次提交
    • J
      remote-curl: unquote incoming push-options · 90dce21e
      Jeff King 提交于
      The transport-helper protocol c-style quotes the value of
      any options passed to the helper via the "option <key> <value>"
      directive. However, remote-curl doesn't actually unquote the
      push-option values, meaning that we will send the quoted
      version to the other side (whereas git-over-ssh would send
      the raw value).
      
      The pack-protocol.txt documentation defines the push-options
      as a series of VCHARs, which excludes most characters that
      would need quoting. But:
      
        1. You can still see the bug with a valid push-option that
           starts with a double-quote (since that triggers
           quoting).
      
        2. We do currently handle any non-NUL characters correctly
           in git-over-ssh. So even though the spec does not say
           that we need to handle most quoted characters, it's
           nice if our behavior is consistent between protocols.
      
      There are two new tests: the "direct" one shows that this
      already works in the non-http case, and the http one covers
      this bugfix.
      Reported-by: NJon Simons <jon@jonsimons.org>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      90dce21e
  15. 09 2月, 2018 1 次提交
  16. 09 12月, 2017 1 次提交
  17. 06 12月, 2017 1 次提交
    • J
      introduce fetch-object: fetch one promisor object · 88e2f9ed
      Jonathan Tan 提交于
      Introduce fetch-object, providing the ability to fetch one object from a
      promisor remote.
      
      This uses fetch-pack. To do this, the transport mechanism has been
      updated with 2 flags, "from-promisor" to indicate that the resulting
      pack comes from a promisor remote (and thus should be annotated as such
      by index-pack), and "no-dependents" to indicate that only the objects
      themselves need to be fetched (but fetching additional objects is
      nevertheless safe).
      
      Whenever "no-dependents" is used, fetch-pack will refrain from using any
      object flags, because it is most likely invoked as part of a dynamic
      object fetch by another Git command (which may itself use object flags).
      An alternative to this is to leave fetch-pack alone, and instead update
      the allocation of flags so that fetch-pack's flags never overlap with
      any others, but this will end up shrinking the number of flags available
      to nearly every other Git command (that is, every Git command that
      accesses objects), so the approach in this commit was used instead.
      
      This will be tested in a subsequent commit.
      Signed-off-by: NJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      88e2f9ed
  18. 16 6月, 2017 1 次提交
  19. 14 4月, 2017 1 次提交
  20. 31 3月, 2017 1 次提交
    • B
      Rename sha1_array to oid_array · 910650d2
      brian m. carlson 提交于
      Since this structure handles an array of object IDs, rename it to struct
      oid_array.  Also rename the accessor functions and the initialization
      constant.
      
      This commit was produced mechanically by providing non-Documentation
      files to the following Perl one-liners:
      
          perl -pi -E 's/struct sha1_array/struct oid_array/g'
          perl -pi -E 's/\bsha1_array_/oid_array_/g'
          perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
      Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      910650d2
  21. 29 3月, 2017 1 次提交
  22. 23 3月, 2017 1 次提交
  23. 07 12月, 2016 2 次提交
    • J
      http: make redirects more obvious · 50d34137
      Jeff King 提交于
      We instruct curl to always follow HTTP redirects. This is
      convenient, but it creates opportunities for malicious
      servers to create confusing situations. For instance,
      imagine Alice is a git user with access to a private
      repository on Bob's server. Mallory runs her own server and
      wants to access objects from Bob's repository.
      
      Mallory may try a few tricks that involve asking Alice to
      clone from her, build on top, and then push the result:
      
        1. Mallory may simply redirect all fetch requests to Bob's
           server. Git will transparently follow those redirects
           and fetch Bob's history, which Alice may believe she
           got from Mallory. The subsequent push seems like it is
           just feeding Mallory back her own objects, but is
           actually leaking Bob's objects. There is nothing in
           git's output to indicate that Bob's repository was
           involved at all.
      
           The downside (for Mallory) of this attack is that Alice
           will have received Bob's entire repository, and is
           likely to notice that when building on top of it.
      
        2. If Mallory happens to know the sha1 of some object X in
           Bob's repository, she can instead build her own history
           that references that object. She then runs a dumb http
           server, and Alice's client will fetch each object
           individually. When it asks for X, Mallory redirects her
           to Bob's server. The end result is that Alice obtains
           objects from Bob, but they may be buried deep in
           history. Alice is less likely to notice.
      
      Both of these attacks are fairly hard to pull off. There's a
      social component in getting Mallory to convince Alice to
      work with her. Alice may be prompted for credentials in
      accessing Bob's repository (but not always, if she is using
      a credential helper that caches). Attack (1) requires a
      certain amount of obliviousness on Alice's part while making
      a new commit. Attack (2) requires that Mallory knows a sha1
      in Bob's repository, that Bob's server supports dumb http,
      and that the object in question is loose on Bob's server.
      
      But we can probably make things a bit more obvious without
      any loss of functionality. This patch does two things to
      that end.
      
      First, when we encounter a whole-repo redirect during the
      initial ref discovery, we now inform the user on stderr,
      making attack (1) much more obvious.
      
      Second, the decision to follow redirects is now
      configurable. The truly paranoid can set the new
      http.followRedirects to false to avoid any redirection
      entirely. But for a more practical default, we will disallow
      redirects only after the initial ref discovery. This is
      enough to thwart attacks similar to (2), while still
      allowing the common use of redirects at the repository
      level. Since c93c92f3 (http: update base URLs when we see
      redirects, 2013-09-28) we re-root all further requests from
      the redirect destination, which should generally mean that
      no further redirection is necessary.
      
      As an escape hatch, in case there really is a server that
      needs to redirect individual requests, the user can set
      http.followRedirects to "true" (and this can be done on a
      per-server basis via http.*.followRedirects config).
      Reported-by: NJann Horn <jannh@google.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      50d34137
    • J
      remote-curl: rename shadowed options variable · fcaa6e64
      Jeff King 提交于
      The discover_refs() function has a local "options" variable
      to hold the http_get_options we pass to http_get_strbuf().
      But this shadows the global "struct options" that holds our
      program-level options, which cannot be accessed from this
      function.
      
      Let's give the local one a more descriptive name so we can
      tell the two apart.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fcaa6e64
  24. 19 11月, 2016 1 次提交
    • D
      remote-curl: don't hang when a server dies before any output · 296b847c
      David Turner 提交于
      In the event that a HTTP server closes the connection after giving a
      200 but before giving any packets, we don't want to hang forever
      waiting for a response that will never come.  Instead, we should die
      immediately.
      
      One case where this happens is when attempting to fetch a dangling
      object by its object name.  In this case, the server dies before
      sending any data.  Prior to this patch, fetch-pack would wait for
      data from the server, and remote-curl would wait for fetch-pack,
      causing a deadlock.
      
      Despite this patch, there is other possible malformed input that could
      cause the same deadlock (e.g. a half-finished pktline, or a pktline but
      no trailing flush).  There are a few possible solutions to this:
      
      1. Allowing remote-curl to tell fetch-pack about the EOF (so that
      fetch-pack could know that no more data is coming until it says
      something else).  This is tricky because an out-of-band signal would
      be required, or the http response would have to be re-framed inside
      another layer of pkt-line or something.
      
      2. Make remote-curl understand some of the protocol.  It turns out
      that in addition to understanding pkt-line, it would need to watch for
      ack/nak.  This is somewhat fragile, as information about the protocol
      would end up in two places.  Also, pkt-lines which are already at the
      length limit would need special handling.
      
      Both of these solutions would require a fair amount of work, whereas
      this hack is easy and solves at least some of the problem.
      
      Still to do: it would be good to give a better error message
      than "fatal: The remote end hung up unexpectedly".
      Signed-off-by: NDavid Turner <dturner@twosigma.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      296b847c
  25. 02 7月, 2016 3 次提交
    • J
      common-main: call git_setup_gettext() · 5ce5f5fa
      Jeff King 提交于
      This should be part of every program, as otherwise users do
      not get translated error messages. However, some external
      commands forgot to do so (e.g., git-credential-store). This
      fixes them, and eliminates the repeated code in programs
      that did remember to use it.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5ce5f5fa
    • J
      common-main: call git_extract_argv0_path() · 650c4492
      Jeff King 提交于
      Every program which links against libgit.a must call this
      function, or risk hitting an assert() in system_path() that
      checks whether we have configured argv0_path (though only
      when RUNTIME_PREFIX is defined, so essentially only on
      Windows).
      
      Looking at the diff, you can see that putting it into the
      common main() saves us having to do it individually in each
      of the external commands. But what you can't see are the
      cases where we _should_ have been doing so, but weren't
      (e.g., git-credential-store, and all of the t/helper test
      programs).
      
      This has been an accident-waiting-to-happen for a long time,
      but wasn't triggered until recently because it involves one
      of those programs actually calling system_path(). That
      happened with git-credential-store in v2.8.0 with ae5f6776
      (lazily load core.sharedrepository, 2016-03-11). The
      program:
      
        - takes a lock file, which...
      
        - opens a tempfile, which...
      
        - calls adjust_shared_perm to fix permissions, which...
      
        - lazy-loads the config (as of ae5f6776), which...
      
        - calls system_path() to find the location of
          /etc/gitconfig
      
      On systems with RUNTIME_PREFIX, this means credential-store
      reliably hits that assert() and cannot be used.
      
      We never noticed in the test suite, because we set
      GIT_CONFIG_NOSYSTEM there, which skips the system_path()
      lookup entirely.  But if we were to tweak git_config() to
      find /etc/gitconfig even when we aren't going to open it,
      then the test suite shows multiple failures (for
      credential-store, and for some other test helpers). I didn't
      include that tweak here because it's way too specific to
      this particular call to be worth carrying around what is
      essentially dead code.
      
      The implementation is fairly straightforward, with one
      exception: there is exactly one caller (git.c) that actually
      cares about the result of the function, and not the
      side-effect of setting up argv0_path. We can accommodate
      that by simply replacing the value of argv[0] in the array
      we hand down to cmd_main().
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      650c4492
    • J
      add an extra level of indirection to main() · 3f2e2297
      Jeff King 提交于
      There are certain startup tasks that we expect every git
      process to do. In some cases this is just to improve the
      quality of the program (e.g., setting up gettext()). In
      others it is a requirement for using certain functions in
      libgit.a (e.g., system_path() expects that you have called
      git_extract_argv0_path()).
      
      Most commands are builtins and are covered by the git.c
      version of main(). However, there are still a few external
      commands that use their own main(). Each of these has to
      remember to include the correct startup sequence, and we are
      not always consistent.
      
      Rather than just fix the inconsistencies, let's make this
      harder to get wrong by providing a common main() that can
      run this standard startup.
      
      We basically have two options to do this:
      
       - the compat/mingw.h file already does something like this by
         adding a #define that replaces the definition of main with a
         wrapper that calls mingw_startup().
      
         The upside is that the code in each program doesn't need
         to be changed at all; it's rewritten on the fly by the
         preprocessor.
      
         The downside is that it may make debugging of the startup
         sequence a bit more confusing, as the preprocessor is
         quietly inserting new code.
      
       - the builtin functions are all of the form cmd_foo(),
         and git.c's main() calls them.
      
         This is much more explicit, which may make things more
         obvious to somebody reading the code. It's also more
         flexible (because of course we have to figure out _which_
         cmd_foo() to call).
      
         The downside is that each of the builtins must define
         cmd_foo(), instead of just main().
      
      This patch chooses the latter option, preferring the more
      explicit approach, even though it is more invasive. We
      introduce a new file common-main.c, with the "real" main. It
      expects to call cmd_main() from whatever other objects it is
      linked against.
      
      We link common-main.o against anything that links against
      libgit.a, since we know that such programs will need to do
      this setup. Note that common-main.o can't actually go inside
      libgit.a, as the linker would not pick up its main()
      function automatically (it has no callers).
      
      The rest of the patch is just adjusting all of the various
      external programs (mostly in t/helper) to use cmd_main().
      I've provided a global declaration for cmd_main(), which
      means that all of the programs also need to match its
      signature. In particular, many functions need to switch to
      "const char **" instead of "char **" for argv. This effect
      ripples out to a few other variables and functions, as well.
      
      This makes the patch even more invasive, but the end result
      is much better. We should be treating argv strings as const
      anyway, and now all programs conform to the same signature
      (which also matches the way builtins are defined).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3f2e2297