1. 23 7月, 2016 1 次提交
  2. 06 3月, 2016 1 次提交
    • J
      strbuf_getwholeline: NUL-terminate getdelim buffer on error · b7090430
      Jeff King 提交于
      Commit 0cc30e0e (strbuf_getwholeline: use getdelim if it is
      available, 2015-04-16) tries to clean up after getdelim()
      returns EOF, but gets one case wrong, which can lead in some
      obscure cases to us reading uninitialized memory.
      
      After getdelim() returns -1, we re-initialize the strbuf
      only if sb->buf is NULL. The thinking was that either:
      
        1. We fed an existing allocated buffer to getdelim(), and
           at most it would have realloc'd, leaving our NUL in
           place.
      
        2. We didn't have a buffer to feed, so we gave getdelim()
           NULL; sb->buf will remain NULL, and we just want to
           restore the empty slopbuf.
      
      But that second case isn't quite right. getdelim() may
      allocate a buffer, write nothing into it, and then return
      EOF. The resulting strbuf rightfully has sb->len set to "0",
      but is missing the NUL terminator in the first byte.
      
      Most call-sites are fine with this. They see the EOF and
      don't bother looking at the strbuf. Or they notice that
      sb->len is empty, and don't look at the contents. But
      there's at least one case that does neither, and relies on
      parsing the resulting (possibly zero-length) string:
      fast-import. You can see this in action with the new test
      (though we probably only notice failure there when run with
      --valgrind or ASAN).
      
      We can fix this by unconditionally resetting the strbuf when
      we have a buffer after getdelim(). That fixes case 2 above.
      Case 1 is probably already fine in practice, but it does not
      hurt for us to re-assert our invariants (especially because
      we are relying on whatever getdelim() happens to do, which
      may vary from platform to platform). Our fix covers that
      case, too.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b7090430
  3. 02 3月, 2016 1 次提交
  4. 23 2月, 2016 1 次提交
    • J
      use xmallocz to avoid size arithmetic · 3733e694
      Jeff King 提交于
      We frequently allocate strings as xmalloc(len + 1), where
      the extra 1 is for the NUL terminator. This can be done more
      simply with xmallocz, which also checks for integer
      overflow.
      
      There's no case where switching xmalloc(n+1) to xmallocz(n)
      is wrong; the result is the same length, and malloc made no
      guarantees about what was in the buffer anyway. But in some
      cases, we can stop manually placing NUL at the end of the
      allocated buffer. But that's only safe if it's clear that
      the contents will always fill the buffer.
      
      In each case where this patch does so, I manually examined
      the control flow, and I tried to err on the side of caution.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3733e694
  5. 16 1月, 2016 2 次提交
    • J
      strbuf: give strbuf_getline() to the "most text friendly" variant · 1a0c8dfd
      Junio C Hamano 提交于
      Now there is no direct caller to strbuf_getline(), we can demote it
      to file-scope static that is private to strbuf.c and rename it to
      strbuf_getdelim().  Rename strbuf_getline_crlf(), which is designed
      to be the most "text friendly" variant, and allow it to take over
      this simplest name, strbuf_getline(), so we can add more uses of it
      without having to type _crlf over and over again in the coming
      steps.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1a0c8dfd
    • J
      strbuf: introduce strbuf_getline_{lf,nul}() · 8f309aeb
      Junio C Hamano 提交于
      The strbuf_getline() interface allows a byte other than LF or NUL as
      the line terminator, but this is only because I wrote these
      codepaths anticipating that there might be a value other than NUL
      and LF that could be useful when I introduced line_termination long
      time ago.  No useful caller that uses other value has emerged.
      
      By now, it is clear that the interface is overly broad without a
      good reason.  Many codepaths have hardcoded preference to read
      either LF terminated or NUL terminated records from their input, and
      then call strbuf_getline() with LF or NUL as the third parameter.
      
      This step introduces two thin wrappers around strbuf_getline(),
      namely, strbuf_getline_lf() and strbuf_getline_nul(), and
      mechanically rewrites these call sites to call either one of
      them.  The changes contained in this patch are:
      
       * introduction of these two functions in strbuf.[ch]
      
       * mechanical conversion of all callers to strbuf_getline() with
         either '\n' or '\0' as the third parameter to instead call the
         respective thin wrapper.
      
      After this step, output from "git grep 'strbuf_getline('" would
      become a lot smaller.  An interim goal of this series is to make
      this an empty set, so that we can have strbuf_getline_crlf() take
      over the shorter name strbuf_getline().
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8f309aeb
  6. 15 1月, 2016 2 次提交
    • J
      strbuf: make strbuf_getline_crlf() global · c8aa9fdf
      Junio C Hamano 提交于
      Often we read "text" files that are supplied by the end user
      (e.g. commit log message that was edited with $GIT_EDITOR upon 'git
      commit -e'), and in some environments lines in a text file are
      terminated with CRLF.  Existing strbuf_getline() knows to read a
      single line and then strip the terminating byte from the result, but
      it is handy to have a version that is more tailored for a "text"
      input that takes both '\n' and '\r\n' as line terminator (aka
      <newline> in POSIX lingo) and returns the body of the line after
      stripping <newline>.
      
      Recently reimplemented "git am" uses such a function implemented
      privately; move it to strbuf.[ch] and make it available for others.
      
      Note that we do not blindly replace calls to strbuf_getline() that
      uses LF as the line terminator with calls to strbuf_getline_crlf()
      and this is very much deliberate.  Some callers may want to treat an
      incoming line that ends with CR (and terminated with LF) to have a
      payload that includes the final CR, and such a blind replacement
      will result in misconversion when done without code audit.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c8aa9fdf
    • J
      strbuf: miniscule style fix · dce80bd1
      Junio C Hamano 提交于
      We write one SP on each side of an operator, even inside an [] pair
      that computes the array index.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dce80bd1
  7. 17 12月, 2015 1 次提交
  8. 17 10月, 2015 1 次提交
  9. 26 9月, 2015 2 次提交
    • J
      convert trivial sprintf / strcpy calls to xsnprintf · 5096d490
      Jeff King 提交于
      We sometimes sprintf into fixed-size buffers when we know
      that the buffer is large enough to fit the input (either
      because it's a constant, or because it's numeric input that
      is bounded in size). Likewise with strcpy of constant
      strings.
      
      However, these sites make it hard to audit sprintf and
      strcpy calls for buffer overflows, as a reader has to
      cross-reference the size of the array with the input. Let's
      use xsnprintf instead, which communicates to a reader that
      we don't expect this to overflow (and catches the mistake in
      case we do).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5096d490
    • J
      add reentrant variants of sha1_to_hex and find_unique_abbrev · af49c6d0
      Jeff King 提交于
      The sha1_to_hex and find_unique_abbrev functions always
      write into reusable static buffers. There are a few problems
      with this:
      
        - future calls overwrite our result. This is especially
          annoying with find_unique_abbrev, which does not have a
          ring of buffers, so you cannot even printf() a result
          that has two abbreviated sha1s.
      
        - if you want to put the result into another buffer, we
          often strcpy, which looks suspicious when auditing for
          overflows.
      
      This patch introduces sha1_to_hex_r and find_unique_abbrev_r,
      which write into a user-provided buffer. Of course this is
      just punting on the overflow-auditing, as the buffer
      obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
      much easier to audit, since that is a well-known size.
      
      We retain the non-reentrant forms, which just become thin
      wrappers around the reentrant ones. This patch also adds a
      strbuf variant of find_unique_abbrev, which will be handy in
      later patches.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      af49c6d0
  10. 11 8月, 2015 1 次提交
    • J
      strbuf_read(): skip unnecessary strbuf_grow() at eof · 3ebbd00c
      Jim Hill 提交于
      The loop in strbuf_read() uses xread() repeatedly while extending
      the strbuf until the call returns zero.  If the buffer is
      sufficiently large to begin with, this results in xread()
      returning the remainder of the file to the end (returning
      non-zero), the loop extending the strbuf, and then making another
      call to xread() to have it return zero.
      
      By using read_in_full(), we can tell when the read reached the end
      of file: when it returns less than was requested, it's eof.  This
      way we can avoid an extra iteration that allocates an extra 8kB
      that is never used.
      Signed-off-by: NJim Hill <gjthill@gmail.com>
      Reviewed-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3ebbd00c
  11. 21 7月, 2015 1 次提交
    • J
      strbuf: make strbuf_addftime more robust · e4f031e3
      Jeff King 提交于
      The return value of strftime is poorly designed; when it
      returns 0, the caller cannot tell if the buffer was not
      large enough, or if the output was actually 0 bytes. In the
      original implementation of strbuf_addftime, we simply punted
      and guessed that our 128-byte hint would be large enough.
      
      We can do better, though, if we're willing to treat strftime
      like less of a black box. We can munge the incoming format
      to make sure that it never produces 0-length output, and
      then "fix" the resulting output.  That lets us reliably grow
      the buffer based on strftime's return value.
      Clever-idea-by: NEric Sunshine <sunshine@sunshineco.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e4f031e3
  12. 04 7月, 2015 1 次提交
  13. 30 6月, 2015 1 次提交
  14. 16 4月, 2015 4 次提交
    • J
      strbuf_getwholeline: use getdelim if it is available · 0cc30e0e
      Jeff King 提交于
      We spend a lot of time in strbuf_getwholeline in a tight
      loop reading characters from a stdio handle into a buffer.
      The libc getdelim() function can do this for us with less
      overhead. It's in POSIX.1-2008, and was a GNU extension
      before that. Therefore we can't rely on it, but can fall
      back to the existing getc loop when it is not available.
      
      The HAVE_GETDELIM knob is turned on automatically for Linux,
      where we have glibc. We don't need to set any new
      feature-test macros, because we already define _GNU_SOURCE.
      Other systems that implement getdelim may need to other
      macros (probably _POSIX_C_SOURCE >= 200809L), but we can
      address that along with setting the Makefile knob after
      testing the feature on those systems.
      
      Running "git rev-parse refs/heads/does-not-exist" on a repo
      with an extremely large (1.6GB) packed-refs file went from
      (best-of-5):
      
        real    0m8.601s
        user    0m8.084s
        sys     0m0.524s
      
      to:
      
        real    0m6.768s
        user    0m6.340s
        sys     0m0.432s
      
      for a wall-clock speedup of 21%.
      
      Based on a patch from Rasmus Villemoes <rv@rasmusvillemoes.dk>.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0cc30e0e
    • J
      strbuf_getwholeline: avoid calling strbuf_grow · f80c153b
      Jeff King 提交于
      As with the recent speedup to strbuf_addch, we can avoid
      calling strbuf_grow() in a tight loop of single-character
      adds by instead checking strbuf_avail.
      
      Note that we would instead call strbuf_addch directly here,
      but it does more work than necessary: it will NUL-terminate
      the result for each character read. Instead, in this loop we
      read the characters one by one and then add the terminator
      manually at the end.
      
      Running "git rev-parse refs/heads/does-not-exist" on a repo
      with an extremely large (1.6GB) packed-refs file went from
      (best-of-5):
      
        real    0m10.948s
        user    0m10.548s
        sys     0m0.412s
      
      to:
      
        real    0m8.601s
        user    0m8.084s
        sys     0m0.524s
      
      for a wall-clock speedup of 21%.
      Helped-by: NEric Sunshine <sunshine@sunshineco.com>
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f80c153b
    • J
      strbuf_getwholeline: use getc_unlocked · 82912d1d
      Jeff King 提交于
      strbuf_getwholeline calls getc in a tight loop. On modern
      libc implementations, the stdio code locks the handle for
      every operation, which means we are paying a significant
      overhead.  We can get around this by locking the handle for
      the whole loop and using the unlocked variant.
      
      Running "git rev-parse refs/heads/does-not-exist" on a repo
      with an extremely large (1.6GB) packed-refs file went from:
      
        real    0m18.900s
        user    0m18.472s
        sys     0m0.448s
      
      to:
      
        real    0m10.953s
        user    0m10.384s
        sys     0m0.580s
      
      for a wall-clock speedup of 42%. All times are best-of-3,
      and done on a glibc 2.19 system.
      
      Note that we call into strbuf_grow while holding the lock.
      It's possible for that function to call other stdio
      functions (e.g., printing to stderr when dying due to malloc
      error); however, the POSIX.1-2001 definition of flockfile
      makes it clear that the locks are per-handle, so we are fine
      unless somebody else tries to read from our same handle.
      This doesn't ever happen in the current code, and is
      unlikely to be added in the future (we would have to do
      something exotic like add a die_routine that tried to read
      from stdin).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      82912d1d
    • J
      strbuf_getwholeline: use getc macro · 3446a59b
      Jeff King 提交于
      strbuf_getwholeline calls fgetc in a tight loop. Using the
      getc form, which can be implemented as a macro, should be
      faster (and we do not care about it evaluating our argument
      twice, as we just have a plain variable).
      
      On my glibc system, running "git rev-parse
      refs/heads/does-not-exist" on a file with an extremely large
      (1.6GB) packed-refs file went from (best of 3 runs):
      
        real    0m19.383s
        user    0m18.876s
        sys     0m0.528s
      
      to:
      
        real    0m18.900s
        user    0m18.472s
        sys     0m0.448s
      
      for a wall-clock speedup of 2.5%.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3446a59b
  15. 28 10月, 2014 1 次提交
    • J
      strbuf_add_commented_lines(): avoid SP-HT sequence in commented lines · d55aeb76
      Junio C Hamano 提交于
      The strbuf_add_commented_lines() function passes a pair of prefixes,
      one to be used for a non-empty line, and the other for an empty
      line, to underlying add_lines().  The former is set to a comment
      char followed by a SP, while the latter is set to just the comment
      char.  This is designed to give a SP after the comment character,
      e.g. "# <user text>\n", on a line with some text, and to avoid
      emitting an unsightly "# \n" for an empty line.
      
      Teach this machinery to also use the latter space-less prefix when
      the payload line begins with a tab, to show e.g. "#\t<user text>\n";
      otherwise we will end up showing "# \t<user text>\n" which is
      similarly unsightly.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d55aeb76
  16. 09 9月, 2014 1 次提交
  17. 27 8月, 2014 1 次提交
  18. 29 7月, 2014 1 次提交
  19. 01 7月, 2014 1 次提交
    • J
      implement ends_with via strip_suffix · f52a35fd
      Jeff King 提交于
      The ends_with function is essentially a simplified version
      of strip_suffix, in which we throw away the stripped length.
      Implementing it as an inline on top of strip_suffix has two
      advantages:
      
        1. We save a bit of duplicated code.
      
        2. The suffix is typically a string literal, and we call
           strlen on it. By making the function inline, many
           compilers can replace the strlen call with a constant.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f52a35fd
  20. 20 6月, 2014 1 次提交
    • J
      strbuf: add xstrfmt helper · 30a0ddb7
      Jeff King 提交于
      You can use a strbuf to build up a string from parts, and
      then detach it. In the general case, you might use multiple
      strbuf_add* functions to do the building. However, in many
      cases, a single strbuf_addf is sufficient, and we end up
      with:
      
        struct strbuf buf = STRBUF_INIT;
        ...
        strbuf_addf(&buf, fmt, some, args);
        str = strbuf_detach(&buf, NULL);
      
      We can make this much more readable (and avoid introducing
      an extra variable, which can clutter the code) by
      introducing a convenience function:
      
        str = xstrfmt(fmt, some, args);
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      30a0ddb7
  21. 28 5月, 2014 1 次提交
  22. 24 5月, 2014 2 次提交
  23. 07 5月, 2014 1 次提交
  24. 06 12月, 2013 2 次提交
  25. 11 4月, 2013 1 次提交
    • A
      strbuf: create strbuf_humanise_bytes() to show byte sizes · 079b546a
      Antoine Pelisse 提交于
      Humanization of downloaded size is done in the same function as text
      formatting in 'process.c'. The code cannot be reused easily elsewhere.
      
      Separate text formatting from size simplification and make the
      function public in strbuf so that it can easily be used by other
      callers.
      
      We now can use strbuf_humanise_bytes() for both downloaded size and
      download speed calculation. One of the drawbacks is that speed will
      now look like this when download is stalled: "0 bytes/s" instead of
      "0 KiB/s".
      Signed-off-by: NAntoine Pelisse <apelisse@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      079b546a
  26. 17 1月, 2013 1 次提交
    • J
      Allow custom "comment char" · eff80a9f
      Junio C Hamano 提交于
      Some users do want to write a line that begin with a pound sign, #,
      in their commit log message.  Many tracking system recognise
      a token of #<bugid> form, for example.
      
      The support we offer these use cases is not very friendly to the end
      users.  They have a choice between
      
       - Don't do it.  Avoid such a line by rewrapping or indenting; and
      
       - Use --cleanup=whitespace but remove all the hint lines we add.
      
      Give them a way to set a custom comment char, e.g.
      
          $ git -c core.commentchar="%" commit
      
      so that they do not have to do either of the two workarounds.
      
      [jc: although I started the topic, all the tests and documentation
      updates, many of the call sites of the new strbuf_add_commented_*()
      functions, and the change to git-submodule.sh scripted Porcelain are
      from Ralf.]
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      Signed-off-by: NRalf Thielow <ralf.thielow@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      eff80a9f
  27. 27 11月, 2012 1 次提交
  28. 04 11月, 2012 3 次提交
  29. 19 10月, 2012 1 次提交
    • J
      strbuf: always return a non-NULL value from strbuf_detach · 08ad56f3
      Jeff King 提交于
      The current behavior is to return NULL when strbuf did not
      actually allocate a string. This can be quite surprising to
      callers, though, who may feed the strbuf from arbitrary data
      and expect to always get a valid value.
      
      In most cases, it does not make a difference because calling
      any strbuf function will cause an allocation (even if the
      function ends up not inserting any data). But if the code is
      structured like:
      
        struct strbuf buf = STRBUF_INIT;
        if (some_condition)
      	  strbuf_addstr(&buf, some_string);
        return strbuf_detach(&buf, NULL);
      
      then you may or may not return NULL, depending on the
      condition. This can cause us to segfault in http-push
      (when fed an empty URL) and in http-backend (when an empty
      parameter like "foo=bar&&" is in the $QUERY_STRING).
      
      This patch forces strbuf_detach to allocate an empty
      NUL-terminated string when it is called on a strbuf that has
      not been allocated.
      
      I investigated all call-sites of strbuf_detach. The majority
      are either not affected by the change (because they call a
      strbuf_* function unconditionally), or can handle the empty
      string just as easily as NULL.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      08ad56f3
  30. 16 9月, 2012 1 次提交