1. 09 1月, 2019 2 次提交
    • J
      sha1-file: modernize loose header/stream functions · 00a7760e
      Jeff King 提交于
      As with the open/map/close functions for loose objects that were
      recently converted, the functions for parsing the loose object stream
      use the name "sha1" and a bare "unsigned char *". Let's fix that so that
      unpack_sha1_header() becomes unpack_loose_header(), etc.
      
      These conversions are less clear-cut than the file access functions.
      You could argue that the they are parsing Git's canonical object format
      (i.e., "type size\0contents", over which we compute the hash), which is
      not strictly tied to loose storage. But in practice these functions are
      used only for loose objects, and using the term "loose_header" (instead
      of "object_header") distinguishes it from the object header found in
      packfiles (which contains the same information in a different format).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      00a7760e
    • J
      sha1-file: modernize loose object file functions · 514c5fdd
      Jeff King 提交于
      The loose object access code in sha1-file.c is some of the oldest in
      Git, and could use some modernizing. It mostly uses "unsigned char *"
      for object ids, which these days should be "struct object_id".
      
      It also uses the term "sha1_file" in many functions, which is confusing.
      The term "loose_objects" is much better. It clearly distinguishes
      them from packed objects (which didn't even exist back when the name
      "sha1_file" came into being). And it also distinguishes it from the
      checksummed-file concept in csum-file.c (which until recently was
      actually called "struct sha1file"!).
      
      This patch converts the functions {open,close,map,stat}_sha1_file() into
      open_loose_object(), etc, and switches their sha1 arguments for
      object_id structs. Similarly, path functions like fill_sha1_path()
      become fill_loose_path() and use object_ids.
      
      The function sha1_loose_object_info() already says "loose", so we can
      just drop the "sha1" (and teach it to use object_id).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      514c5fdd
  2. 31 10月, 2018 1 次提交
    • J
      read_istream_pack_non_delta(): document input handling · 0afbe3e8
      Jeff King 提交于
      Twice now we have scratched our heads about why the loose streaming code
      needs the protection added by 692f0bc7 (avoid infinite loop in
      read_istream_loose, 2013-03-25), but the similar code in its pack
      counterpart does not.
      
      The short answer is that use_pack() will die before it lets us run out
      of bytes. Note that this could mean reading garbage (including the
      trailing hash) from the packfile in some cases of corruption, but that's
      OK. zlib will notice and complain (and if not, certainly the end result
      will not match the object hash we expect).
      
      Let's leave a comment this time to document our findings.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0afbe3e8
  3. 26 4月, 2018 1 次提交
  4. 12 4月, 2018 2 次提交
  5. 27 3月, 2018 1 次提交
  6. 15 3月, 2018 6 次提交
  7. 14 9月, 2017 1 次提交
  8. 24 8月, 2017 1 次提交
  9. 27 9月, 2016 1 次提交
  10. 08 9月, 2016 1 次提交
  11. 12 8月, 2016 1 次提交
    • J
      provide an initializer for "struct object_info" · 27b5c1a0
      Jeff King 提交于
      An all-zero initializer is fine for this struct, but because
      the first element is a pointer, call sites need to know to
      use "NULL" instead of "0". Otherwise some static checkers
      like "sparse" will complain; see d099b717 (Fix some sparse
      warnings, 2013-07-18) for example.  So let's provide an
      initializer to make this easier to get right.
      
      But let's also comment that memset() to zero is explicitly
      OK[1]. One of the callers embeds object_info in another
      struct which is initialized via memset (expand_data in
      builtin/cat-file.c). Since our subset of C doesn't allow
      assignment from a compound literal, handling this in any
      other way is awkward, so we'd like to keep the ability to
      initialize by memset(). By documenting this property, it
      should make anybody who wants to change the initializer
      think twice before doing so.
      
      There's one other caller of interest. In parse_sha1_header(),
      we did not initialize the struct fully in the first place.
      This turned out not to be a bug because the sub-function it
      calls does not look at any other fields except the ones we
      did initialize. But that assumption might not hold in the
      future, so it's a dangerous construct. This patch switches
      it to initializing the whole struct, which protects us
      against unexpected reads of the other fields.
      
      [1] Obviously using memset() to initialize a pointer
          violates the C standard, but we long ago decided that it
          was an acceptable tradeoff in the real world.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      27b5c1a0
  12. 01 4月, 2015 1 次提交
    • J
      streaming.c: fix a memleak · 9ce4ad3e
      John Keeping 提交于
      When stream_blob_to_fd() opens an input stream with a filter, the
      filter gets discarded upon calling close_istream() before the
      function returns in the normal case.  However, when we fail to open
      the stream, we failed to discard the filter.
      
      By discarding the filter in the failure case, give a consistent
      life-time rule of the filter to the callers; otherwise the callers
      need to conditionally discard the filter themselves, and this
      function does not give enough hint for the caller to do so
      correctly.
      Signed-off-by: NJohn Keeping <john@keeping.me.uk>
      Signed-off-by: NStefan Beller <sbeller@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9ce4ad3e
  13. 19 2月, 2014 1 次提交
  14. 18 1月, 2014 1 次提交
  15. 13 12月, 2013 1 次提交
  16. 24 7月, 2013 1 次提交
  17. 19 7月, 2013 1 次提交
  18. 13 7月, 2013 1 次提交
    • J
      sha1_object_info_extended: make type calculation optional · 5b086407
      Jeff King 提交于
      Each caller of sha1_object_info_extended sets up an
      object_info struct to tell the function which elements of
      the object it wants to get. Until now, getting the type of
      the object has always been required (and it is returned via
      the return type rather than a pointer in object_info).
      
      This can involve actually opening a loose object file to
      determine its type, or following delta chains to determine a
      packed file's base type. These effects produce a measurable
      slow-down when doing a "cat-file --batch-check" that does
      not include %(objecttype).
      
      This patch adds a "typep" query to struct object_info, so
      that it can be optionally queried just like size and
      disk_size. As a result, the return type of the function is
      no longer the object type, but rather 0/-1 for success/error.
      
      As there are only three callers total, we just fix up each
      caller rather than keep a compatibility wrapper:
      
        1. The simpler sha1_object_info wrapper continues to
           always ask for and return the type field.
      
        2. The istream_source function wants to know the type, and
           so always asks for it.
      
        3. The cat-file batch code asks for the type only when
           %(objecttype) is part of the format string.
      
      On linux.git, the best-of-five for running:
      
        $ git rev-list --objects --all >objects
        $ time git cat-file --batch-check='%(objectsize:disk)'
      
      on a fully packed repository goes from:
      
        real    0m8.680s
        user    0m8.160s
        sys     0m0.512s
      
      to:
      
        real    0m7.205s
        user    0m6.580s
        sys     0m0.608s
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5b086407
  19. 08 7月, 2013 1 次提交
    • J
      zero-initialize object_info structs · 7c07385d
      Jeff King 提交于
      The sha1_object_info_extended function expects the caller to
      provide a "struct object_info" which contains pointers to
      "query" items that will be filled in. The purpose of
      providing pointers rather than storing the response directly
      in the struct is so that callers can choose not to incur the
      expense in finding particular fields that they do not care
      about.
      
      Right now the only query item is "sizep", and all callers
      set it explicitly to choose whether or not to query it; they
      can then leave the rest of the struct uninitialized.
      
      However, as we add new query items, each caller will have to
      be updated to explicitly turn off the new ones (by setting
      them to NULL).  Instead, let's teach each caller to
      zero-initialize the struct, so that they do not have to
      learn about each new query item added.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      7c07385d
  20. 28 3月, 2013 3 次提交
    • J
      avoid infinite loop in read_istream_loose · 692f0bc7
      Jeff King 提交于
      The read_istream_loose function loops on inflating a chunk of data
      from an mmap'd loose object. We end the loop when we run out
      of space in our output buffer, or if we see a zlib error.
      
      We need to treat Z_BUF_ERROR specially, though, as it is not
      fatal; it is just zlib's way of telling us that we need to
      either feed it more input or give it more output space. It
      is perfectly normal for us to hit this when we are at the
      end of our buffer.
      
      However, we may also get Z_BUF_ERROR because we have run out
      of input. In a well-formed object, this should not happen,
      because we have fed the whole mmap'd contents to zlib. But
      if the object is truncated or corrupt, we will loop forever,
      never giving zlib any more data, but continuing to ask it to
      inflate.
      
      We can fix this by considering it an error when zlib returns
      Z_BUF_ERROR but we still have output space left (which means
      it must want more input, which we know is a truncation
      error). It would not be sufficient to just check whether
      zlib had consumed all the input at the start of the loop, as
      it might still want to generate output from what is in its
      internal state.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      692f0bc7
    • J
      read_istream_filtered: propagate read error from upstream · 42e7e2a5
      Jeff King 提交于
      The filter istream pulls data from an "upstream" stream,
      running it through a filter function. However, we did not
      properly notice when the upstream filter yielded an error,
      and just returned what we had read. Instead, we should
      propagate the error.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      42e7e2a5
    • J
      stream_blob_to_fd: detect errors reading from stream · 45d4bdae
      Jeff King 提交于
      We call read_istream, but never check its return value for
      errors. This can lead to us looping infinitely, as we just
      keep trying to write "-1" bytes (and we do not notice the
      error, as we simply check that write_in_full reports the
      same number of bytes we fed it, which of course is also -1).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      45d4bdae
  21. 19 5月, 2012 1 次提交
  22. 04 5月, 2012 1 次提交
  23. 08 3月, 2012 1 次提交
  24. 23 7月, 2011 1 次提交
    • J
      streaming: free git_istream upon closing · 95dea6eb
      Jeff King 提交于
      Kirill Smelkov noticed that post-1.7.6 "git checkout"
      started leaking tons of memory. The streaming_write_entry
      function properly calls close_istream(), but that function
      did not actually free() the allocated git_istream struct.
      
      The git_istream struct is totally opaque to calling code,
      and must be heap-allocated by open_istream. Therefore it's
      not appropriate for callers to have to free it.
      
      This patch makes close_istream() into "close and de-allocate
      all associated resources". We could add a new "free_istream"
      call, but there's not much point in letting callers inspect
      the istream after close. And this patch's semantics make us
      match fopen/fclose, which is well-known and understood.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      95dea6eb
  25. 27 5月, 2011 2 次提交
    • J
      Add streaming filter API · b6691092
      Junio C Hamano 提交于
      This introduces an API to plug custom filters to an input stream.
      
      The caller gets get_stream_filter("path") to obtain an appropriate
      filter for the path, and then uses it when opening an input stream
      via open_istream().  After that, the caller can read from the stream
      with read_istream(), and close it with close_istream(), just like an
      unfiltered stream.
      
      This only adds a "null" filter that is a pass-thru filter, but later
      changes can add LF-to-CRLF and other filters, and the callers of the
      streaming API do not have to change.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b6691092
    • J
      stream filter: add "no more input" to the filters · 4ae66704
      Junio C Hamano 提交于
      Some filters may need to buffer the input and look-ahead inside it
      to decide what to output, and they may consume more than zero bytes
      of input and still not produce any output. After feeding all the
      input, pass NULL as input as keep calling stream_filter() to let
      such filters know there is no more input coming, and it is time for
      them to produce the remaining output based on the buffered input.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4ae66704
  26. 21 5月, 2011 3 次提交