1. 24 5月, 2018 1 次提交
  2. 06 5月, 2018 1 次提交
  3. 26 4月, 2018 2 次提交
  4. 16 4月, 2018 14 次提交
    • N
      pack-objects: show some progress when counting kept objects · 5af05043
      Nguyễn Thái Ngọc Duy 提交于
      We only show progress when there are new objects to be packed. But
      when --keep-pack is specified on the base pack, we will exclude most
      of objects. This makes 'pack-objects' stay silent for a long time
      while the counting phase is going.
      
      Let's show some progress whenever we visit an object instead. The old
      "Counting objects" is renamed to "Enumerating objects" and a new
      progress "Counting objects" line is added.
      
      This new "Counting objects" line should progress pretty quick when the
      system is beefy. But when the system is under pressure, the reading
      object header done in this phase could be slow and showing progress is
      an improvement over staying silent in the current code.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5af05043
    • N
      gc --auto: exclude base pack if not enough mem to "repack -ad" · 9806f5a7
      Nguyễn Thái Ngọc Duy 提交于
      pack-objects could be a big memory hog especially on large repos,
      everybody knows that. The suggestion to stick a .keep file on the
      giant base pack to avoid this problem is also known for a long time.
      
      Recent patches add an option to do just this, but it has to be either
      configured or activated manually. This patch lets `git gc --auto`
      activate this mode automatically when it thinks `repack -ad` will use
      a lot of memory and start affecting the system due to swapping or
      flushing OS cache.
      
      gc --auto decides to do this based on an estimation of pack-objects
      memory usage, which is quite accurate at least for the heap part, and
      whether that fits in half of system memory (the assumption here is for
      desktop environment where there are many other applications running).
      
      This mechanism only kicks in if gc.bigBasePackThreshold is not configured.
      If it is, it is assumed that the user already knows what they want.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9806f5a7
    • N
      repack: add --keep-pack option · ed7e5fc3
      Nguyễn Thái Ngọc Duy 提交于
      We allow to keep existing packs by having companion .keep files. This
      is helpful when a pack is permanently kept. In the next patch, git-gc
      just wants to keep a pack temporarily, for one pack-objects
      run. git-gc can use --keep-pack for this use case.
      
      A note about why the pack_keep field cannot be reused and
      pack_keep_in_core has to be added. This is about the case when
      --keep-pack is specified together with either --keep-unreachable or
      --unpack-unreachable, but --honor-pack-keep is NOT specified.
      
      In this case, we want to exclude objects from the packs specified on
      command line, not from ones with .keep files. If only one bit flag is
      used, we have to clear pack_keep on pack files with the .keep file.
      
      But we can't make any assumption about unreachable objects in .keep
      packs. If "pack_keep" field is false for .keep packs, we could
      potentially pull lots of unreachable objects into the new pack, or
      unpack them loose. The safer approach is ignore all packs with either
      .keep file or --keep-pack.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ed7e5fc3
    • N
      pack-objects: shrink delta_size field in struct object_entry · 0aca34e8
      Nguyễn Thái Ngọc Duy 提交于
      Allowing a delta size of 64 bits is crazy. Shrink this field down to
      20 bits with one overflow bit.
      
      If we find an existing delta larger than 1MB, we do not cache
      delta_size at all and will get the value from oe_size(), potentially
      from disk if it's larger than 4GB.
      
      Note, since DELTA_SIZE() is used in try_delta() code, it must be
      thread-safe. Luckily oe_size() does guarantee this so we it is
      thread-safe.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0aca34e8
    • N
      pack-objects: shrink size field in struct object_entry · ac77d0c3
      Nguyễn Thái Ngọc Duy 提交于
      It's very very rare that an uncompressed object is larger than 4GB
      (partly because Git does not handle those large files very well to
      begin with). Let's optimize it for the common case where object size
      is smaller than this limit.
      
      Shrink size field down to 31 bits and one overflow bit. If the size is
      too large, we read it back from disk. As noted in the previous patch,
      we need to return the delta size instead of canonical size when the
      to-be-reused object entry type is a delta instead of a canonical one.
      
      Add two compare helpers that can take advantage of the overflow
      bit (e.g. if the file is 4GB+, chances are it's already larger than
      core.bigFileThreshold and there's no point in comparing the actual
      value).
      
      Another note about oe_get_size_slow(). This function MUST be thread
      safe because SIZE() macro is used inside try_delta() which may run in
      parallel. Outside parallel code, no-contention locking should be dirt
      cheap (or insignificant compared to i/o access anyway). To exercise
      this code, it's best to run the test suite with something like
      
          make test GIT_TEST_OE_SIZE=4
      
      which forces this code on all objects larger than 3 bytes.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ac77d0c3
    • N
      pack-objects: clarify the use of object_entry::size · 27a7d067
      Nguyễn Thái Ngọc Duy 提交于
      While this field most of the time contains the canonical object size,
      there is one case it does not: when we have found that the base object
      of the delta in question is also to be packed, we will very happily
      reuse the delta by copying it over instead of regenerating the new
      delta.
      
      "size" in this case will record the delta size, not canonical object
      size. Later on in write_reuse_object(), we reconstruct the delta
      header and "size" is used for this purpose. When this happens, the
      "type" field contains a delta type instead of a canonical type.
      Highlight this in the code since it could be tricky to see.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      27a7d067
    • N
      pack-objects: don't check size when the object is bad · 660b3735
      Nguyễn Thái Ngọc Duy 提交于
      sha1_object_info() in check_objects() may fail to locate an object in
      the pack and return type OBJ_BAD. In that case, it will likely leave
      the "size" field untouched. We delay error handling until later in
      prepare_pack() though. Until then, do not touch "size" field.
      
      This field should contain the default value zero, but we can't say
      sha1_object_info() cannot damage it. This becomes more important later
      when the object size may have to be retrieved back from the
      (non-existing) pack.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      660b3735
    • N
      pack-objects: shrink z_delta_size field in struct object_entry · 0cb3c142
      Nguyễn Thái Ngọc Duy 提交于
      We only cache deltas when it's smaller than a certain limit. This limit
      defaults to 1000 but save its compressed length in a 64-bit field.
      Shrink that field down to 20 bits, so you can only cache 1MB deltas.
      Larger deltas must be recomputed at when the pack is written down.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0cb3c142
    • N
      pack-objects: refer to delta objects by index instead of pointer · 898eba5e
      Nguyễn Thái Ngọc Duy 提交于
      These delta pointers always point to elements in the objects[] array
      in packing_data struct. We can only hold maximum 4G of those objects
      because the array size in nr_objects is uint32_t. We could use
      uint32_t indexes to address these elements instead of pointers. On
      64-bit architecture (8 bytes per pointer) this would save 4 bytes per
      pointer.
      
      Convert these delta pointers to indexes. Since we need to handle NULL
      pointers as well, the index is shifted by one [1].
      
      [1] This means we can only index 2^32-2 objects even though nr_objects
          could contain 2^32-1 objects. It should not be a problem in
          practice because when we grow objects[], nr_alloc would probably
          blow up long before nr_objects hits the wall.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      898eba5e
    • N
      pack-objects: move in_pack out of struct object_entry · 43fa44fa
      Nguyễn Thái Ngọc Duy 提交于
      Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
      pack. Use an index instead since the number of packs should be
      relatively small.
      
      This limits the number of packs we can handle to 1k. Since we can't be
      sure people can never run into the situation where they have more than
      1k pack files. Provide a fall back route for it.
      
      If we find out they have too many packs, the new in_pack_by_idx[]
      array (which has at most 1k elements) will not be used. Instead we
      allocate in_pack[] array that holds nr_objects elements. This is
      similar to how the optional in_pack_pos field is handled.
      
      The new simple test is just to make sure the too-many-packs code path
      is at least executed. The true test is running
      
          make test GIT_TEST_FULL_IN_PACK_ARRAY=1
      
      to take advantage of other special case tests.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      43fa44fa
    • N
      pack-objects: move in_pack_pos out of struct object_entry · 06af3bba
      Nguyễn Thái Ngọc Duy 提交于
      This field is only need for pack-bitmap, which is an optional
      feature. Move it to a separate array that is only allocated when
      pack-bitmap is used (like objects[], it is not freed, since we need it
      until the end of the process)
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      06af3bba
    • N
      pack-objects: use bitfield for object_entry::depth · b5c0cbd8
      Nguyễn Thái Ngọc Duy 提交于
      Because of struct packing from now on we can only handle max depth
      4095 (or even lower when new booleans are added in this struct). This
      should be ok since long delta chain will cause significant slow down
      anyway.
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b5c0cbd8
    • N
    • N
      pack-objects: turn type and in_pack_type to bitfields · fd9b1bae
      Nguyễn Thái Ngọc Duy 提交于
      An extra field type_valid is added to carry the equivalent of OBJ_BAD
      in the original "type" field. in_pack_type always contains a valid
      type so we only need 3 bits for it.
      
      A note about accepting OBJ_NONE as "valid" type. The function
      read_object_list_from_stdin() can pass this value [1] and it
      eventually calls create_object_entry() where current code skip setting
      "type" field if the incoming type is zero. This does not have any bad
      side effects because "type" field should be memset()'d anyway.
      
      But since we also need to set type_valid now, skipping oe_set_type()
      leaves type_valid zero/false, which will make oe_type() return
      OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
      prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger
      
          fatal: unable to get type of object ...
      
      Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
      been for a very long time and I haven't time to dig in further.
      
      [1] See 5c49c116 (pack-objects: better check_object() performances -
          2007-04-16)
      
      [2] 21666f1a (convert object type handling from a string to a number
          - 2007-02-26)
      Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fd9b1bae
  5. 03 4月, 2018 2 次提交
  6. 27 3月, 2018 3 次提交
  7. 15 3月, 2018 4 次提交
  8. 07 3月, 2018 1 次提交
  9. 15 2月, 2018 1 次提交
  10. 03 2月, 2018 1 次提交
  11. 31 1月, 2018 1 次提交
  12. 25 1月, 2018 1 次提交
  13. 09 12月, 2017 1 次提交
  14. 22 11月, 2017 1 次提交
    • J
      pack-objects: add list-objects filtering · 9535ce73
      Jeff Hostetler 提交于
      Teach pack-objects to use the filtering provided by the
      traverse_commit_list_filtered() interface to omit unwanted
      objects from the resulting packfile.
      
      Filtering requires the use of the "--stdout" option.
      
      Add t5317 test.
      
      In the future, we will introduce a "partial clone" mechanism
      wherein an object in a repo, obtained from a remote, may
      reference a missing object that can be dynamically fetched from
      that remote once needed.  This "partial clone" mechanism will
      have a way, sometimes slow, of determining if a missing link
      is one of the links expected to be produced by this mechanism.
      
      This patch introduces handling of missing objects to help
      debugging and development of the "partial clone" mechanism,
      and once the mechanism is implemented, for a power user to
      perform operations that are missing-object aware without
      incurring the cost of checking if a missing link is expected.
      Signed-off-by: NJeff Hostetler <jeffhost@microsoft.com>
      Reviewed-by: NJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9535ce73
  15. 16 10月, 2017 3 次提交
  16. 10 10月, 2017 1 次提交
  17. 01 10月, 2017 1 次提交
  18. 22 9月, 2017 1 次提交