1. 03 11月, 2008 1 次提交
    • N
      close another possibility for propagating pack corruption · 0e8189e2
      Nicolas Pitre 提交于
      Abstract
      --------
      
      With index v2 we have a per object CRC to allow quick and safe reuse of
      pack data when repacking.  This, however, doesn't currently prevent a
      stealth corruption from being propagated into a new pack when _not_
      reusing pack data as demonstrated by the modification to t5302 included
      here.
      
      The Context
      -----------
      
      The Git database is all checksummed with SHA1 hashes.  Any kind of
      corruption can be confirmed by verifying this per object hash against
      corresponding data.  However this can be costly to perform systematically
      and therefore this check is often not performed at run time when
      accessing the object database.
      
      First, the loose object format is entirely compressed with zlib which
      already provide a CRC verification of its own when inflating data.  Any
      disk corruption would be caught already in this case.
      
      Then, packed objects are also compressed with zlib but only for their
      actual payload.  The object headers and delta base references are not
      deflated for obvious performance reasons, however this leave them
      vulnerable to potentially undetected disk corruptions.  Object types
      are often validated against the expected type when they're requested,
      and deflated size must always match the size recorded in the object header,
      so those cases are pretty much covered as well.
      
      Where corruptions could go unnoticed is in the delta base reference.
      Of course, in the OBJ_REF_DELTA case,  the odds for a SHA1 reference to
      get corrupted so it actually matches the SHA1 of another object with the
      same size (the delta header stores the expected size of the base object
      to apply against) are virtually zero.  In the OBJ_OFS_DELTA case, the
      reference is a pack offset which would have to match the start boundary
      of a different base object but still with the same size, and although this
      is relatively much more "probable" than in the OBJ_REF_DELTA case, the
      probability is also about zero in absolute terms.  Still, the possibility
      exists as demonstrated in t5302 and is certainly greater than a SHA1
      collision, especially in the OBJ_OFS_DELTA case which is now the default
      when repacking.
      
      Again, repacking by reusing existing pack data is OK since the per object
      CRC provided by index v2 guards against any such corruptions. What t5302
      failed to test is a full repack in such case.
      
      The Solution
      ------------
      
      As unlikely as this kind of stealth corruption can be in practice, it
      certainly isn't acceptable to propagate it into a freshly created pack.
      But, because this is so unlikely, we don't want to pay the run time cost
      associated with extra validation checks all the time either.  Furthermore,
      consequences of such corruption in anything but repacking should be rather
      visible, and even if it could be quite unpleasant, it still has far less
      severe consequences than actively creating bad packs.
      
      So the best compromize is to check packed object CRC when unpacking
      objects, and only during the compression/writing phase of a repack, and
      only when not streaming the result.  The cost of this is minimal (less
      than 1% CPU time), and visible only with a full repack.
      
      Someone with a stats background could provide an objective evaluation of
      this, but I suspect that it's bad RAM that has more potential for data
      corruptions at this point, even in those cases where this extra check
      is not performed.  Still, it is best to prevent a known hole for
      corruption when recreating object data into a new pack.
      
      What about the streamed pack case?  Well, any client receiving a pack
      must always consider that pack as untrusty and perform full validation
      anyway, hence no such stealth corruption could be propagated to remote
      repositoryes already.  It is therefore worthless doing local validation
      in that case.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0e8189e2
  2. 23 10月, 2008 1 次提交
  3. 22 10月, 2008 1 次提交
  4. 04 9月, 2008 1 次提交
  5. 14 7月, 2008 1 次提交
  6. 25 6月, 2008 1 次提交
  7. 14 5月, 2008 1 次提交
  8. 13 3月, 2008 1 次提交
  9. 02 2月, 2008 1 次提交
    • J
      Sane use of test_expect_failure · 41ac414e
      Junio C Hamano 提交于
      Originally, test_expect_failure was designed to be the opposite
      of test_expect_success, but this was a bad decision.  Most tests
      run a series of commands that leads to the single command that
      needs to be tested, like this:
      
          test_expect_{success,failure} 'test title' '
      	setup1 &&
              setup2 &&
              setup3 &&
              what is to be tested
          '
      
      And expecting a failure exit from the whole sequence misses the
      point of writing tests.  Your setup$N that are supposed to
      succeed may have failed without even reaching what you are
      trying to test.  The only valid use of test_expect_failure is to
      check a trivial single command that is expected to fail, which
      is a minority in tests of Porcelain-ish commands.
      
      This large-ish patch rewrites all uses of test_expect_failure to
      use test_expect_success and rewrites the condition of what is
      tested, like this:
      
          test_expect_success 'test title' '
      	setup1 &&
              setup2 &&
              setup3 &&
              ! this command should fail
          '
      
      test_expect_failure is redefined to serve as a reminder that
      that test *should* succeed but due to a known breakage in git it
      currently does not pass.  So if git-foo command should create a
      file 'bar' but you discovered a bug that it doesn't, you can
      write a test like this:
      
          test_expect_failure 'git-foo should create bar' '
              rm -f bar &&
              git foo &&
              test -f bar
          '
      
      This construct acts similar to test_expect_success, but instead
      of reporting "ok/FAIL" like test_expect_success does, the
      outcome is reported as "FIXED/still broken".
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      41ac414e
  10. 16 11月, 2007 1 次提交
  11. 15 11月, 2007 1 次提交
  12. 03 7月, 2007 1 次提交
  13. 03 5月, 2007 1 次提交
  14. 24 4月, 2007 1 次提交
  15. 12 4月, 2007 1 次提交