1. 23 2月, 2006 6 次提交
    • J
      Make "empty ident" error message a bit more helpful. · 2fb4a210
      Junio C Hamano 提交于
      It appears that some people who did not care about having bogus
      names in their own commit messages are bitten by the recent
      change to require a sane environment [*1*].
      
      While it was a good idea to prevent people from using bogus
      names to create commits and doing sign-offs, the error message
      is not very informative.  This patch attempts to warn things
      upfront and hint people how to fix their environments.
      
      [Footnote]
      
      *1* The thread is this one.
      
          http://marc.theaimsgroup.com/?t=113868084800004
      
          Especially this message.
      
          http://marc.theaimsgroup.com/?m=113932830015032Signed-off-by: NJunio C Hamano <junkio@cox.net>
      2fb4a210
    • J
      pack-objects: avoid delta chains that are too long. · 15b4d577
      Junio C Hamano 提交于
      This tries to rework the solution for the excess delta chain
      problem. An earlier commit worked it around ``cheaply'', but
      repeated repacking risks unbound growth of delta chains.
      
      This version counts the length of delta chain we are reusing
      from the existing pack, and makes sure a base object that has
      sufficiently long delta chain does not get deltified.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      15b4d577
    • J
      git-repack: allow passing a couple of flags to pack-objects. · 4181bda1
      Junio C Hamano 提交于
      A new flag -q makes underlying pack-objects less chatty.
      A new flag -f forces delta to be recomputed from scratch.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      4181bda1
    • J
      pack-objects: finishing touches. · ab7cd7bb
      Junio C Hamano 提交于
      This introduces --no-reuse-delta option to disable reusing of
      existing delta, which is a large part of the optimization
      introduced by this series.  This may become necessary if
      repeated repacking makes delta chain too long.  With this, the
      output of the command becomes identical to that of the older
      implementation.  But the performance suffers greatly.
      
      It still allows reusing non-deltified representations; there is
      no point uncompressing and recompressing the whole text.
      
      It also adds a couple more statistics output, while squelching
      it under -q flag, which the last round forgot to do.
      
        $ time old-git-pack-objects --stdout >/dev/null <RL
        Generating pack...
        Done counting 184141 objects.
        Packing 184141 objects....................
        real    12m8.530s       user    11m1.450s       sys     0m57.920s
        $ time git-pack-objects --stdout >/dev/null <RL
        Generating pack...
        Done counting 184141 objects.
        Packing 184141 objects.....................
        Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
        real    0m59.549s       user    0m56.670s       sys     0m2.400s
        $ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
        Generating pack...
        Done counting 184141 objects.
        Packing 184141 objects.....................
        Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
        real    11m13.830s      user    9m45.240s       sys     0m44.330s
      
      There is one remaining issue when --no-reuse-delta option is not
      used.  It can create delta chains that are deeper than specified.
      
          A<--B<--C<--D   E   F   G
      
      Suppose we have a delta chain A to D (A is stored in full either
      in a pack or as a loose object. B is depth1 delta relative to A,
      C is depth2 delta relative to B...) with loose objects E, F, G.
      And we are going to pack all of them.
      
      B, C and D are left as delta against A, B and C respectively.
      So A, E, F, and G are examined for deltification, and let's say
      we decided to keep E expanded, and store the rest as deltas like
      this:
      
          E<--F<--G<--A
      
      Oops.  We ended up making D a bit too deep, didn't we?  B, C and
      D form a chain on top of A!
      
      This is because we did not know what the final depth of A would
      be, when we checked objects and decided to keep the existing
      delta.  Unfortunately, deferring the decision until just before
      the deltification is not an option.  To be able to make B, C,
      and D candidates for deltification with the rest, we need to
      know the type and final unexpanded size of them, but the major
      part of the optimization comes from the fact that we do not read
      the delta data to do so -- getting the final size is quite an
      expensive operation.
      
      To prevent this from happening, we should keep A from being
      deltified.  But how would we tell that, cheaply?
      
      To do this most precisely, after check_object() runs, each
      object that is used as the base object of some existing delta
      needs to be marked with the maximum depth of the objects we
      decided to keep deltified (in this case, D is depth 3 relative
      to A, so if no other delta chain that is longer than 3 based on
      A exists, mark A with 3).  Then when attempting to deltify A, we
      would take that number into account to see if the final delta
      chain that leads to D becomes too deep.
      
      However, this is a bit cumbersome to compute, so we would cheat
      and reduce the maximum depth for A arbitrarily to depth/4 in
      this implementation.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      ab7cd7bb
    • J
      pack-objects: reuse data from existing packs. · 3f9ac8d2
      Junio C Hamano 提交于
      When generating a new pack, notice if we have already needed
      objects in existing packs.  If an object is stored deltified,
      and its base object is also what we are going to pack, then
      reuse the existing deltified representation unconditionally,
      bypassing all the expensive find_deltas() and try_deltas()
      calls.
      
      Also, notice if what we are going to write out exactly match
      what is already in an existing pack (either deltified or just
      compressed).  In such a case, we can just copy it instead of
      going through the usual uncompressing & recompressing cycle.
      
      Without this patch, in linux-2.6 repository with about 1500
      loose objects and a single mega pack:
      
          $ git-rev-list --objects v2.6.16-rc3 >RL
          $ wc -l RL
          184141 RL
          $ time git-pack-objects p <RL
          Generating pack...
          Done counting 184141 objects.
          Packing 184141 objects....................
          a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
      
          real    12m4.323s
          user    11m2.560s
          sys     0m55.950s
      
      With this patch, the same input:
      
          $ time ../git.junio/git-pack-objects q <RL
          Generating pack...
          Done counting 184141 objects.
          Packing 184141 objects.....................
          a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
          Total 184141, written 184141, reused 182441
      
          real    1m2.608s
          user    0m55.090s
          sys     0m1.830s
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      3f9ac8d2
    • J
      detect broken alternates. · 26125f6b
      Junio C Hamano 提交于
      The real problem triggered an earlier fix was that an alternate
      entry was pointing at a removed directory.  Complaining on
      object/pack directory that cannot be opendir-ed produces noise
      in an ancient repository that does not have object/pack
      directory and has never been packed.
      
      Detect the real user error and report it.  Also if opendir
      failed for other reasons (e.g. no read permissions), report that
      as well.
      
      Spotted by Andrew Vasquez <andrew.vasquez@qlogic.com>.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      26125f6b
  2. 22 2月, 2006 3 次提交
  3. 19 2月, 2006 2 次提交
    • M
      Fix retries in git-cvsimport · 39ba7d54
      Martin Mares 提交于
      Fixed a couple of bugs in recovering from broken connections:
      
      The _line() method now returns undef correctly when the connection
      is broken instead of falling off the function and returning garbage.
      
      Retries are now reported to stderr and the eventual partially
      downloaded file is discarded instead of being appended to.
      
      The "Server gone away" test has been removed, because it was
      reachable only if the garbage return bug bit.
      Signed-off-by: NMartin Mares <mj@ucw.cz>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      39ba7d54
    • E
      archimport: remove files from the index before adding/updating · 3ff903bf
      Eric Wong 提交于
      This fixes a bug when importing where a directory gets removed/renamed
      but is immediately replaced by a file of the same name in the same
      changeset.
      
      This fix only applies to the accurate (default) strategy the moment.
      
      This patch should also fix the fast strategy if/when it is updated
      to handle the cases that would've triggered this bug.
      
      This bug was originally found in git-svn, but I remembered I did the
      same thing with archimport as well.
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      3ff903bf
  4. 18 2月, 2006 6 次提交
  5. 16 2月, 2006 2 次提交
  6. 15 2月, 2006 1 次提交
  7. 14 2月, 2006 5 次提交
  8. 13 2月, 2006 8 次提交
  9. 12 2月, 2006 7 次提交
    • J
      hashtable-based objects: minimum fixups. · 2b796360
      Junio C Hamano 提交于
      Calling hashtable_index from find_object before objs is created
      would result in division by zero failure.  Avoid it.
      
      Also the given object name may not be aligned suitably for
      unsigned int; avoid dereferencing casted pointer.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      2b796360
    • J
      Use a hashtable for objects instead of a sorted list · 070879ca
      Johannes Schindelin 提交于
      In a simple test, this brings down the CPU time from 47 sec to 22 sec.
      Signed-off-by: NJohannes Schindelin <Johannes.Schindelin@gmx.de>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      070879ca
    • K
      Add howto about separating topics. · 5b766ea9
      kent@lysator.liu.se 提交于
      This howto consists of a footnote from an email by JC to the git
      mailing list (<7vfyms0x4p.fsf@assigned-by-dhcp.cox.net>).
      Signed-off-by: NKent Engstrom <kent@lysator.liu.se>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      5b766ea9
    • J
      Merge branch 'pb/repo' · af8c28e1
      Junio C Hamano 提交于
      * pb/repo:
        Add support for explicit type specifiers when calling git-repo-config
      af8c28e1
    • J
      Merge branch 'jc/fixdiff' · c611db19
      Junio C Hamano 提交于
      * jc/fixdiff:
        diff-tree: do not default to -c
      c611db19
    • J
      Avoid using "git-var -l" until it gets fixed. · 4890f62b
      Junio C Hamano 提交于
      This is to be nicer to people with unusable GECOS field.
      
      "git-var -l" is currently broken in that when used by a user who
      does not have a usable GECOS field and has not corrected it by
      exporting GIT_COMMITTER_NAME environment variable it dies when
      it tries to output GIT_COMMITTER_IDENT (same thing for AUTHOR).
      
      "git-pull" used "git-var -l" only because it needed to get a
      configuration variable before "git-repo-config --get" was
      introduced.  Use the latter tool designed exactly for this
      purpose.
      
      "git-sh-setup" used "git-var GIT_AUTHOR_IDENT" without actually
      wanting to use its value.  The only purpose was to cause the
      command to check and barf if the repository format version
      recorded in the $GIT_DIR/config file is too new for us to deal
      with correctly.  Instead, use "repo-config --get" on a random
      property and see if it die()s, and check if the exit status is
      128 (comes from die -- missing variable is reported with exit
      status 1, so we can tell that case apart).
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      4890f62b
    • P
      Add support for explicit type specifiers when calling git-repo-config · 7162dff3
      Petr Baudis 提交于
      Currently, git-repo-config will just return the raw value of option
      as specified in the config file; this makes things difficult for scripts
      calling it, especially if the value is supposed to be boolean.
      
      This patch makes it possible to ask git-repo-config to check if the option
      is of the given type (int or bool) and write out the value in its
      canonical form. If you do not pass --int or --bool, the behaviour stays
      unchanged and the raw value is emitted.
      
      This also incidentally fixes the segfault when option with no value is
      encountered.
      
      [jc: tweaked the option parsing a bit to make it easier to see
       that the patch does not change anything but the type stuff in
       the diff output.  Also changed to avoid "foo ? : bar" construct. ]
      Signed-off-by: NPetr Baudis <pasky@suse.cz>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      7162dff3