1. 23 10月, 2005 2 次提交
    • L
      git-rev-list: add "--dense" flag · 1b9e059d
      Linus Torvalds 提交于
      This is what the recent git-rev-list changes have all been gearing up for.
      
      When we use a path filter to git-rev-list, the new "--dense" flag asks
      git-rev-list to compress the history so that it _only_ contains commits
      that change files in the path filter.  It also rewrites the parent
      information so that tools like "gitk" will see the result as a dense
      history tree.
      
      For example, on the current kernel archive:
      
      	[torvalds@g5 linux]$ git-rev-list HEAD | wc -l
      	9904
      	[torvalds@g5 linux]$ git-rev-list HEAD -- kernel | wc -l
      	5442
      	[torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel | wc -l
      	356
      
      which shows that while we have almost ten thousand commits, we can prune
      down the work to slightly more than half by only following the merges
      that are interesting. But further, we can then compress the history to
      just 356 entries that actually make changes to the kernel subdirectory.
      
      To see this in action, try something like
      
      	gitk --dense -- gitk
      
      to see just the history that affects gitk.  Or, to show that true
      parallel development still remains parallel, do
      
      	gitk --dense -- daemon.c
      
      which shows some parallel commits in the current git tree.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      1b9e059d
    • L
      Teach git-rev-list to follow just a specified set of files · cf484544
      Linus Torvalds 提交于
      This is the first cut at a git-rev-list that knows to ignore commits that
      don't change a certain file (or set of files).
      
      NOTE! For now it only prunes _merge_ commits, and follows the parent where
      there are no differences in the set of files specified. In the long run,
      I'd like to make it re-write the straight-line history too, but for now
      the merge simplification is much more fundamentally important (the
      rewriting of straight-line history is largely a separate simplification
      phase, but the merge simplification needs to happen early if we want to
      optimize away unnecessary commit parsing).
      
      If all parents of a merge change some of the files, the merge is left as
      is, so the end result is in no way guaranteed to be a linear history, but
      it will often be a lot /more/ linear than the full tree, since it prunes
      out parents that didn't matter for that set of files.
      
      As an example from the current kernel:
      
      	[torvalds@g5 linux]$ git-rev-list HEAD | wc -l
      	9885
      	[torvalds@g5 linux]$ git-rev-list HEAD -- Makefile | wc -l
      	4084
      	[torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb | wc -l
      	5206
      
      and you can also use 'gitk' to more visually see the pruning of the
      history tree, with something like
      
      	gitk -- drivers/usb
      
      showing a simplified history that tries to follow the first parent in a
      merge that is the parent that fully defines drivers/usb/.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      cf484544
  2. 19 10月, 2005 1 次提交
    • L
      Optimize common case of git-rev-list · fe5f51ce
      Linus Torvalds 提交于
      I took a look at webgit, and it looks like at least for the "projects"
      page, the most common operation ends up being basically
      
      	git-rev-list --header --parents --max-count=1 HEAD
      
      Now, the thing is, the way "git-rev-list" works, it always keeps on
      popping the parents and parsing them in order to build the list of
      parents, and it turns out that even though we just want a single commit,
      git-rev-list will invariably look up _three_ generations of commits.
      
      It will parse:
       - the commit we want (it obviously needs this)
       - it's parent(s) as part of the "pop_most_recent_commit()" logic
       - it will then pop one of the parents before it notices that it doesn't
         need any more
       - and as part of popping the parent, it will parse the grandparent (again
         due to "pop_most_recent_commit()".
      
      Now, I've strace'd it, and it really is pretty efficient on the whole, but
      if things aren't nicely cached, and with long-latency IO, doing those two
      extra objects (at a minimum - if the parent is a merge it will be more) is
      just wasted time, and potentially a lot of it.
      
      So here's a quick special-case for the trivial case of "just one commit,
      and no date-limits or other special rules".
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      fe5f51ce
  3. 06 10月, 2005 1 次提交
    • J
      upload-pack: Do not choke on too many heads request. · e091eb93
      Junio C Hamano 提交于
      Cloning from a repository with more than 256 refs (heads and tags
      included) will choke, because upload-pack has a built-in limit of
      feeding not more than MAX_NEEDS (currently 256) heads to underlying
      git-rev-list.  This is a problem when cloning a repository with many
      tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+
      tags.
      
      This commit introduces a new flag, --all, to git-rev-list, to include
      all refs in the repository.  Updated upload-pack detects requests that
      ask more than MAX_NEEDS refs, and sends everything back instead.
      
      We may probably want to tweak the definitions of MAX_NEEDS and
      MAX_HAS, but that is a separate topic.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      e091eb93
  4. 03 10月, 2005 1 次提交
  5. 21 9月, 2005 1 次提交
  6. 18 9月, 2005 1 次提交
  7. 17 9月, 2005 2 次提交
    • L
      [PATCH] Avoid building object ref lists when not needed · 8805ccac
      Linus Torvalds 提交于
      The object parsing code builds a generic "this object references that
      object" because doing a full connectivity check for fsck requires it.
      
      However, nothing else really needs it, and it's quite expensive for
      git-rev-list that can have tons of objects in flight.
      
      So, exactly like the commit buffer save thing, add a global flag to
      disable it, and use it in git-rev-list.
      
      Before:
      
      	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+26718minor)pagefaults 0swaps
      	59124
      
      After this change:
      
      	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	10.33user 0.18system 0:10.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+18509minor)pagefaults 0swaps
      	59124
      
      and note how the number of pages touched by git-rev-list for this
      particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB).
      
      Calculating the total object difference between two git revisions is still
      clearly the most expensive git operation (both in memory and CPU time),
      but it's now less than 40% of what it used to be.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      8805ccac
    • L
      [PATCH] Improve git-rev-list memory usage further · b0d8923e
      Linus Torvalds 提交于
      This avoids keeping tree entries around, and free's them as it traverses
      the list. This avoids building up a huge memory footprint just for these
      small but very common allocations.
      
      Before:
      
      	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+42934minor)pagefaults 0swaps
      	59124
      
      After:
      
      	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+26718minor)pagefaults 0swaps
      	59124
      
      Note how the minor fault numbers - which ends up being how many pages we
      needed to map - go down from 42934 (167 MB) to 26718 (104 MB).  That is:
      
      Before:
      	42934 minor pagefaults
      
      After:
      
      	26718 minor pagefaults
      
      This is all in _addition_ to the previous fixes.  It used to be
      ~48,000 pagefaults.
      
      That's still a honking big memory footprint, but it's about half of what
      it was just a day or two ago (and this is the object list for a pretty big
      update - almost 60,000 objects. Smaller updates need less memory).
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      b0d8923e
  8. 16 9月, 2005 2 次提交
    • L
      [PATCH] Re-organize "git-rev-list --objects" logic · 5bdbaaa4
      Linus Torvalds 提交于
      The logic to calculate the full object list used to be very inter-twined
      with the logic that looked up the commits.
      
      For no good reason - it's actually a lot simpler to just do that logic
      as a separate pass.
      
      This improves performance a bit, and uses slightly less memory in my
      tests, but more importantly it makes the code simpler to work with and
      follow what it does.
      
      The performance win is less than I had hoped for, but I get:
      
      Before:
      
      	[torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	13.64user 0.42system 0:14.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+47947minor)pagefaults 0swaps
      	58945
      
      After:
      
      	[torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
      	11.80user 0.36system 0:12.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+42684minor)pagefaults 0swaps
      	58945
      
      ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's
      20MB out of 174MB to go). And got the same number of objects (in theory,
      the more expensive one might find some more shared objects to avoid. In
      practice it obviously doesn't).
      
      I know how to make it use _lots_ less memory, which will probably speed it
      up. But that's for another time, and I'd prefer to see this go in first.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      5bdbaaa4
    • L
      [PATCH] Avoid wasting memory in git-rev-list · 60ab26de
      Linus Torvalds 提交于
      As pointed out on the list, git-rev-list can use a lot of memory.
      
      One low-hanging fruit is to free the commit buffer for commits that we
      parse. By default, parse_commit() will save away the buffer, since a lot
      of cases do want it, and re-reading it continually would be unnecessary.
      However, in many cases the buffer isn't actually necessary and saving it
      just wastes memory.
      
      We could just free the buffer ourselves, but especially in git-rev-list,
      we actually end up using the helper functions that automatically add
      parent commits to the commit lists, so we don't actually control the
      commit parsing directly.
      
      Instead, just make this behaviour of "parse_commit()" a global flag.
      Maybe this is a bit tasteless, but it's very simple, and it makes a
      noticable difference in memory usage.
      
      Before the change:
      
      	[torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null
      	0.26user 0.02system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+3714minor)pagefaults 0swaps
      
      after the change:
      
      	[torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null
      	0.26user 0.00system 0:00.27elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
      	0inputs+0outputs (0major+2433minor)pagefaults 0swaps
      
      note how the minor faults have decreased from 3714 pages to 2433 pages.
      That's all due to the fewer anonymous pages allocated to hold the comment
      buffers and their metadata.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      60ab26de
  9. 25 8月, 2005 1 次提交
    • P
      [PATCH] Fix "prefix" mixup in git-rev-list · d998a089
      Pavel Roskin 提交于
      Recent changes in git have broken cg-log.  git-rev-list no longer
      prints "commit" in front of commit hashes.  It turn out a local
      "prefix" variable in main() shadows a file-scoped "prefix" variable.
      
      The patch removed the local "prefix" variable since its value is never
      used (in the intended way, that is).  The call to
      setup_git_directory() is kept since it has useful side effects.
      
      The file-scoped "prefix" variable is renamed to "commit_prefix" just
      in case someone reintroduces "prefix" to hold the return value of
      setup_git_directory().
      Signed-off-by: NPavel Roskin <proski@gnu.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      d998a089
  10. 24 8月, 2005 1 次提交
  11. 20 8月, 2005 1 次提交
  12. 10 8月, 2005 2 次提交
  13. 05 8月, 2005 1 次提交
    • J
      Teach rev-list since..til notation. · 1215879c
      Junio C Hamano 提交于
      The King Penguin says:
      
          Now, for extra bonus points, maybe you should make "git-rev-list" also
          understand the "rev..rev" format (which you can't do with just the
          get_sha1() interface, since it expands into more).
      
      The faithful servant makes it so.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      1215879c
  14. 30 7月, 2005 2 次提交
    • P
      [PATCH] Support for NO_OPENSSL · dd53c7ab
      Petr Baudis 提交于
      Support for completely OpenSSL-less builds. FSF considers distributing GPL
      binaries with OpenSSL linked in as a legal problem so this is trouble
      e.g. for Debian, or some people might not want to install OpenSSL
      anyway. If you
      
      	make NO_OPENSSL=1
      
      you get completely OpenSSL-less build, disabling --merge-order and using
      Mozilla's SHA1 implementation.
      
      Ported from Cogito.
      Signed-off-by: NPetr Baudis <pasky@ucw.cz>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      dd53c7ab
    • L
      [PATCH] Fix interesting git-rev-list corner case · 6c3b84c8
      Linus Torvalds 提交于
      This corner-case was triggered by a kernel commit that was not in date
      order, due to a misconfigured time zone that made the commit appear three
      hours older than it was.
      
      That caused git-rev-list to traverse the commit tree in a non-obvious
      order, and made it parse several of the _parents_ of the misplaced commit
      before it actually parsed the commit itself. That's fine, but it meant
      that the grandparents of the commit didn't get marked uninteresting,
      because they had been reached through an "interesting" branch.
      
      The reason was that "mark_parents_uninteresting()" (which is supposed to
      mark all existing parents as being uninteresting - duh) didn't actually
      traverse more than one level down the parent chain.
      
      NORMALLY this is fine, since with the date-based traversal order,
      grandparents won't ever even have been looked at before their parents (so
      traversing the chain down isn't needed, because the next time around when
      we pick out the parent we'll mark _its_ parents uninteresting), but since
      we'd gotten out of order, we'd already seen the parent and thus never got
      around to mark the grandparents.
      
      Anyway, the fix is simple. Just traverse parent chains recursively.
      Normally the chain won't even exist (since the parent hasn't been parsed
      yet), so this is not actually going to trigger except in this strange
      corner-case.
      
      Add a comment to the simple one-liner, since this was a bit subtle, and I
      had to really think things through to understand how it could happen.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      6c3b84c8
  15. 28 7月, 2005 1 次提交
  16. 24 7月, 2005 1 次提交
    • L
      Be more aggressive about marking trees uninteresting · 4311d328
      Linus Torvalds 提交于
      We'll mark all the trees at the edges (as deep as we had to go to
      realize that we have all the commits needed) as uninteresting.
      Otherwise we'll occasionally list a lot of objects that were actually
      available at the edge in a commit that we just never ended up parsing
      because we could determine early that we had all relevant commits.
      
      NOTE! The object listing is still just a _heuristic_.  It's guaranteed
      to list a superset of the actual new objects, but there might be the
      occasional old object in the list, just because the commit that
      referenced it was much further back in the history.
      
      For example, let's say that a recent commit is a revert of part of the
      tree to much older state: since we didn't walk _that_ far back in the
      commit history tree to list the commits necessary, git-rev-tree will
      never have marked the old objects uninteresting, and we'll end up
      listing them as "new".
      
      That's ok.
      4311d328
  17. 12 7月, 2005 1 次提交
    • J
      [PATCH] Dereference tag repeatedly until we get a non-tag. · 013aab82
      Junio C Hamano 提交于
      When we allow a tag object in place of a commit object, we only
      dereferenced the given tag once, which causes a tag that points at a tag
      that points at a commit to be rejected.  Instead, dereference tag
      repeatedly until we get a non-tag.
      
      This patch makes change to two functions:
      
       - commit.c::lookup_commit_reference() is used by merge-base,
         rev-tree and rev-parse to convert user supplied SHA1 to that of
         a commit.
       - rev-list uses its own get_commit_reference() to do the same.
      
      Dereferencing tags this way helps both of these uses.
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      013aab82
  18. 11 7月, 2005 1 次提交
  19. 07 7月, 2005 8 次提交
  20. 06 7月, 2005 1 次提交
  21. 05 7月, 2005 2 次提交
  22. 04 7月, 2005 3 次提交
  23. 30 6月, 2005 2 次提交
  24. 27 6月, 2005 1 次提交