- 18 6月, 2006 2 次提交
-
-
由 Linus Torvalds 提交于
This is really the dregs of my effort to not waste memory in git-rev-list, and makes barely one percent of a difference in the memory footprint, but hey, it's also a pretty small patch. It discards the parent lists and the commit buffer after the commit has been shown by git-rev-list (and "git log" - which already did the commit buffer part), and frees the commit list entry that was used by the revision walker. The big win would be to get rid of the "refs" pointer in the object structure (another 5%), because it's only used by fsck. That would require some pretty major surgery to fsck, though, so I'm timid and did the less interesting but much easier part instead. This (percentually) makes a bigger difference to "git log" and friends, since those are walking _just_ commits, and thus the list entries tend to be a bigger percentage of the memory use. But the "list all objects" case does improve too. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This shrinks "struct object" by a small amount, by getting rid of the "struct type *" pointer and replacing it with a 3-bit bitfield instead. In addition, we merge the bitfields and the "flags" field, which incidentally should also remove a useless 4-byte padding from the object when in 64-bit mode. Now, our "struct object" is still too damn large, but it's now less obviously bloated, and of the remaining fields, only the "util" (which is not used by most things) is clearly something that should be eventually discarded. This shrinks the "git-rev-list --all" memory use by about 2.5% on the kernel archive (and, perhaps more importantly, on the larger mozilla archive). That may not sound like much, but I suspect it's more on a 64-bit platform. There are other remaining inefficiencies (the parent lists, for example, probably have horrible malloc overhead), but this was pretty obvious. Most of the patch is just changing the comparison of the "type" pointer from one of the constant string pointers to the appropriate new TYPE_xxx small integer constant. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 06 6月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
The tree-walking conversion of the "process_tree()" function broke packing by using an unrelated variable from outer scope. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 31 5月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
This adds a "tree_entry()" function that combines the common operation of doing a "tree_entry_extract()" + "update_tree_entry()". It also has a simplified calling convention, designed for simple loops that traverse over a whole tree: the arguments are pointers to the tree descriptor and a name_entry structure to fill in, and it returns a boolean "true" if there was an entry left to be gotten in the tree. This allows tree traversal with struct tree_desc desc; struct name_entry entry; desc.buf = tree->buffer; desc.size = tree->size; while (tree_entry(&desc, &entry) { ... use "entry.{path, sha1, mode, pathlen}" ... } which is not only shorter than writing it out in full, it's hopefully less error prone too. [ It's actually a tad faster too - we don't need to recalculate the entry pathlength in both extract and update, but need to do it only once. Also, some callers can avoid doing a "strlen()" on the result, since it's returned as part of the name_entry structure. However, by now we're talking just 1% speedup on "git-rev-list --objects --all", and we're definitely at the point where tree walking is no longer the issue any more. ] NOTE! Not everybody wants to use this new helper function, since some of the tree walkers very much on purpose do the descriptor update separately from the entry extraction. So the "extract + update" sequence still remains as the core sequence, this is just a simplified interface. We should probably add a silly two-line inline helper function for initializing the descriptor from the "struct tree" too, just to cut down on the noise from that common "desc" initializer. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 30 5月, 2006 3 次提交
-
-
由 Linus Torvalds 提交于
Instead, just use the tree buffer directly, and use the tree-walk infrastructure to walk the buffers instead of the tree-entry list. The tree-entry list is inefficient, and generates tons of small allocations for no good reason. The tree-walk infrastructure is generally no harder to use than following a linked list, and allows us to do most tree parsing in-place. Some programs still use the old tree-entry lists, and are a bit painful to convert without major surgery. For them we have a helper function that creates a temporary tree-entry list on demand. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This is preparatory work for further cleanups, where we try to make tree_entry look more like the more efficient tree-walk descriptor. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This allows us to avoid allocating information for names etc, because we can just use the information from the tree buffer directly. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 29 5月, 2006 4 次提交
-
-
由 Linus Torvalds 提交于
This finally removes the tree-entry list from "struct tree", since most of the users can just use the tree-walk infrastructure to walk the raw tree buffers instead of the tree-entry list. The tree-entry list is inefficient, and generates tons of small allocations for no good reason. The tree-walk infrastructure is generally no harder to use than following a linked list, and allows us to do most tree parsing in-place. Some programs still use the old tree-entry lists, and are a bit painful to convert without major surgery. For them we have a helper function that creates a temporary tree-entry list on demand. We can convert those too eventually, but with this they no longer affect any users who don't need the explicit lists. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This is preparatory work for further cleanups, where we try to make tree_entry look more like the more efficient tree-walk descriptor. Instead of having a union of pointers to blob/tree/objects, this just makes "struct tree_entry" have the raw SHA1, and makes all the users use that instead (often that implies adding a "lookup_tree(..)" on the sha1, but sometimes the user just wanted the SHA1 in the first place, and it just avoids an unnecessary indirection). Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This allows us to avoid allocating information for names etc, because we can just use the information from the tree buffer directly. We still keep the old "tree_entry_list" in struct tree as well, so old users aren't affected, apart from the fact that the allocations are different (if you free a tree entry, you should no longer free the name allocation for it, since it's allocated as part of "tree->buffer") Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
Martin Langhoff points out that "git repack -a" ends up using up a lot of memory for big archives, and that git cvsimport probably should do only incremental repacks in order to avoid having repacking flush all the caches. The big majority of the memory usage of repacking is from git rev-list tracking all objects, and this patch should go a long way in avoiding the excessive memory usage: the bulk of it was due to the object names being leaked from the tree parser. For the historic Linux kernel archive, this simple patch does: Before: /usr/bin/time git-rev-list --all --objects > /dev/null 72.45user 0.82system 1:13.55elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+125376minor)pagefaults 0swaps After: /usr/bin/time git-rev-list --all --objects > /dev/null 75.22user 0.48system 1:16.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+43921minor)pagefaults 0swaps where we do end up wasting a bit of time on some extra strdup()s (which could be avoided, but that would require tracking where the pathnames came from), but we avoid a lot of memory usage. Minor page faults track maximum RSS very closely (each page fault maps in one page into memory), so the reduction from 125376 page faults to 43921 means a rough reduction of VM footprint from almost half a gigabyte to about a third of that. Those numbers were also double-checked by looking at "top" while the process was running. (Side note: at least part of the remaining VM footprint is the mapping of the 177MB pack-file, so the remaining memory use is at least partly "well behaved" from a project caching perspective). For the current git archive itself, the memory usage for a "--all --objects" rev-list invocation dropped from 7128 pages to 2318 (27MB to 9MB), so the reduction seems to hold for much smaller projects too. For regular "git-rev-list" usage (ie without the "--objects" flag) this patch has no impact. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 21 5月, 2006 1 次提交
-
-
由 Johannes Schindelin 提交于
This patch touches a couple of files, because it adds options to print a custom text just after the subject of a commit, and just after the diffstat. [jc: made "many dashes" used as the boundary leader into a single variable, to reduce the possibility of later tweaks to miscount the number of dashes to break it.] Signed-off-by: NJohannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 19 5月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
This was surprisingly easy. The diff is truly minimal: rename "main()" to "cmd_rev_list()" in rev-list.c, and rename the whole file to reflect its new built-in status. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 06 5月, 2006 1 次提交
-
-
由 Johannes Schindelin 提交于
Signed-off-by: NJohannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 18 4月, 2006 2 次提交
-
-
由 Linus Torvalds 提交于
On Sun, 16 Apr 2006, Junio C Hamano wrote: > > In the mid-term, I am hoping we can drop the generate_header() > callchain _and_ the custom code that formats commit log in-core, > found in cmd_log_wc(). Ok, this was nastier than expected, just because the dependencies between the different log-printing stuff were absolutely _everywhere_, but here's a patch that does exactly that. The patch is not very easy to read, and the "--patch-with-stat" thing is still broken (it does not call the "show_log()" thing properly for merges). That's not a new bug. In the new world order it _should_ do something like if (rev->logopt) show_log(rev, rev->logopt, "---\n"); but it doesn't. I haven't looked at the --with-stat logic, so I left it alone. That said, this patch removes more lines than it adds, and in particular, the "cmd_log_wc()" loop is now a very clean: while ((commit = get_revision(rev)) != NULL) { log_tree_commit(rev, commit); free(commit->buffer); commit->buffer = NULL; } so it doesn't get much prettier than this. All the complexity is entirely hidden in log-tree.c, and any code that needs to flush the log literally just needs to do the "if (rev->logopt) show_log(...)" incantation. I had to make the combined_diff() logic take a "struct rev_info" instead of just a "struct diff_options", but that part is pretty clean. This does change "git whatchanged" from using "diff-tree" as the commit descriptor to "commit", and I changed one of the tests to reflect that new reality. Otherwise everything still passes, and my other tests look fine too. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Junio C Hamano 提交于
Initial fix prepared by Johannes, but I did it slightly differently. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 17 4月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
The boundary commits are shown for UI like gitk to draw them as soon as topo-order sorting allows, and should not be omitted by get_revision() filtering logic. As long as their immediate child commits are shown, we should not filter them out. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 16 4月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
The big option parser unification broke rev-list the big way; this makes it use options from the parsed revs structure. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 15 4月, 2006 2 次提交
-
-
由 Junio C Hamano 提交于
rev-list does not take diff options, so barf after seeing some. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Junio C Hamano 提交于
I noticed bisect does not work well without both good and bad. Running this script in git.git repository would give you quite different results: #!/bin/sh initial=e83c5163 mid0=`git rev-list --bisect ^$initial --all` git rev-list $mid0 | wc -l git rev-list ^$mid0 --all | wc -l mid1=`git rev-list --bisect --all` git rev-list $mid1 | wc -l git rev-list ^$mid1 --all | wc -l The $initial commit is the very first commit you made. The first midpoint bisects things evenly as designed, but the latter does not. The reason I got interested in this was because I was wondering if something like the following would help people converting a huge repository from foreign SCM, or preparing a repository to be fetched over plain dumb HTTP only: #!/bin/sh N=4 P=.git/objects/pack bottom= while test 0 \< $N do N=$((N-1)) if test -z "$bottom" then newbottom=`git rev-list --bisect --all` else newbottom=`git rev-list --bisect ^$bottom --all` fi if test -z "$bottom" then rev_list="$newbottom" elif test 0 = $N then rev_list="^$bottom --all" else rev_list="^$bottom $newbottom" fi p=$(git rev-list --unpacked --objects $rev_list | git pack-objects $P/pack) git show-index <$P/pack-$p.idx | wc -l bottom=$newbottom done The idea is to pack older half of the history to one pack, then older half of the remaining history to another, to continue a few times, using finer granularity as we get closer to the tip. This may not matter, since for a truly huge history, running bisect number of times could be quite time consuming, and we might be better off running "git rev-list --all" once into a temporary file, and manually pick cut-off points from the resulting list of commits. After all we are talking about "approximately half" for such an usage, and older history does not matter much. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 11 4月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
This makes things that include revision.h build again. Blame is also built, but I am not sure how well it works (or how well it worked to begin with) -- it was relying on tree-diff to be using whatever pathspec was used the last time, which smells a bit suspicious. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 09 4月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
The parent rewriting feature caused us to create the whole history in one go, and then simplify it later, because of how rewrite_parents() had been written. However, with a little tweaking, it's perfectly possible to do even that one incrementally. Right now, this doesn't really much matter, because every user of "--parents" will probably generally _also_ use "--topo-order", which will cause the old non-incremental behaviour anyway. However, I'm hopeful that we could make even the topological sort incremental, or at least _partially_ so (for example, make it incremental up to the first merge). In the meantime, this at least moves things in the right direction, and removes a strange special case. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 07 4月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
This should make --pretty=oneline a whole lot more readable for people using 80-column terminals. Originally from Eric Wong. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 01 4月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
Not only do we do it in both rev-list.c and git.c, the revision walking code will soon want to know whether we should rewrite parenthood information or not. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 30 3月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 29 3月, 2006 2 次提交
-
-
由 Junio C Hamano 提交于
With the new --boundary flag, the output from rev-list includes the UNINTERESING commits at the boundary, which are usually not shown. Their object names are prefixed with '-'. For example, with this graph: C side / A---B---D master You would get something like this: $ git rev-list --boundary --header --parents side..master D B tree D^{tree} parent B ... log message for commit D here ... \0-B A tree B^{tree} parent A ... log message for commit B here ... \0 Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Junio C Hamano 提交于
We do not need to track object refs, neither we need to save commit unless we are doing verbose header. A lot of traversal happens inside prepare_revision_walk() these days so setting things up before calling that function is necessary. Signed-off-by: NJunio C Hamano <junkio@cox.net> Acked-by: NLinus Torvalds <torvalds@osdl.org>
-
- 22 3月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
This prefixes the raw commit timestamp to the output. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 11 3月, 2006 1 次提交
-
-
由 Fredrik Kuivinen 提交于
prune_fn in the rev_info structure is called in place of try_to_simplify_commit. This makes it possible to do rename tracking with a custom try_to_simplify_commit-like function. This commit also introduces init_revisions which initialises the rev_info structure with default values. Signed-off-by: NFredrik Kuivinen <freku045@student.liu.se> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 01 3月, 2006 3 次提交
-
-
由 Junio C Hamano 提交于
This ports the following options from rev-list based git-log implementation: * -<n>, -n<n>, and -n <n>. I am still wondering if we want this natively supported by setup_revisions(), which already takes --max-count. We may want to move them in the next round. Also I am not sure if we can get away with not setting revs->limited when we set max-count. The latest rev-list.c and revision.c in this series do not, so I left them as they are. * --pretty and --pretty=<fmt>. * --abbrev=<n> and --no-abbrev. The previous commit already handles time-based limiters (--since, --until and friends). The remaining things that rev-list based git-log happens to do are not useful in a pure log-viewing purposes, and not ported: * --bisect (obviously). * --header. I am actually in favor of doing the NUL terminated record format, but rev-list based one always passed --pretty, which defeated this option. Maybe next round. * --parents. I do not think of a reason a log viewer wants this. The flag is primarily for feeding squashed history via pipe to downstream tools. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
Well, assuming breaking --merge-order is fine, here's a patch (on top of the other ones) that makes git log <filename> actually work, as far as I can tell. I didn't add the logic for --before/--after flags, but that should be pretty trivial, and is independent of this anyway. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This actually moves the "meat" of the revision walking from rev-list.c to the new library code in revision.h. It introduces the new functions void prepare_revision_walk(struct rev_info *revs); struct commit *get_revision(struct rev_info *revs); to prepare and then walk the revisions that we have. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 28 2月, 2006 1 次提交
-
-
由 Linus Torvalds 提交于
This makes the rewrite easier to validate in that revision flag parsing and warlking part are now all in rev_info structure. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 27 2月, 2006 2 次提交
-
-
由 Junio C Hamano 提交于
This fixes "the other end has commit X but since then we tagged that commit with tag T, and he says he wants T -- what is the list of objects we need to send him?" question: git-rev-list --objects ^X T We ended up sending everything since the beginning of time X-<. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Linus Torvalds 提交于
This really just splits things up partially, and creates the interface to set things up by parsing the command line. No real code changes so far, although the parsing of filenames is a bit stricter. In particular, if there is a "--", then we do not accept any filenames before it, and if there isn't any "--", then we check that _all_ paths listed are valid, not just the first one. The new argument parsing automatically also gives us "--default" and "--not" handling as in git-rev-parse. Signed-off-by: NLinus Torvalds <torvalds@osdl.org> Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 24 2月, 2006 2 次提交
-
-
由 Junio C Hamano 提交于
This helps to group the same files from different revs together, while spreading files with the same basename in different directories, to help pack-object. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
由 Junio C Hamano 提交于
Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 22 2月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 20 2月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
This new flag is similar to --objects, but causes rev-list to show list of "uninteresting" commits that appear on the edge commit prefixed with '-'. Downstream pack-objects will be changed to take these as hints to use the trees and blobs contained with them as base objects of resulting pack, producing an incomplete (not self-contained) pack. Such a pack cannot be used in .git/objects/pack (it is prevented by git-index-pack erroring out if it is fed to git-fetch-pack -k or git-clone-pack), but would be useful when transferring only small changes to huge blobs. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-
- 16 2月, 2006 1 次提交
-
-
由 Junio C Hamano 提交于
This adds --date-order to rev-list; it is similar to topo order in the sense that no parent comes before all of its children, but otherwise things are still ordered in the commit timestamp order. The same flag is also added to show-branch. Signed-off-by: NJunio C Hamano <junkio@cox.net>
-