1. 09 7月, 2009 1 次提交
    • L
      Add 'fill_directory()' helper function for directory traversal · 1d8842d9
      Linus Torvalds 提交于
      Most of the users of "read_directory()" actually want a much simpler
      interface than the whole complex (but rather powerful) one.
      
      In fact 'git add' had already largely abstracted out the core interface
      issues into a private "fill_directory()" function that was largely
      applicable almost as-is to a number of callers.  Yes, 'git add' wants to
      do some extra work of its own, specific to the add semantics, but we can
      easily split that out, and use the core as a generic function.
      
      This function does exactly that, and now that much simplified
      'fill_directory()' function can be shared with a number of callers,
      while also ensuring that the rather more complex calling conventions of
      read_directory() are used by fewer call-sites.
      
      This also makes the 'common_prefix()' helper function private to dir.c,
      since all callers are now in that file.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1d8842d9
  2. 28 6月, 2009 1 次提交
  3. 01 6月, 2009 1 次提交
    • J
      git-add: no need for -f when resolving a conflict in already tracked path · 6e4f981f
      Jeff King 提交于
      When a path F that matches ignore pattern has a conflict, "git add F"
      insisted the -f option be given, which did not make sense.  It would have
      required -f when the path was originally added, but when resolving a
      conflict, it already is tracked.
      
      So this should work (and does):
      
        $ echo file >.gitignore
        $ echo content >file
        $ git add -f file ;# need -f because we are adding new path
        $ echo more content >>file
        $ git add file ;# don't need -f; it is not actually an "other" file
      
      This is handled under the hood by the COLLECT_IGNORED option to
      read_directory. When that code finds an ignored file, it checks the
      index to make sure it is not actually a tracked file. However, the test
      it uses does not take into account unmerged entries, and considers them
      to still be ignored. "git ls-files" uses a more elaborate test and gets
      the right answer and the same test should be used here.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      6e4f981f
  4. 17 5月, 2009 1 次提交
    • L
      dir.c: clean up handling of 'path' parameter in read_directory_recursive() · da4b3e8c
      Linus Torvalds 提交于
      Right now we pass two different pathnames ('path' and 'base') down to
      read_directory_recursive(), and the only real reason for that is that we
      want to allow an empty 'base' parameter, but when we do so, we need the
      pathname to "opendir()" to be "." rather than the empty string.
      
      And rather than handle that confusion in the caller, we can just fix
      read_directory_recursive() to handle the case of an empty path itself,
      by just passing opendir() a "." ourselves if the path is empty.
      
      This would allow us to then drop one of the pathnames entirely from the
      calling convention, but rather than do that, we'll start separating them
      out as a "filesystem pathname" (the one we use for filesystem accesses)
      and a "git internal base name" (which is the name that we use for git
      internally).
      
      That will eventually allow us to do things like handle different
      encodings (eg the filesystem pathnames might be Latin1, while git itself
      would use UTF-8 for filename information).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      da4b3e8c
  5. 06 5月, 2009 1 次提交
  6. 02 5月, 2009 1 次提交
  7. 19 2月, 2009 1 次提交
  8. 13 2月, 2009 1 次提交
    • F
      Support "\" in non-wildcard exclusion entries · dd482eea
      Finn Arne Gangstad 提交于
      "\" was treated differently in exclude rules depending on whether a
      wildcard match was done. For wildcard rules, "\" was de-escaped in
      fnmatch, but this was not done for other rules since they used strcmp
      instead.  A file named "#foo" would not be excluded by "\#foo", but would
      be excluded by "\#foo*".
      
      We now treat all rules with "\" as wildcard rules.
      
      Another solution could be to de-escape all non-wildcard rules as we
      read them, but we would have to do the de-escaping exactly as fnmatch
      does it to avoid inconsistencies.
      Signed-off-by: NFinn Arne Gangstad <finnag@pvv.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dd482eea
  9. 10 2月, 2009 1 次提交
  10. 18 1月, 2009 1 次提交
    • R
      Change NUL char handling of isspecial() · 8cc32992
      René Scharfe 提交于
      Replace isspecial() by the new macro is_glob_special(), which is more,
      well, specialized.  The former included the NUL char in its character
      class, while the letter only included characters that are special to
      file name globbing.
      
      The new name contains underscores because they enhance readability
      considerably now that it's made up of three words.  Renaming the
      function is necessary to document its changed scope.
      
      The call sites of isspecial() are updated to check explicitly for NUL.
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8cc32992
  11. 15 1月, 2009 2 次提交
  12. 12 1月, 2009 2 次提交
  13. 03 10月, 2008 1 次提交
  14. 29 9月, 2008 1 次提交
  15. 29 8月, 2008 1 次提交
  16. 14 8月, 2008 1 次提交
  17. 05 8月, 2008 1 次提交
  18. 27 4月, 2008 1 次提交
    • L
      Optimize match_pathspec() to avoid fnmatch() · 88ea8112
      Linus Torvalds 提交于
      "git add *" is actually fundamentally different from "git add .", and
      yeah, you should generally use the latter.
      
      The reason? The argument list is actually something different from what
      you think it is. For git, it's a "pathspec", so what actualy happens is
      that in *both* cases, it will really traverse the whole tree, and then
      match every file it finds against the pathspec.
      
      So think of the arguments not as a file list, but as a random bunch of
      patterns to match against the files you have!
      
      Which is why the cost is actually approximately O(n*m), where "n" is the
      size of the working tree, and "m" is the number of pathspecs.
      
      So the reason "git add ." is fast is actually that "m" in that case is
      just 1 (just one trivial pattern), and then "git add *" is slow because
      "m" is large (lots of complicated patterns). In both cases, 'n' is the
      same (== the whole set of files in your working tree).
      
      Anyway, here's a trivial patch that doesn't change this fundamental fact,
      but that avoids doing anything *expensive* until we've done some cheap
      initial tests. It may or may not help your test-case, but it's pretty
      simple and it matches the other git optimizations in this area (ie
      "conceptually handle the general case, but optimize the simple cases where
      we can exit early")
      
      Notice how this patch doesn' actually change the fundamental O(n^2)
      behaviour, but it makes it much cheaper by generally avoiding the
      expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not
      needed.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      88ea8112
  19. 15 4月, 2008 1 次提交
  20. 09 4月, 2008 2 次提交
  21. 23 2月, 2008 1 次提交
    • J
      Avoid unnecessary "if-before-free" tests. · 8e0f7003
      Jim Meyering 提交于
      This change removes all obvious useless if-before-free tests.
      E.g., it replaces code like this:
      
              if (some_expression)
                      free (some_expression);
      
      with the now-equivalent:
      
              free (some_expression);
      
      It is equivalent not just because POSIX has required free(NULL)
      to work for a long time, but simply because it has worked for
      so long that no reasonable porting target fails the test.
      Here's some evidence from nearly 1.5 years ago:
      
          http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
      
      FYI, the change below was prepared by running the following:
      
        git ls-files -z | xargs -0 \
        perl -0x3b -pi -e \
          's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
      
      Note however, that it doesn't handle brace-enclosed blocks like
      "if (x) { free (x); }".  But that's ok, since there were none like
      that in git sources.
      
      Beware: if you do use the above snippet, note that it can
      produce syntactically invalid C code.  That happens when the
      affected "if"-statement has a matching "else".
      E.g., it would transform this
      
        if (x)
          free (x);
        else
          foo ();
      
      into this:
      
        free (x);
        else
          foo ();
      
      There were none of those here, either.
      
      If you're interested in automating detection of the useless
      tests, you might like the useless-if-before-free script in gnulib:
      [it *does* detect brace-enclosed free statements, and has a --name=S
       option to make it detect free-like functions with different names]
      
        http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
      
      Addendum:
        Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
      Signed-off-by: NJim Meyering <meyering@redhat.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8e0f7003
  22. 05 2月, 2008 2 次提交
    • J
      gitignore: lazily find dtype · 6831a88a
      Junio C Hamano 提交于
      When we process "foo/" entries in gitignore files on a system
      that does not have d_type member in "struct dirent", the earlier
      implementation ran lstat(2) separately when matching with
      entries that came from the command line, in-tree .gitignore
      files, and $GIT_DIR/info/excludes file.
      
      This optimizes it by delaying the lstat(2) call until it becomes
      absolutely necessary.
      
      The initial idea for this change was by Jeff King, but I
      optimized it further to pass pointers to around.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      6831a88a
    • J
      gitignore(5): Allow "foo/" in ignore list to match directory "foo" · d6b8fc30
      Junio C Hamano 提交于
      A pattern "foo/" in the exclude list did not match directory
      "foo", but a pattern "foo" did.  This attempts to extend the
      exclude mechanism so that it would while not matching a regular
      file or a symbolic link "foo".  In order to differentiate a
      directory and non directory, this passes down the type of path
      being checked to excluded() function.
      
      A downside is that the recursive directory walk may need to run
      lstat(2) more often on systems whose "struct dirent" do not give
      the type of the entry; earlier it did not have to do so for an
      excluded path, but we now need to figure out if a path is a
      directory before deciding to exclude it.  This is especially bad
      because an idea similar to the earlier CE_UPTODATE optimization
      to reduce number of lstat(2) calls would by definition not apply
      to the codepaths involved, as (1) directories will not be
      registered in the index, and (2) excluded paths will not be in
      the index anyway.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d6b8fc30
  23. 23 1月, 2008 1 次提交
    • L
      Create pathname-based hash-table lookup into index · cf558704
      Linus Torvalds 提交于
      This creates a hash index of every single file added to the index.
      Right now that hash index isn't actually used for much: I implemented a
      "cache_name_exists()" function that uses it to efficiently look up a
      filename in the index without having to do the O(logn) binary search,
      but quite frankly, that's not why this patch is interesting.
      
      No, the whole and only reason to create the hash of the filenames in the
      index is that by modifying the hash function, you can fairly easily do
      things like making it always hash equivalent names into the same bucket.
      
      That, in turn, means that suddenly questions like "does this name exist
      in the index under an _equivalent_ name?" becomes much much cheaper.
      
      Guiding principles behind this patch:
      
       - it shouldn't be too costly. In fact, my primary goal here was to
         actually speed up "git commit" with a fully populated kernel tree, by
         being faster at checking whether a file already existed in the index. I
         did succeed, but only barely:
      
      	Best before:
      		[torvalds@woody linux]$ time git commit > /dev/null
      		real    0m0.255s
      		user    0m0.168s
      		sys     0m0.088s
      
      	Best after:
      
      		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
      		real    0m0.233s
      		user    0m0.144s
      		sys     0m0.088s
      
         so some things are actually faster (~8%).
      
         Caveat: that's really the best case. Other things are invariably going
         to be slightly slower, since we populate that index cache, and quite
         frankly, few things really use it to look things up.
      
         That said, the cost is really quite small. The worst case is probably
         doing a "git ls-files", which will do very little except puopulate the
         index, and never actually looks anything up in it, just lists it.
      
      	Before:
      		[torvalds@woody linux]$ time git ls-files > /dev/null
      		real    0m0.016s
      		user    0m0.016s
      		sys     0m0.000s
      
      	After:
      		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
      		real    0m0.021s
      		user    0m0.012s
      		sys     0m0.008s
      
         and while the thing has really gotten relatively much slower, we're
         still talking about something almost unmeasurable (eg 5ms). And that
         really should be pretty much the worst case.
      
         So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
         poison - this patch has the advantage that it will _likely_ speed up
         the cases that are complex and expensive more than it slows down the
         cases that are already so fast that nobody cares. But if you look at
         relative speedups/slowdowns, it doesn't look so good.
      
       - It should be simple and clean
      
         The code may be a bit subtle (the reasons I do hash removal the way I
         do etc), but it re-uses the existing hash.c files, so it really is
         fairly small and straightforward apart from a few odd details.
      
      Now, this patch on its own doesn't really do much, but I think it's worth
      looking at, if only because if done correctly, the name hashing really can
      make an improvement to the whole issue of "do we have a filename that
      looks like this in the index already". And at least it gets real testing
      by being used even by default (ie there is a real use-case for it even
      without any insane filesystems).
      
      NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
      not ashamed of it enough to really care. I took all the numbers out of my
      nether regions - I'm sure it's good enough that it works in practice, but
      the whole point was that you can make a really much fancier hash that
      hashes characters not directly, but by their upper-case value or something
      like that, and thus you get a case-insensitive hash, while still keeping
      the name and the index itself totally case sensitive.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cf558704
  24. 22 1月, 2008 1 次提交
    • L
      Make on-disk index representation separate from in-core one · 7a51ed66
      Linus Torvalds 提交于
      This converts the index explicitly on read and write to its on-disk
      format, allowing the in-core format to contain more flags, and be
      simpler.
      
      In particular, the in-core format is now host-endian (as opposed to the
      on-disk one that is network endian in order to be able to be shared
      across machines) and as a result we can dispense with all the
      htonl/ntohl on accesses to the cache_entry fields.
      
      This will make it easier to make use of various temporary flags that do
      not exist in the on-disk format.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a51ed66
  25. 17 12月, 2007 1 次提交
  26. 29 11月, 2007 2 次提交
    • J
      per-directory-exclude: lazily read .gitignore files · 63d285c8
      Junio C Hamano 提交于
      Operations that walk directories or trees, which potentially need to
      consult the .gitignore files, used to always try to open the .gitignore
      file every time they entered a new directory, even when they ended up
      not needing to call excluded() function to see if a path in the
      directory is ignored.  This was done by push/pop exclude_per_directory()
      functions that managed the data in a stack.
      
      This changes the directory walking API to remove the need to call these
      two functions.  Instead, the directory walk data structure caches the
      data used by excluded() function the last time, and lazily reuses it as
      much as possible.  Among the data the last check used, the ones from
      deeper directories that the path we are checking is outside are
      discarded, data from the common leading directories are reused, and then
      the directories between the common directory and the directory the path
      being checked is in are checked for .gitignore file.  This is very
      similar to the way gitattributes are handled.
      
      This API change also fixes "ls-files -c -i", which called excluded()
      without setting up the gitignore data via the old push/pop functions.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      63d285c8
    • J
      dir.c: minor clean-up · 686a4a06
      Junio C Hamano 提交于
      Replace handcrafted reallocation with ALLOC_GROW().
      Reindent "file_exists()" helper function.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      686a4a06
  27. 23 11月, 2007 1 次提交
  28. 17 11月, 2007 2 次提交
    • J
      Fix per-directory exclude handing for "git add" · 0e06cc8b
      Junio C Hamano 提交于
      In "dir_struct", each exclusion element in the exclusion stack records a
      base string (pointer to the beginning with length) so that we can tell
      where it came from, but this pointer is just pointing at the parameter
      that is given by the caller to the push_exclude_per_directory()
      function.
      
      While read_directory_recursive() runs, calls to excluded() makes use
      the data in the exclusion elements, including this base string.  The
      caller of read_directory_recursive() is not supposed to free the
      buffer it gave to push_exclude_per_directory() earlier, until it
      returns.
      
      The test case Bruce Stephens gave in the mailing list discussion
      was simplified and added to the t3700 test.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0e06cc8b
    • J
      core.excludesfile clean-up · dcf0c16e
      Junio C Hamano 提交于
      There are inconsistencies in the way commands currently handle
      the core.excludesfile configuration variable.  The problem is
      the variable is too new to be noticed by anything other than
      git-add and git-status.
      
       * git-ls-files does not notice any of the "ignore" files by
         default, as it predates the standardized set of ignore files.
         The calling scripts established the convention to use
         .git/info/exclude, .gitignore, and later core.excludesfile.
      
       * git-add and git-status know about it because they call
         add_excludes_from_file() directly with their own notion of
         which standard set of ignore files to use.  This is just a
         stupid duplication of code that need to be updated every time
         the definition of the standard set of ignore files is
         changed.
      
       * git-read-tree takes --exclude-per-directory=<gitignore>,
         not because the flexibility was needed.  Again, this was
         because the option predates the standardization of the ignore
         files.
      
       * git-merge-recursive uses hardcoded per-directory .gitignore
         and nothing else.  git-clean (scripted version) does not
         honor core.* because its call to underlying ls-files does not
         know about it.  git-clean in C (parked in 'pu') doesn't either.
      
      We probably could change git-ls-files to use the standard set
      when no excludes are specified on the command line and ignore
      processing was asked, or something like that, but that will be a
      change in semantics and might break people's scripts in a subtle
      way.  I am somewhat reluctant to make such a change.
      
      On the other hand, I think it makes perfect sense to fix
      git-read-tree, git-merge-recursive and git-clean to follow the
      same rule as other commands.  I do not think of a valid use case
      to give an exclude-per-directory that is nonstandard to
      read-tree command, outside a "negative" test in the t1004 test
      script.
      
      This patch is the first step to untangle this mess.
      
      The next step would be to teach read-tree, merge-recursive and
      clean (in C) to use setup_standard_excludes().
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      dcf0c16e
  29. 16 11月, 2007 1 次提交
    • J
      Fix per-directory exclude handing for "git add" · 41a7aa58
      Junio C Hamano 提交于
      In "dir_struct", each exclusion element in the exclusion stack records a
      base string (pointer to the beginning with length) so that we can tell
      where it came from, but this pointer is just pointing at the parameter
      that is given by the caller to the push_exclude_per_directory()
      function.
      
      While read_directory_recursive() runs, calls to excluded() makes use
      the data in the exclusion elements, including this base string.  The
      caller of read_directory_recursive() is not supposed to free the
      buffer it gave to push_exclude_per_directory() earlier, until it
      returns.
      
      The test case Bruce Stephens gave in the mailing list discussion
      was simplified and added to the t3700 test.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      41a7aa58
  30. 15 11月, 2007 1 次提交
    • J
      core.excludesfile clean-up · 039bc64e
      Junio C Hamano 提交于
      There are inconsistencies in the way commands currently handle
      the core.excludesfile configuration variable.  The problem is
      the variable is too new to be noticed by anything other than
      git-add and git-status.
      
       * git-ls-files does not notice any of the "ignore" files by
         default, as it predates the standardized set of ignore files.
         The calling scripts established the convention to use
         .git/info/exclude, .gitignore, and later core.excludesfile.
      
       * git-add and git-status know about it because they call
         add_excludes_from_file() directly with their own notion of
         which standard set of ignore files to use.  This is just a
         stupid duplication of code that need to be updated every time
         the definition of the standard set of ignore files is
         changed.
      
       * git-read-tree takes --exclude-per-directory=<gitignore>,
         not because the flexibility was needed.  Again, this was
         because the option predates the standardization of the ignore
         files.
      
       * git-merge-recursive uses hardcoded per-directory .gitignore
         and nothing else.  git-clean (scripted version) does not
         honor core.* because its call to underlying ls-files does not
         know about it.  git-clean in C (parked in 'pu') doesn't either.
      
      We probably could change git-ls-files to use the standard set
      when no excludes are specified on the command line and ignore
      processing was asked, or something like that, but that will be a
      change in semantics and might break people's scripts in a subtle
      way.  I am somewhat reluctant to make such a change.
      
      On the other hand, I think it makes perfect sense to fix
      git-read-tree, git-merge-recursive and git-clean to follow the
      same rule as other commands.  I do not think of a valid use case
      to give an exclude-per-directory that is nonstandard to
      read-tree command, outside a "negative" test in the t1004 test
      script.
      
      This patch is the first step to untangle this mess.
      
      The next step would be to teach read-tree, merge-recursive and
      clean (in C) to use setup_standard_excludes().
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      039bc64e
  31. 09 11月, 2007 1 次提交
  32. 30 10月, 2007 1 次提交
    • L
      Speedup scanning for excluded files. · 68492fc7
      Lars Knoll 提交于
      Try to avoid a lot of work scanning for excluded files,
      by caching some more information when setting up the exclusion
      data structure.
      
      Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and
      reduces the amount of instructions executed (as measured by valgrind) by a
      factor of 2.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      68492fc7
  33. 21 10月, 2007 1 次提交
    • L
      Fix directory scanner to correctly ignore files without d_type · 07134421
      Linus Torvalds 提交于
      On Fri, 19 Oct 2007, Todd T. Fries wrote:
      > If DT_UNKNOWN exists, then we have to do a stat() of some form to
      > find out the right type.
      
      That happened in the case of a pathname that was ignored, and we did
      not ask for "dir->show_ignored". That test used to be *together*
      with the "DTYPE(de) != DT_DIR", but splitting the two tests up
      means that we can do that (common) test before we even bother to
      calculate the real dtype.
      
      Of course, that optimization only matters for systems that don't
      have, or don't fill in DTYPE properly.
      
      I also clarified the real relationship between "exclude" and
      "dir->show_ignored". It used to do
      
      	if (exclude != dir->show_ignored) {
      		..
      
      which wasn't exactly obvious, because it triggers for two different
      cases:
      
       - the path is marked excluded, but we are not interested in ignored
         files: ignore it
      
       - the path is *not* excluded, but we *are* interested in ignored
         files: ignore it unless it's a directory, in which case we might
         have ignored files inside the directory and need to recurse
         into it).
      
      so this splits them into those two cases, since the first case
      doesn't even care about the type.
      
      I also made a the DT_UNKNOWN case a separate helper function,
      and added some commentary to the cases.
      
      		Linus
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      07134421
  34. 30 9月, 2007 1 次提交