提交 · 1c2b1f7018ba7d5f6a5d949e29e4eaeeef3261e2 · 李少辉-开发者 / git

11 10月, 2016 1 次提交

ls-files: add pathspec matching for submodules · 75a6315f

由 Brandon Williams 提交于 10月 07, 2016

Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule. The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally. If any wildcard character is encountered we'll simply punt
and produce a false positive match. More accurate matching will be done
once inside the submodule. This is due to the superproject not knowing
what files could exist in the submodule.
Signed-off-by: NBrandon Williams <bmwill@google.com>
Reviewed-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

75a6315f

05 5月, 2016 1 次提交

remove_dir_recursively(): add docstring · 728af283

由 Michael Haggerty 提交于 4月 24, 2016

Add a docstring for the remove_dir_recursively() function and the
REMOVE_DIR_* flags that can be passed to it.
Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu>

728af283

23 4月, 2016 2 次提交

dir.c: rename str(n)cmp_icase to fspath(n)cmp · ba0897e6

由 Nguyễn Thái Ngọc Duy 提交于 4月 22, 2016

These functions compare two paths that are taken from file system.
Depending on the running file system, paths may need to be compared
case-sensitively or not, and maybe even something else in future. The
current names do not convey that well.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ba0897e6

dir.c: remove dead function fnmatch_icase() · 423b592a

由 Nguyễn Thái Ngọc Duy 提交于 4月 22, 2016

It was largely replaced by fnmatch_icase_mem() and its last use was in
84b8b5d1 (remove match_pathspec() in favor of match_pathspec_depth() -
2013-07-14).
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

423b592a

19 3月, 2016 1 次提交

Revert "Merge branch 'nd/exclusion-regression-fix'" · 5cee3493

由 Junio C Hamano 提交于 3月 18, 2016

This reverts commit 5e57f9c3, reversing
changes made to e79112d2.

We will be postponing nd/exclusion-regression-fix topic to later
cycle.

5cee3493

02 3月, 2016 1 次提交

dir: store EXC_FLAG_* values in unsigned integers · f8708998

由 Saurav Sachidanand 提交于 3月 01, 2016

The values defined by the macro EXC_FLAG_* (1, 4, 8, 16) are stored
in fields of the structs "pattern" and "exclude", some functions
arguments and a local variable.  None of these uses its most
significant bit in any special way and there is no good reason to
use a signed integer for them.

And while we're at it, document "flags" of "exclude" to explicitly
state the values it's supposed to take on.
Signed-off-by: NSaurav Sachidanand <sauravsachidanand@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f8708998

16 2月, 2016 1 次提交

dir.c: support marking some patterns already matched · c62a9173

由 Nguyễn Thái Ngọc Duy 提交于 2月 15, 2016

Given path "a" and the pattern "a", it's matched. But if we throw path
"a/b" to pattern "a", the code fails to realize that if "a" matches
"a" then "a/b" should also be matched.

When the pattern is matched the first time, we can mark it "sticky", so
that all files and dirs inside the matched path also matches. This is a
simpler solution than modify all match scenarios to fix that.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c62a9173

26 1月, 2016 3 次提交

dir: simplify untracked cache "ident" field · 0e0f7618

由 Christian Couder 提交于 1月 24, 2016

It is not a good idea to compare kernel versions and disable
the untracked cache if it changes, as people may upgrade and
still want the untracked cache to work. So let's just
compare work tree locations and kernel name to decide if we
should disable it.

Also storing many locations in the ident field and comparing
to any of them can be dangerous if GIT_WORK_TREE is used with
different values. So let's just store one location, the
location of the current work tree.

The downside is that untracked cache can only be used by one
type of OS for now. Exporting a git repo to different clients
via a network to e.g. Linux and Windows means that only one
can use the untracked cache.

If the location changed in the ident field and we still want
an untracked cache, let's delete the cache and recreate it.

Note that if an untracked cache has been created by a
previous Git version, then the kernel version is stored in
the ident field. As we now compare with just the kernel
name the comparison will fail and the untracked cache will
be disabled until it's recreated.
Helped-by: NTorsten Bögershausen <tboegi@web.de>
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0e0f7618

dir: add remove_untracked_cache() · 07b29bfd

由 Christian Couder 提交于 1月 24, 2016

Factor out code into remove_untracked_cache(), which will be used
in a later commit.
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

07b29bfd

dir: add {new,add}_untracked_cache() · 4a4ca479

由 Christian Couder 提交于 1月 24, 2016

Factor out code into new_untracked_cache() and
add_untracked_cache(), which will be used
in later commits.
Helped-by: NEric Sunshine <sunshine@sunshineco.com>
Signed-off-by: NChristian Couder <chriscool@tuxfamily.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

4a4ca479

25 3月, 2015 1 次提交

report_path_error(): move to dir.c · 777c55a6

由 Junio C Hamano 提交于 3月 24, 2015

The expected call sequence is for the caller to use match_pathspec()
repeatedly on a set of pathspecs, accumulating the "hits" in a
separate array, and then call this function to diagnose a pathspec
that never matched anything, as that can indicate a typo from the
command line, e.g. "git commit Maekfile".

Many builtin commands use this function from builtin/ls-files.c,
which is not a very healthy arrangement.  ls-files might have been
the first command to feel the need for such a helper, but the need
is shared by everybody who uses the "match and then report" pattern.

Move it to dir.c where match_pathspec() is defined.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

777c55a6

13 3月, 2015 9 次提交

untracked cache: guard and disable on system changes · 1e8fef60

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.
Helped-by: NEric Sunshine <sunshine@sunshineco.com>
Helped-by: Nbrian m. carlson <sandals@crustytoothpaste.net>
Helped-by: NTorsten Bögershausen <tboegi@web.de>
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

1e8fef60

untracked cache: save to an index extension · 83c094ad

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

Helped-by: NStefan Beller <sbeller@google.com>
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

83c094ad

untracked cache: invalidate at index addition or removal · e931371a

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e931371a

untracked cache: load from UNTR index extension · f9e6c649

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f9e6c649

untracked cache: mark what dirs should be recursed/saved · 26cb0182

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

If we redo this thing in a functional style, we would have one struct
untracked_dir as input tree and another as output. The input is used
for verification. The output is a brand new tree, reflecting current
worktree.

But that means recreate a lot of dir nodes even if a lot could be
shared between input and output trees in good cases. So we go with the
messy but efficient way, combining both input and output trees into
one. We need a way to know which node in this combined tree belongs to
the output. This is the purpose of this "recurse" flag.

"valid" bit can't be used for this because it's about data of the node
except the subdirs. When we invalidate a directory, we want to keep
cached data of the subdirs intact even though we don't really know
what subdir still exists (yet). Then we check worktree to see what
actual subdir remains on disk. Those will have 'recurse' bit set
again. If cached data for those are still valid, we may be able to
avoid computing exclude files for them. Those subdirs that are deleted
will have 'recurse' remained clear and their 'valid' bits do not
matter.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

26cb0182

untracked cache: record/validate dir mtime and reuse cached output · 91a2288b

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

The main readdir loop in read_directory_recursive() is replaced with a
new one that checks if cached results of a directory is still valid.

If a file is added or removed from the index, the containing directory
is invalidated (but not its subdirs). If directory's mtime is changed,
the same happens. If a .gitignore is updated, the containing directory
and all subdirs are invalidated recursively. If dir_struct#flags or
other conditions change, the cache is ignored.

If a directory is invalidated, we opendir/readdir/closedir and run the
exclude machinery on that directory listing as usual. If untracked
cache is also enabled, we'll update the cache along the way. If a
directory is validated, we simply pull the untracked listing out from
the cache. The cache also records the list of direct subdirs that we
have to recurse in. Fully excluded directories are seen as "untracked
files".

In the best case when no dirs are invalidated, read_directory()
becomes a series of

  stat(dir), open(.gitignore), fstat(), read(), close() and optionally
  hash_sha1_file()

For comparison, standard read_directory() is a sequence of

  opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the
  expensive last_exclude_matching() and closedir().

We already try not to open(.gitignore) if we know it does not exist,
so open/fstat/read/close sequence does not apply to every
directory. The sequence could be reduced further, as noted in
prep_exclude() in another patch. So in theory, the entire best-case
read_directory sequence could be reduced to a series of stat() and
nothing else.

This is not a silver bullet approach. When you compile a C file, for
example, the old .o file is removed and a new one with the same name
created, effectively invalidating the containing directory's cache
(but not its subdirectories). If your build process touches every
directory, this cache adds extra overhead for nothing, so it's a good
idea to separate generated files from tracked files.. Editors may use
the same strategy for saving files. And of course you're out of luck
running your repo on an unsupported filesystem and/or operating system.
Helped-by: NEric Sunshine <sunshine@sunshineco.com>
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

91a2288b

untracked cache: initial untracked cache validation · ccad261f

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

Make sure the starting conditions and all global exclude files are
good to go. If not, either disable untracked cache completely, or wipe
out the cache and start fresh.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ccad261f

untracked cache: record .gitignore information and dir hierarchy · 0dcb8d7f

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

The idea is if we can capture all input and (non-rescursive) output of
read_directory_recursive(), and can verify later that all the input is
the same, then the second r_d_r() should produce the same output as in
the first run.

The requirement for this to work is stat info of a directory MUST
change if an entry is added to or removed from that directory (and
should not change often otherwise). If your OS and filesystem do not
meet this requirement, untracked cache is not for you. Most file
systems on *nix should be fine. On Windows, NTFS is fine while FAT may
not be [1] even though FAT on Linux seems to be fine.

The list of input of r_d_r() is in the big comment block in dir.h. In
short, the output of a directory (not counting subdirs) mainly depends
on stat info of the directory in question, all .gitignore leading to
it and the check_only flag when r_d_r() is called recursively. This
patch records all this info (and the output) as r_d_r() runs.

Two hash_sha1_file() are required for $GIT_DIR/info/exclude and
core.excludesfile unless their stat data matches. hash_sha1_file() is
only needed when .gitignore files in the worktree are modified,
otherwise their SHA-1 in index is used (see the previous patch).

We could store stat data for .gitignore files so we don't have to
rehash them if their content is different from index, but I think
.gitignore files are rarely modified, so not worth extra cache data
(and hashing penalty read-cache.c:verify_hdr(), as we will be storing
this as an index extension).

The implication is, if you change .gitignore, you better add it to the
index soon or you lose all the benefit of untracked cache because a
modified .gitignore invalidates all subdirs recursively. This is
especially bad for .gitignore at root.

This cached output is about untracked files only, not ignored files
because the number of tracked files is usually small, so small cache
overhead, while the number of ignored files could go really high
(e.g. *.o files mixing with source code).

[1] "Description of NTFS date and time stamps for files and folders"
    http://support.microsoft.com/kb/299648Helped-by: NTorsten Bögershausen <tboegi@web.de>
Helped-by: NDavid Turner <dturner@twopensource.com>
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0dcb8d7f

dir.c: optionally compute sha-1 of a .gitignore file · 55fe6f51

由 Nguyễn Thái Ngọc Duy 提交于 3月 08, 2015

This is not used anywhere yet. But the goal is to compare quickly if a
.gitignore file has changed when we have the SHA-1 of both old (cached
somewhere) and new (from index or a tree) versions.
Helped-by: NJunio C Hamano <gitster@pobox.com>
Helped-by: NTorsten Bögershausen <tboegi@web.de>
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

55fe6f51

15 7月, 2014 2 次提交

prep_exclude: remove the artificial PATH_MAX limit · aceb9429

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2014

This fixes a segfault in git-status with long paths on Windows,
where PATH_MAX is only 260.

This also fixes the problem of silently ignoring .gitignore if the
full path exceeds PATH_MAX. Now add_excludes_from_file() will report
if it gets ENAMETOOLONG.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NKarsten Blees <blees@dcon.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

aceb9429

dir.h: move struct exclude declaration to top level · 709359c8

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2014

There is no actual nested struct here. Move it out for clarity.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NKarsten Blees <blees@dcon.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

709359c8

25 2月, 2014 4 次提交

pathspec: pass directory indicator to match_pathspec_item() · ae8d0824

由 Nguyễn Thái Ngọc Duy 提交于 1月 24, 2014

This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which
makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce
the same output. Previously only the version without trailing slash
returns the difference (if any).

That's the effect of new ce_path_match(). dir_path_match() is not
executed by the new tests. And it should not introduce regressions.

Previously if path "dir/" is passed in with pathspec "dir/", they
obviously match. With new dir_path_match(), the path becomes
_directory_ "dir" vs pathspec "dir/", which is not executed by the old
code path in m_p_i(). The new code path is executed and produces the
same result.

The other case is pathspec "dir" and path "dir/" is now turned to
"dir" (with DO_MATCH_DIRECTORY). Still the same result before or after
the patch.

So why change? Because of the next patch about clean.c.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ae8d0824

pathspec: rename match_pathspec_depth() to match_pathspec() · 854b0959

由 Nguyễn Thái Ngọc Duy 提交于 1月 24, 2014

A long time ago, for some reason I was not happy with
match_pathspec(). I created a better version, match_pathspec_depth()
that was suppose to replace match_pathspec()
eventually. match_pathspec() has finally been gone since 6 months
ago. Use the shorter name for match_pathspec_depth().
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

854b0959

pathspec: convert some match_pathspec_depth() to dir_path_match() · ebb32893

由 Nguyễn Thái Ngọc Duy 提交于 1月 24, 2014

This helps reduce the number of match_pathspec_depth() call sites and
show how m_p_d() is used. And it usage is:

 - match against an index entry (ce_path_match or match_pathspec_depth
   in ls-files)

 - match against a dir_entry from read_directory (dir_path_match and
   match_pathspec_depth in clean.c, which will be converted later)

 - resolve-undo (rerere.c and ls-files.c)
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ebb32893

pathspec: convert some match_pathspec_depth() to ce_path_match() · 429bb40a

由 Nguyễn Thái Ngọc Duy 提交于 1月 24, 2014

This helps reduce the number of match_pathspec_depth() call sites and
show how match_pathspec_depth() is used.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

429bb40a

16 8月, 2013 1 次提交

ls-files -k: a directory only can be killed if the index has a non-directory · 2eac2a4c

由 Junio C Hamano 提交于 8月 15, 2013

"ls-files -o" and "ls-files -k" both traverse the working tree down
to find either all untracked paths or those that will be "killed"
(removed from the working tree to make room) when the paths recorded
in the index are checked out.  It is necessary to traverse the
working tree fully when enumerating all the "other" paths, but when
we are only interested in "killed" paths, we can take advantage of
the fact that paths that do not overlap with entries in the index
can never be killed.

The treat_one_path() helper function, which is called during the
recursive traversal, is the ideal place to implement an
optimization.

When we are looking at a directory P in the working tree, there are
three cases:

 (1) P exists in the index.  Everything inside the directory P in
     the working tree needs to go when P is checked out from the
     index.

 (2) P does not exist in the index, but there is P/Q in the index.
     We know P will stay a directory when we check out the contents
     of the index, but we do not know yet if there is a directory
     P/Q in the working tree to be killed, so we need to recurse.

 (3) P does not exist in the index, and there is no P/Q in the index
     to require P to be a directory, either.  Only in this case, we
     know that everything inside P will not be killed without
     recursing.

Note that this helper is called by treat_leading_path() that decides
if we need to traverse only subdirectories of a single common
leading directory, which is essential for this optimization to be
correct.  This caller checks each level of the leading path
component from shallower directory to deeper ones, and that is what
allows us to only check if the path appears in the index.  If the
call to treat_one_path() weren't there, given a path P/Q/R, the real
traversal may start from directory P/Q/R, even when the index
records P as a regular file, and we would end up having to check if
any leading subpath in P/Q/R, e.g. P, appears in the index.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

2eac2a4c

16 7月, 2013 5 次提交

pathspec: support :(glob) syntax · bd30c2e4

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013

:(glob)path differs from plain pathspec that it uses wildmatch with
WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The
difference lies in how '*' (and '**') is processed.

With the introduction of :(glob) and :(literal) and their global
options --[no]glob-pathspecs, the user can:

 - make everything literal by default via --noglob-pathspecs
   --literal-pathspecs cannot be used for this purpose as it
   disables _all_ pathspec magic.

 - individually turn on globbing with :(glob)

 - make everything globbing by default via --glob-pathspecs

 - individually turn off globbing with :(literal)

The implication behind this is, there is no way to gain the default
matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get
new globbing or literal. The old fnmatch behavior is considered
deprecated and discouraged to use.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

bd30c2e4

remove match_pathspec() in favor of match_pathspec_depth() · 84b8b5d1

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013

match_pathspec_depth was created to replace match_pathspec (see
61cf2820 (pathspec: add match_pathspec_depth() - 2010-12-15). It took
more than two years, but the replacement finally happens :-)
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

84b8b5d1

convert common_prefix() to use struct pathspec · 827f4d6c

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013

The code now takes advantage of nowildcard_len field.
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

827f4d6c

N
convert {read,fill}_directory to take struct pathspec · 7327d3d1
由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013
```
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
7327d3d1

add parse_pathspec() that converts cmdline args to struct pathspec · 87323bda

由 Nguyễn Thái Ngọc Duy 提交于 7月 14, 2013

Currently to fill a struct pathspec, we do:

   const char **paths;
   paths = get_pathspec(prefix, argv);
   ...
   init_pathspec(&pathspec, paths);

"paths" can only carry bare strings, which loses information from
command line arguments such as pathspec magic or the prefix part's
length for each argument.

parse_pathspec() is introduced to combine the two calls into one. The
plan is gradually replace all get_pathspec() and init_pathspec() with
parse_pathspec(). get_pathspec() now becomes a thin wrapper of
parse_pathspec().

parse_pathspec() allows the caller to reject the pathspec magics that
it does not support. When a new pathspec magic is introduced, we can
enable it per command after making sure that all underlying code has no
problem with the new magic.

"flags" parameter is currently unused. But it would allow callers to
pass certain instructions to parse_pathspec, for example forcing
literal pathspec when no magic is used.

With the introduction of parse_pathspec, there are now two functions
that can initialize struct pathspec: init_pathspec and
parse_pathspec. Any semantic changes in struct pathspec must be
reflected in both functions. init_pathspec() will be phased out in
favor of parse_pathspec().
Signed-off-by: NNguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

87323bda

16 4月, 2013 3 次提交

dir.c: git-status --ignored: don't scan the work tree twice · 0aaf62b6

由 Karsten Blees 提交于 4月 15, 2013

'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.
Signed-off-by: NKarsten Blees <blees@dcon.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

0aaf62b6

K
dir.c: replace is_path_excluded with now equivalent is_excluded API · b07bc8c8
由 Karsten Blees 提交于 4月 15, 2013
```
Signed-off-by: NKarsten Blees <blees@dcon.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>
```
b07bc8c8

dir.c: unify is_excluded and is_path_excluded APIs · 95c6f271

由 Karsten Blees 提交于 4月 15, 2013

The is_excluded and is_path_excluded APIs are very similar, except for a
few noteworthy differences:

is_excluded doesn't handle ignored directories, results for paths within
ignored directories are incorrect. This is probably based on the premise
that recursive directory scans should stop at ignored directories, which
is no longer true (in certain cases, read_directory_recursive currently
calls is_excluded *and* is_path_excluded to get correct ignored state).

is_excluded caches parsed .gitignore files of the last directory in struct
dir_struct. If the directory changes, it finds a common parent directory
and is very careful to drop only as much state as necessary. On the other
hand, is_excluded will also read and parse .gitignore files in already
ignored directories, which are completely irrelevant.

is_path_excluded correctly handles ignored directories by checking if any
component in the path is excluded. As it uses is_excluded internally, this
unfortunately forces is_excluded to drop and re-read all .gitignore files,
as there is no common parent directory for the root dir.

is_path_excluded tracks state in a separate struct path_exclude_check,
which is essentially a wrapper of dir_struct with two more fields. However,
as is_path_excluded also modifies dir_struct, it is not possible to e.g.
use multiple path_exclude_check structures with the same dir_struct in
parallel. The additional structure just unnecessarily complicates the API.

Teach is_excluded / prep_exclude about ignored directories: whenever
entering a new directory, first check if the entire directory is excluded.
Remember the excluded state in dir_struct. Don't traverse into already
ignored directories (i.e. don't read irrelevant .gitignore files).

Directories could also be excluded by exclude patterns specified on the
command line or .git/info/exclude, so we cannot simply skip prep_exclude
entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
Move this check to just before actually reading the file.

is_path_excluded is now equivalent to is_excluded, so we can simply
redirect to it (the public API is cleaned up in the next patch).

The performance impact of the additional ignored check per directory is
hardly noticeable when reading directories recursively (e.g. 'git status').
However, performance of git commands using the is_path_excluded API (e.g.
'git ls-files --cached --ignored --exclude-standard') is greatly improved
as this no longer re-reads .gitignore files on each call.

Here's some performance data from the linux and WebKit repos (best of 10
runs on a Debian Linux on SSD, core.preloadIndex=true):

       | ls-files -ci   |    status      | status --ignored
       | linux | WebKit | linux | WebKit | linux | WebKit
-------+-------+--------+-------+--------+-------+---------
before | 0.506 |  6.539 | 0.212 |  1.555 | 0.323 |  2.541
after  | 0.080 |  1.191 | 0.218 |  1.583 | 0.321 |  2.579
gain   | 6.325 |  5.490 | 0.972 |  0.982 | 1.006 |  0.985
Signed-off-by: NKarsten Blees <blees@dcon.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

95c6f271

07 1月, 2013 4 次提交

dir.c: improve docs for match_pathspec() and match_pathspec_depth() · 52ed1894

由 Adam Spiers 提交于 1月 06, 2013

Fix a grammatical issue in the description of these functions, and
make it more obvious how and why seen[] can be reused across multiple
invocations.
Signed-off-by: NAdam Spiers <git@adamspiers.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

52ed1894

dir.c: provide clear_directory() for reclaiming dir_struct memory · 270be816

由 Adam Spiers 提交于 1月 06, 2013

By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.
Signed-off-by: NAdam Spiers <git@adamspiers.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

270be816

dir.c: keep track of where patterns came from · c04318e4

由 Adam Spiers 提交于 1月 06, 2013

For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).

For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1.  So for example the 2nd pattern provided via
--exclude would be numbered -2.  This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.
Signed-off-by: NAdam Spiers <git@adamspiers.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c04318e4

dir.c: use a single struct exclude_list per source of excludes · c082df24

由 Adam Spiers 提交于 1月 06, 2013

Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.
Signed-off-by: NAdam Spiers <git@adamspiers.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

c082df24

29 12月, 2012 1 次提交

dir.c: rename free_excludes() to clear_exclude_list() · f6198812

由 Adam Spiers 提交于 12月 27, 2012

It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.

http://article.gmane.org/gmane.comp.version-control.git/206128Signed-off-by: NAdam Spiers <git@adamspiers.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f6198812

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致