1. 15 10月, 2014 1 次提交
    • J
      color_parse: do not mention variable name in error message · f6c5a296
      Jeff King 提交于
      Originally the color-parsing function was used only for
      config variables. It made sense to pass the variable name so
      that the die() message could be something like:
      
        $ git -c color.branch.plain=bogus branch
        fatal: bad color value 'bogus' for variable 'color.branch.plain'
      
      These days we call it in other contexts, and the resulting
      error messages are a little confusing:
      
        $ git log --pretty='%C(bogus)'
        fatal: bad color value 'bogus' for variable '--pretty format'
      
        $ git config --get-color foo.bar bogus
        fatal: bad color value 'bogus' for variable 'command line'
      
      This patch teaches color_parse to complain only about the
      value, and then return an error code. Config callers can
      then propagate that up to the config parser, which mentions
      the variable name. Other callers can provide a custom
      message. After this patch these three cases now look like:
      
        $ git -c color.branch.plain=bogus branch
        error: invalid color value: bogus
        fatal: unable to parse 'color.branch.plain' from command-line config
      
        $ git log --pretty='%C(bogus)'
        error: invalid color value: bogus
        fatal: unable to parse --pretty format
      
        $ git config --get-color foo.bar bogus
        error: invalid color value: bogus
        fatal: unable to parse default color value
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f6c5a296
  2. 21 3月, 2014 1 次提交
  3. 12 3月, 2014 1 次提交
  4. 07 3月, 2014 1 次提交
  5. 11 5月, 2013 1 次提交
    • J
      grep: allow to use textconv filters · 335ec3bf
      Jeff King 提交于
      Recently and not so recently, we made sure that log/grep type operations
      use textconv filters when a userfacing diff would do the same:
      
      ef90ab66 (pickaxe: use textconv for -S counting, 2012-10-28)
      b1c2f57d (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
      0508fe53 (combine-diff: respect textconv attributes, 2011-05-23)
      
      "git grep" currently does not use textconv filters at all, that is
      neither for displaying the match and context nor for the actual grepping,
      even when requested by --textconv.
      
      Introduce an option "--textconv" which makes git grep use any configured
      textconv filters for grepping and output purposes. It is off by default.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      335ec3bf
  6. 25 2月, 2013 1 次提交
  7. 12 10月, 2012 1 次提交
  8. 10 10月, 2012 3 次提交
  9. 30 9月, 2012 3 次提交
  10. 21 9月, 2012 1 次提交
  11. 16 9月, 2012 1 次提交
  12. 15 9月, 2012 2 次提交
    • J
      log --grep/--author: honor --all-match honored for multiple --grep patterns · 13e4fc7e
      Junio C Hamano 提交于
      When we have both header expression (which has to be an OR node by
      construction) and a pattern expression (which could be anything), we
      create a new top-level OR node to bind them together, and the
      resulting expression structure looks like this:
      
                   OR
              /          \
             /            \
         pattern            OR
           / \           /     \
          .....    committer    OR
                               /   \
                           author   TRUE
      
      The three elements on the top-level backbone that are inspected by
      the "all-match" logic are "pattern", "committer" and "author".  When
      there are more than one elements in the "pattern", the top-level
      node of the "pattern" part of the subtree is an OR, and that node is
      inspected by "all-match".
      
      The result ends up ignoring the "--all-match" given from the command
      line.  A match on either side of the pattern is considered a match,
      hence:
      
              git log --grep=A --grep=B --author=C --all-match
      
      shows the same "authored by C and has either A or B" that is correct
      only when run without "--all-match".
      
      Fix this by turning the resulting expression around when "--all-match"
      is in effect, like this:
      
                    OR
                /        \
               /          \
              /              OR
          committer        /    \
                       author    \
                                 pattern
      
      The set of nodes on the top-level backbone in the resulting
      expression becomes "committer", "author", and the nodes that are on
      the top-level backbone of the "pattern" subexpression.  This makes
      the "all-match" logic inspect the same nodes in "pattern" as the
      case without the author and/or the committer restriction, and makes
      the earlier "log" example to show "authored by C and has A and has
      B", which is what the command line expects.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      13e4fc7e
    • J
      grep: teach --debug option to dump the parse tree · 17bf35a3
      Junio C Hamano 提交于
      Our "grep" allows complex boolean expressions to be formed to match
      each individual line with operators like --and, '(', ')' and --not.
      Introduce the "--debug" option to show the parse tree to help people
      who want to debug and enhance it.
      
      Also "log" learns "--grep-debug" option to do the same.  The command
      line parser to the log family is a lot more limited than the general
      "git grep" parser, but it has special handling for header matching
      (e.g. "--author"), and a parse tree is valuable when working on it.
      
      Note that "--all-match" is *not* any individual node in the parse
      tree.  It is an instruction to the evaluator to check all the nodes
      in the top-level backbone have matched and reject a document as
      non-matching otherwise.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      17bf35a3
  13. 21 5月, 2012 3 次提交
  14. 08 5月, 2012 1 次提交
  15. 29 2月, 2012 1 次提交
    • J
      grep: use static trans-case table · 0f871cf5
      Junio C Hamano 提交于
      In order to prepare the kwset machinery for a case-insensitive search, we
      used to use a static table of 256 elements and filled it every time before
      calling kwsalloc().  Because the kwset machinery will never modify this
      table, just allocate a single instance globally and fill it at the compile
      time.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0f871cf5
  16. 27 2月, 2012 1 次提交
    • M
      grep -P: Fix matching ^ and $ · fba4f125
      Michał Kiedrowicz 提交于
      When "git grep" is run with -P/--perl-regexp, it doesn't match ^ and $ at
      the beginning/end of the line.  This is because PCRE normally matches ^
      and $ at the beginning/end of the whole text, not for each line, and "git
      grep" passes a large chunk of text (possibly containing many lines) to
      pcre_exec() and then splits the text into lines.
      
      This makes "git grep -P" behave differently from "git grep -E" and also
      from "grep -P" and "pcregrep":
      
      	$ cat file
      	a
      	 b
      	$ git grep --no-index -P '^ ' file
      	$ git grep --no-index -E '^ ' file
      	file: b
      	$ grep -c -P '^ ' file
      	 b
      	$ pcregrep -c '^ ' file
      	 b
      Reported-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
      Signed-off-by: NMichał Kiedrowicz <michal.kiedrowicz@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fba4f125
  17. 03 2月, 2012 7 次提交
    • J
      grep: load file data after checking binary-ness · 08265798
      Jeff King 提交于
      Usually we load each file to grep into memory, check whether
      it's binary, and then either grep it (the default) or not
      (if "-I" was given).
      
      In the "-I" case, we can skip loading the file entirely if
      it is marked as binary via gitattributes. On my giant
      3-gigabyte media repository, doing "git grep -I foo" went
      from:
      
        real    0m0.712s
        user    0m0.044s
        sys     0m4.780s
      
      to:
      
        real    0m0.026s
        user    0m0.016s
        sys     0m0.020s
      
      Obviously this is an extreme example. The repo is almost
      entirely binary files, and you can see that we spent all of
      our time asking the kernel to read() the data. However, with
      a cold disk cache, even avoiding a few binary files can have
      an impact.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      08265798
    • J
      grep: respect diff attributes for binary-ness · 41b59bfc
      Jeff King 提交于
      There is currently no way for users to tell git-grep that a
      particular path is or is not a binary file; instead, grep
      always relies on its auto-detection (or the user specifying
      "-a" to treat all binary-looking files like text).
      
      This patch teaches git-grep to use the same attribute lookup
      that is used by git-diff. We could add a new "grep" flag,
      but that is unnecessarily complex and unlikely to be useful.
      Despite the name, the "-diff" attribute (or "diff=foo" and
      the associated diff.foo.binary config option) are really
      about describing the contents of the path. It's simply
      historical that diff was the only thing that cared about
      these attributes in the past.
      
      And if this simple approach turns out to be insufficient, we
      still have a backwards-compatible path forward: we can add a
      separate "grep" attribute, and fall back to respecting
      "diff" if it is unset.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      41b59bfc
    • J
      grep: cache userdiff_driver in grep_source · 94ad9d9e
      Jeff King 提交于
      Right now, grep only uses the userdiff_driver for one thing:
      looking up funcname patterns for "-p" and "-W".  As new uses
      for userdiff drivers are added to the grep code, we want to
      minimize attribute lookups, which can be expensive.
      
      It might seem at first that this would also optimize multiple
      lookups when the funcname pattern for a file is needed
      multiple times. However, the compiled funcname pattern is
      already cached in struct grep_opt's "priv" member, so
      multiple lookups are already suppressed.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      94ad9d9e
    • J
      grep: drop grep_buffer's "name" parameter · c876d6da
      Jeff King 提交于
      Before the grep_source interface existed, grep_buffer was
      used by two types of callers:
      
        1. Ones which pulled a file into a buffer, and then wanted
           to supply the file's name for the output (i.e.,
           git grep).
      
        2. Ones which really just wanted to grep a buffer (i.e.,
           git log --grep).
      
      Callers in set (1) should now be using grep_source. Callers
      in set (2) always pass NULL for the "name" parameter of
      grep_buffer. We can therefore get rid of this now-useless
      parameter.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c876d6da
    • J
      grep: refactor the concept of "grep source" into an object · e1327023
      Jeff King 提交于
      The main interface to the low-level grep code is
      grep_buffer, which takes a pointer to a buffer and a size.
      This is convenient and flexible (we use it to grep commit
      bodies, files on disk, and blobs by sha1), but it makes it
      hard to pass extra information about what we are grepping
      (either for correctness, like overriding binary
      auto-detection, or for optimizations, like lazily loading
      blob contents).
      
      Instead, let's encapsulate the idea of a "grep source",
      including the buffer, its size, and where the data is coming
      from. This is similar to the diff_filespec structure used by
      the diff code (unsurprising, since future patches will
      implement some of the same optimizations found there).
      
      The diffstat is slightly scarier than the actual patch
      content. Most of the modified lines are simply replacing
      access to raw variables with their counterparts that are now
      in a "struct grep_source". Most of the added lines were
      taken from builtin/grep.c, which partially abstracted the
      idea of grep sources (for file vs sha1 sources).
      
      Instead of dropping the now-redundant code, this patch
      leaves builtin/grep.c using the traditional grep_buffer
      interface (which now wraps the grep_source interface). That
      makes it easy to test that there is no change of behavior
      (yet).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e1327023
    • J
      grep: move sha1-reading mutex into low-level code · b3aeb285
      Jeff King 提交于
      The multi-threaded git-grep code needs to serialize access
      to the thread-unsafe read_sha1_file call. It does this with
      a mutex that is local to builtin/grep.c.
      
      Let's instead push this down into grep.c, where it can be
      used by both builtin/grep.c and grep.c. This will let us
      safely teach the low-level grep.c code tricks that involve
      reading from the object db.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b3aeb285
    • J
      grep: make locking flag global · 78db6ea9
      Jeff King 提交于
      The low-level grep code traditionally didn't care about
      threading, as it doesn't do any threading itself and didn't
      call out to other non-thread-safe code.  That changed with
      0579f91d (grep: enable threading with -p and -W using lazy
      attribute lookup, 2011-12-12), which pushed the lookup of
      funcname attributes (which is not thread-safe) into the
      low-level grep code.
      
      As a result, the low-level code learned about a new global
      "grep_attr_mutex" to serialize access to the attribute code.
      A multi-threaded caller (e.g., builtin/grep.c) is expected
      to initialize the mutex and set "use_threads" in the
      grep_opt structure. The low-level code only uses the lock if
      use_threads is set.
      
      However, putting the use_threads flag into the grep_opt
      struct is not the most logical place. Whether threading is
      in use is not something that matters for each call to
      grep_buffer, but is instead global to the whole program
      (i.e., if any thread is doing multi-threaded grep, every
      other thread, even if it thinks it is doing its own
      single-threaded grep, would need to use the locking).  In
      practice, this distinction isn't a problem for us, because
      the only user of multi-threaded grep is "git-grep", which
      does nothing except call grep.
      
      This patch turns the opt->use_threads flag into a global
      flag. More important than the nit-picking semantic argument
      above is that this means that the locking functions don't
      need to actually have access to a grep_opt to know whether
      to lock. Which in turn can make adding new locks simpler, as
      we don't need to pass around a grep_opt.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      78db6ea9
  18. 17 12月, 2011 1 次提交
  19. 13 12月, 2011 1 次提交
  20. 21 8月, 2011 1 次提交
    • F
      Use kwset in grep · 9eceddee
      Fredrik Kuivinen 提交于
      Benchmarks for the hot cache case:
      
      before:
      $ perf stat --repeat=5 git grep qwerty > /dev/null
      
      Performance counter stats for 'git grep qwerty' (5 runs):
      
              3,478,085 cache-misses             #      2.322 M/sec   ( +-   2.690% )
             11,356,177 cache-references         #      7.582 M/sec   ( +-   2.598% )
              3,872,184 branch-misses            #      0.363 %       ( +-   0.258% )
          1,067,367,848 branches                 #    712.673 M/sec   ( +-   2.622% )
          3,828,370,782 instructions             #      0.947 IPC     ( +-   0.033% )
          4,043,832,831 cycles                   #   2700.037 M/sec   ( +-   0.167% )
                  8,518 page-faults              #      0.006 M/sec   ( +-   3.648% )
                    847 CPU-migrations           #      0.001 M/sec   ( +-   3.262% )
                  6,546 context-switches         #      0.004 M/sec   ( +-   2.292% )
            1497.695495 task-clock-msecs         #      3.303 CPUs    ( +-   2.550% )
      
             0.453394396  seconds time elapsed   ( +-   0.912% )
      
      after:
      $ perf stat --repeat=5 git grep qwerty > /dev/null
      
      Performance counter stats for 'git grep qwerty' (5 runs):
      
              2,989,918 cache-misses             #      3.166 M/sec   ( +-   5.013% )
             10,986,041 cache-references         #     11.633 M/sec   ( +-   4.899% )  (scaled from 95.06%)
              3,511,993 branch-misses            #      1.422 %       ( +-   0.785% )
            246,893,561 branches                 #    261.433 M/sec   ( +-   3.967% )
          1,392,727,757 instructions             #      0.564 IPC     ( +-   0.040% )
          2,468,142,397 cycles                   #   2613.494 M/sec   ( +-   0.110% )
                  7,747 page-faults              #      0.008 M/sec   ( +-   3.995% )
                    897 CPU-migrations           #      0.001 M/sec   ( +-   2.383% )
                  6,535 context-switches         #      0.007 M/sec   ( +-   1.993% )
             944.384228 task-clock-msecs         #      3.177 CPUs    ( +-   0.268% )
      
             0.297257643  seconds time elapsed   ( +-   0.450% )
      
      So we gain about 35% by using the kwset code.
      
      As a side effect of using kwset two grep tests are fixed by this
      patch. The first is fixed because kwset can deal with case-insensitive
      search containing NULs, something strcasestr cannot do. The second one
      is fixed because we consider patterns containing NULs as fixed strings
      (regcomp cannot accept patterns with NULs).
      Signed-off-by: NFredrik Kuivinen <frekui@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9eceddee
  21. 20 8月, 2011 1 次提交
    • J
      color: delay auto-color decision until point of use · daa0c3d9
      Jeff King 提交于
      When we read a color value either from a config file or from
      the command line, we use git_config_colorbool to convert it
      from the tristate always/never/auto into a single yes/no
      boolean value.
      
      This has some timing implications with respect to starting
      a pager.
      
      If we start (or decide not to start) the pager before
      checking the colorbool, everything is fine. Either isatty(1)
      will give us the right information, or we will properly
      check for pager_in_use().
      
      However, if we decide to start a pager after we have checked
      the colorbool, things are not so simple. If stdout is a tty,
      then we will have already decided to use color. However, the
      user may also have configured color.pager not to use color
      with the pager. In this case, we need to actually turn off
      color. Unfortunately, the pager code has no idea which color
      variables were turned on (and there are many of them
      throughout the code, and they may even have been manipulated
      after the colorbool selection by something like "--color" on
      the command line).
      
      This bug can be seen any time a pager is started after
      config and command line options are checked. This has
      affected "git diff" since 89d07f75 (diff: don't run pager if
      user asked for a diff style exit code, 2007-08-12). It has
      also affect the log family since 1fda91b5 (Fix 'git log'
      early pager startup error case, 2010-08-24).
      
      This patch splits the notion of parsing a colorbool and
      actually checking the configuration. The "use_color"
      variables now have an additional possible value,
      GIT_COLOR_AUTO. Users of the variable should use the new
      "want_color()" wrapper, which will lazily determine and
      cache the auto-color decision.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      daa0c3d9
  22. 02 8月, 2011 1 次提交
  23. 06 6月, 2011 3 次提交
    • R
      grep: add --heading · 1d84f72e
      René Scharfe 提交于
      With --heading, the filename is printed once before matches from that
      file instead of at the start of each line, giving more screen space to
      the actual search results.
      
      This option is taken from ack (http://betterthangrep.com/).  And now
      git grep can dress up like it:
      
      	$ git config alias.ack "grep --break --heading --line-number"
      
      	$ git ack -e --heading
      	Documentation/git-grep.txt
      	154:--heading::
      
      	t/t7810-grep.sh
      	785:test_expect_success 'grep --heading' '
      	786:    git grep --heading -e char -e lo_w hello.c hello_world >actual &&
      	808:    git grep --break --heading -n --color \
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1d84f72e
    • R
      grep: add --break · a8f0e764
      René Scharfe 提交于
      With --break, an empty line is printed between matches from different
      files, increasing readability.  This option is taken from ack
      (http://betterthangrep.com/).
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      a8f0e764
    • R
      grep: fix coloring of hunk marks between files · 08303c36
      René Scharfe 提交于
      Commit 431d6e7b (grep: enable threading for context line printing)
      split the printing of the "--\n" mark between results from different
      files out into two places: show_line() in grep.c for the non-threaded
      case and work_done() in builtin/grep.c for the threaded case.  Commit
      55f638bd (grep: Colorize filename, line number, and separator) updated
      the former, but not the latter, so the separators between files are
      not colored if threads are used.
      
      This patch merges the two.  In the threaded case, hunk marks are now
      printed by show_line() for every file, including the first one, and the
      very first mark is simply skipped in work_done().  This ensures that the
      output is properly colored and works just as well.
      Signed-off-by: NRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      08303c36
  24. 10 5月, 2011 2 次提交