1. 22 9月, 2018 1 次提交
  2. 10 7月, 2018 1 次提交
    • T
      grep.c: teach 'git grep --only-matching' · 9d8db06e
      Taylor Blau 提交于
      Teach 'git grep --only-matching', a new option to only print the
      matching part(s) of a line.
      
      For instance, a line containing the following (taken from README.md:27):
      
        (`man gitcvs-migration` or `git help cvs-migration` if git is
      
      Is printed as follows:
      
        $ git grep --line-number --column --only-matching -e git -- \
          README.md | grep ":27"
        README.md:27:7:git
        README.md:27:16:git
        README.md:27:38:git
      
      The patch works mostly as one would expect, with the exception of a few
      considerations that are worth mentioning here.
      
      Like GNU grep, this patch ignores --only-matching when --invert (-v) is
      given. There is a sensible answer here, but parity with the behavior of
      other tools is preferred.
      
      Because a line might contain more than one match, there are special
      considerations pertaining to when to print line headers, newlines, and
      how to increment the match column offset. The line header and newlines
      are handled as a special case within the main loop to avoid polluting
      the surrounding code with conditionals that have large blocks.
      Signed-off-by: NTaylor Blau <me@ttaylorr.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9d8db06e
  3. 23 6月, 2018 1 次提交
  4. 29 5月, 2018 1 次提交
  5. 13 11月, 2017 1 次提交
    • C
      grep: fix NO_LIBPCRE1_JIT to fully disable JIT · 2fff1e19
      Charles Bailey 提交于
      If you have a pcre1 library which is compiled with JIT enabled then
      PCRE_STUDY_JIT_COMPILE will be defined whether or not the
      NO_LIBPCRE1_JIT configuration is set.
      
      This means that we enable JIT functionality when calling pcre_study
      even if NO_LIBPCRE1_JIT has been explicitly set and we just use plain
      pcre_exec later.
      
      Fix this by using own macro (GIT_PCRE_STUDY_JIT_COMPILE) which we set to
      PCRE_STUDY_JIT_COMPILE only if NO_LIBPCRE1_JIT is not set and define to
      0 otherwise, as before.
      Reviewed-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2fff1e19
  6. 03 8月, 2017 1 次提交
  7. 01 7月, 2017 1 次提交
    • Æ
      grep: remove regflags from the public grep_opt API · 07a3d411
      Ævar Arnfjörð Bjarmason 提交于
      Refactor calls to the grep machinery to always pass opt.ignore_case &
      opt.extended_regexp_option instead of setting the equivalent regflags
      bits.
      
      The bug fixed when making -i work with -P in commit 9e3cbc59 ("log:
      make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
      really just plastering over the code smell which this change fixes.
      
      The reason for adding the extensive commentary here is that I
      discovered some subtle complexity in implementing this that really
      should be called out explicitly to future readers.
      
      Before this change we'd rely on the difference between
      `extended_regexp_option` and `regflags` to serve as a membrane between
      our preliminary parsing of grep.extendedRegexp and grep.patternType,
      and what we decided to do internally.
      
      Now that those two are the same thing, it's necessary to unset
      `extended_regexp_option` just before we commit in cases where both of
      those config variables are set. See 84befcd0 ("grep: add a
      grep.patternType configuration setting", 2012-08-03) for the code and
      documentation related to that.
      
      The explanation of why the if/else branches in
      grep_commit_pattern_type() are ordered the way they are exists in that
      commit message, but I think it's worth calling this subtlety out
      explicitly with a comment for future readers.
      
      Even though grep_commit_pattern_type() is the only caller of
      grep_set_pattern_type_option() it's simpler to reset the
      extended_regexp_option flag in the latter, since 2/3 branches in the
      former would otherwise need to reset it, this way we can do it in one
      place.
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      07a3d411
  8. 02 6月, 2017 3 次提交
    • B
      grep: convert to struct object_id · 1c41c82b
      Brandon Williams 提交于
      Convert the remaining parts of grep to use struct object_id.
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1c41c82b
    • Æ
      grep: add support for PCRE v2 · 94da9193
      Ævar Arnfjörð Bjarmason 提交于
      Add support for v2 of the PCRE API. This is a new major version of
      PCRE that came out in early 2015[1].
      
      The regular expression syntax is the same, but while the API is
      similar, pretty much every function is either renamed or takes
      different arguments. Thus using it via entirely new functions makes
      sense, as opposed to trying to e.g. have one compile_pcre_pattern()
      that would call either PCRE v1 or v2 functions.
      
      Git can now be compiled with either USE_LIBPCRE1=YesPlease or
      USE_LIBPCRE2=YesPlease, with USE_LIBPCRE=YesPlease currently being a
      synonym for the former. Providing both is a compile-time error.
      
      With earlier patches to enable JIT for PCRE v1 the performance of the
      release versions of both libraries is almost exactly the same, with
      PCRE v2 being around 1% slower.
      
      However after I reported this to the pcre-dev mailing list[2] I got a
      lot of help with the API use from Zoltán Herczeg, he subsequently
      optimized some of the JIT functionality in v2 of the library.
      
      Running the p7820-grep-engines.sh performance test against the latest
      Subversion trunk of both, with both them and git compiled as -O3, and
      the test run against linux.git, gives the following results. Just the
      /perl/ tests shown:
      
          $ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_COMMAND='grep -q LIBPCRE2 Makefile && make -j8 USE_LIBPCRE2=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre2/inst/lib || make -j8 USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~5 HEAD~ HEAD p7820-grep-engines.sh
          [...]
          Test                                            HEAD~5            HEAD~                    HEAD
          -----------------------------------------------------------------------------------------------------------------
          7820.3: perl grep 'how.to'                      0.31(1.10+0.48)   0.21(0.35+0.56) -32.3%   0.21(0.34+0.55) -32.3%
          7820.7: perl grep '^how to'                     0.56(2.70+0.40)   0.24(0.64+0.52) -57.1%   0.20(0.28+0.60) -64.3%
          7820.11: perl grep '[how] to'                   0.56(2.66+0.38)   0.29(0.95+0.45) -48.2%   0.23(0.45+0.54) -58.9%
          7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       1.02(5.77+0.42)   0.31(1.02+0.54) -69.6%   0.23(0.50+0.54) -77.5%
          7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.38(1.57+0.42)   0.27(0.85+0.46) -28.9%   0.21(0.33+0.57) -44.7%
      
      See commit ("perf: add a comparison test of grep regex engines",
      2017-04-19) for details on the machine the above test run was executed
      on.
      
      Here HEAD~2 is git with PCRE v1 without JIT, HEAD~ is PCRE v1 with
      JIT, and HEAD is PCRE v2 (also with JIT). See previous commits of mine
      mentioning p7820-grep-engines.sh for more details on the test setup.
      
      For ease of readability, a different run just of HEAD~ (PCRE v1 with
      JIT v.s. PCRE v2), again with just the /perl/ tests shown:
      
          [...]
          Test                                            HEAD~             HEAD
          ----------------------------------------------------------------------------------------
          7820.3: perl grep 'how.to'                      0.21(0.42+0.52)   0.21(0.31+0.58) +0.0%
          7820.7: perl grep '^how to'                     0.25(0.65+0.50)   0.20(0.31+0.57) -20.0%
          7820.11: perl grep '[how] to'                   0.30(0.90+0.50)   0.23(0.46+0.53) -23.3%
          7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       0.30(1.19+0.38)   0.23(0.51+0.51) -23.3%
          7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.27(0.84+0.48)   0.21(0.34+0.57) -22.2%
      
      I.e. the two are either neck-to-neck, but PCRE v2 usually pulls ahead,
      when it does it's around 20% faster.
      
      A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
      the compiled pattern can be shared between threads, but not some of
      the JIT context, however the grep threading support does all pattern &
      JIT compilation in separate threads, so this code doesn't need to
      concern itself with thread safety.
      
      See commit 63e7e9d8 ("git-grep: Learn PCRE", 2011-05-09) for the
      initial addition of PCRE v1. This change follows some of the same
      patterns it did (and which were discussed on list at the time),
      e.g. mocking up types with typedef instead of ifdef-ing them out when
      USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
      program, but makes the code look nicer.
      
      1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
      2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.htmlSigned-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      94da9193
    • Æ
      grep: un-break building with PCRE >= 8.32 without --enable-jit · fb95e2e3
      Ævar Arnfjörð Bjarmason 提交于
      Amend my change earlier in this series ("grep: add support for the
      PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
      versions later than 8.31 compiled without --enable-jit.
      
      As explained in that change and a later compatibility change in this
      series ("grep: un-break building with PCRE < 8.32", 2017-05-10) the
      pcre_jit_exec() function is a faster path to execute the JIT.
      
      Unfortunately there's no compatibility stub for that function compiled
      into the library if pcre_config(PCRE_CONFIG_JIT, &ret) would return 0,
      and no macro that can be used to check for it, so the only portable
      option to support builds without --enable-jit is via a new
      NO_LIBPCRE1_JIT=UnfortunatelyYes Makefile option[1].
      
      Another option would be to make the JIT opt-in via
      USE_LIBPCRE1_JIT=YesPlease, after all it's not a default option of
      PCRE v1.
      
      I think it makes more sense to make it opt-out since even though it's
      not a default option, most packagers of PCRE seem to turn it on by
      default, with the notable exception of the MinGW package.
      
      Make the MinGW platform work by default by changing the build defaults
      to turn on NO_LIBPCRE1_JIT=UnfortunatelyYes. It is the only platform
      that turns on USE_LIBPCRE=YesPlease by default, see commit
      df5218b4 ("config.mak.uname: support MSys2", 2016-01-13) for that
      change.
      
      1. "How do I support pcre1 JIT on all
         versions?"  (https://lists.exim.org/lurker/thread/20170601.103148.10253788.en.html)
      
      2. https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD
         (referenced from "Re: PCRE v2 compile error, was Re: What's cooking
         in git.git (May 2017, #01; Mon, 1)";
         <alpine.DEB.2.20.1705021756530.3480@virtualbox>)
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fb95e2e3
  9. 26 5月, 2017 5 次提交
    • Æ
      grep: un-break building with PCRE < 8.20 · c30cf827
      Ævar Arnfjörð Bjarmason 提交于
      Amend my change earlier in this series ("grep: add support for the
      PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
      versions earlier than 8.20.
      
      The 8.20 release was the first release to have JIT & pcre_jit_stack in
      the headers, so a mock type needs to be provided for it on those
      releases.
      
      Now git should compile with all PCRE versions that it supported before
      my JIT change.
      
      I've tested it as far back as version 7.5 released on 2008-01-10, once
      I got down to version 7.0 it wouldn't build anymore with GCC 7.1.1,
      and I couldn't be bothered to anything older than 7.5 as I'm confident
      that if the build breaks on those older versions it's not because of
      my JIT change.
      
      See the "un-break" change in this series ("grep: un-break building
      with PCRE < 8.32", 2017-05-10) for why this isn't squashed into the
      main PCRE JIT commit.
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c30cf827
    • Æ
      grep: un-break building with PCRE < 8.32 · e87de7ca
      Ævar Arnfjörð Bjarmason 提交于
      Amend my change earlier in this series ("grep: add support for the
      PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
      versions earlier than 8.32.
      
      The JIT support was added in version 8.20 released on 2011-10-21, but
      it wasn't until 8.32 released on 2012-11-30 that the fast code path to
      use the JIT via pcre_jit_exec() was added[1] (see also [2]).
      
      This means that versions 8.20 through 8.31 could still use the JIT,
      but supporting it on those versions would add to the already verbose
      macro soup around JIT support it, and I don't expect that the use-case
      of compiling a brand new git against a 5 year old PCRE is particularly
      common, and if someone does that they can just get the existing
      pre-JIT slow codepath.
      
      So just take the easy way out and disable the JIT on any version older
      than 8.32.
      
      The reason this change isn't part of the initial change PCRE JIT
      support is to have a cleaner history showing which parts of the
      implementation are only used for ancient PCRE versions. This also
      makes it easier to revert this change if we ever decide to stop
      supporting those old versions.
      
      1. http://www.pcre.org/original/changelog.txt ("28. Introducing a
         native interface for JIT. Through this interface, the
         compiled[...]")
      2. https://bugs.exim.org/show_bug.cgi?id=2121Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e87de7ca
    • Æ
      grep: add support for the PCRE v1 JIT API · fbaceaac
      Ævar Arnfjörð Bjarmason 提交于
      Change the grep PCRE v1 code to use JIT when available. When PCRE
      support was initially added in commit 63e7e9d8 ("git-grep: Learn
      PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into
      8.20 released on 2011-10-21.
      
      Enabling JIT support usually improves performance by more than
      40%. The pattern compilation times are relatively slower, but those
      relative numbers are tiny, and are easily made back in all but the
      most trivial cases of grep. Detailed benchmarks & overview of
      compilation times is at: http://sljit.sourceforge.net/pcre.html
      
      With this change the difference in a t/perf/p7820-grep-engines.sh run
      is, with just the /perl/ tests shown:
      
          $ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS='-j8 USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~ HEAD p7820-grep-engines.sh
          Test                                           HEAD~             HEAD
          ---------------------------------------------------------------------------------------
          7820.3: perl grep 'how.to'                      0.35(1.11+0.43)   0.23(0.42+0.46) -34.3%
          7820.7: perl grep '^how to'                     0.64(2.71+0.36)   0.27(0.66+0.44) -57.8%
          7820.11: perl grep '[how] to'                   0.63(2.51+0.42)   0.33(0.98+0.39) -47.6%
          7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       1.17(5.61+0.35)   0.34(1.08+0.46) -70.9%
          7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.43(1.52+0.44)   0.30(0.88+0.42) -30.2%
      
      The conditional support for JIT is implemented as suggested in the
      pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's
      not present.
      
      The implementation is relatively verbose because even if
      PCRE_CONFIG_JIT is defined only a call to pcre_config() can determine
      if the JIT is available, and if so the faster pcre_jit_exec() function
      should be called instead of pcre_exec(), and a different (but not
      complimentary!) function needs to be called to free pcre1_extra_info.
      
      There's no graceful fallback if pcre_jit_stack_alloc() fails under
      PCRE_CONFIG_JIT, instead the program will simply abort. I don't think
      this is worth handling gracefully, it'll only fail in cases where
      malloc() doesn't work, in which case we're screwed anyway.
      
      That there's no assignment of `p->pcre1_jit_on = 0` when
      PCRE_CONFIG_JIT isn't defined isn't a bug. The create_grep_pat()
      function allocates the grep_pat allocates it with calloc(), so it's
      guaranteed to be 0 when PCRE_CONFIG_JIT isn't defined.
      
      I you're bisecting and find this change, check that your PCRE isn't
      older than 8.32. This change intentionally broke really old versions
      of PCRE, but that's fixed in follow-up commits.
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fbaceaac
    • Æ
      grep: change internal *pcre* variable & function names to be *pcre1* · 6d4b5747
      Ævar Arnfjörð Bjarmason 提交于
      Change the internal PCRE variable & function names to have a "1"
      suffix. This is for preparation for libpcre2 support, where having
      non-versioned names would be confusing.
      
      An earlier change in this series ("grep: change the internal PCRE
      macro names to be PCRE1", 2017-04-07) elaborates on the motivations
      behind this change.
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      6d4b5747
    • Æ
      grep: change the internal PCRE macro names to be PCRE1 · 3485bea1
      Ævar Arnfjörð Bjarmason 提交于
      Change the internal USE_LIBPCRE define, & build options flag to use a
      naming convention ending in PCRE1, without changing the long-standing
      USE_LIBPCRE Makefile flag which enables this code.
      
      This is for preparation for libpcre2 support where having things like
      USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
      need to for backwards compatibility with old Makefile arguments would
      be confusing.
      
      In some ways it would be better to change everything that now uses
      USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
      USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
      burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
      build scripts.
      
      However I'd like to leave the door open to making
      USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
      i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
      the default.
      
      This code and the USE_LIBPCRE Makefile argument was added in commit
      63e7e9d8 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
      no indication that the PCRE project would release an entirely new &
      incompatible API around 3 years later.
      Signed-off-by: NÆvar Arnfjörð Bjarmason <avarab@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      3485bea1
  10. 23 12月, 2016 1 次提交
    • B
      grep: add submodules as a grep source type · 4538eef5
      Brandon Williams 提交于
      Add `GREP_SOURCE_SUBMODULE` as a grep_source type and cases for this new
      type in the various switch statements in grep.c.
      
      When initializing a grep_source with type `GREP_SOURCE_SUBMODULE` the
      identifier can either be NULL (to indicate that the working tree will be
      used) or a SHA1 (the REV of the submodule to be grep'd).  If the
      identifier is a SHA1 then we want to fall through to the
      `GREP_SOURCE_SHA1` case to handle the copying of the SHA1.
      Signed-off-by: NBrandon Williams <bmwill@google.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4538eef5
  11. 26 7月, 2016 1 次提交
    • J
      grep: further simplify setting the pattern type · 8465541e
      Junio C Hamano 提交于
      When c5c31d33 (grep: move pattern-type bits support to top-level
      grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
      function, the intention was to allow the users of grep API to having
      to fiddle only with .pattern_type_option (which can be set to "fixed",
      "basic", "extended", and "pcre"), and then immediately before compiling
      the pattern strings for use, call grep_commit_pattern_type() to have
      it prepare various bits in the grep_opt structure (like .fixed,
      .regflags, etc.).
      
      However, grep_set_pattern_type_option() helper function the grep API
      internally uses were left as an external function by mistake.  This
      function shouldn't have been made callable by the users of the API.
      
      Later when the grep API was used in revision traversal machinery,
      the caller then mistakenly started calling the function around
      34a4ae55 (log --grep: use the same helper to set -E/-F options as
      "git grep", 2012-10-03), instead of setting the .pattern_type_option
      field and letting the grep_commit_pattern_type() to take care of the
      details.
      
      This caused an unnecessary bug that made a configured
      grep.patternType take precedence over the command line options
      (e.g. --basic-regexp, --fixed-strings) in "git log" family of
      commands.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      8465541e
  12. 02 7月, 2016 1 次提交
  13. 29 10月, 2014 1 次提交
  14. 11 5月, 2013 1 次提交
    • J
      grep: allow to use textconv filters · 335ec3bf
      Jeff King 提交于
      Recently and not so recently, we made sure that log/grep type operations
      use textconv filters when a userfacing diff would do the same:
      
      ef90ab66 (pickaxe: use textconv for -S counting, 2012-10-28)
      b1c2f57d (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
      0508fe53 (combine-diff: respect textconv attributes, 2011-05-23)
      
      "git grep" currently does not use textconv filters at all, that is
      neither for displaying the match and context nor for the actual grepping,
      even when requested by --textconv.
      
      Introduce an option "--textconv" which makes git grep use any configured
      textconv filters for grepping and output purposes. It is off by default.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      335ec3bf
  15. 25 2月, 2013 1 次提交
  16. 12 10月, 2012 1 次提交
  17. 10 10月, 2012 2 次提交
  18. 30 9月, 2012 3 次提交
  19. 16 9月, 2012 1 次提交
  20. 15 9月, 2012 1 次提交
    • J
      grep: teach --debug option to dump the parse tree · 17bf35a3
      Junio C Hamano 提交于
      Our "grep" allows complex boolean expressions to be formed to match
      each individual line with operators like --and, '(', ')' and --not.
      Introduce the "--debug" option to show the parse tree to help people
      who want to debug and enhance it.
      
      Also "log" learns "--grep-debug" option to do the same.  The command
      line parser to the log family is a lot more limited than the general
      "git grep" parser, but it has special handling for header matching
      (e.g. "--author"), and a parse tree is valuable when working on it.
      
      Note that "--all-match" is *not* any individual node in the parse
      tree.  It is an instruction to the evaluator to check all the nodes
      in the top-level backbone have matched and reject a document as
      non-matching otherwise.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      17bf35a3
  21. 04 8月, 2012 1 次提交
    • J
      grep: add a grep.patternType configuration setting · 84befcd0
      J Smith 提交于
      The grep.extendedRegexp configuration setting enables the -E flag on grep
      by default but there are no equivalents for the -G, -F and -P flags.
      
      Rather than adding an additional setting for grep.fooRegexp for current
      and future pattern matching options, add a grep.patternType setting that
      can accept appropriate values for modifying the default grep pattern
      matching behavior. The current values are "basic", "extended", "fixed",
      "perl" and "default" for setting -G, -E, -F, -P and the default behavior
      respectively.
      
      When grep.patternType is set to a value other than "default", the
      grep.extendedRegexp setting is ignored. The value of "default" restores
      the current default behavior, including the grep.extendedRegexp
      behavior.
      Signed-off-by: NJ Smith <dark.panda@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      84befcd0
  22. 21 5月, 2012 1 次提交
  23. 03 2月, 2012 6 次提交
    • J
      grep: respect diff attributes for binary-ness · 41b59bfc
      Jeff King 提交于
      There is currently no way for users to tell git-grep that a
      particular path is or is not a binary file; instead, grep
      always relies on its auto-detection (or the user specifying
      "-a" to treat all binary-looking files like text).
      
      This patch teaches git-grep to use the same attribute lookup
      that is used by git-diff. We could add a new "grep" flag,
      but that is unnecessarily complex and unlikely to be useful.
      Despite the name, the "-diff" attribute (or "diff=foo" and
      the associated diff.foo.binary config option) are really
      about describing the contents of the path. It's simply
      historical that diff was the only thing that cared about
      these attributes in the past.
      
      And if this simple approach turns out to be insufficient, we
      still have a backwards-compatible path forward: we can add a
      separate "grep" attribute, and fall back to respecting
      "diff" if it is unset.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      41b59bfc
    • J
      grep: cache userdiff_driver in grep_source · 94ad9d9e
      Jeff King 提交于
      Right now, grep only uses the userdiff_driver for one thing:
      looking up funcname patterns for "-p" and "-W".  As new uses
      for userdiff drivers are added to the grep code, we want to
      minimize attribute lookups, which can be expensive.
      
      It might seem at first that this would also optimize multiple
      lookups when the funcname pattern for a file is needed
      multiple times. However, the compiled funcname pattern is
      already cached in struct grep_opt's "priv" member, so
      multiple lookups are already suppressed.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      94ad9d9e
    • J
      grep: drop grep_buffer's "name" parameter · c876d6da
      Jeff King 提交于
      Before the grep_source interface existed, grep_buffer was
      used by two types of callers:
      
        1. Ones which pulled a file into a buffer, and then wanted
           to supply the file's name for the output (i.e.,
           git grep).
      
        2. Ones which really just wanted to grep a buffer (i.e.,
           git log --grep).
      
      Callers in set (1) should now be using grep_source. Callers
      in set (2) always pass NULL for the "name" parameter of
      grep_buffer. We can therefore get rid of this now-useless
      parameter.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c876d6da
    • J
      grep: refactor the concept of "grep source" into an object · e1327023
      Jeff King 提交于
      The main interface to the low-level grep code is
      grep_buffer, which takes a pointer to a buffer and a size.
      This is convenient and flexible (we use it to grep commit
      bodies, files on disk, and blobs by sha1), but it makes it
      hard to pass extra information about what we are grepping
      (either for correctness, like overriding binary
      auto-detection, or for optimizations, like lazily loading
      blob contents).
      
      Instead, let's encapsulate the idea of a "grep source",
      including the buffer, its size, and where the data is coming
      from. This is similar to the diff_filespec structure used by
      the diff code (unsurprising, since future patches will
      implement some of the same optimizations found there).
      
      The diffstat is slightly scarier than the actual patch
      content. Most of the modified lines are simply replacing
      access to raw variables with their counterparts that are now
      in a "struct grep_source". Most of the added lines were
      taken from builtin/grep.c, which partially abstracted the
      idea of grep sources (for file vs sha1 sources).
      
      Instead of dropping the now-redundant code, this patch
      leaves builtin/grep.c using the traditional grep_buffer
      interface (which now wraps the grep_source interface). That
      makes it easy to test that there is no change of behavior
      (yet).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      e1327023
    • J
      grep: move sha1-reading mutex into low-level code · b3aeb285
      Jeff King 提交于
      The multi-threaded git-grep code needs to serialize access
      to the thread-unsafe read_sha1_file call. It does this with
      a mutex that is local to builtin/grep.c.
      
      Let's instead push this down into grep.c, where it can be
      used by both builtin/grep.c and grep.c. This will let us
      safely teach the low-level grep.c code tricks that involve
      reading from the object db.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b3aeb285
    • J
      grep: make locking flag global · 78db6ea9
      Jeff King 提交于
      The low-level grep code traditionally didn't care about
      threading, as it doesn't do any threading itself and didn't
      call out to other non-thread-safe code.  That changed with
      0579f91d (grep: enable threading with -p and -W using lazy
      attribute lookup, 2011-12-12), which pushed the lookup of
      funcname attributes (which is not thread-safe) into the
      low-level grep code.
      
      As a result, the low-level code learned about a new global
      "grep_attr_mutex" to serialize access to the attribute code.
      A multi-threaded caller (e.g., builtin/grep.c) is expected
      to initialize the mutex and set "use_threads" in the
      grep_opt structure. The low-level code only uses the lock if
      use_threads is set.
      
      However, putting the use_threads flag into the grep_opt
      struct is not the most logical place. Whether threading is
      in use is not something that matters for each call to
      grep_buffer, but is instead global to the whole program
      (i.e., if any thread is doing multi-threaded grep, every
      other thread, even if it thinks it is doing its own
      single-threaded grep, would need to use the locking).  In
      practice, this distinction isn't a problem for us, because
      the only user of multi-threaded grep is "git-grep", which
      does nothing except call grep.
      
      This patch turns the opt->use_threads flag into a global
      flag. More important than the nit-picking semantic argument
      above is that this means that the locking functions don't
      need to actually have access to a grep_opt to know whether
      to lock. Which in turn can make adding new locks simpler, as
      we don't need to pass around a grep_opt.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      78db6ea9
  24. 17 12月, 2011 1 次提交
  25. 21 8月, 2011 1 次提交
    • F
      Use kwset in grep · 9eceddee
      Fredrik Kuivinen 提交于
      Benchmarks for the hot cache case:
      
      before:
      $ perf stat --repeat=5 git grep qwerty > /dev/null
      
      Performance counter stats for 'git grep qwerty' (5 runs):
      
              3,478,085 cache-misses             #      2.322 M/sec   ( +-   2.690% )
             11,356,177 cache-references         #      7.582 M/sec   ( +-   2.598% )
              3,872,184 branch-misses            #      0.363 %       ( +-   0.258% )
          1,067,367,848 branches                 #    712.673 M/sec   ( +-   2.622% )
          3,828,370,782 instructions             #      0.947 IPC     ( +-   0.033% )
          4,043,832,831 cycles                   #   2700.037 M/sec   ( +-   0.167% )
                  8,518 page-faults              #      0.006 M/sec   ( +-   3.648% )
                    847 CPU-migrations           #      0.001 M/sec   ( +-   3.262% )
                  6,546 context-switches         #      0.004 M/sec   ( +-   2.292% )
            1497.695495 task-clock-msecs         #      3.303 CPUs    ( +-   2.550% )
      
             0.453394396  seconds time elapsed   ( +-   0.912% )
      
      after:
      $ perf stat --repeat=5 git grep qwerty > /dev/null
      
      Performance counter stats for 'git grep qwerty' (5 runs):
      
              2,989,918 cache-misses             #      3.166 M/sec   ( +-   5.013% )
             10,986,041 cache-references         #     11.633 M/sec   ( +-   4.899% )  (scaled from 95.06%)
              3,511,993 branch-misses            #      1.422 %       ( +-   0.785% )
            246,893,561 branches                 #    261.433 M/sec   ( +-   3.967% )
          1,392,727,757 instructions             #      0.564 IPC     ( +-   0.040% )
          2,468,142,397 cycles                   #   2613.494 M/sec   ( +-   0.110% )
                  7,747 page-faults              #      0.008 M/sec   ( +-   3.995% )
                    897 CPU-migrations           #      0.001 M/sec   ( +-   2.383% )
                  6,535 context-switches         #      0.007 M/sec   ( +-   1.993% )
             944.384228 task-clock-msecs         #      3.177 CPUs    ( +-   0.268% )
      
             0.297257643  seconds time elapsed   ( +-   0.450% )
      
      So we gain about 35% by using the kwset code.
      
      As a side effect of using kwset two grep tests are fixed by this
      patch. The first is fixed because kwset can deal with case-insensitive
      search containing NULs, something strcasestr cannot do. The second one
      is fixed because we consider patterns containing NULs as fixed strings
      (regcomp cannot accept patterns with NULs).
      Signed-off-by: NFredrik Kuivinen <frekui@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9eceddee
  26. 02 8月, 2011 1 次提交