• F
    Use kwset in grep · 9eceddee
    Fredrik Kuivinen 提交于
    Benchmarks for the hot cache case:
    
    before:
    $ perf stat --repeat=5 git grep qwerty > /dev/null
    
    Performance counter stats for 'git grep qwerty' (5 runs):
    
            3,478,085 cache-misses             #      2.322 M/sec   ( +-   2.690% )
           11,356,177 cache-references         #      7.582 M/sec   ( +-   2.598% )
            3,872,184 branch-misses            #      0.363 %       ( +-   0.258% )
        1,067,367,848 branches                 #    712.673 M/sec   ( +-   2.622% )
        3,828,370,782 instructions             #      0.947 IPC     ( +-   0.033% )
        4,043,832,831 cycles                   #   2700.037 M/sec   ( +-   0.167% )
                8,518 page-faults              #      0.006 M/sec   ( +-   3.648% )
                  847 CPU-migrations           #      0.001 M/sec   ( +-   3.262% )
                6,546 context-switches         #      0.004 M/sec   ( +-   2.292% )
          1497.695495 task-clock-msecs         #      3.303 CPUs    ( +-   2.550% )
    
           0.453394396  seconds time elapsed   ( +-   0.912% )
    
    after:
    $ perf stat --repeat=5 git grep qwerty > /dev/null
    
    Performance counter stats for 'git grep qwerty' (5 runs):
    
            2,989,918 cache-misses             #      3.166 M/sec   ( +-   5.013% )
           10,986,041 cache-references         #     11.633 M/sec   ( +-   4.899% )  (scaled from 95.06%)
            3,511,993 branch-misses            #      1.422 %       ( +-   0.785% )
          246,893,561 branches                 #    261.433 M/sec   ( +-   3.967% )
        1,392,727,757 instructions             #      0.564 IPC     ( +-   0.040% )
        2,468,142,397 cycles                   #   2613.494 M/sec   ( +-   0.110% )
                7,747 page-faults              #      0.008 M/sec   ( +-   3.995% )
                  897 CPU-migrations           #      0.001 M/sec   ( +-   2.383% )
                6,535 context-switches         #      0.007 M/sec   ( +-   1.993% )
           944.384228 task-clock-msecs         #      3.177 CPUs    ( +-   0.268% )
    
           0.297257643  seconds time elapsed   ( +-   0.450% )
    
    So we gain about 35% by using the kwset code.
    
    As a side effect of using kwset two grep tests are fixed by this
    patch. The first is fixed because kwset can deal with case-insensitive
    search containing NULs, something strcasestr cannot do. The second one
    is fixed because we consider patterns containing NULs as fixed strings
    (regcomp cannot accept patterns with NULs).
    Signed-off-by: NFredrik Kuivinen <frekui@gmail.com>
    Signed-off-by: NJunio C Hamano <gitster@pobox.com>
    9eceddee
grep.c 27.2 KB