• A
    samples/bpf: add map_lookup microbenchmark · 95ff141e
    Alexei Starovoitov 提交于
    $ map_perf_test 128
    speed of HASH bpf_map_lookup_elem() in lookups per second
    	w/o JIT		w/JIT
    before	46M		58M
    after	42M		74M
    
    perf report
    before:
        54.23%  map_perf_test  [kernel.kallsyms]  [k] __htab_map_lookup_elem
        14.24%  map_perf_test  [kernel.kallsyms]  [k] lookup_elem_raw
         8.84%  map_perf_test  [kernel.kallsyms]  [k] htab_map_lookup_elem
         5.93%  map_perf_test  [kernel.kallsyms]  [k] bpf_map_lookup_elem
         2.30%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
         1.49%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler
    
    after:
        60.03%  map_perf_test  [kernel.kallsyms]  [k] __htab_map_lookup_elem
        18.07%  map_perf_test  [kernel.kallsyms]  [k] lookup_elem_raw
         2.91%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
         1.94%  map_perf_test  [kernel.kallsyms]  [k] _einittext
         1.90%  map_perf_test  [kernel.kallsyms]  [k] __audit_syscall_exit
         1.72%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler
    
    Notice that bpf_map_lookup_elem() and htab_map_lookup_elem() are trivial
    functions, yet they take sizeable amount of cpu time.
    htab_map_gen_lookup() removes bpf_map_lookup_elem() and converts
    htab_map_lookup_elem() into three BPF insns which causing cpu time
    for bpf_prog_da4fc6a3f41761a2() slightly increase.
    
    $ map_perf_test 256
    speed of ARRAY bpf_map_lookup_elem() in lookups per second
    	w/o JIT		w/JIT
    before	97M		174M
    after	64M		280M
    
    before:
        37.33%  map_perf_test  [kernel.kallsyms]  [k] array_map_lookup_elem
        13.95%  map_perf_test  [kernel.kallsyms]  [k] bpf_map_lookup_elem
         6.54%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
         4.57%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler
    
    after:
        32.86%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
         6.54%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler
    
    array_map_gen_lookup() removes calls to array_map_lookup_elem()
    and bpf_map_lookup_elem() and replaces them with 7 bpf insns.
    
    The performance without JIT is slower, since executing extra insns
    in the interpreter is slower than running native C code,
    but with JIT the performance gains are obvious,
    since native C->x86 code is replaced with fewer bpf->x86 instructions.
    Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
    Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    95ff141e
map_perf_test_kern.c 4.4 KB