1. 03 12月, 2021 4 次提交
  2. 01 12月, 2021 3 次提交
    • J
      selftest/bpf/benchs: Add bpf_loop benchmark · ec151037
      Joanne Koong 提交于
      Add benchmark to measure the throughput and latency of the bpf_loop
      call.
      
      Testing this on my dev machine on 1 thread, the data is as follows:
      
              nr_loops: 10
      bpf_loop - throughput: 198.519 ± 0.155 M ops/s, latency: 5.037 ns/op
      
              nr_loops: 100
      bpf_loop - throughput: 247.448 ± 0.305 M ops/s, latency: 4.041 ns/op
      
              nr_loops: 500
      bpf_loop - throughput: 260.839 ± 0.380 M ops/s, latency: 3.834 ns/op
      
              nr_loops: 1000
      bpf_loop - throughput: 262.806 ± 0.629 M ops/s, latency: 3.805 ns/op
      
              nr_loops: 5000
      bpf_loop - throughput: 264.211 ± 1.508 M ops/s, latency: 3.785 ns/op
      
              nr_loops: 10000
      bpf_loop - throughput: 265.366 ± 3.054 M ops/s, latency: 3.768 ns/op
      
              nr_loops: 50000
      bpf_loop - throughput: 235.986 ± 20.205 M ops/s, latency: 4.238 ns/op
      
              nr_loops: 100000
      bpf_loop - throughput: 264.482 ± 0.279 M ops/s, latency: 3.781 ns/op
      
              nr_loops: 500000
      bpf_loop - throughput: 309.773 ± 87.713 M ops/s, latency: 3.228 ns/op
      
              nr_loops: 1000000
      bpf_loop - throughput: 262.818 ± 4.143 M ops/s, latency: 3.805 ns/op
      
      >From this data, we can see that the latency per loop decreases as the
      number of loops increases. On this particular machine, each loop had an
      overhead of about ~4 ns, and we were able to run ~250 million loops
      per second.
      Signed-off-by: NJoanne Koong <joannekoong@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211130030622.4131246-5-joannekoong@fb.com
      ec151037
    • J
      selftests/bpf: Measure bpf_loop verifier performance · f6e659b7
      Joanne Koong 提交于
      This patch tests bpf_loop in pyperf and strobemeta, and measures the
      verifier performance of replacing the traditional for loop
      with bpf_loop.
      
      The results are as follows:
      
      ~strobemeta~
      
      Baseline
          verification time 6808200 usec
          stack depth 496
          processed 554252 insns (limit 1000000) max_states_per_insn 16
          total_states 15878 peak_states 13489  mark_read 3110
          #192 verif_scale_strobemeta:OK (unrolled loop)
      
      Using bpf_loop
          verification time 31589 usec
          stack depth 96+400
          processed 1513 insns (limit 1000000) max_states_per_insn 2
          total_states 106 peak_states 106 mark_read 60
          #193 verif_scale_strobemeta_bpf_loop:OK
      
      ~pyperf600~
      
      Baseline
          verification time 29702486 usec
          stack depth 368
          processed 626838 insns (limit 1000000) max_states_per_insn 7
          total_states 30368 peak_states 30279 mark_read 748
          #182 verif_scale_pyperf600:OK (unrolled loop)
      
      Using bpf_loop
          verification time 148488 usec
          stack depth 320+40
          processed 10518 insns (limit 1000000) max_states_per_insn 10
          total_states 705 peak_states 517 mark_read 38
          #183 verif_scale_pyperf600_bpf_loop:OK
      
      Using the bpf_loop helper led to approximately a 99% decrease
      in the verification time and in the number of instructions.
      Signed-off-by: NJoanne Koong <joannekoong@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211130030622.4131246-4-joannekoong@fb.com
      f6e659b7
    • J
      selftests/bpf: Add bpf_loop test · 4e5070b6
      Joanne Koong 提交于
      Add test for bpf_loop testing a variety of cases:
      various nr_loops, null callback ctx, invalid flags, nested callbacks.
      Signed-off-by: NJoanne Koong <joannekoong@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211130030622.4131246-3-joannekoong@fb.com
      4e5070b6
  3. 29 11月, 2021 1 次提交
  4. 26 11月, 2021 9 次提交
  5. 23 11月, 2021 1 次提交
  6. 20 11月, 2021 1 次提交
  7. 19 11月, 2021 2 次提交
  8. 18 11月, 2021 1 次提交
  9. 17 11月, 2021 3 次提交
  10. 16 11月, 2021 4 次提交
    • A
      selftests/bpf: Add uprobe triggering overhead benchmarks · d41bc48b
      Andrii Nakryiko 提交于
      Add benchmark to measure overhead of uprobes and uretprobes. Also have
      a baseline (no uprobe attached) benchmark.
      
      On my dev machine, baseline benchmark can trigger 130M user_target()
      invocations. When uprobe is attached, this falls to just 700K. With
      uretprobe, we get down to 520K:
      
        $ sudo ./bench trig-uprobe-base -a
        Summary: hits  131.289 ± 2.872M/s
      
        # UPROBE
        $ sudo ./bench -a trig-uprobe-without-nop
        Summary: hits    0.729 ± 0.007M/s
      
        $ sudo ./bench -a trig-uprobe-with-nop
        Summary: hits    1.798 ± 0.017M/s
      
        # URETPROBE
        $ sudo ./bench -a trig-uretprobe-without-nop
        Summary: hits    0.508 ± 0.012M/s
      
        $ sudo ./bench -a trig-uretprobe-with-nop
        Summary: hits    0.883 ± 0.008M/s
      
      So there is almost 2.5x performance difference between probing nop vs
      non-nop instruction for entry uprobe. And 1.7x difference for uretprobe.
      
      This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe
      and 2 microseconds for non-nop uretprobe.
      
      For nop variants, uprobe and uretprobe overhead is down to 0.556 and
      1.13 microseconds, respectively.
      
      For comparison, just doing a very low-overhead syscall (with no BPF
      programs attached anywhere) gives:
      
        $ sudo ./bench trig-base -a
        Summary: hits    4.830 ± 0.036M/s
      
      So uprobes are about 2.67x slower than pure context switch.
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
      d41bc48b
    • Q
      selftests/bpf: Configure dir paths via env in test_bpftool_synctypes.py · e12cd158
      Quentin Monnet 提交于
      Script test_bpftool_synctypes.py parses a number of files in the bpftool
      directory (or even elsewhere in the repo) to make sure that the list of
      types or options in those different files are consistent. Instead of
      having fixed paths, let's make the directories configurable through
      environment variable. This should make easier in the future to run the
      script in a different setup, for example on an out-of-tree bpftool
      mirror with a different layout.
      Signed-off-by: NQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20211115225844.33943-4-quentin@isovalent.com
      e12cd158
    • Q
      bpftool: Update doc (use susbtitutions) and test_bpftool_synctypes.py · b6231815
      Quentin Monnet 提交于
      test_bpftool_synctypes.py helps detecting inconsistencies in bpftool
      between the different list of types and options scattered in the
      sources, the documentation, and the bash completion. For options that
      apply to all bpftool commands, the script had a hardcoded list of
      values, and would use them to check whether the man pages are
      up-to-date. When writing the script, it felt acceptable to have this
      list in order to avoid to open and parse bpftool's main.h every time,
      and because the list of global options in bpftool doesn't change so
      often.
      
      However, this is prone to omissions, and we recently added a new
      -l|--legacy option which was described in common_options.rst, but not
      listed in the options summary of each manual page. The script did not
      complain, because it keeps comparing the hardcoded list to the (now)
      outdated list in the header file.
      
      To address the issue, this commit brings the following changes:
      
      - Options that are common to all bpftool commands (--json, --pretty, and
        --debug) are moved to a dedicated file, and used in the definition of
        a RST substitution. This substitution is used in the sources of all
        the man pages.
      
      - This list of common options is updated, with the addition of the new
        -l|--legacy option.
      
      - The script test_bpftool_synctypes.py is updated to compare:
          - Options specific to a command, found in C files, for the
            interactive help messages, with the same specific options from the
            relevant man page for that command.
          - Common options, checked just once: the list in main.h is
            compared with the new list in substitutions.rst.
      Signed-off-by: NQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20211115225844.33943-3-quentin@isovalent.com
      b6231815
    • Y
      selftests/bpf: Add a dedup selftest with equivalent structure types · 47461583
      Yonghong Song 提交于
      Without previous libbpf patch, the following error will occur:
      
        $ ./test_progs -t btf
        ...
        do_test_dedup:FAIL:check btf_dedup failed errno:-22#13/205 btf/dedup: btf_type_tag #5, struct:FAIL
      
      And the previous libbpf patch fixed the issue.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20211115163943.3922547-1-yhs@fb.com
      47461583
  11. 13 11月, 2021 2 次提交
  12. 12 11月, 2021 9 次提交