提交 · d80b2fcbe0a023619e0fc73112f2a02c2662f6ab · openeuler / Kernel

17 3月, 2021 1 次提交

selftests/bpf: Build everything in debug mode · 252e3cbf

由 Andrii Nakryiko 提交于 3月 13, 2021

Build selftests, bpftool, and libbpf in debug mode with DWARF data to
facilitate easier debugging.

In terms of impact on building and running selftests. Build is actually faster
now:

BEFORE: make -j60  380.21s user 37.87s system 1466% cpu 28.503 total
AFTER:  make -j60  345.47s user 37.37s system 1599% cpu 23.939 total

test_progs runtime seems to be the same:

BEFORE:
real    1m5.139s
user    0m1.600s
sys     0m43.977s

AFTER:
real    1m3.799s
user    0m1.721s
sys     0m42.420s

Huge difference is being able to debug issues throughout test_progs, bpftool,
and libbpf without constantly updating 3 Makefiles by hand (including GDB
seeing the source code without any extra incantations).
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-5-andrii@kernel.org

252e3cbf

09 3月, 2021 1 次提交

selftests/bpf: Fix typo in Makefile · a0d73acc

由 Jean-Philippe Brucker 提交于 3月 08, 2021

The selftest build fails when trying to install the scripts:

rsync: [sender] link_stat "tools/testing/selftests/bpf/test_docs_build.sh" failed: No such file or directory (2)

Fix the filename.

Fixes: a01d935b ("tools/bpf: Remove bpf-helpers from bpftool docs")
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210308182830.155784-1-jean-philippe@linaro.org

a0d73acc

05 3月, 2021 1 次提交

tools/bpf: Remove bpf-helpers from bpftool docs · a01d935b

由 Joe Stringer 提交于 3月 02, 2021

This logic is used for validating the manual pages from selftests, so
move the infra under tools/testing/selftests/bpf/ and rely on selftests
for validation rather than tying it into the bpftool build.
Signed-off-by: NJoe Stringer <joe@cilium.io>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NQuentin Monnet <quentin@isovalent.com>
Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210302171947.2268128-12-joe@cilium.io

a01d935b

27 2月, 2021 1 次提交

selftests/bpf: Copy extras in out-of-srctree builds · 86fd1665

由 Ilya Leoshkevich 提交于 2月 24, 2021

Building selftests in a separate directory like this:

    make O="$BUILD" -C tools/testing/selftests/bpf

and then running:

    cd "$BUILD" && ./test_progs -t btf

causes all the non-flavored btf_dump_test_case_*.c tests to fail,
because these files are not copied to where test_progs expects to find
them.

Fix by not skipping EXT-COPY when the original $(OUTPUT) is not empty
(lib.mk sets it to $(shell pwd) in that case) and using rsync instead
of cp: cp fails because e.g. urandom_read is being copied into itself,
and rsync simply skips such cases. rsync is already used by kselftests
and therefore is not a new dependency.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210224111445.102342-1-iii@linux.ibm.com

86fd1665

12 2月, 2021 1 次提交

selftests/bpf: Integrate the socket_cookie test to test_progs · 61f8c9c8

由 Florent Revest 提交于 2月 10, 2021

Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.

This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs
Signed-off-by: NFlorent Revest <revest@chromium.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NKP Singh <kpsingh@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-3-revest@chromium.org

61f8c9c8

29 1月, 2021 1 次提交

tools: Factor Clang, LLC and LLVM utils definitions · 211a741c

由 Sedat Dilek 提交于 1月 28, 2021

When dealing with BPF/BTF/pahole and DWARF v5 I wanted to build bpftool.

While looking into the source code I found duplicate assignments in misc tools
for the LLVM eco system, e.g. clang and llvm-objcopy.

Move the Clang, LLC and/or LLVM utils definitions to tools/scripts/Makefile.include
file and add missing includes where needed. Honestly, I was inspired by the commit
c8a950d0 ("tools: Factor HOSTCC, HOSTLD, HOSTAR definitions").

I tested with bpftool and perf on Debian/testing AMD64 and LLVM/Clang v11.1.0-rc1.

Build instructions:

[ make and make-options ]
MAKE="make V=1"
MAKE_OPTS="HOSTCC=clang HOSTCXX=clang++ HOSTLD=ld.lld CC=clang LD=ld.lld LLVM=1 LLVM_IAS=1"
MAKE_OPTS="$MAKE_OPTS PAHOLE=/opt/pahole/bin/pahole"

[ clean-up ]
$MAKE $MAKE_OPTS -C tools/ clean

[ bpftool ]
$MAKE $MAKE_OPTS -C tools/bpf/bpftool/

[ perf ]
PYTHON=python3 $MAKE $MAKE_OPTS -C tools/perf/

I was careful with respecting the user's wish to override custom compiler, linker,
GNU/binutils and/or LLVM utils settings.
Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com> # tools/build and tools/perf
Link: https://lore.kernel.org/bpf/20210128015117.20515-1-sedat.dilek@gmail.com

211a741c

21 1月, 2021 1 次提交

bpf, selftests: Fold test_current_pid_tgid_new_ns into test_progs. · 09c02d55

由 Carlos Neira 提交于 1月 14, 2021

Currently tests for bpf_get_ns_current_pid_tgid() are outside test_progs.
This change folds test cases into test_progs.

Changes from v11:

 - Fixed test failure is not detected.
 - Removed EXIT(3) call as it will stop test_progs execution.
Signed-off-by: NCarlos Neira <cneirabustos@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210114141033.GA17348@localhostSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

09c02d55

15 1月, 2021 1 次提交

bpf: Add tests for new BPF atomic operations · 98d666d0

由 Brendan Jackman 提交于 1月 14, 2021

The prog_test that's added depends on Clang/LLVM features added by
Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184).

Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
to:

 - Avoid breaking the build for people on old versions of Clang
 - Avoid needing separate lists of test objects for no_alu32, where
   atomics are not supported even if Clang has the feature.

The atomics_test.o BPF object is built unconditionally both for
test_progs and test_progs-no_alu32. For test_progs, if Clang supports
atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
test code. Otherwise, progs and global vars are defined anyway, as
stubs; this means that the skeleton user code still builds.

The atomics_test.o userspace object is built once and used for both
test_progs and test_progs-no_alu32. A variable called skip_tests is
defined in the BPF object's data section, which tells the userspace
object whether to skip the atomics test.
Signed-off-by: NBrendan Jackman <jackmanb@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-11-jackmanb@google.com

98d666d0

14 1月, 2021 5 次提交

selftests/bpf: Install btf_dump test cases · b8d1cbef

由 Jean-Philippe Brucker 提交于 1月 13, 2021

The btf_dump test cannot access the original source files for comparison
when running the selftests out of tree, causing several failures:

awk: btf_dump_test_case_syntax.c: No such file or directory
...

Add those files to $(TEST_FILES) to have "make install" pick them up.
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113163319.1516382-6-jean-philippe@linaro.org

b8d1cbef

selftests/bpf: Fix installation of urandom_read · ca1e8467

由 Jean-Philippe Brucker 提交于 1月 13, 2021

For out-of-tree builds, $(TEST_CUSTOM_PROGS) require the $(OUTPUT)
prefix, otherwise the kselftest lib doesn't know how to install them:

rsync: [sender] link_stat "tools/testing/selftests/bpf/urandom_read" failed: No such file or directory (2)
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113163319.1516382-5-jean-philippe@linaro.org

ca1e8467

selftests/bpf: Move generated test files to $(TEST_GEN_FILES) · d6ac8cad

由 Jean-Philippe Brucker 提交于 1月 13, 2021

During an out-of-tree build, attempting to install the $(TEST_FILES)
into the $(OUTPUT) directory fails, because the objects were already
generated into $(OUTPUT):

rsync: [sender] link_stat "tools/testing/selftests/bpf/test_lwt_ip_encap.o" failed: No such file or directory (2)
rsync: [sender] link_stat "tools/testing/selftests/bpf/test_tc_edt.o" failed: No such file or directory (2)

Use $(TEST_GEN_FILES) instead.
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113163319.1516382-4-jean-philippe@linaro.org

d6ac8cad

selftests/bpf: Fix out-of-tree build · 5837cede

由 Jean-Philippe Brucker 提交于 1月 13, 2021

When building out-of-tree, the .skel.h files are generated into the
$(OUTPUT) directory, rather than $(CURDIR). Add $(OUTPUT) to the include
paths.
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113163319.1516382-3-jean-philippe@linaro.org

5837cede

selftests/bpf: Enable cross-building · de11ae4f

由 Jean-Philippe Brucker 提交于 1月 13, 2021

Build bpftool and resolve_btfids using the host toolchain when
cross-compiling, since they are executed during build to generate the
selftests. Add a host build directory in order to build both host and
target version of libbpf. Build host tools using $(HOSTCC) defined in
Makefile.include.
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113163319.1516382-2-jean-philippe@linaro.org

de11ae4f

17 12月, 2020 1 次提交

selftests/bpf: Clarify build error if no vmlinux · 1a3449c1

由 Kamal Mostafa 提交于 12月 15, 2020

If Makefile cannot find any of the vmlinux's in its VMLINUX_BTF_PATHS list,
it tries to run btftool incorrectly, with VMLINUX_BTF unset:

    bpftool btf dump file $(VMLINUX_BTF) format c

Such that the keyword 'format' is misinterpreted as the path to vmlinux.
The resulting build error message is fairly cryptic:

      GEN      vmlinux.h
    Error: failed to load BTF from format: No such file or directory

This patch makes the failure reason clearer by yielding this instead:

    Makefile:...: *** Cannot find a vmlinux for VMLINUX_BTF at any of
    "{paths}".  Stop.

Fixes: acbd0620 ("selftests/bpf: Add vmlinux.h selftest exercising tracing of syscalls")
Signed-off-by: NKamal Mostafa <kamal@canonical.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201215182011.15755-1-kamal@canonical.com

1a3449c1

11 12月, 2020 2 次提交

selftests/bpf: Drop the need for LLVM's llc · 89ad7420

由 Andrew Delgadillo 提交于 12月 11, 2020

LLC is meant for compiler development and debugging. Consequently, it
exposes many low level options about its backend. To avoid future bugs
introduced by using the raw LLC tool, use clang directly so that all
appropriate options are passed to the back end.

Additionally, simplify the Makefile by removing the
CLANG_NATIVE_BPF_BUILD_RULE as it is not being use, stop passing
dwarfris attr since elfutils/libdw now supports the bpf backend (which
should work with any recent pahole), and stop passing alu32 since
-mcpu=v3 implies alu32.
Signed-off-by: NAndrew Delgadillo <adelg@google.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201211004344.3355074-1-adelg@google.com

89ad7420

selftests/bpf: fix bpf_testmod.ko recompilation logic · a67079b0

由 Andrii Nakryiko 提交于 12月 10, 2020

bpf_testmod.ko build rule declared dependency on VMLINUX_BTF, but the variable
itself was initialized after the rule was declared, which often caused
bpf_testmod.ko to not be re-compiled. Fix by moving VMLINUX_BTF determination
sooner.

Also enforce bpf_testmod.ko recompilation when we detect that vmlinux image
changed by removing bpf_testmod/bpf_testmod.ko. This is necessary to generate
correct module's split BTF. Without it, Kbuild's module build logic might
determine that nothing changed on the kernel side and thus bpf_testmod.ko
shouldn't be rebuilt, so won't re-generate module BTF, which often leads to
module's BTF with wrong string offsets against vmlinux BTF. Removing .ko file
forces Kbuild to re-build the module.
Reported-by: NAlexei Starovoitov <ast@kernel.org>
Fixes: 9f7fa225 ("selftests/bpf: Add bpf_testmod kernel module for testing")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20201211015946.4062098-1-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

a67079b0

10 12月, 2020 1 次提交

selftests/bpf: Drop tcp-{client,server}.py from Makefile · a5b7b119

由 Veronika Kabatova 提交于 12月 10, 2020

The files don't exist anymore so this breaks generic kselftest builds
when using "make install" or "make gen_tar".

Fixes: 247f0ec3 ("selftests/bpf: Drop python client/server in favor of threads")
Signed-off-by: NVeronika Kabatova <vkabatov@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201210120134.2148482-1-vkabatov@redhat.com

a5b7b119

09 12月, 2020 2 次提交

selftests/bpf: Xsk selftests - SKB POLL, NOPOLL · facb7cb2

由 Weqaar Janjua 提交于 12月 07, 2020

Adds following tests:

1. AF_XDP SKB mode
   Generic mode XDP is driver independent, used when the driver does
   not have support for XDP. Works on any netdevice using sockets and
   generic XDP path. XDP hook from netif_receive_skb().
   a. nopoll - soft-irq processing
   b. poll - using poll() syscall
Signed-off-by: NWeqaar Janjua <weqaar.a.janjua@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NYonghong Song <yhs@fb.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20201207215333.11586-3-weqaar.a.janjua@intel.com

facb7cb2

selftests/bpf: Xsk selftests framework · a8905257

由 Weqaar Janjua 提交于 12月 07, 2020

This patch adds AF_XDP selftests framework under selftests/bpf.

Topology:
---------
     -----------           -----------
     |  xskX   | --------- |  xskY   |
     -----------     |     -----------
          |          |          |
     -----------     |     ----------
     |  vethX  | --------- |  vethY |
     -----------   peer    ----------
          |          |          |
     namespaceX      |     namespaceY

Prerequisites setup by script test_xsk.sh:

   Set up veth interfaces as per the topology shown ^^:
   * setup two veth interfaces and one namespace
   ** veth<xxxx> in root namespace
   ** veth<yyyy> in af_xdp<xxxx> namespace
   ** namespace af_xdp<xxxx>
   * create a spec file veth.spec that includes this run-time configuration
   *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid
       conflict with any existing interface
   * tests the veth and xsk layers of the topology
Signed-off-by: NWeqaar Janjua <weqaar.a.janjua@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NYonghong Song <yhs@fb.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20201207215333.11586-2-weqaar.a.janjua@intel.com

a8905257

04 12月, 2020 2 次提交

selftests/bpf: Add bpf_testmod kernel module for testing · 9f7fa225

由 Andrii Nakryiko 提交于 12月 03, 2020

Add bpf_testmod module, which is conceptually out-of-tree module and provides
ways for selftests/bpf to test various kernel module-related functionality:
raw tracepoint, fentry/fexit/fmod_ret, etc. This module will be auto-loaded by
test_progs test runner and expected by some of selftests to be present and
loaded.

Pahole currently isn't able to generate BTF for static functions in kernel
modules, so make sure traced function is global.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-7-andrii@kernel.org

9f7fa225

bpf: Fix cold build of test_progs-no_alu32 · 58c185b8

由 Brendan Jackman 提交于 12月 03, 2020

This object lives inside the trunner output dir,
i.e. tools/testing/selftests/bpf/no_alu32/btf_data.o

At some point it gets copied into the parent directory during another
part of the build, but that doesn't happen when building
test_progs-no_alu32 from clean.
Signed-off-by: NBrendan Jackman <jackmanb@google.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NJiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/bpf/20201203120850.859170-1-jackmanb@google.com

58c185b8

01 12月, 2020 1 次提交

selftests/bpf: Fix flavored variants of test_ima · 854055c0

由 KP Singh 提交于 11月 26, 2020

Flavored variants of test_progs (e.g. test_progs-no_alu32) change their
working directory to the corresponding subdirectory (e.g. no_alu32).
Since the setup script required by test_ima (ima_setup.sh) is not
mentioned in the dependencies, it does not get copied to these
subdirectories and causes flavored variants of test_ima to fail.

Adding the script to TRUNNER_EXTRA_FILES ensures that the file is also
copied to the subdirectories for the flavored variants of test_progs.

Fixes: 34b82d3a ("bpf: Add a selftest for bpf_ima_inode_hash")
Reported-by: NYonghong Song <yhs@fb.com>
Suggested-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NKP Singh <kpsingh@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201126184946.1708213-1-kpsingh@chromium.org

854055c0

19 11月, 2020 1 次提交

selftests/bpf: Fix broken riscv build · 6016df8f

由 Björn Töpel 提交于 11月 18, 2020

The selftests/bpf Makefile includes system include directories from
the host, when building BPF programs. On RISC-V glibc requires that
__riscv_xlen is defined. This is not the case for "clang -target bpf",
which messes up __WORDSIZE (errno.h -> ... -> wordsize.h) and breaks
the build.

By explicitly defining __risc_xlen correctly for riscv, we can
workaround this.

Fixes: 167381f3 ("selftests/bpf: Makefile fix "missing" headers on build with -idirafter")
Signed-off-by: NBjörn Töpel <bjorn.topel@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NLuke Nelson <luke.r.nels@gmail.com>
Link: https://lore.kernel.org/bpf/20201118071640.83773-2-bjorn.topel@gmail.com

6016df8f

06 11月, 2020 1 次提交

selftests/bpf: Add checking of raw type dump in BTF writer APIs selftests · 1306c980

由 Andrii Nakryiko 提交于 11月 04, 2020

Add re-usable btf_helpers.{c,h} to provide BTF-related testing routines. Start
with adding a raw BTF dumping helpers.

Raw BTF dump is the most succinct and at the same time a very human-friendly
way to validate exact contents of BTF types. Cross-validate raw BTF dump and
writable BTF in a single selftest. Raw type dump checks also serve as a good
self-documentation.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NSong Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-7-andrii@kernel.org

1306c980

04 11月, 2020 1 次提交

selftests/bpf: Move test_tcppbf_user into test_progs · aaf376bd

由 Alexander Duyck 提交于 11月 03, 2020

Recently a bug was missed due to the fact that test_tcpbpf_user is not a
part of test_progs. In order to prevent similar issues in the future move
the test functionality into test_progs. By doing this we can make certain
that it is a part of standard testing and will not be overlooked.

As a part of moving the functionality into test_progs it is necessary to
integrate with the test_progs framework and to drop any redundant code.
This patch:
1. Cleans up the include headers
2. Dropped a duplicate definition of bpf_find_map
3. Switched over to using test_progs specific cgroup functions
4. Renamed main to test_tcpbpf_user
5. Dropped return value in favor of CHECK_FAIL to check for errors

The general idea is that I wanted to keep the changes as small as possible
while moving the file into the test_progs framework. The follow-on patches
are meant to clean up the remaining issues such as the use of CHECK_FAIL.
Signed-off-by: NAlexander Duyck <alexanderduyck@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/160443928881.1086697.17661359319919165370.stgit@localhost.localdomain

aaf376bd

09 10月, 2020 1 次提交

kbuild: explicitly specify the build id style · a9684337

由 Bill Wendling 提交于 9月 22, 2020

ld's --build-id defaults to "sha1" style, while lld defaults to "fast".
The build IDs are very different between the two, which may confuse
programs that reference them.
Signed-off-by: NBill Wendling <morbo@google.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

a9684337

26 9月, 2020 1 次提交

bpf: selftest: Move sock_fields test into test_progs · 6f521a2b

由 Martin KaFai Lau 提交于 9月 24, 2020

This is a mechanical change to
1. move test_sock_fields.c to prog_tests/sock_fields.c
2. rename progs/test_sock_fields_kern.c to progs/test_sock_fields.c

Minimal change is made to the code itself.  Next patch will make
changes to use new ways of writing test, e.g. use skel and global
variables.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200925000427.3857814-1-kafai@fb.com

6f521a2b

16 9月, 2020 2 次提交

selftests/bpf: Merge most of test_btf into test_progs · c64779e2

由 Andrii Nakryiko 提交于 9月 15, 2020

Merge 183 tests from test_btf into test_progs framework to be exercised
regularly. All the test_btf tests that were moved are modeled as proper
sub-tests in test_progs framework for ease of debugging and reporting.

No functional or behavioral changes were intended, I tried to preserve
original behavior as much as possible. E.g., `test_progs -v` will activate
"always_log" flag to emit BTF validation log.

The only difference is in reducing the max_entries limit for pretty-printing
tests from (128 * 1024) to just 128 to reduce tests running time without
reducing the coverage.

Example test run:

  $ sudo ./test_progs -n 8
  ...
  #8 btf:OK
  Summary: 1/183 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200916004819.3767489-1-andriin@fb.com

c64779e2

selftests/bpf: Test load and dump metadata with btftool and skel · d42d1cc4

由 YiFei Zhu 提交于 9月 15, 2020

This is a simple test to check that loading and dumping metadata
in btftool works, whether or not metadata contents are used by the
program.

A C test is also added to make sure the skeleton code can read the
metadata values.
Signed-off-by: NYiFei Zhu <zhuyifei@google.com>
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Link: https://lore.kernel.org/bpf/20200915234543.3220146-6-sdf@google.com

d42d1cc4

11 9月, 2020 1 次提交

selftests, bpftool: Add bpftool (and eBPF helpers) documentation build · 41d5c37b

由 Quentin Monnet 提交于 9月 09, 2020

eBPF selftests include a script to check that bpftool builds correctly
with different command lines. Let's add one build for bpftool's
documentation so as to detect errors or warning reported by rst2man when
compiling the man pages. Also add a build to the selftests Makefile to
make sure we build bpftool documentation along with bpftool when
building the selftests.

This also builds and checks warnings for the man page for eBPF helpers,
which is built along bpftool's documentation.

This change adds rst2man as a dependency for selftests (it comes with
Python's "docutils").

v2:
- Use "--exit-status=1" option for rst2man instead of counting lines
  from stderr.
- Also build bpftool as part as the selftests build (and not only when
  the tests are actually run).
Signed-off-by: NQuentin Monnet <quentin@isovalent.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200909162251.15498-3-quentin@isovalent.com

41d5c37b

22 8月, 2020 1 次提交

selftests/bpf: BPF object files should depend only on libbpf headers · 3ac2e20f

由 Andrii Nakryiko 提交于 8月 20, 2020

There is no need to re-build BPF object files if any of the sources of libbpf
change. So record more precise dependency only on libbpf/bpf_*.h headers. This
eliminates unnecessary re-builds.
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200820231250.1293069-2-andriin@fb.com

3ac2e20f

21 8月, 2020 1 次提交

selftests/bpf: Remove test_align leftovers · 5597432d

由 Veronika Kabatova 提交于 8月 19, 2020

Calling generic selftests "make install" fails as rsync expects all
files from TEST_GEN_PROGS to be present. The binary is not generated
anymore (commit 3b09d27c) so we can safely remove it from there
and also from gitignore.

Fixes: 3b09d27c ("selftests/bpf: Move test_align under test_progs")
Signed-off-by: NVeronika Kabatova <vkabatov@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20200819160710.1345956-1-vkabatov@redhat.com

5597432d

08 8月, 2020 1 次提交

selftests/bpf: Fix silent Makefile output · d5ca5905

由 Andrii Nakryiko 提交于 8月 06, 2020

99aacebe ("selftests: do not use .ONESHELL") removed .ONESHELL, which
changes how Makefile "silences" multi-command target recipes. selftests/bpf's
Makefile relied (a somewhat unknowingly) on .ONESHELL behavior of silencing
all commands within the recipe if the first command contains @ symbol.
Removing .ONESHELL exposed this hack.

This patch fixes the issue by explicitly silencing each command with $(Q).

Also explicitly define fallback rule for building *.o from *.c, instead of
relying on non-silent inherited rule. This was causing a non-silent output for
bench.o object file.

Fixes: 92f7440e ("selftests/bpf: More succinct Makefile output")
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200807033058.848677-1-andriin@fb.com

d5ca5905

07 8月, 2020 1 次提交

selftests/bpf: Prevent runqslower from racing on building bpftool · 6bcaf41f

由 Andrii Nakryiko 提交于 8月 04, 2020

runqslower's Makefile is building/installing bpftool into
$(OUTPUT)/sbin/bpftool, which coincides with $(DEFAULT_BPFTOOL). In practice
this means that often when building selftests from scratch (after `make
clean`), selftests are racing with runqslower to simultaneously build bpftool
and one of the two processes fail due to file being busy. Prevent this race by
explicitly order-depending on $(BPFTOOL_DEFAULT).

Fixes: a2c9652f ("selftests: Refactor build to remove tools/lib/bpf from include path")
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200805004757.2960750-1-andriin@fb.com

6bcaf41f

14 7月, 2020 1 次提交

selftests/bpf: Add test for resolve_btfids · cc15a20d

由 Jiri Olsa 提交于 7月 11, 2020

Adding resolve_btfids test under test_progs suite.

It's possible to use btf_ids.h header and its logic in
user space application, so we can add easy test for it.

The test defines BTF_ID_LIST and checks it gets properly
resolved.

For this reason the test_progs binary (and other binaries
that use TRUNNER* macros) is processed with resolve_btfids
tool, which resolves BTF IDs in .BTF_ids section. The BTF
data are taken from btf_data.o object rceated from
progs/btf_data.c.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Tested-by: NAndrii Nakryiko <andriin@fb.com>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200711215329.41165-10-jolsa@kernel.org

cc15a20d

01 7月, 2020 1 次提交

selftests/bpf: Allow substituting custom vmlinux.h for selftests build · ca4db638

由 Andrii Nakryiko 提交于 6月 29, 2020

Similarly to bpftool Makefile, allow to specify custom location of vmlinux.h
to be used during the build. This allows simpler testing setups with
checked-in pre-generated vmlinux.h.
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630004759.521530-2-andriin@fb.com

ca4db638

03 6月, 2020 1 次提交

selftests/bpf: Add a default $(CXX) value · e7ad28e6

由 Ilya Leoshkevich 提交于 6月 02, 2020

When using make kselftest TARGETS=bpf, tools/bpf is built with
MAKEFLAGS=rR, which causes $(CXX) to be undefined, which in turn causes
the build to fail with

  CXX      test_cpp
/bin/sh: 2: g: not found

Fix by adding a default $(CXX) value, like tools/build/feature/Makefile
already does.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200602175649.2501580-3-iii@linux.ibm.com

e7ad28e6

02 6月, 2020 1 次提交

bpf: Add BPF ringbuf and perf buffer benchmarks · c97099b0

由 Andrii Nakryiko 提交于 5月 29, 2020

Extend bench framework with ability to have benchmark-provided child argument
parser for custom benchmark-specific parameters. This makes bench generic code
modular and independent from any specific benchmark.

Also implement a set of benchmarks for new BPF ring buffer and existing perf
buffer. 4 benchmarks were implemented: 2 variations for each of BPF ringbuf
and perfbuf:,
  - rb-libbpf utilizes stock libbpf ring_buffer manager for reading data;
  - rb-custom implements custom ring buffer setup and reading code, to
    eliminate overheads inherent in generic libbpf code due to callback
    functions and the need to update consumer position after each consumed
    record, instead of batching updates (due to pessimistic assumption that
    user callback might take long time and thus could unnecessarily hold ring
    buffer space for too long);
  - pb-libbpf uses stock libbpf perf_buffer code with all the default
    settings, though uses higher-performance raw event callback to minimize
    unnecessary overhead;
  - pb-custom implements its own custom consumer code to minimize any possible
    overhead of generic libbpf implementation and indirect function calls.

All of the test support default, no data notification skipped, mode, as well
as sampled mode (with --rb-sampled flag), which allows to trigger epoll
notification less frequently and reduce overhead. As will be shown, this mode
is especially critical for perf buffer, which suffers from high overhead of
wakeups in kernel.

Otherwise, all benchamrks implement similar way to generate a batch of records
by using fentry/sys_getpgid BPF program, which pushes a bunch of records in
a tight loop and records number of successful and dropped samples. Each record
is a small 8-byte integer, to minimize the effect of memory copying with
bpf_perf_event_output() and bpf_ringbuf_output().

Benchmarks that have only one producer implement optional back-to-back mode,
in which record production and consumption is alternating on the same CPU.
This is the highest-throughput happy case, showing ultimate performance
achievable with either BPF ringbuf or perfbuf.

All the below scenarios are implemented in a script in
benchs/run_bench_ringbufs.sh. Tests were performed on 28-core/56-thread
Intel Xeon CPU E5-2680 v4 @ 2.40GHz CPU.

Single-producer, parallel producer
==================================
rb-libbpf            12.054 ± 0.320M/s (drops 0.000 ± 0.000M/s)
rb-custom            8.158 ± 0.118M/s (drops 0.001 ± 0.003M/s)
pb-libbpf            0.931 ± 0.007M/s (drops 0.000 ± 0.000M/s)
pb-custom            0.965 ± 0.003M/s (drops 0.000 ± 0.000M/s)

Single-producer, parallel producer, sampled notification
========================================================
rb-libbpf            11.563 ± 0.067M/s (drops 0.000 ± 0.000M/s)
rb-custom            15.895 ± 0.076M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            9.889 ± 0.032M/s (drops 0.000 ± 0.000M/s)
pb-custom            9.866 ± 0.028M/s (drops 0.000 ± 0.000M/s)

Single producer on one CPU, consumer on another one, both running at full
speed. Curiously, rb-libbpf has higher throughput than objectively faster (due
to more lightweight consumer code path) rb-custom. It appears that faster
consumer causes kernel to send notifications more frequently, because consumer
appears to be caught up more frequently. Performance of perfbuf suffers from
default "no sampling" policy and huge overhead that causes.

In sampled mode, rb-custom is winning very significantly eliminating too
frequent in-kernel wakeups, the gain appears to be more than 2x.

Perf buffer achieves even more impressive wins, compared to stock perfbuf
settings, with 10x improvements in throughput with 1:500 sampling rate. The
trade-off is that with sampling, application might not get next X events until
X+1st arrives, which is not always acceptable. With steady influx of events,
though, this shouldn't be a problem.

Overall, single-producer performance of ring buffers seems to be better no
matter the sampled/non-sampled modes, but it especially beats ring buffer
without sampling due to its adaptive notification approach.

Single-producer, back-to-back mode
==================================
rb-libbpf            15.507 ± 0.247M/s (drops 0.000 ± 0.000M/s)
rb-libbpf-sampled    14.692 ± 0.195M/s (drops 0.000 ± 0.000M/s)
rb-custom            21.449 ± 0.157M/s (drops 0.000 ± 0.000M/s)
rb-custom-sampled    20.024 ± 0.386M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            1.601 ± 0.015M/s (drops 0.000 ± 0.000M/s)
pb-libbpf-sampled    8.545 ± 0.064M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.607 ± 0.022M/s (drops 0.000 ± 0.000M/s)
pb-custom-sampled    8.988 ± 0.144M/s (drops 0.000 ± 0.000M/s)

Here we test a back-to-back mode, which is arguably best-case scenario both
for BPF ringbuf and perfbuf, because there is no contention and for ringbuf
also no excessive notification, because consumer appears to be behind after
the first record. For ringbuf, custom consumer code clearly wins with 21.5 vs
16 million records per second exchanged between producer and consumer. Sampled
mode actually hurts a bit due to slightly slower producer logic (it needs to
fetch amount of data available to decide whether to skip or force notification).

Perfbuf with wakeup sampling gets 5.5x throughput increase, compared to
no-sampling version. There also doesn't seem to be noticeable overhead from
generic libbpf handling code.

Perfbuf back-to-back, effect of sample rate
===========================================
pb-sampled-1         1.035 ± 0.012M/s (drops 0.000 ± 0.000M/s)
pb-sampled-5         3.476 ± 0.087M/s (drops 0.000 ± 0.000M/s)
pb-sampled-10        5.094 ± 0.136M/s (drops 0.000 ± 0.000M/s)
pb-sampled-25        7.118 ± 0.153M/s (drops 0.000 ± 0.000M/s)
pb-sampled-50        8.169 ± 0.156M/s (drops 0.000 ± 0.000M/s)
pb-sampled-100       8.887 ± 0.136M/s (drops 0.000 ± 0.000M/s)
pb-sampled-250       9.180 ± 0.209M/s (drops 0.000 ± 0.000M/s)
pb-sampled-500       9.353 ± 0.281M/s (drops 0.000 ± 0.000M/s)
pb-sampled-1000      9.411 ± 0.217M/s (drops 0.000 ± 0.000M/s)
pb-sampled-2000      9.464 ± 0.167M/s (drops 0.000 ± 0.000M/s)
pb-sampled-3000      9.575 ± 0.273M/s (drops 0.000 ± 0.000M/s)

This benchmark shows the effect of event sampling for perfbuf. Back-to-back
mode for highest throughput. Just doing every 5th record notification gives
3.5x speed up. 250-500 appears to be the point of diminishing return, with
almost 9x speed up. Most benchmarks use 500 as the default sampling for pb-raw
and pb-custom.

Ringbuf back-to-back, effect of sample rate
===========================================
rb-sampled-1         1.106 ± 0.010M/s (drops 0.000 ± 0.000M/s)
rb-sampled-5         4.746 ± 0.149M/s (drops 0.000 ± 0.000M/s)
rb-sampled-10        7.706 ± 0.164M/s (drops 0.000 ± 0.000M/s)
rb-sampled-25        12.893 ± 0.273M/s (drops 0.000 ± 0.000M/s)
rb-sampled-50        15.961 ± 0.361M/s (drops 0.000 ± 0.000M/s)
rb-sampled-100       18.203 ± 0.445M/s (drops 0.000 ± 0.000M/s)
rb-sampled-250       19.962 ± 0.786M/s (drops 0.000 ± 0.000M/s)
rb-sampled-500       20.881 ± 0.551M/s (drops 0.000 ± 0.000M/s)
rb-sampled-1000      21.317 ± 0.532M/s (drops 0.000 ± 0.000M/s)
rb-sampled-2000      21.331 ± 0.535M/s (drops 0.000 ± 0.000M/s)
rb-sampled-3000      21.688 ± 0.392M/s (drops 0.000 ± 0.000M/s)

Similar benchmark for ring buffer also shows a great advantage (in terms of
throughput) of skipping notifications. Skipping every 5th one gives 4x boost.
Also similar to perfbuf case, 250-500 seems to be the point of diminishing
returns, giving roughly 20x better results.

Keep in mind, for this test, notifications are controlled manually with
BPF_RB_NO_WAKEUP and BPF_RB_FORCE_WAKEUP. As can be seen from previous
benchmarks, adaptive notifications based on consumer's positions provides same
(or even slightly better due to simpler load generator on BPF side) benefits in
favorable back-to-back scenario. Over zealous and fast consumer, which is
almost always caught up, will make thoughput numbers smaller. That's the case
when manual notification control might prove to be extremely beneficial.

Ringbuf back-to-back, reserve+commit vs output
==============================================
reserve              22.819 ± 0.503M/s (drops 0.000 ± 0.000M/s)
output               18.906 ± 0.433M/s (drops 0.000 ± 0.000M/s)

Ringbuf sampled, reserve+commit vs output
=========================================
reserve-sampled      15.350 ± 0.132M/s (drops 0.000 ± 0.000M/s)
output-sampled       14.195 ± 0.144M/s (drops 0.000 ± 0.000M/s)

BPF ringbuf supports two sets of APIs with various usability and performance
tradeoffs: bpf_ringbuf_reserve()+bpf_ringbuf_commit() vs bpf_ringbuf_output().
This benchmark clearly shows superiority of reserve+commit approach, despite
using a small 8-byte record size.

Single-producer, consumer/producer competing on the same CPU, low batch count
=============================================================================
rb-libbpf            3.045 ± 0.020M/s (drops 3.536 ± 0.148M/s)
rb-custom            3.055 ± 0.022M/s (drops 3.893 ± 0.066M/s)
pb-libbpf            1.393 ± 0.024M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.407 ± 0.016M/s (drops 0.000 ± 0.000M/s)

This benchmark shows one of the worst-case scenarios, in which producer and
consumer do not coordinate *and* fight for the same CPU. No batch count and
sampling settings were able to eliminate drops for ringbuffer, producer is
just too fast for consumer to keep up. But ringbuf and perfbuf still able to
pass through quite a lot of messages, which is more than enough for a lot of
applications.

Ringbuf, multi-producer contention
==================================
rb-libbpf nr_prod 1  10.916 ± 0.399M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 2  4.931 ± 0.030M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3  4.880 ± 0.006M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4  3.926 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8  4.011 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 3.967 ± 0.016M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 16 2.604 ± 0.030M/s (drops 0.001 ± 0.002M/s)
rb-libbpf nr_prod 20 2.233 ± 0.003M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 24 2.085 ± 0.015M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 28 2.055 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 32 1.962 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 36 2.089 ± 0.005M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 40 2.118 ± 0.006M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 44 2.105 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 48 2.120 ± 0.058M/s (drops 0.000 ± 0.001M/s)
rb-libbpf nr_prod 52 2.074 ± 0.024M/s (drops 0.007 ± 0.014M/s)

Ringbuf uses a very short-duration spinlock during reservation phase, to check
few invariants, increment producer count and set record header. This is the
biggest point of contention for ringbuf implementation. This benchmark
evaluates the effect of multiple competing writers on overall throughput of
a single shared ringbuffer.

Overall throughput drops almost 2x when going from single to two
highly-contended producers, gradually dropping with additional competing
producers.  Performance drop stabilizes at around 20 producers and hovers
around 2mln even with 50+ fighting producers, which is a 5x drop compared to
non-contended case. Good kernel implementation in kernel helps maintain decent
performance here.

Note, that in the intended real-world scenarios, it's not expected to get even
close to such a high levels of contention. But if contention will become
a problem, there is always an option of sharding few ring buffers across a set
of CPUs.
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200529075424.3139988-5-andriin@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

c97099b0

14 5月, 2020 2 次提交

selftest/bpf: Add BPF triggering benchmark · c5d420c3

由 Andrii Nakryiko 提交于 5月 12, 2020

It is sometimes desirable to be able to trigger BPF program from user-space
with minimal overhead. sys_enter would seem to be a good candidate, yet in
a lot of cases there will be a lot of noise from syscalls triggered by other
processes on the system. So while searching for low-overhead alternative, I've
stumbled upon getpgid() syscall, which seems to be specific enough to not
suffer from accidental syscall by other apps.

This set of benchmarks compares tp, raw_tp w/ filtering by syscall ID, kprobe,
fentry and fmod_ret with returning error (so that syscall would not be
executed), to determine the lowest-overhead way. Here are results on my
machine (using benchs/run_bench_trigger.sh script):

  base      :    9.200 ± 0.319M/s
  tp        :    6.690 ± 0.125M/s
  rawtp     :    8.571 ± 0.214M/s
  kprobe    :    6.431 ± 0.048M/s
  fentry    :    8.955 ± 0.241M/s
  fmodret   :    8.903 ± 0.135M/s

So it seems like fmodret doesn't give much benefit for such lightweight
syscall. Raw tracepoint is pretty decent despite additional filtering logic,
but it will be called for any other syscall in the system, which rules it out.
Fentry, though, seems to be adding the least amoung of overhead and achieves
97.3% of performance of baseline no-BPF-attached syscall.

Using getpgid() seems to be preferable to set_task_comm() approach from
test_overhead, as it's about 2.35x faster in a baseline performance.
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200512192445.2351848-5-andriin@fb.com

c5d420c3

selftest/bpf: Fmod_ret prog and implement test_overhead as part of bench · 4eaf0b5c

由 Andrii Nakryiko 提交于 5月 12, 2020

Add fmod_ret BPF program to existing test_overhead selftest. Also re-implement
user-space benchmarking part into benchmark runner to compare results. Results
with ./bench are consistently somewhat lower than test_overhead's, but relative
performance of various types of BPF programs stay consisten (e.g., kretprobe is
noticeably slower). This slowdown seems to be coming from the fact that
test_overhead is single-threaded, while benchmark always spins off at least
one thread for producer. This has been confirmed by hacking multi-threaded
test_overhead variant and also single-threaded bench variant. Resutls are
below. run_bench_rename.sh script from benchs/ subdirectory was used to
produce results for ./bench.

Single-threaded implementations
===============================

/* bench: single-threaded, atomics */
base      :    4.622 ± 0.049M/s
kprobe    :    3.673 ± 0.052M/s
kretprobe :    2.625 ± 0.052M/s
rawtp     :    4.369 ± 0.089M/s
fentry    :    4.201 ± 0.558M/s
fexit     :    4.309 ± 0.148M/s
fmodret   :    4.314 ± 0.203M/s

/* selftest: single-threaded, no atomics */
task_rename base        4555K events per sec
task_rename kprobe      3643K events per sec
task_rename kretprobe   2506K events per sec
task_rename raw_tp      4303K events per sec
task_rename fentry      4307K events per sec
task_rename fexit       4010K events per sec
task_rename fmod_ret    3984K events per sec

Multi-threaded implementations
==============================

/* bench: multi-threaded w/ atomics */
base      :    3.910 ± 0.023M/s
kprobe    :    3.048 ± 0.037M/s
kretprobe :    2.300 ± 0.015M/s
rawtp     :    3.687 ± 0.034M/s
fentry    :    3.740 ± 0.087M/s
fexit     :    3.510 ± 0.009M/s
fmodret   :    3.485 ± 0.050M/s

/* selftest: multi-threaded w/ atomics */
task_rename base        3872K events per sec
task_rename kprobe      3068K events per sec
task_rename kretprobe   2350K events per sec
task_rename raw_tp      3731K events per sec
task_rename fentry      3639K events per sec
task_rename fexit       3558K events per sec
task_rename fmod_ret    3511K events per sec

/* selftest: multi-threaded, no atomics */
task_rename base        3945K events per sec
task_rename kprobe      3298K events per sec
task_rename kretprobe   2451K events per sec
task_rename raw_tp      3718K events per sec
task_rename fentry      3782K events per sec
task_rename fexit       3543K events per sec
task_rename fmod_ret    3526K events per sec

Note that the fact that ./bench benchmark always uses atomic increments for
counting, while test_overhead doesn't, doesn't influence test results all that
much.
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200512192445.2351848-4-andriin@fb.com

4eaf0b5c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功