提交 · ef89654f2bc7459f45b40be80de6cd3765ef8539 · openeuler / Kernel

26 4月, 2022 1 次提交

libbpf: Add kptr type tag macros to bpf_helpers.h · ef89654f

由 Kumar Kartikeya Dwivedi 提交于 4月 25, 2022

Include convenience definitions:
__kptr:	Unreferenced kptr
__kptr_ref: Referenced kptr

Users can use them to tag the pointer type meant to be used with the new
support directly in the map value definition.
Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com

ef89654f

25 4月, 2022 1 次提交

libbpf: Remove unnecessary type cast · 003fed59

由 Yuntao Wang 提交于 4月 24, 2022

The link variable is already of type 'struct bpf_link *', casting it to
'struct bpf_link *' is redundant, drop it.
Signed-off-by: NYuntao Wang <ytcoode@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com

003fed59

23 4月, 2022 1 次提交

libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() · 8462e0b4

由 Andrii Nakryiko 提交于 4月 20, 2022

Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on
older kernels for programs that are attachable through
BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and
convenient interface for creating bpf_link-based attachments.

With this approach end users can just use bpf_link_create() for
tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to
care about kernel support, as libbpf will handle this transparently. On
the other hand, as newer features (like BPF cookie) are added to
LINK_CREATE interface, they will be readily usable though the same
bpf_link_create() API without any major refactoring from user's
standpoint.

bpf_program__attach_btf_id() is now using bpf_link_create() internally
as well and will take advantaged of this unified interface when BPF
cookie is added for fentry/fexit.

Doing proactive feature detection of LINK_CREATE support for
fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF,
determining some stable and guaranteed to be in all kernels versions
target BTF type (either raw tracepoint or fentry target function),
actually attaching this program and thus potentially affecting the
performance of the host kernel briefly, etc. So instead we are taking
much simpler "lazy" approach of falling back to
bpf_raw_tracepoint_open() call only if initial LINK_CREATE command
fails. For modern kernels this will mean zero added overhead, while
older kernels will incur minimal overhead with a single fast-failing
LINK_CREATE call.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NKui-Feng Lee <kuifeng@fb.com>
Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org

8462e0b4

22 4月, 2022 1 次提交

libbpf: Remove redundant non-null checks on obj_elf · b71a2ebf

由 Gaosheng Cui 提交于 4月 21, 2022

Obj_elf is already non-null checked at the function entry, so remove
redundant non-null checks on obj_elf.
Signed-off-by: NGaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com

b71a2ebf

21 4月, 2022 3 次提交

libbpf: Add documentation to API functions · a66ab9a9

由 Grant Seltzer 提交于 4月 20, 2022

This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()
Signed-off-by: NGrant Seltzer <grantseltzer@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com

a66ab9a9

libbpf: Update API functions usage to check error · df286716

由 Grant Seltzer 提交于 4月 20, 2022

This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
Signed-off-by: NGrant Seltzer <grantseltzer@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com

df286716

libbpf: Add error returns to two API functions · 93442f13

由 Grant Seltzer 提交于 4月 20, 2022

This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.
Signed-off-by: NGrant Seltzer <grantseltzer@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com

93442f13

20 4月, 2022 3 次提交

libbpf: Support riscv USDT argument parsing logic · 58ca8b05

由 Pu Lehui 提交于 4月 19, 2022

Add riscv-specific USDT argument specification parsing logic.
riscv USDT argument format is shown below:
- Memory dereference case:
  "size@off(reg)", e.g. "-8@-88(s0)"
- Constant value case:
  "size@val", e.g. "4@5"
- Register read case:
  "size@reg", e.g. "-8@a1"

s8 will be marked as poison while it's a reg of riscv, we need
to alias it in advance. Both RV32 and RV64 have been tested.
Signed-off-by: NPu Lehui <pulehui@huawei.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com

58ca8b05

libbpf: Fix usdt_cookie being cast to 32 bits · 5af25a41

由 Pu Lehui 提交于 4月 19, 2022

The usdt_cookie is defined as __u64, which should not be
used as a long type because it will be cast to 32 bits
in 32-bit platforms.
Signed-off-by: NPu Lehui <pulehui@huawei.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com

5af25a41

libbpf: Support opting out from autoloading BPF programs declaratively · a3820c48

由 Andrii Nakryiko 提交于 4月 18, 2022

Establish SEC("?abc") naming convention (i.e., adding question mark in
front of otherwise normal section name) that allows to set corresponding
program's autoload property to false. This is effectively just
a declarative way to do bpf_program__set_autoload(prog, false).

Having a way to do this declaratively in BPF code itself is useful and
convenient for various scenarios. E.g., for testing, when BPF object
consists of multiple independent BPF programs that each needs to be
tested separately. Opting out all of them by default and then setting
autoload to true for just one of them at a time simplifies testing code
(see next patch for few conversions in BPF selftests taking advantage of
this new feature).

Another real-world use case is in libbpf-tools for cases when different
BPF programs have to be picked depending on particulars of the host
kernel due to various incompatible changes (like kernel function renames
or signature change, or to pick kprobe vs fentry depending on
corresponding kernel support for the latter). Marking all the different
BPF program candidates as non-autoloaded declaratively makes this more
obvious in BPF source code and allows simpler code in user-space code.

When BPF program marked as SEC("?abc") it is otherwise treated just like
SEC("abc") and bpf_program__section_name() reported will be "abc".
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org

a3820c48

12 4月, 2022 1 次提交

libbpf: Usdt aarch64 arg parsing support · 0f861992

由 Alan Maguire 提交于 4月 11, 2022

Parsing of USDT arguments is architecture-specific. On aarch64 it is
relatively easy since registers used are x[0-31] and sp. Format is
slightly different compared to x86_64. Possible forms are:

- "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]";
- "size@reg" for register values, e.g. "-4@x0";
- "size@value" for raw values, e.g. "-8@1".
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com

0f861992

11 4月, 2022 2 次提交

libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels · d252a4a4

由 Runqing Yang 提交于 4月 09, 2022

Background:
Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user}
[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that
kernel doesn't support new APIs. Specifically, libbpf invokes the
probe_kern_probe_read_kernel function to load a small eBPF program into
the kernel in which bpf_probe_read_kernel API is invoked and lets the
kernel checks whether the new API is valid. If the loading fails, libbpf
considers the new API invalid and replaces it with the old API.

static int probe_kern_probe_read_kernel(void)
{
	struct bpf_insn insns[] = {
		BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),	/* r1 = r10 (fp) */
		BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),	/* r1 += -8 */
		BPF_MOV64_IMM(BPF_REG_2, 8),		/* r2 = 8 */
		BPF_MOV64_IMM(BPF_REG_3, 0),		/* r3 = 0 */
		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
		BPF_EXIT_INSN(),
	};
	int fd, insn_cnt = ARRAY_SIZE(insns);

	fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL,
                           "GPL", insns, insn_cnt, NULL);
	return probe_fd(fd);
}

Bug:
On older kernel versions [0], the kernel checks whether the version
number provided in the bpf syscall, matches the LINUX_VERSION_CODE.
If not matched, the bpf syscall fails. eBPF However, the
probe_kern_probe_read_kernel code does not set the kernel version
number provided to the bpf syscall, which causes the loading process
alwasys fails for old versions. It means that libbpf will replace the
new API with the old one even the kernel supports the new one.

Solution:
After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT
program type instead of BPF_PROG_TYPE_KPROBE because kernel does not
enfoce version check for tracepoint programs. I test the patch in old
kernels (4.18 and 4.19) and it works well.

  [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360
  [1] Closes: https://github.com/libbpf/libbpf/issues/473Signed-off-by: NRunqing Yang <rainkin1993@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com

d252a4a4

libbpf: Add ARC support to bpf_tracing.h · 07385998

由 Vladimir Isaev 提交于 4月 09, 2022

Add PT_REGS macros suitable for ARCompact and ARCv2.
Signed-off-by: NVladimir Isaev <isaev@synopsys.com>
Signed-off-by: NSergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NSong Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com

07385998

09 4月, 2022 4 次提交

libbpf: Use weak hidden modifier for USDT BPF-side API functions · 2fa5b0f2

由 Andrii Nakryiko 提交于 4月 08, 2022

Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org

2fa5b0f2

libbpf: Don't error out on CO-RE relos for overriden weak subprogs · e89d57d9

由 Andrii Nakryiko 提交于 4月 08, 2022

During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.

But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.

Fixes: db2b8b06 ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org

e89d57d9

libbpf: Allow WEAK and GLOBAL bindings during BTF fixup · 3a06ec0a

由 Andrii Nakryiko 提交于 4月 07, 2022

During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.

This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.
Reported-by: NHengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org

3a06ec0a

libbpf: Use strlcpy() in path resolution fallback logic · 3c0dfe6e

由 Andrii Nakryiko 提交于 4月 07, 2022

Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org

3c0dfe6e

08 4月, 2022 7 次提交

libbpf: Add s390-specific USDT arg spec parsing logic · bd022685

由 Ilya Leoshkevich 提交于 4月 07, 2022

The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().

Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com

bd022685

libbpf: Make BPF-side of USDT support work on big-endian machines · 6f403d9d

由 Ilya Leoshkevich 提交于 4月 07, 2022

BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of
the actual argument size. On little-endian the relevant argument bits
end up in the lower bits of val, and later on the code that handles
all the argument types expects them to be there.

On big-endian they end up in the upper bits of val, breaking that
expectation. Fix by right-shifting val on big-endian.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com

6f403d9d

libbpf: Minor style improvements in USDT code · e1b6df59

由 Ilya Leoshkevich 提交于 4月 07, 2022

Fix several typos and references to non-existing headers.
Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with
the rest of the bpf code - see commit 45f2bebc ("libbpf: Fix
endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for
rationale).
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com

e1b6df59

libbpf: Fix use #ifdef instead of #if to avoid compiler warning · ded6dffa

由 Andrii Nakryiko 提交于 4月 07, 2022

As reported by Naresh:

  perf build errors on i386 [1] on Linux next-20220407 [2]

  usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1181 | #if __x86_64__
        |     ^~~~~~~~~~
  usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1196 | #if __x86_64__
        |     ^~~~~~~~~~
  cc1: all warnings being treated as errors

Use #ifdef instead of #if to avoid this.

Fixes: 4c59e584 ("libbpf: Add x86-specific USDT arg spec parsing logic")
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org

ded6dffa

libbpf: Potential NULL dereference in usdt_manager_attach_usdt() · e58c5c97

由 Haowen Bai 提交于 4月 07, 2022

link could be null but still dereference bpf_link__destroy(&link->link)
and it will lead to a null pointer access.
Signed-off-by: NHaowen Bai <baihaowen@meizu.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com

e58c5c97

libbpf: Improve string parsing for uprobe auto-attach · 90db26e6

由 Alan Maguire 提交于 4月 06, 2022

For uprobe auto-attach, the parsing can be simplified for the SEC()
name to a single sscanf(); the return value of the sscanf can then
be used to distinguish between sections that simply specify
"u[ret]probe" (and thus cannot auto-attach), those that specify
"u[ret]probe/binary_path:function+offset" etc.
Suggested-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com

90db26e6

libbpf: Improve library identification for uprobe binary path resolution · a1c9d61b

由 Alan Maguire 提交于 4月 06, 2022

In the process of doing path resolution for uprobe attach, libraries are
identified by matching a ".so" substring in the binary_path.
This matches a lot of patterns that do not conform to library.so[.version]
format, so instead match a ".so" _suffix_, and if that fails match a
".so." substring for the versioned library case.
Suggested-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com

a1c9d61b

07 4月, 2022 1 次提交

libbpf: Fix spelling mistake "libaries" -> "libraries" · a8d600f6

由 Colin Ian King 提交于 4月 06, 2022

There is a spelling mistake in a pr_warn message. Fix it.
Signed-off-by: NColin Ian King <colin.i.king@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com

a8d600f6

06 4月, 2022 5 次提交

libbpf: Add x86-specific USDT arg spec parsing logic · 4c59e584

由 Andrii Nakryiko 提交于 4月 04, 2022

Add x86/x86_64-specific USDT argument specification parsing. Each
architecture will require their own logic, as all this is arch-specific
assembly-based notation. Architectures that libbpf doesn't support for
USDTs will pr_warn() with specific error and return -ENOTSUP.

We use sscanf() as a very powerful and easy to use string parser. Those
spaces in sscanf's format string mean "skip any whitespaces", which is
pretty nifty (and somewhat little known) feature.

All this was tested on little-endian architecture, so bit shifts are
probably off on big-endian, which our CI will hopefully prove.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Reviewed-by: NDave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org

4c59e584

libbpf: Wire up spec management and other arch-independent USDT logic · 999783c8

由 Andrii Nakryiko 提交于 4月 04, 2022

Last part of architecture-agnostic user-space USDT handling logic is to
set up BPF spec and, optionally, IP-to-ID maps from user-space.
usdt_manager performs a compact spec ID allocation to utilize
fixed-sized BPF maps as efficiently as possible. We also use hashmap to
deduplicate USDT arg spec strings and map identical strings to single
USDT spec, minimizing the necessary BPF map size. usdt_manager supports
arbitrary sequences of attachment and detachment, both of the same USDT
and multiple different USDTs and internally maintains a free list of
unused spec IDs. bpf_link_usdt's logic is extended with proper setup and
teardown of this spec ID free list and supporting BPF maps.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Reviewed-by: NDave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org

999783c8

libbpf: Add USDT notes parsing and resolution logic · 74cc6311

由 Andrii Nakryiko 提交于 4月 04, 2022

Implement architecture-agnostic parts of USDT parsing logic. The code is
the documentation in this case, it's futile to try to succinctly
describe how USDT parsing is done in any sort of concreteness. But
still, USDTs are recorded in special ELF notes section (.note.stapsdt),
where each USDT call site is described separately. Along with USDT
provider and USDT name, each such note contains USDT argument
specification, which uses assembly-like syntax to describe how to fetch
value of USDT argument. USDT arg spec could be just a constant, or
a register, or a register dereference (most common cases in x86_64), but
it technically can be much more complicated cases, like offset relative
to global symbol and stuff like that. One of the later patches will
implement most common subset of this for x86 and x86-64 architectures,
which seems to handle a lot of real-world production application.

USDT arg spec contains a compact encoding allowing usdt.bpf.h from
previous patch to handle the above 3 cases. Instead of recording which
register might be needed, we encode register's offset within struct
pt_regs to simplify BPF-side implementation. USDT argument can be of
different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle
this, libbpf pre-calculates necessary bit shifts to do proper casting
and sign-extension in a short sequences of left and right shifts.

The rest is in the code with sometimes extensive comments and references
to external "documentation" for USDTs.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Reviewed-by: NDave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org

74cc6311

libbpf: Wire up USDT API and bpf_link integration · 2e4913e0

由 Andrii Nakryiko 提交于 4月 04, 2022

Wire up libbpf USDT support APIs without yet implementing all the
nitty-gritty details of USDT discovery, spec parsing, and BPF map
initialization.

User-visible user-space API is simple and is conceptually very similar
to uprobe API.

bpf_program__attach_usdt() API allows to programmatically attach given
BPF program to a USDT, specified through binary path (executable or
shared lib), USDT provider and name. Also, just like in uprobe case, PID
filter is specified (0 - self, -1 - any process, or specific PID).
Optionally, USDT cookie value can be specified. Such single API
invocation will try to discover given USDT in specified binary and will
use (potentially many) BPF uprobes to attach this program in correct
locations.

Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that
represents this attachment. It is a virtual BPF link that doesn't have
direct kernel object, as it can consist of multiple underlying BPF
uprobe links. As such, attachment is not atomic operation and there can
be brief moment when some USDT call sites are attached while others are
still in the process of attaching. This should be taken into
consideration by user. But bpf_program__attach_usdt() guarantees that
in the case of success all USDT call sites are successfully attached, or
all the successfuly attachments will be detached as soon as some USDT
call sites failed to be attached. So, in theory, there could be cases of
failed bpf_program__attach_usdt() call which did trigger few USDT
program invocations. This is unavoidable due to multi-uprobe nature of
USDT and has to be handled by user, if it's important to create an
illusion of atomicity.

USDT BPF programs themselves are marked in BPF source code as either
SEC("usdt"), in which case they won't be auto-attached through
skeleton's <skel>__attach() method, or it can have a full definition,
which follows the spirit of fully-specified uprobes:
SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's
attach method will attempt auto-attachment. Similarly, generic
bpf_program__attach() will have enought information to go off of for
parameterless attachment.

USDT BPF programs are actually uprobes, and as such for kernel they are
marked as BPF_PROG_TYPE_KPROBE.

Another part of this patch is USDT-related feature probing:
  - BPF cookie support detection from user-space;
  - detection of kernel support for auto-refcounting of USDT semaphore.

The latter is optional. If kernel doesn't support such feature and USDT
doesn't rely on USDT semaphores, no error is returned. But if libbpf
detects that USDT requires setting semaphores and kernel doesn't support
this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't
support poking process's memory directly to increment semaphore value,
like BCC does on legacy kernels, due to inherent raciness and danger of
such process memory manipulation. Libbpf let's kernel take care of this
properly or gives up.

Logistically, all the extra USDT-related infrastructure of libbpf is put
into a separate usdt.c file and abstracted behind struct usdt_manager.
Each bpf_object has lazily-initialized usdt_manager pointer, which is
only instantiated if USDT programs are attempted to be attached. Closing
BPF object frees up usdt_manager resources. usdt_manager keeps track of
USDT spec ID assignment and few other small things.

Subsequent patches will fill out remaining missing pieces of USDT
initialization and setup logic.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org

2e4913e0

libbpf: Add BPF-side of USDT support · d72e2968

由 Andrii Nakryiko 提交于 4月 04, 2022

Add BPF-side implementation of libbpf-provided USDT support. This
consists of single header library, usdt.bpf.h, which is meant to be used
from user's BPF-side source code. This header is added to the list of
installed libbpf header, along bpf_helpers.h and others.

BPF-side implementation consists of two BPF maps:
  - spec map, which contains "a USDT spec" which encodes information
    necessary to be able to fetch USDT arguments and other information
    (argument count, user-provided cookie value, etc) at runtime;
  - IP-to-spec-ID map, which is only used on kernels that don't support
    BPF cookie feature. It allows to lookup spec ID based on the place
    in user application that triggers USDT program.

These maps have default sizes, 256 and 1024, which are chosen
conservatively to not waste a lot of space, but handling a lot of common
cases. But there could be cases when user application needs to either
trace a lot of different USDTs, or USDTs are heavily inlined and their
arguments are located in a lot of differing locations. For such cases it
might be necessary to size those maps up, which libbpf allows to do by
overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros.

It is an important aspect to keep in mind. Single USDT (user-space
equivalent of kernel tracepoint) can have multiple USDT "call sites".
That is, single logical USDT is triggered from multiple places in user
application. This can happen due to function inlining. Each such inlined
instance of USDT invocation can have its own unique USDT argument
specification (instructions about the location of the value of each of
USDT arguments). So while USDT looks very similar to usual uprobe or
kernel tracepoint, under the hood it's actually a collection of uprobes,
each potentially needing different spec to know how to fetch arguments.

User-visible API consists of three helper functions:
  - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT;
  - bpf_usdt_arg(), which reads value of specified USDT argument (by
    it's zero-indexed position) and returns it as 64-bit value;
  - bpf_usdt_cookie(), which functions like BPF cookie for USDT
    programs; this is necessary as libbpf doesn't allow specifying actual
    BPF cookie and utilizes it internally for USDT support implementation.

Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into
BPF program. On kernels that don't support BPF cookie it is used to
fetch absolute IP address of the underlying uprobe.

usdt.bpf.h also provides BPF_USDT() macro, which functions like
BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to
get access to USDT arguments, if USDT definition is static and known to
the user. It is expected that majority of use cases won't have to use
bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover
all their needs.

Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to
detect kernel support for BPF cookie. If BPF CO-RE dependency is
undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to
either a boolean constant (or equivalently zero and non-zero), or even
point it to its own .rodata variable that can be specified from user's
application user-space code. It is important that
BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus
.rodata and not just .data), as otherwise BPF code will still contain
bpf_get_attach_cookie() BPF helper call and will fail validation at
runtime, if not dead-code eliminated.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org

d72e2968

05 4月, 2022 1 次提交

libbpf: Support Debian in resolve_full_path() · 56818931

由 Ilya Leoshkevich 提交于 4月 05, 2022

attach_probe selftest fails on Debian-based distros with `failed to
resolve full path for 'libc.so.6'`. The reason is that these distros
embraced multiarch to the point where even for the "main" architecture
they store libc in /lib/<triple>.

This is configured in /etc/ld.so.conf and in theory it's possible to
replicate the loader's parsing and processing logic in libbpf, however
a much simpler solution is to just enumerate the known library paths.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404225020.51029-1-iii@linux.ibm.com

56818931

04 4月, 2022 4 次提交

libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len) · e93f3999

由 Yuntao Wang 提交于 4月 04, 2022

Since core relos is an optional part of the .BTF.ext ELF section, we should
skip parsing it instead of returning -EINVAL if header size is less than
offsetofend(struct btf_ext_header, core_relo_len).
Signed-off-by: NYuntao Wang <ytcoode@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com

e93f3999

libbpf: Add auto-attach for uprobes based on section name · 39f8dc43

由 Alan Maguire 提交于 3月 30, 2022

Now that u[ret]probes can use name-based specification, it makes
sense to add support for auto-attach based on SEC() definition.
The format proposed is

        SEC("u[ret]probe/binary:[raw_offset|[function_name[+offset]]")

For example, to trace malloc() in libc:

        SEC("uprobe/libc.so.6:malloc")

...or to trace function foo2 in /usr/bin/foo:

        SEC("uprobe//usr/bin/foo:foo2")

Auto-attach is done for all tasks (pid -1).  prog can be an absolute
path or simply a program/library name; in the latter case, we use
PATH/LD_LIBRARY_PATH to resolve the full path, falling back to
standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if
the file is not found via environment-variable specified locations.
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-4-git-send-email-alan.maguire@oracle.com

39f8dc43

libbpf: Support function name-based attach uprobes · 433966e3

由 Alan Maguire 提交于 3月 30, 2022

kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.htmlSigned-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-3-git-send-email-alan.maguire@oracle.com

433966e3

libbpf: auto-resolve programs/libraries when necessary for uprobes · 1ce3a60e

由 Alan Maguire 提交于 3月 30, 2022

bpf_program__attach_uprobe_opts() requires a binary_path argument
specifying binary to instrument.  Supporting simply specifying
"libc.so.6" or "foo" should be possible too.

Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib.
This allows users to run BPF programs prefixed with
LD_LIBRARY_PATH=/path2/lib while still searching standard locations.
Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin.

Path determination will be useful for auto-attach of BPF uprobe programs
using SEC() definition.
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-2-git-send-email-alan.maguire@oracle.com

1ce3a60e

21 3月, 2022 2 次提交

libbpf: Close fd in bpf_object__reuse_map · d0f325c3

由 Hengqi Chen 提交于 3月 19, 2022

pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it
in bpf_object__reuse_map after reuse.
Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com

d0f325c3

libbpf: Avoid NULL deref when initializing map BTF info · a8fee962

由 Andrii Nakryiko 提交于 3月 19, 2022

If BPF object doesn't have an BTF info, don't attempt to search for BTF
types describing BPF map key or value layout.

Fixes: 262cfb74 ("libbpf: Init btf_{key,value}_type_id on internal map open")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org

a8fee962

18 3月, 2022 3 次提交

libbpf: Add subskeleton scaffolding · 430025e5

由 Delyan Kratunov 提交于 3月 16, 2022

In symmetry with bpf_object__open_skeleton(),
bpf_object__open_subskeleton() performs the actual walking and linking
of maps, progs, and globals described by bpf_*_skeleton objects.
Signed-off-by: NDelyan Kratunov <delyank@fb.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com

430025e5

libbpf: Init btf_{key,value}_type_id on internal map open · 262cfb74

由 Delyan Kratunov 提交于 3月 16, 2022

For internal and user maps, look up the key and value btf
types on open() and not load(), so that `bpf_map_btf_value_type_id`
is usable in `bpftool gen`.
Signed-off-by: NDelyan Kratunov <delyank@fb.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com

262cfb74

libbpf: .text routines are subprograms in strict mode · bc380eb9

由 Delyan Kratunov 提交于 3月 16, 2022

Currently, libbpf considers a single routine in .text to be a program. This
is particularly confusing when it comes to library objects - a single routine
meant to be used as an extern will instead be considered a bpf_program.

This patch hides this compatibility behavior behind the pre-existing
SEC_NAME strict mode flag.
Signed-off-by: NDelyan Kratunov <delyank@fb.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com

bc380eb9

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功