提交 40901784 编写于 作者: A Andrii Nakryiko 提交者: Daniel Borkmann

libbpf: Generate more efficient BPF_CORE_READ code

Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
there are intermediate pointers to be read, initial source pointer is
going to be assigned into a temporary variable and then temporary
variable is going to be uniformly used as a "source" pointer for all
intermediate pointer reads. Schematically (ignoring all the type casts),
BPF_CORE_READ(s, a, b, c) is expanded into:
({
	const void *__t = src;
	bpf_probe_read(&__t, sizeof(*__t), &__t->a);
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

This initial `__t = src` makes calls more uniform, but causes slightly
less optimal register usage sometimes when compiled with Clang. This can
cascase into, e.g., more register spills.

This patch fixes this issue by generating more optimal sequence:
({
	const void *__t;
	bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

Fixes: 7db3822a ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191011023847.275936-1-andriin@fb.com
上级 baead859
...@@ -88,11 +88,11 @@ ...@@ -88,11 +88,11 @@
read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor) read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
/* "recursively" read a sequence of inner pointers using local __t var */ /* "recursively" read a sequence of inner pointers using local __t var */
#define ___rd_first(src, a) ___read(bpf_core_read, &__t, ___type(src), src, a);
#define ___rd_last(...) \ #define ___rd_last(...) \
___read(bpf_core_read, &__t, \ ___read(bpf_core_read, &__t, \
___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__)); ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
#define ___rd_p0(src) const void *__t = src; #define ___rd_p1(...) const void *__t; ___rd_first(__VA_ARGS__)
#define ___rd_p1(...) ___rd_p0(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) #define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) #define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) #define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册