- 30 3月, 2020 4 次提交
-
-
由 KP Singh 提交于
Since BPF_PROG_TYPE_LSM uses the same attaching mechanism as BPF_PROG_TYPE_TRACING, the common logic is refactored into a static function bpf_program__attach_btf_id. A new API call bpf_program__attach_lsm is still added to avoid userspace conflicts if this ever changes in the future. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NBrendan Jackman <jackmanb@google.com> Reviewed-by: NFlorent Revest <revest@google.com> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Acked-by: NYonghong Song <yhs@fb.com> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-7-kpsingh@chromium.org
-
由 KP Singh 提交于
Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NBrendan Jackman <jackmanb@google.com> Reviewed-by: NFlorent Revest <revest@google.com> Reviewed-by: NThomas Garnier <thgarnie@google.com> Acked-by: NYonghong Song <yhs@fb.com> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NJames Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
-
由 Toke Høiland-Jørgensen 提交于
This adds a test to exercise the new bpf_map__set_initial_value() function. The test simply overrides the global data section with all zeroes, and checks that the new value makes it into the kernel map on load. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200329132253.232541-2-toke@redhat.com
-
由 Toke Høiland-Jørgensen 提交于
For internal maps (most notably the maps backing global variables), libbpf uses an internal mmaped area to store the data after opening the object. This data is subsequently copied into the kernel map when the object is loaded. This adds a function to set a new value for that data, which can be used to before it is loaded into the kernel. This is especially relevant for RODATA maps, since those are frozen on load. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200329132253.232541-1-toke@redhat.com
-
- 29 3月, 2020 4 次提交
-
-
由 Toke Høiland-Jørgensen 提交于
This adds tests for the various replacement operations using IFLA_XDP_EXPECTED_FD. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158515700967.92963.15098921624731968356.stgit@toke.dk
-
由 Toke Høiland-Jørgensen 提交于
This adds a new function to set the XDP fd while specifying the FD of the program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink parameter. The new function uses the opts struct mechanism to be extendable in the future. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158515700857.92963.7052131201257841700.stgit@toke.dk
-
由 Toke Høiland-Jørgensen 提交于
This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the XDP_FLAGS_REPLACE flag to if_link.h in tools/include. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158515700747.92963.8615391897417388586.stgit@toke.dk
-
由 Fletcher Dunn 提交于
Fix a sharp edge in xsk_umem__create and xsk_socket__create. Almost all of the members of the ring buffer structs are initialized, but the "cached_xxx" variables are not all initialized. The caller is required to zero them. This is needlessly dangerous. The results if you don't do it can be very bad. For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return values greater than the size of the queue. xsk_ring_cons__peek can return an index that does not refer to an item that has been queued. I have confirmed that without this change, my program misbehaves unless I memset the ring buffers to zero before calling the function. Afterwards, my program works without (or with) the memset. Signed-off-by: NFletcher Dunn <fletcherd@valvesoftware.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMagnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/85f12913cde94b19bfcb598344701c38@valvesoftware.com
-
- 28 3月, 2020 3 次提交
-
-
由 Daniel Borkmann 提交于
Add various tests to make sure the verifier keeps catching them: # ./test_verifier [...] #230/p pass ctx or null check, 1: ctx OK #231/p pass ctx or null check, 2: null OK #232/p pass ctx or null check, 3: 1 OK #233/p pass ctx or null check, 4: ctx - const OK #234/p pass ctx or null check, 5: null (connect) OK #235/p pass ctx or null check, 6: null (bind) OK #236/p pass ctx or null check, 7: ctx (bind) OK #237/p pass ctx or null check, 8: null (bind) OK [...] Summary: 1595 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c74758d07b1b678036465ef7f068a49e9efd3548.1585323121.git.daniel@iogearbox.net
-
由 Daniel Borkmann 提交于
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(), recvmsg() and bind-related hooks in order to retrieve the cgroup v2 context which can then be used as part of the key for BPF map lookups, for example. Given these hooks operate in process context 'current' is always valid and pointing to the app that is performing mentioned syscalls if it's subject to a v2 cgroup. Also with same motivation of commit 77236281 ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper") enable retrieval of ancestor from current so the cgroup id can be used for policy lookups which can then forbid connect() / bind(), for example. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
-
由 Daniel Borkmann 提交于
In Cilium we're mainly using BPF cgroup hooks today in order to implement kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*), ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic between Cilium managed nodes. While this works in its current shape and avoids packet-level NAT for inter Cilium managed node traffic, there is one major limitation we're facing today, that is, lack of netns awareness. In Kubernetes, the concept of Pods (which hold one or multiple containers) has been built around network namespaces, so while we can use the global scope of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing NodePort ports on loopback addresses), we also have the need to differentiate between initial network namespaces and non-initial one. For example, ExternalIP services mandate that non-local service IPs are not to be translated from the host (initial) network namespace as one example. Right now, we have an ugly work-around in place where non-local service IPs for ExternalIP services are not xlated from connect() and friends BPF hooks but instead via less efficient packet-level NAT on the veth tc ingress hook for Pod traffic. On top of determining whether we're in initial or non-initial network namespace we also have a need for a socket-cookie like mechanism for network namespaces scope. Socket cookies have the nice property that they can be combined as part of the key structure e.g. for BPF LRU maps without having to worry that the cookie could be recycled. We are planning to use this for our sessionAffinity implementation for services. Therefore, add a new bpf_get_netns_cookie() helper which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would provide the cookie for the initial network namespace while passing the context instead of NULL would provide the cookie from the application's network namespace. We're using a hole, so no size increase; the assignment happens only once. Therefore this allows for a comparison on initial namespace as well as regular cookie usage as we have today with socket cookies. We could later on enable this helper for other program types as well as we would see need. (*) Both externalTrafficPolicy={Local|Cluster} types [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.cSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
-
- 26 3月, 2020 3 次提交
-
-
由 John Fastabend 提交于
After changes to add update_reg_bounds after ALU ops and adding ALU32 bounds tracking the error message is changed in the 32-bit right shift tests. Test "#70/u bounds check after 32-bit right shift with 64-bit input FAIL" now fails with, Unexpected error message! EXP: R0 invalid mem access RES: func#0 @0 7: (b7) r1 = 2 8: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP2 R10=fp0 fp-8_w=mmmmmmmm 8: (67) r1 <<= 31 9: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP4294967296 R10=fp0 fp-8_w=mmmmmmmm 9: (74) w1 >>= 31 10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP0 R10=fp0 fp-8_w=mmmmmmmm 10: (14) w1 -= 2 11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP4294967294 R10=fp0 fp-8_w=mmmmmmmm 11: (0f) r0 += r1 math between map_value pointer and 4294967294 is not allowed And test "#70/p bounds check after 32-bit right shift with 64-bit input FAIL" now fails with, Unexpected error message! EXP: R0 invalid mem access RES: func#0 @0 7: (b7) r1 = 2 8: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv2 R10=fp0 fp-8_w=mmmmmmmm 8: (67) r1 <<= 31 9: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv4294967296 R10=fp0 fp-8_w=mmmmmmmm 9: (74) w1 >>= 31 10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv0 R10=fp0 fp-8_w=mmmmmmmm 10: (14) w1 -= 2 11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv4294967294 R10=fp0 fp-8_w=mmmmmmmm 11: (0f) r0 += r1 last_idx 11 first_idx 0 regs=2 stack=0 before 10: (14) w1 -= 2 regs=2 stack=0 before 9: (74) w1 >>= 31 regs=2 stack=0 before 8: (67) r1 <<= 31 regs=2 stack=0 before 7: (b7) r1 = 2 math between map_value pointer and 4294967294 is not allowed Before this series we did not trip the "math between map_value pointer..." error because check_reg_sane_offset is never called in adjust_ptr_min_max_vals(). Instead we have a register state that looks like this at line 11*, 11: R0_w=map_value(id=0,off=0,ks=8,vs=8, smin_value=0,smax_value=0, umin_value=0,umax_value=0, var_off=(0x0; 0x0)) R1_w=invP(id=0, smin_value=0,smax_value=4294967295, umin_value=0,umax_value=4294967295, var_off=(0xfffffffe; 0x0)) R10=fp(id=0,off=0, smin_value=0,smax_value=0, umin_value=0,umax_value=0, var_off=(0x0; 0x0)) fp-8_w=mmmmmmmm 11: (0f) r0 += r1 In R1 'smin_val != smax_val' yet we have a tnum_const as seen by 'var_off(0xfffffffe; 0x0))' with a 0x0 mask. So we hit this check in adjust_ptr_min_max_vals() if ((known && (smin_val != smax_val || umin_val != umax_val)) || smin_val > smax_val || umin_val > umax_val) { /* Taint dst register if offset had invalid bounds derived from * e.g. dead branches. */ __mark_reg_unknown(env, dst_reg); return 0; } So we don't throw an error here and instead only throw an error later in the verification when the memory access is made. The root cause in verifier without alu32 bounds tracking is having 'umin_value = 0' and 'umax_value = U64_MAX' from BPF_SUB which we set when 'umin_value < umax_val' here, if (dst_reg->umin_value < umax_val) { /* Overflow possible, we know nothing */ dst_reg->umin_value = 0; dst_reg->umax_value = U64_MAX; } else { ...} Later in adjust_calar_min_max_vals we previously did a coerce_reg_to_size() which will clamp the U64_MAX to U32_MAX by truncating to 32bits. But either way without a call to update_reg_bounds the less precise bounds tracking will fall out of the alu op verification. After latest changes we now exit adjust_scalar_min_max_vals with the more precise umin value, due to zero extension propogating bounds from alu32 bounds into alu64 bounds and then calling update_reg_bounds. This then causes the verifier to trigger an earlier error and we get the error in the output above. This patch updates tests to reflect new error message. * I have a local patch to print entire verifier state regardless if we believe it is a constant so we can get a full picture of the state. Usually if tnum_is_const() then bounds are also smin=smax, etc. but this is not always true and is a bit subtle. Being able to see these states helps understand dataflow imo. Let me know if we want something similar upstream. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158507161475.15666.3061518385241144063.stgit@john-Precision-5820-Tower
-
由 Stanislav Fomichev 提交于
For each prog/btf load we allocate and free 16 megs of verifier buffer. On production systems it doesn't really make sense because the programs/btf have gone through extensive testing and (mostly) guaranteed to successfully load. Let's assume successful case by default and skip buffer allocation on the first try. If there is an error, start with BPF_LOG_BUF_SIZE and double it on each ENOSPC iteration. v3: * Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko) v2: * Don't allocate the buffer at all on the first try (Andrii Nakryiko) Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200325195521.112210-1-sdf@google.com
-
由 Tobias Klauser 提交于
Has been unused since commit ef99b02b ("libbpf: capture value in BTF type info for BTF-defined map defs"). Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NQuentin Monnet <quentin@isovalent.com> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200325113655.19341-1-tklauser@distanz.ch
-
- 24 3月, 2020 2 次提交
-
-
由 Daniel T. Lee 提交于
To reduce the reliance of trace samples (trace*_user) on bpf_load, move read_trace_pipe to trace_helpers. By moving this bpf_loader helper elsewhere, trace functions can be easily migrated to libbbpf. Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200321100424.1593964-2-danieltimlee@gmail.com
-
由 Martin KaFai Lau 提交于
This patch adds test to exercise the bpf_sk_storage_get() and bpf_sk_storage_delete() helper from the bpf_dctcp.c. The setup and check on the sk_storage is done immediately before and after the connect(). This patch also takes this chance to move the pthread_create() after the connect() has been done. That will remove the need of the "wait_thread" label. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200320152107.2169904-1-kafai@fb.com
-
- 21 3月, 2020 1 次提交
-
-
由 Bill Wendling 提交于
Clang's -Wmisleading-indentation warns about misleading indentations if there's a mixture of spaces and tabs. Remove extraneous spaces. Signed-off-by: NBill Wendling <morbo@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200320201510.217169-1-morbo@google.com
-
- 20 3月, 2020 4 次提交
-
-
由 Martin KaFai Lau 提交于
This patch adds struct_ops support to the bpftool. To recap a bit on the recent bpf_struct_ops feature on the kernel side: It currently supports "struct tcp_congestion_ops" to be implemented in bpf. At a high level, bpf_struct_ops is struct_ops map populated with a number of bpf progs. bpf_struct_ops currently supports the "struct tcp_congestion_ops". However, the bpf_struct_ops design is generic enough that other kernel struct ops can be supported in the future. Although struct_ops is map+progs at a high lever, there are differences in details. For example, 1) After registering a struct_ops, the struct_ops is held by the kernel subsystem (e.g. tcp-cc). Thus, there is no need to pin a struct_ops map or its progs in order to keep them around. 2) To iterate all struct_ops in a system, it iterates all maps in type BPF_MAP_TYPE_STRUCT_OPS. BPF_MAP_TYPE_STRUCT_OPS is the current usual filter. In the future, it may need to filter by other struct_ops specific properties. e.g. filter by tcp_congestion_ops or other kernel subsystem ops in the future. 3) struct_ops requires the running kernel having BTF info. That allows more flexibility in handling other kernel structs. e.g. it can always dump the latest bpf_map_info. 4) Also, "struct_ops" command is not intended to repeat all features already provided by "map" or "prog". For example, if there really is a need to pin the struct_ops map, the user can use the "map" cmd to do that. While the first attempt was to reuse parts from map/prog.c, it ended up not a lot to share. The only obvious item is the map_parse_fds() but that still requires modifications to accommodate struct_ops map specific filtering (for the immediate and the future needs). Together with the earlier mentioned differences, it is better to part away from map/prog.c. The initial set of subcmds are, register, unregister, show, and dump. For register, it registers all struct_ops maps that can be found in an obj file. Option can be added in the future to specify a particular struct_ops map. Also, the common bpf_tcp_cc is stateless (e.g. bpf_cubic.c and bpf_dctcp.c). The "reuse map" feature is not implemented in this patch and it can be considered later also. For other subcmds, please see the man doc for details. A sample output of dump: [root@arch-fb-vm1 bpf]# bpftool struct_ops dump name cubic [{ "bpf_map_info": { "type": 26, "id": 64, "key_size": 4, "value_size": 256, "max_entries": 1, "map_flags": 0, "name": "cubic", "ifindex": 0, "btf_vmlinux_value_type_id": 18452, "netns_dev": 0, "netns_ino": 0, "btf_id": 52, "btf_key_type_id": 0, "btf_value_type_id": 0 } },{ "bpf_struct_ops_tcp_congestion_ops": { "refcnt": { "refs": { "counter": 1 } }, "state": "BPF_STRUCT_OPS_STATE_INUSE", "data": { "list": { "next": 0, "prev": 0 }, "key": 0, "flags": 0, "init": "void (struct sock *) bictcp_init/prog_id:138", "release": "void (struct sock *) 0", "ssthresh": "u32 (struct sock *) bictcp_recalc_ssthresh/prog_id:141", "cong_avoid": "void (struct sock *, u32, u32) bictcp_cong_avoid/prog_id:140", "set_state": "void (struct sock *, u8) bictcp_state/prog_id:142", "cwnd_event": "void (struct sock *, enum tcp_ca_event) bictcp_cwnd_event/prog_id:139", "in_ack_event": "void (struct sock *, u32) 0", "undo_cwnd": "u32 (struct sock *) tcp_reno_undo_cwnd/prog_id:144", "pkts_acked": "void (struct sock *, const struct ack_sample *) bictcp_acked/prog_id:143", "min_tso_segs": "u32 (struct sock *) 0", "sndbuf_expand": "u32 (struct sock *) 0", "cong_control": "void (struct sock *, const struct rate_sample *) 0", "get_info": "size_t (struct sock *, u32, int *, union tcp_cc_info *) 0", "name": "bpf_cubic", "owner": 0 } } } ] Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NQuentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200318171656.129650-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
The kernel struct_ops obj has kernel's func ptrs implemented by bpf_progs. The bpf prog_id is stored as the value of the func ptr for introspection purpose. In the latter patch, a struct_ops dump subcmd will be added to introspect these func ptrs. It is desired to print the actual bpf prog_name instead of only printing the prog_id. Since struct_ops is the only usecase storing prog_id in the func ptr, this patch adds a prog_id_as_func_ptr bool (default is false) to "struct btf_dumper" in order not to mis-interpret the ptr value for the other existing use-cases. While printing a func_ptr as a bpf prog_name, this patch also prefix the bpf prog_name with the ptr's func_proto. [ Note that it is the ptr's func_proto instead of the bpf prog's func_proto ] It reuses the current btf_dump_func() to obtain the ptr's func_proto string. Here is an example from the bpf_cubic.c: "void (struct sock *, u32, u32) bictcp_cong_avoid/prog_id:140" Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NQuentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200318171650.129252-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
A char[] is currently printed as an integer array. This patch will print it as a string when 1) The array element type is an one byte int 2) The array element type has a BTF_INT_CHAR encoding or the array element type's name is "char" 3) All characters is between (0x1f, 0x7f) and it is terminated by a null character. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NQuentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200318171643.129021-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
This patch prints the enum's name if there is one found in the array of btf_enum. The commit 9eea9849 ("bpf: fix BTF verification of enums") has details about an enum could have any power-of-2 size (up to 8 bytes). This patch also takes this chance to accommodate these non 4 byte enums. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NQuentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200318171637.128862-1-kafai@fb.com
-
- 18 3月, 2020 6 次提交
-
-
由 Wenbo Zhang 提交于
Use PT_REGS_RC instead of PT_REGS_RET to get ret correctly. Fixes: df8ff353 ("libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h") Signed-off-by: NWenbo Zhang <ethercflow@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200315083252.22274-1-ethercflow@gmail.com
-
由 Andrii Nakryiko 提交于
Some tests and sub-tests are setting "custom" thread/process affinity and don't reset it back. Instead of requiring each test to undo all this, ensure that thread affinity is restored by test_progs test runner itself. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200314013932.4035712-3-andriin@fb.com
-
由 Andrii Nakryiko 提交于
When specifying disjoint set of tests, test_progs doesn't set skipped test's array elements to false. This leads to spurious execution of tests that should have been skipped. Fix it by explicitly initializing them to false. Fixes: 3a516a0a ("selftests/bpf: add sub-tests support for test_progs") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200314013932.4035712-2-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Previous attempt to make tcp_rtt more robust introduced a new race, in which server_done might be set to true before server can actually accept any connection. Fix this by unconditionally waiting for accept(). Given socket is non-blocking, if there are any problems with client side, it should eventually close listening FD and let server thread exit with failure. Fixes: 4cd729fa ("selftests/bpf: Make tcp_rtt test more robust to failures") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200314013932.4035712-1-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Amazingly, some libc implementations don't call __NR_nanosleep syscall from their nanosleep() APIs. Hammer it down with explicit syscall() call and never get back to it again. Also simplify code for timespec initialization. I verified that nanosleep is called w/ printk and in exactly same Linux image that is used in Travis CI. So it should both sleep and call correct syscall. v1->v2: - math is too hard, fix usec -> nsec convertion (Martin); - test_vmlinux has explicit nanosleep() call, convert that one as well. Fixes: 4e1fd25d ("selftests/bpf: Fix usleep() implementation") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200314002743.3782677-1-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Remove unused len variable, which causes compilation warnings. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200314001834.3727680-1-andriin@fb.com
-
- 15 3月, 2020 3 次提交
-
-
由 Petr Machata 提交于
Extend RED testsuite to cover the new nodrop mode of RED-ECN. This test is really similar to ECN test, diverging only in the last step, where UDP traffic should go to backlog instead of being dropped. Thus extract a common helper, ecn_test_common(), make do_ecn_test() into a relatively simple wrapper, and add another one, do_ecn_nodrop_test(). Signed-off-by: NPetr Machata <petrm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
Add tests for the new "nodrop" flag. Signed-off-by: NPetr Machata <petrm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
Add a handful of tests for creating RED with different flags. Signed-off-by: NPetr Machata <petrm@mellanox.com> Reviewed-by: NRoman Mashak <mrv@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 3月, 2020 10 次提交
-
-
由 Andrii Nakryiko 提交于
Add vmlinux.h generation to selftest/bpf's Makefile. Use it from newly added test_vmlinux to trace nanosleep syscall using 5 different types of programs: - tracepoint; - raw tracepoint; - raw tracepoint w/ direct memory reads (tp_btf); - kprobe; - fentry. These programs are realistic variants of real-life tracing programs, excercising vmlinux.h's usage with tracing applications. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-5-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first argument. After that, reading any of pt_regs fields requires bpf_probe_read(), even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to read necessary fields. This provides relocatable architecture-agnostic pt_regs field accesses. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com
-
由 Andrii Nakryiko 提交于
When finding target type candidates, ignore forward declarations, functions, and other named types of incompatible kind. Not doing this can cause false errors. See [0] for one such case (due to struct pt_regs forward declaration). [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645 Fixes: ddc7c304 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: NWenbo Zhang <ethercflow@gmail.com> Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com
-
由 Andrii Nakryiko 提交于
printf() doesn't seem to honor using overwritten stdout/stderr (as part of stdio hijacking), so ensure all "standard" invocations of printf() do fprintf(stdout, ...) instead. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-2-andriin@fb.com
-
由 Jakub Sitnicki 提交于
Andrii Nakryiko reports that sockmap_listen test suite is frequently failing due to accept() calls erroring out with EAGAIN: ./test_progs:connect_accept_thread:733: accept: Resource temporarily unavailable connect_accept_thread:FAIL:733 This is because we are using a non-blocking listening TCP socket to accept() connections without polling on the socket. While at first switching to blocking mode seems like the right thing to do, this could lead to test process blocking indefinitely in face of a network issue, like loopback interface being down, as Andrii pointed out. Hence, stick to non-blocking mode for TCP listening sockets but with polling for incoming connection for a limited time before giving up. Apply this approach to all socket I/O calls in the test suite that we expect to block indefinitely, that is accept() for TCP and recv() for UDP. Fixes: 44d28be2 ("selftests/bpf: Tests for sockmap/sockhash holding listening sockets") Reported-by: NAndrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200313161049.677700-1-jakub@cloudflare.com
-
由 Tobias Klauser 提交于
Commit fe4eb069 ("bpftool: Use linux/types.h from source tree for profiler build") added a build dependency on tools/testing/selftests/bpf to tools/bpf/bpftool. This is suboptimal with respect to a possible stand-alone build of bpftool. Fix this by moving tools/testing/selftests/bpf/include/uapi/linux/types.h to tools/include/uapi/linux/types.h. This requires an adjustment in the include search path order for the tests in tools/testing/selftests/bpf so that tools/include/linux/types.h is selected when building host binaries and tools/include/uapi/linux/types.h is selected when building bpf binaries. Verified by compiling bpftool and the bpf selftests on x86_64 with this change. Fixes: fe4eb069 ("bpftool: Use linux/types.h from source tree for profiler build") Suggested-by: NAndrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NQuentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200313113105.6918-1-tklauser@distanz.ch
-
由 Andrii Nakryiko 提交于
nanosleep syscall expects pointer to struct timespec, not nanoseconds directly. Current implementation fulfills its purpose of invoking nanosleep syscall, but doesn't really provide sleeping capabilities, which can cause flakiness for tests relying on usleep() to wait for something. Fixes: ec12a57b822c ("selftests/bpf: Guarantee that useep() calls nanosleep() syscall") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200313061837.3685572-1-andriin@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Andrii Nakryiko 提交于
Switch to non-blocking accept and wait for server thread to exit before proceeding. I noticed that sometimes tcp_rtt server thread failure would "spill over" into other tests (that would run after tcp_rtt), probably just because server thread exits much later and tcp_rtt doesn't wait for it. v1->v2: - add usleep() while waiting on initial non-blocking accept() (Stanislav); Fixes: 8a03222f ("selftests/bpf: test_progs: fix client/server race in tcp_rtt") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NStanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20200311222749.458015-1-andriin@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Andrii Nakryiko 提交于
Some implementations of C runtime library won't call nanosleep() syscall from usleep(). But a bunch of kprobe/tracepoint selftests rely on nanosleep being called to trigger them. To make this more reliable, "override" usleep implementation and call nanosleep explicitly. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Cc: Julia Kartseva <hex@fb.com> Link: https://lore.kernel.org/bpf/20200311185345.3874602-1-andriin@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Quentin Monnet 提交于
In commit 4a3d6c6a ("libbpf: Reduce log level for custom section names"), log level for messages for libbpf_attach_type_by_name() and libbpf_prog_type_by_name() was downgraded from "info" to "debug". The latter function, in particular, is used by bpftool when attempting to load programs, and this change caused bpftool to exit with no hint or error message when it fails to detect the type of the program to load (unless "-d" option was provided). To help users understand why bpftool fails to load the program, let's do a second run of the function with log level in "debug" mode in case of failure. Before: # bpftool prog load sample_ret0.o /sys/fs/bpf/sample_ret0 # echo $? 255 Or really verbose with -d flag: # bpftool -d prog load sample_ret0.o /sys/fs/bpf/sample_ret0 libbpf: loading sample_ret0.o libbpf: section(1) .strtab, size 134, link 0, flags 0, type=3 libbpf: skip section(1) .strtab libbpf: section(2) .text, size 16, link 0, flags 6, type=1 libbpf: found program .text libbpf: section(3) .debug_abbrev, size 55, link 0, flags 0, type=1 libbpf: skip section(3) .debug_abbrev libbpf: section(4) .debug_info, size 75, link 0, flags 0, type=1 libbpf: skip section(4) .debug_info libbpf: section(5) .rel.debug_info, size 32, link 14, flags 0, type=9 libbpf: skip relo .rel.debug_info(5) for section(4) libbpf: section(6) .debug_str, size 150, link 0, flags 30, type=1 libbpf: skip section(6) .debug_str libbpf: section(7) .BTF, size 155, link 0, flags 0, type=1 libbpf: section(8) .BTF.ext, size 80, link 0, flags 0, type=1 libbpf: section(9) .rel.BTF.ext, size 32, link 14, flags 0, type=9 libbpf: skip relo .rel.BTF.ext(9) for section(8) libbpf: section(10) .debug_frame, size 40, link 0, flags 0, type=1 libbpf: skip section(10) .debug_frame libbpf: section(11) .rel.debug_frame, size 16, link 14, flags 0, type=9 libbpf: skip relo .rel.debug_frame(11) for section(10) libbpf: section(12) .debug_line, size 74, link 0, flags 0, type=1 libbpf: skip section(12) .debug_line libbpf: section(13) .rel.debug_line, size 16, link 14, flags 0, type=9 libbpf: skip relo .rel.debug_line(13) for section(12) libbpf: section(14) .symtab, size 96, link 1, flags 0, type=2 libbpf: looking for externs among 4 symbols... libbpf: collected 0 externs total libbpf: failed to guess program type from ELF section '.text' libbpf: supported section(type) names are: socket sk_reuseport kprobe/ [...] After: # bpftool prog load sample_ret0.o /sys/fs/bpf/sample_ret0 libbpf: failed to guess program type from ELF section '.text' libbpf: supported section(type) names are: socket sk_reuseport kprobe/ [...] Signed-off-by: NQuentin Monnet <quentin@isovalent.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200311021205.9755-1-quentin@isovalent.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-