- 28 8月, 2020 2 次提交
-
-
由 Yonghong Song 提交于
bpf_link_info.iter is used by link_query to return bpf_iter_link_info to user space. Fields may be different, e.g., map_fd vs. map_id, so we cannot reuse the exact structure. But make them similar, e.g., struct bpf_link_info { /* common fields */ union { struct { ... } raw_tracepoint; struct { ... } tracing; ... struct { /* common fields for iter */ union { struct { __u32 map_id; } map; /* other structs for other targets */ }; }; }; }; so the structure is extensible the same way as bpf_iter_link_info. Fixes: 6b0a249a ("bpf: Implement link_query for bpf iterators") Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200828051922.758950-1-yhs@fb.com
-
由 Jesper Dangaard Brouer 提交于
The system for "Auto-detecting system features" located under tools/build/ are (currently) used by perf, libbpf and bpftool. It can contain stalled feature detection files, which are not cleaned up by libbpf and bpftool on make clean (side-note: perf tool is correct). Fix this by making the users invoke the make clean target. Some details about the changes. The libbpf Makefile already had a clean-config target (which seems to be copy-pasted from perf), but this target was not "connected" (a make dependency) to clean target. Choose not to rename target as someone might be using it. Did change the output from "CLEAN config" to "CLEAN feature-detect", to make it more clear what happens. This is related to the complaint and troubleshooting in the following link: https://lore.kernel.org/lkml/20200818122007.2d1cfe2d@carbon/Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/lkml/20200818122007.2d1cfe2d@carbon/ Link: https://lore.kernel.org/bpf/159851841661.1072907.13770213104521805592.stgit@firesoul
-
- 27 8月, 2020 7 次提交
-
-
由 Andrii Nakryiko 提交于
Fix compilation warnings due to __u64 defined differently as `unsigned long` or `unsigned long long` on different architectures (e.g., ppc64le differs from x86-64). Also cast one argument to size_t to fix printf warning of similar nature. Fixes: eacaaed7 ("libbpf: Implement enum value-based CO-RE relocations") Fixes: 50e09460 ("libbpf: Skip well-known ELF sections when iterating ELF") Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200827041109.3613090-1-andriin@fb.com
-
由 Yonghong Song 提交于
Added some test_verifier bounds check test cases for xor operations. $ ./test_verifier ... #78/u bounds check for reg = 0, reg xor 1 OK #78/p bounds check for reg = 0, reg xor 1 OK #79/u bounds check for reg32 = 0, reg32 xor 1 OK #79/p bounds check for reg32 = 0, reg32 xor 1 OK #80/u bounds check for reg = 2, reg xor 3 OK #80/p bounds check for reg = 2, reg xor 3 OK #81/u bounds check for reg = any, reg xor 3 OK #81/p bounds check for reg = any, reg xor 3 OK #82/u bounds check for reg32 = any, reg32 xor 3 OK #82/p bounds check for reg32 = any, reg32 xor 3 OK #83/u bounds check for reg > 0, reg xor 3 OK #83/p bounds check for reg > 0, reg xor 3 OK #84/u bounds check for reg32 > 0, reg32 xor 3 OK #84/p bounds check for reg32 > 0, reg32 xor 3 OK ... Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200825064609.2018077-1-yhs@fb.com
-
由 Alex Gartrell 提交于
There are code paths where EINVAL is returned directly without setting errno. In that case, errno could be 0, which would mask the failure. For example, if a careless programmer set log_level to 10000 out of laziness, they would have to spend a long time trying to figure out why. Fixes: 4f33ddb4 ("libbpf: Propagate EPERM to caller on program load") Signed-off-by: NAlex Gartrell <alexgartrell@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200826075549.1858580-1-alexgartrell@gmail.com
-
由 Udip Pant 提交于
This adds further tests to ensure access permissions and restrictions are applied properly for some map types such as sock-map. It also adds another negative tests to assert static functions cannot be replaced. In the 'unreliable' mode it still fails with error 'tracing progs cannot use bpf_spin_lock yet' with the change in the verifier Signed-off-by: NUdip Pant <udippant@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825232003.2877030-5-udippant@fb.com
-
由 Udip Pant 提交于
This adds test to enforce same check for the return code for the extended prog as it is enforced for the target program. It asserts failure for a return code, which is permitted without the patch in this series, while it is restricted after the application of this patch. Signed-off-by: NUdip Pant <udippant@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825232003.2877030-4-udippant@fb.com
-
由 Udip Pant 提交于
This adds a selftest that tests the behavior when a freplace target program attempts to make a write access on a packet. The expectation is that the read or write access is granted based on the program type of the linked program and not itself (which is of type, for e.g., BPF_PROG_TYPE_EXT). This test fails without the associated patch on the verifier. Signed-off-by: NUdip Pant <udippant@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825232003.2877030-3-udippant@fb.com
-
由 Colin Ian King 提交于
There is a spelling mistake in a check error message. Fix it. Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200826085907.43095-1-colin.king@canonical.com
-
- 26 8月, 2020 13 次提交
-
-
由 Jiri Olsa 提交于
Alexei reported compile breakage on newer systems with following error: In file included from /usr/include/fcntl.h:290:0, 4814 from ./test_progs.h:29, 4815 from .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:3: 4816In function ‘open’, 4817 inlined from ‘trigger_fstat_events’ at .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:50:10, 4818 inlined from ‘test_d_path’ at .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:119:6: 4819/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:4: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments 4820 __open_missing_mode (); 4821 ^~~~~~~~~~~~~~~~~~~~~~ We're missing permission bits as 3rd argument for open call with O_CREAT flag specified. Fixes: e4d1af4b ("selftests/bpf: Add test for d_path helper") Reported-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200826101845.747617-1-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding test to for sets resolve_btfids. We're checking that testing set gets properly resolved and sorted. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-15-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding test for d_path helper which is pretty much copied from Wenbo Zhang's test for bpf_get_fd_path, which never made it in. The test is doing fstat/close on several fd types, and verifies we got the d_path helper working on kernel probes for vfs_getattr/filp_close functions. Original-patch-by: NWenbo Zhang <ethercflow@gmail.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-14-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding verifier test for attaching tracing program and calling d_path helper from within and testing that it's allowed for dentry_open function and denied for 'd_path' function with appropriate error. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-13-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding d_path helper function that returns full path for given 'struct path' object, which needs to be the kernel BTF 'path' object. The path is returned in buffer provided 'buf' of size 'sz' and is zero terminated. bpf_d_path(&file->f_path, buf, size); The helper calls directly d_path function, so there's only limited set of function it can be called from. Adding just very modest set for the start. Updating also bpf.h tools uapi header and adding 'path' to bpf_helpers_doc.py script. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NKP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-11-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding support to define sorted set of BTF ID values. Following defines sorted set of BTF ID values: BTF_SET_START(btf_allowlist_d_path) BTF_ID(func, vfs_truncate) BTF_ID(func, vfs_fallocate) BTF_ID(func, dentry_open) BTF_ID(func, vfs_getattr) BTF_ID(func, filp_close) BTF_SET_END(btf_allowlist_d_path) It defines following 'struct btf_id_set' variable to access values and count: struct btf_id_set btf_allowlist_d_path; Adding 'allowed' callback to struct bpf_func_proto, to allow verifier the check on allowed callers. Adding btf_id_set_contains function, which will be used by allowed callbacks to verify the caller's BTF ID value is within allowed set. Also removing extra '\' in __BTF_ID_LIST macro. Added BTF_SET_START_GLOBAL macro for global sets. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-10-jolsa@kernel.org
-
由 Jiri Olsa 提交于
The set symbol does not have the unique number suffix, so we need to give it a special parsing function. This was omitted in the first batch, because there was no set support yet, so it slipped in the testing. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-3-jolsa@kernel.org
-
由 Jiri Olsa 提交于
To make sure we don't crash on malformed symbols. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-2-jolsa@kernel.org
-
由 KP Singh 提交于
inode_local_storage: * Hook to the file_open and inode_unlink LSM hooks. * Create and unlink a temporary file. * Store some information in the inode's bpf_local_storage during file_open. * Verify that this information exists when the file is unlinked. sk_local_storage: * Hook to the socket_post_create and socket_bind LSM hooks. * Open and bind a socket and set the sk_storage in the socket_post_create hook using the start_server helper. * Verify if the information is set in the socket_bind hook. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200825182919.1118197-8-kpsingh@chromium.org
-
由 KP Singh 提交于
Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used in LSM programs. These helpers are not used for tracing programs (currently) as their usage is tied to the life-cycle of the object and should only be used where the owning object won't be freed (when the owning object is passed as an argument to the LSM hook). Thus, they are safer to use in LSM hooks than tracing. Usage of local storage in tracing programs will probably follow a per function based whitelist approach. Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock, it, leads to a compilation warning for LSM programs, it's also updated to accept a void * pointer instead. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200825182919.1118197-7-kpsingh@chromium.org
-
由 KP Singh 提交于
Similar to bpf_local_storage for sockets, add local storage for inodes. The life-cycle of storage is managed with the life-cycle of the inode. i.e. the storage is destroyed along with the owning inode. The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the security blob which are now stackable and can co-exist with other LSMs. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
-
由 KP Singh 提交于
Refactor the functionality in bpf_sk_storage.c so that concept of storage linked to kernel objects can be extended to other objects like inode, task_struct etc. Each new local storage will still be a separate map and provide its own set of helpers. This allows for future object specific extensions and still share a lot of the underlying implementation. This includes the changes suggested by Martin in: https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/ adding new map operations to support bpf_local_storage maps: * storages for different kernel objects to optionally have different memory charging strategy (map_local_storage_charge, map_local_storage_uncharge) * Functionality to extract the storage pointer from a pointer to the owning object (map_owner_storage_ptr) Co-developed-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825182919.1118197-4-kpsingh@chromium.org
-
由 KP Singh 提交于
A purely mechanical change to split the renaming from the actual generalization. Flags/consts: SK_STORAGE_CREATE_FLAG_MASK BPF_LOCAL_STORAGE_CREATE_FLAG_MASK BPF_SK_STORAGE_CACHE_SIZE BPF_LOCAL_STORAGE_CACHE_SIZE MAX_VALUE_SIZE BPF_LOCAL_STORAGE_MAX_VALUE_SIZE Structs: bucket bpf_local_storage_map_bucket bpf_sk_storage_map bpf_local_storage_map bpf_sk_storage_data bpf_local_storage_data bpf_sk_storage_elem bpf_local_storage_elem bpf_sk_storage bpf_local_storage The "sk" member in bpf_local_storage is also updated to "owner" in preparation for changing the type to void * in a subsequent patch. Functions: selem_linked_to_sk selem_linked_to_storage selem_alloc bpf_selem_alloc __selem_unlink_sk bpf_selem_unlink_storage_nolock __selem_link_sk bpf_selem_link_storage_nolock selem_unlink_sk __bpf_selem_unlink_storage sk_storage_update bpf_local_storage_update __sk_storage_lookup bpf_local_storage_lookup bpf_sk_storage_map_free bpf_local_storage_map_free bpf_sk_storage_map_alloc bpf_local_storage_map_alloc bpf_sk_storage_map_alloc_check bpf_local_storage_map_alloc_check bpf_sk_storage_map_check_btf bpf_local_storage_map_check_btf Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200825182919.1118197-2-kpsingh@chromium.org
-
- 25 8月, 2020 12 次提交
-
-
由 Yonghong Song 提交于
Currently test_sk_assign failed verifier with llvm11/llvm12. During debugging, I found the default verifier output is truncated like below Verifier analysis: Skipped 2200 bytes, use 'verb' option for the full verbose log. [...] off=23,r=34,imm=0) R5=inv0 R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=34,imm=0) R10=fp0 80: (0f) r7 += r2 last_idx 80 first_idx 21 regs=4 stack=0 before 78: (16) if w3 == 0x11 goto pc+1 when I am using "./test_progs -vv -t assign". The reason is tc verbose mode is not enabled. This patched enabled tc verbose mode and the output looks like below Verifier analysis: 0: (bf) r6 = r1 1: (b4) w0 = 2 2: (61) r1 = *(u32 *)(r6 +80) 3: (61) r7 = *(u32 *)(r6 +76) 4: (bf) r2 = r7 5: (07) r2 += 14 6: (2d) if r2 > r1 goto pc+61 R0_w=inv2 R1_w=pkt_end(id=0,off=0,imm=0) R2_w=pkt(id=0,off=14,r=14,imm=0) ... Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200824222807.100200-1-yhs@fb.com
-
由 Lorenz Bauer 提交于
Address review by Yonghong, to bring the new tests in line with the usual code style. Signed-off-by: NLorenz Bauer <lmb@cloudflare.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200824084523.13104-1-lmb@cloudflare.com
-
由 Andrii Nakryiko 提交于
Fix copy-paste error in types compatibility check. Local type is accidentally used instead of target type for the very first type check strictness check. This can result in potentially less strict candidate comparison. Fix the error. Fixes: 3fc32f40 ("libbpf: Implement type-based CO-RE relocations support") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821225653.2180782-1-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Some versions of GCC report uninitialized targ_spec usage. GCC is wrong, but let's avoid unnecessary warnings. Fixes: ddc7c304 ("libbpf: implement BPF CO-RE offset relocation algorithm") Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821225556.2178419-1-andriin@fb.com
-
由 Martin KaFai Lau 提交于
This patch is adapted from Eric's patch in an earlier discussion [1]. The TCP_SAVE_SYN currently only stores the network header and tcp header. This patch allows it to optionally store the mac header also if the setsockopt's optval is 2. It requires one more bit for the "save_syn" bit field in tcp_sock. This patch achieves this by moving the syn_smc bit next to the is_mptcp. The syn_smc is currently used with the TCP experimental option. Since syn_smc is only used when CONFIG_SMC is enabled, this patch also puts the "IS_ENABLED(CONFIG_SMC)" around it like the is_mptcp did with "IS_ENABLED(CONFIG_MPTCP)". The mac_hdrlen is also stored in the "struct saved_syn" to allow a quick offset from the bpf prog if it chooses to start getting from the network header or the tcp header. [1]: https://lore.kernel.org/netdev/CANn89iLJNWh6bkH7DNhy_kmcAexuUCccqERqe7z2QsvPhGrYPQ@mail.gmail.com/Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190123.2886935-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
This patch adds tests for the new bpf tcp header option feature. test_tcp_hdr_options.c: - It tests header option writing and parsing in 3WHS: regular connection establishment, fastopen, and syncookie. - In syncookie, the passive side's bpf prog is asking the active side to resend its bpf header option by specifying a RESEND bit in the outgoing SYNACK. handle_active_estab() and write_nodata_opt() has some details. - handle_passive_estab() has comments on fastopen. - It also has test for header writing and parsing in FIN packet. - Most of the tests is writing an experimental option 254 with magic 0xeB9F. - The no_exprm_estab() also tests writing a regular TCP option without any magic. test_misc_tcp_options.c: - It is an one directional test. Active side writes option and passive side parses option. The focus is to exercise the new helpers and API. - Testing the new helper: bpf_load_hdr_opt() and bpf_store_hdr_opt(). - Testing the bpf_getsockopt(TCP_BPF_SYN). - Negative tests for the above helpers. - Testing the sock_ops->skb_data. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820190117.2886749-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
This patch adds a fastopen_connect() helper which will be used in a later test. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820190111.2886196-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
[ Note: The TCP changes here is mainly to implement the bpf pieces into the bpf_skops_*() functions introduced in the earlier patches. ] The earlier effort in BPF-TCP-CC allows the TCP Congestion Control algorithm to be written in BPF. It opens up opportunities to allow a faster turnaround time in testing/releasing new congestion control ideas to production environment. The same flexibility can be extended to writing TCP header option. It is not uncommon that people want to test new TCP header option to improve the TCP performance. Another use case is for data-center that has a more controlled environment and has more flexibility in putting header options for internal only use. For example, we want to test the idea in putting maximum delay ACK in TCP header option which is similar to a draft RFC proposal [1]. This patch introduces the necessary BPF API and use them in the TCP stack to allow BPF_PROG_TYPE_SOCK_OPS program to parse and write TCP header options. It currently supports most of the TCP packet except RST. Supported TCP header option: ─────────────────────────── This patch allows the bpf-prog to write any option kind. Different bpf-progs can write its own option by calling the new helper bpf_store_hdr_opt(). The helper will ensure there is no duplicated option in the header. By allowing bpf-prog to write any option kind, this gives a lot of flexibility to the bpf-prog. Different bpf-prog can write its own option kind. It could also allow the bpf-prog to support a recently standardized option on an older kernel. Sockops Callback Flags: ────────────────────── The bpf program will only be called to parse/write tcp header option if the following newly added callback flags are enabled in tp->bpf_sock_ops_cb_flags: BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG A few words on the PARSE CB flags. When the above PARSE CB flags are turned on, the bpf-prog will be called on packets received at a sk that has at least reached the ESTABLISHED state. The parsing of the SYN-SYNACK-ACK will be discussed in the "3 Way HandShake" section. The default is off for all of the above new CB flags, i.e. the bpf prog will not be called to parse or write bpf hdr option. There are details comment on these new cb flags in the UAPI bpf.h. sock_ops->skb_data and bpf_load_hdr_opt() ───────────────────────────────────────── sock_ops->skb_data and sock_ops->skb_data_end covers the whole TCP header and its options. They are read only. The new bpf_load_hdr_opt() helps to read a particular option "kind" from the skb_data. Please refer to the comment in UAPI bpf.h. It has details on what skb_data contains under different sock_ops->op. 3 Way HandShake ─────────────── The bpf-prog can learn if it is sending SYN or SYNACK by reading the sock_ops->skb_tcp_flags. * Passive side When writing SYNACK (i.e. sock_ops->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB), the received SYN skb will be available to the bpf prog. The bpf prog can use the SYN skb (which may carry the header option sent from the remote bpf prog) to decide what bpf header option should be written to the outgoing SYNACK skb. The SYN packet can be obtained by getsockopt(TCP_BPF_SYN*). More on this later. Also, the bpf prog can learn if it is in syncookie mode (by checking sock_ops->args[0] == BPF_WRITE_HDR_TCP_SYNACK_COOKIE). The bpf prog can store the received SYN pkt by using the existing bpf_setsockopt(TCP_SAVE_SYN). The example in a later patch does it. [ Note that the fullsock here is a listen sk, bpf_sk_storage is not very useful here since the listen sk will be shared by many concurrent connection requests. Extending bpf_sk_storage support to request_sock will add weight to the minisock and it is not necessary better than storing the whole ~100 bytes SYN pkt. ] When the connection is established, the bpf prog will be called in the existing PASSIVE_ESTABLISHED_CB callback. At that time, the bpf prog can get the header option from the saved syn and then apply the needed operation to the newly established socket. The later patch will use the max delay ack specified in the SYN header and set the RTO of this newly established connection as an example. The received ACK (that concludes the 3WHS) will also be available to the bpf prog during PASSIVE_ESTABLISHED_CB through the sock_ops->skb_data. It could be useful in syncookie scenario. More on this later. There is an existing getsockopt "TCP_SAVED_SYN" to return the whole saved syn pkt which includes the IP[46] header and the TCP header. A few "TCP_BPF_SYN*" getsockopt has been added to allow specifying where to start getting from, e.g. starting from TCP header, or from IP[46] header. The new getsockopt(TCP_BPF_SYN*) will also know where it can get the SYN's packet from: - (a) the just received syn (available when the bpf prog is writing SYNACK) and it is the only way to get SYN during syncookie mode. or - (b) the saved syn (available in PASSIVE_ESTABLISHED_CB and also other existing CB). The bpf prog does not need to know where the SYN pkt is coming from. The getsockopt(TCP_BPF_SYN*) will hide this details. Similarly, a flags "BPF_LOAD_HDR_OPT_TCP_SYN" is also added to bpf_load_hdr_opt() to read a particular header option from the SYN packet. * Fastopen Fastopen should work the same as the regular non fastopen case. This is a test in a later patch. * Syncookie For syncookie, the later example patch asks the active side's bpf prog to resend the header options in ACK. The server can use bpf_load_hdr_opt() to look at the options in this received ACK during PASSIVE_ESTABLISHED_CB. * Active side The bpf prog will get a chance to write the bpf header option in the SYN packet during WRITE_HDR_OPT_CB. The received SYNACK pkt will also be available to the bpf prog during the existing ACTIVE_ESTABLISHED_CB callback through the sock_ops->skb_data and bpf_load_hdr_opt(). * Turn off header CB flags after 3WHS If the bpf prog does not need to write/parse header options beyond the 3WHS, the bpf prog can clear the bpf_sock_ops_cb_flags to avoid being called for header options. Or the bpf-prog can select to leave the UNKNOWN_HDR_OPT_CB_FLAG on so that the kernel will only call it when there is option that the kernel cannot handle. [1]: draft-wang-tcpm-low-latency-opt-00 https://tools.ietf.org/html/draft-wang-tcpm-low-latency-opt-00Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820190104.2885895-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
The bpf prog needs to parse the SYN header to learn what options have been sent by the peer's bpf-prog before writing its options into SYNACK. This patch adds a "syn_skb" arg to tcp_make_synack() and send_synack(). This syn_skb will eventually be made available (as read-only) to the bpf prog. This will be the only SYN packet available to the bpf prog during syncookie. For other regular cases, the bpf prog can also use the saved_syn. When writing options, the bpf prog will first be called to tell the kernel its required number of bytes. It is done by the new bpf_skops_hdr_opt_len(). The bpf prog will only be called when the new BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG is set in tp->bpf_sock_ops_cb_flags. When the bpf prog returns, the kernel will know how many bytes are needed and then update the "*remaining" arg accordingly. 4 byte alignment will be included in the "*remaining" before this function returns. The 4 byte aligned number of bytes will also be stored into the opts->bpf_opt_len. "bpf_opt_len" is a newly added member to the struct tcp_out_options. Then the new bpf_skops_write_hdr_opt() will call the bpf prog to write the header options. The bpf prog is only called if it has reserved spaces before (opts->bpf_opt_len > 0). The bpf prog is the last one getting a chance to reserve header space and writing the header option. These two functions are half implemented to highlight the changes in TCP stack. The actual codes preparing the bpf running context and invoking the bpf prog will be added in the later patch with other necessary bpf pieces. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190052.2885316-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
The patch adds a function bpf_skops_parse_hdr(). It will call the bpf prog to parse the TCP header received at a tcp_sock that has at least reached the ESTABLISHED state. For the packets received during the 3WHS (SYN, SYNACK and ACK), the received skb will be available to the bpf prog during the callback in bpf_skops_established() introduced in the previous patch and in the bpf_skops_write_hdr_opt() that will be added in the next patch. Calling bpf prog to parse header is controlled by two new flags in tp->bpf_sock_ops_cb_flags: BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG and BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG. When BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG is set, the bpf prog will only be called when there is unknown option in the TCP header. When BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG is set, the bpf prog will be called on all received TCP header. This function is half implemented to highlight the changes in TCP stack. The actual codes preparing the bpf running context and invoking the bpf prog will be added in the later patch with other necessary bpf pieces. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190046.2885054-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
This patch adds bpf_setsockopt(TCP_BPF_RTO_MIN) to allow bpf prog to set the min rto of a connection. It could be used together with the earlier patch which has added bpf_setsockopt(TCP_BPF_DELACK_MAX). A later selftest patch will communicate the max delay ack in a bpf tcp header option and then the receiving side can use bpf_setsockopt(TCP_BPF_RTO_MIN) to set a shorter rto. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200820190027.2884170-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
This change is mostly from an internal patch and adapts it from sysctl config to the bpf_setsockopt setup. The bpf_prog can set the max delay ack by using bpf_setsockopt(TCP_BPF_DELACK_MAX). This max delay ack can be communicated to its peer through bpf header option. The receiving peer can then use this max delay ack and set a potentially lower rto by using bpf_setsockopt(TCP_BPF_RTO_MIN) which will be introduced in the next patch. Another later selftest patch will also use it like the above to show how to write and parse bpf tcp header option. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200820190021.2884000-1-kafai@fb.com
-
- 22 8月, 2020 6 次提交
-
-
由 Andrii Nakryiko 提交于
Make libbpf logs follow similar pattern and provide more context like section name or program name, where appropriate. Also, add BPF_INSN_SZ constant and use it throughout to clean up code a little bit. This commit doesn't have any functional changes and just removes some code changes out of the way before bigger refactoring in libbpf internals. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-6-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Skip and don't log ELF sections that libbpf knows about and ignores during ELF processing. This allows to not unnecessarily log details about those ELF sections and cleans up libbpf debug log. Ignored sections include DWARF data, string table, empty .text section and few special (e.g., .llvm_addrsig) useless sections. With such ELF sections out of the way, log unrecognized ELF sections at pr_info level to increase visibility. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-5-andriin@fb.com
-
由 Andrii Nakryiko 提交于
__noinline is pretty frequently used, especially with BPF subprograms, so add them along the __always_inline, for user convenience and completeness. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-4-andriin@fb.com
-
由 Andrii Nakryiko 提交于
Factor out common ELF operations done throughout the libbpf. This simplifies usage across multiple places in libbpf, as well as hide error reporting from higher-level functions and make error logging more consistent. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-3-andriin@fb.com
-
由 Andrii Nakryiko 提交于
There is no need to re-build BPF object files if any of the sources of libbpf change. So record more precise dependency only on libbpf/bpf_*.h headers. This eliminates unnecessary re-builds. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-2-andriin@fb.com
-
由 Lorenz Bauer 提交于
Add a test which copies a socket from a sockmap into another sockmap or sockhash. This excercises bpf_map_update_elem support from BPF context. Compare the socket cookies from source and destination to ensure that the copy succeeded. Also check that the verifier rejects map_update from unsafe contexts. Signed-off-by: NLorenz Bauer <lmb@cloudflare.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200821102948.21918-7-lmb@cloudflare.com
-