- 24 8月, 2022 14 次提交
-
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.14-rc1 commit 79a7f8bd category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=79a7f8bdb159d9914b58740f3d31d602a6e4aca8 ------------------------------------------------- Add placeholders for bpf_sys_bpf() helper and new program type. Make sure to check that expected_attach_type is zero for future extensibility. Allow tracing helper functions to be used in this program type, since they will only execute from user context via bpf_prog_test_run. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.com (cherry picked from commit 79a7f8bd) Signed-off-by: NWang Yufen <wangyufen@huawei.com> Conflicts: include/linux/bpf_types.h include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Florent Revest 提交于
mainline inclusion from mainline-5.13-rc1 commit 7b15523a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7b15523a989b63927c2bb08e9b5b0bbc10b58bef ------------------------------------------------- The implementation takes inspiration from the existing bpf_trace_printk helper but there are a few differences: To allow for a large number of format-specifiers, parameters are provided in an array, like in bpf_seq_printf. Because the output string takes two arguments and the array of parameters also takes two arguments, the format string needs to fit in one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to a zero-terminated read-only map so we don't need a format string length arg. Because the format-string is known at verification time, we also do a first pass of format string validation in the verifier logic. This makes debugging easier. Signed-off-by: NFlorent Revest <revest@chromium.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org (cherry picked from commit 7b15523a) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Toke Høiland-Jørgensen 提交于
mainline inclusion from mainline-5.13-rc1 commit 441e8c66 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=441e8c66b23e027c00ccebd70df9fd933918eefe ------------------------------------------------- There is currently no way to discover the target of a tracing program attachment after the fact. Add this information to bpf_link_info and return it when querying the bpf_link fd. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210413091607.58945-1-toke@redhat.com (cherry picked from commit 441e8c66) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Pedro Tammela 提交于
mainline inclusion from mainline-5.13-rc1 commit 5c507329 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c507329000e282dce91e6c98ee6ffa61a8a5e49 ------------------------------------------------- In 'bpf_ringbuf_reserve()' we require the flag to '0' at the moment. For 'bpf_ringbuf_{discard,submit,output}' a flag of '0' might send a notification to the process if needed. Signed-off-by: NPedro Tammela <pctammela@mojatatu.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210412192434.944343-1-pctammela@mojatatu.com (cherry picked from commit 5c507329) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Martin KaFai Lau 提交于
mainline inclusion from mainline-5.13-rc1 commit e6ac2450 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e6ac2450d6dee3121cd8bbf2907b78a68a8a353d ------------------------------------------------- This patch adds support to BPF verifier to allow bpf program calling kernel function directly. The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations. This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them. The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly. This patch is to make the required changes in the bpf verifier. First change is in btf.c, it adds a case in "btf_check_func_arg_match()". When the passed in "btf->kernel_btf == true", it means matching the verifier regs' states with a kernel function. This will handle the PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, and PTR_TO_TCP_SOCK to its kernel's btf_id. In the later libbpf patch, the insn calling a kernel function will look like: insn->code == (BPF_JMP | BPF_CALL) insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */ insn->imm == func_btf_id /* btf_id of the running kernel */ [ For the future calling function-in-kernel-module support, an array of module btf_fds can be passed at the load time and insn->off can be used to index into this array. ] At the early stage of verifier, the verifier will collect all kernel function calls into "struct bpf_kfunc_desc". Those descriptors are stored in "prog->aux->kfunc_tab" and will be available to the JIT. Since this "add" operation is similar to the current "add_subprog()" and looking for the same insn->code, they are done together in the new "add_subprog_and_kfunc()". In the "do_check()" stage, the new "check_kfunc_call()" is added to verify the kernel function call instruction: 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. A new bpf_verifier_ops "check_kfunc_call" is added to do that. The bpf-tcp-cc struct_ops program will implement this function in a later patch. 2. Call "btf_check_kfunc_args_match()" to ensure the regs can be used as the args of a kernel function. 3. Mark the regs' type, subreg_def, and zext_dst. At the later do_misc_fixups() stage, the new fixup_kfunc_call() will replace the insn->imm with the function address (relative to __bpf_call_base). If needed, the jit can find the btf_func_model by calling the new bpf_jit_find_kfunc_model(prog, insn). With the imm set to the function address, "bpftool prog dump xlated" will be able to display the kernel function calls the same way as it displays other bpf helper calls. gpl_compatible program is required to call kernel function. This feature currently requires JIT. The verifier selftests are adjusted because of the changes in the verbose log in add_subprog_and_kfunc(). Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com (cherry picked from commit e6ac2450) Signed-off-by: NWang Yufen <wangyufen@huawei.com> Conflicts: include/linux/bpf.h Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Yonghong Song 提交于
mainline inclusion from mainline-5.13-rc1 commit 69c087ba category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69c087ba6225b574afb6e505b72cb75242a3d844 ------------------------------------------------- The bpf_for_each_map_elem() helper is introduced which iterates all map elements with a callback function. The helper signature looks like long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags) and for each map element, the callback_fn will be called. For example, like hashmap, the callback signature may look like long callback_fn(map, key, val, callback_ctx) There are two known use cases for this. One is from upstream ([1]) where a for_each_map_elem helper may help implement a timeout mechanism in a more generic way. Another is from our internal discussion for a firewall use case where a map contains all the rules. The packet data can be compared to all these rules to decide allow or deny the packet. For array maps, users can already use a bounded loop to traverse elements. Using this helper can avoid using bounded loop. For other type of maps (e.g., hash maps) where bounded loop is hard or impossible to use, this helper provides a convenient way to operate on all elements. For callback_fn, besides map and map element, a callback_ctx, allocated on caller stack, is also passed to the callback function. This callback_ctx argument can provide additional input and allow to write to caller stack for output. If the callback_fn returns 0, the helper will iterate through next element if available. If the callback_fn returns 1, the helper will stop iterating and returns to the bpf program. Other return values are not used for now. Currently, this helper is only available with jit. It is possible to make it work with interpreter with so effort but I leave it as the future work. [1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com (cherry picked from commit 69c087ba) Signed-off-by: NWang Yufen <wangyufen@huawei.com> Conflicts: include/linux/bpf.h kernel/bpf/verifier.c Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.12-rc1 commit 9ed9e9ba category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ed9e9ba2337205311398a312796c213737bac35 ------------------------------------------------- Add per-program counter for number of times recursion prevention mechanism was triggered and expose it via show_fdinfo and bpf_prog_info. Teach bpftool to print it. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com (cherry picked from commit 9ed9e9ba) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Brendan Jackman 提交于
mainline inclusion from mainline-5.12-rc1 commit 5ffa2550 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ffa25502b5ab3d639829a2d1e316cff7f59a41e ------------------------------------------------- This adds two atomic opcodes, both of which include the BPF_FETCH flag. XCHG without the BPF_FETCH flag would naturally encode atomic_set. This is not supported because it would be of limited value to userspace (it doesn't imply any barriers). CMPXCHG without BPF_FETCH woulud be an atomic compare-and-write. We don't have such an operation in the kernel so it isn't provided to BPF either. There are two significant design decisions made for the CMPXCHG instruction: - To solve the issue that this operation fundamentally has 3 operands, but we only have two register fields. Therefore the operand we compare against (the kernel's API calls it 'old') is hard-coded to be R0. x86 has similar design (and A64 doesn't have this problem). A potential alternative might be to encode the other operand's register number in the immediate field. - The kernel's atomic_cmpxchg returns the old value, while the C11 userspace APIs return a boolean indicating the comparison result. Which should BPF do? A64 returns the old value. x86 returns the old value in the hard-coded register (and also sets a flag). That means return-old-value is easier to JIT, so that's what we use. Signed-off-by: NBrendan Jackman <jackmanb@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com (cherry picked from commit 5ffa2550) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Brendan Jackman 提交于
mainline inclusion from mainline-5.12-rc1 commit 5ca419f2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ca419f2864a2c60940dcf4bbaeb69546200e36f ------------------------------------------------- The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC instructions, in order to have the previous value of the atomically-modified memory location loaded into the src register after an atomic op is carried out. Suggested-by: NYonghong Song <yhs@fb.com> Signed-off-by: NBrendan Jackman <jackmanb@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com (cherry picked from commit 5ca419f2) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Brendan Jackman 提交于
mainline inclusion from mainline-5.12-rc1 commit 91c960b0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91c960b0056672e74627776655c926388350fa30 ------------------------------------------------- A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations. In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD. This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero. All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers. Signed-off-by: NBrendan Jackman <jackmanb@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NBjörn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com (cherry picked from commit 91c960b0) Signed-off-by: NWang Yufen <wangyufen@huawei.com> Conflicts: arch/x86/net/bpf_jit_comp.c Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Brendan Jackman 提交于
mainline inclusion from mainline-5.12-rc1 commit c6458e72 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6458e72f6fd6ac7e390da0d9abe8446084886e5 ------------------------------------------------- When the buffer is too small to contain the input string, these helpers return the length of the buffer, not the length of the original string. This tries to make the docs totally clear about that, since "the length of the [copied ]string" could also refer to the length of the input. Signed-off-by: NBrendan Jackman <jackmanb@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NKP Singh <kpsingh@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com (cherry picked from commit c6458e72) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Andrii Nakryiko 提交于
mainline inclusion from mainline-5.11-rc1 commit 290248a5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=290248a5b7d829871b3ea3c62578613a580a1744 ------------------------------------------------- Add ability for user-space programs to specify non-vmlinux BTF when attaching BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this, attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD of a module or vmlinux BTF object. For backwards compatibility reasons, 0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified. Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org (cherry picked from commit 290248a5) Signed-off-by: NWang Yufen <wangyufen@huawei.com> Conflicts: kernel/bpf/syscall.c Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Andrii Nakryiko 提交于
mainline inclusion from mainline-5.11-rc1 commit 53297220 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5329722057d41aebc31e391907a501feaa42f7d9 ------------------------------------------------- Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF objects in the system. To allow distinguishing vmlinux BTF (and later kernel module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module BTF). We might want to later allow specifying BTF name for user-provided BTFs as well, if that makes sense. But currently this is reserved only for in-kernel BTFs. Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require in-kernel BTF type with ability to specify BTF types from kernel modules, not just vmlinux BTF. This will be implemented in a follow up patch set for fentry/fexit/fmod_ret/lsm/etc. Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org (cherry picked from commit 53297220) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
由 Lorenz Bauer 提交于
mainline inclusion from mainline-5.13-rc1 commit 7c32e8f8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7c32e8f8bc33a5f4b113a630857e46634e3e143b ------------------------------------------------- Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer. We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler. Signed-off-by: NLorenz Bauer <lmb@cloudflare.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com (cherry picked from commit 7c32e8f8) Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
- 22 8月, 2022 9 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add helper funciton to set cpus_ptr in task. Signed-off-by: NHui Tang <tanghui20@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add helper function to check two cpu whehter share same LLC cache. Signed-off-by: NHui Tang <tanghui20@huawei.com>
-
由 Chen Hui 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add cpumask ops collection, such as cpumask_empty, cpumask_and, cpumask_andnot, cpumask_subset, cpumask_equal, cpumask_copy. Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NHui Tang <tanghui20@huawei.com>
-
由 Ren Zhijie 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add bpf helper function bpf_init_cpu_topology() which obtains cpu topology info through the macros topology_* that are defined by include/linux/topology.h, and save it in BPF MAP. The cpu topology info are useful to select core in userspace. Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Chen Hui 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add four helper functions to get cpu stat, as follows: 1.acquire cfs/rt/irq cpu load statitic. 2.acquire multiple types of nr_running statitic. 3.acquire cpu idle statitic. 4.acquire cpu capacity. Based on CPU statistics in different dimensions, specific scheduling policies can be implemented in bpf program. Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NHui Tang <tanghui20@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Ren Zhijie 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add helper function bpf_sched_set_tg_tag() and bpf_sched_set_task_tag() to set tag for task group or task. They can not be call when rq->lock has been held. The use case is that the other kernel subsystems, such as the network, can use it to mark key tasks. Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Chen Hui 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- Add three helper functions: 1) bpf_sched_entity_is_task is to check whether the sched entity is a task struct. 2) bpf_sched_entity_to_task is to change the sched entity to a task struct. 3) bpf_sched_entity_to_tg is to change the sched entity to a task group. Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Chen Hui 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- This helper function read the tag of the struct task. The bpf prog obtains the tags to detect different workloads. Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Ren Zhijie 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5KUFB CVE: NA -------------------------------- This helper function read the task group tag for a task. The bpf prog obtains the tags to detect different workloads. Signed-off-by: NRen Zhijie <renzhijie2@huawei.com> Signed-off-by: NChen Hui <judy.chenhui@huawei.com>
-
- 27 7月, 2022 2 次提交
-
-
由 Roman Gushchin 提交于
maillist inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5F6X6 CVE: NA Reference: https://lore.kernel.org/all/20210916162451.709260-1-guro@fb.com/ ------------------- This patch adds 3 helpers useful for dealing with sched entities: u64 bpf_sched_entity_to_tgidpid(struct sched_entity *se); u64 bpf_sched_entity_to_cgrpid(struct sched_entity *se); long bpf_sched_entity_belongs_to_cgrp(struct sched_entity *se, u64 cgrpid); Sched entity is a basic structure used by the scheduler to represent schedulable objects: tasks and cgroups (if CONFIG_FAIR_GROUP_SCHED is enabled). It will be passed as an argument to many bpf hooks, so scheduler bpf programs need a convenient way to deal with it. bpf_sched_entity_to_tgidpid() and bpf_sched_entity_to_cgrpid() are useful to identify a sched entity in userspace terms (pid, tgid and cgroup id). bpf_sched_entity_belongs_to_cgrp() allows to check whether a sched entity belongs to sub-tree of a cgroup. It allows to write cgroup-specific scheduler policies even without enabling the cgroup cpu controller. Signed-off-by: NRoman Gushchin <guro@fb.com> Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
由 Roman Gushchin 提交于
maillist inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5F6X6 CVE: NA Reference: https://lore.kernel.org/all/20210916162451.709260-1-guro@fb.com/ ------------------- This commit introduces basic definitions and infrastructure for scheduler bpf programs. It defines the BPF_PROG_TYPE_SCHED program type and the BPF_SCHED attachment type. The implementation is inspired by lsm bpf programs and is based on kretprobes. This will allow to add new hooks with a minimal changes to the kernel code and without any changes to libbpf/bpftool. It's very convenient as I anticipate a large number of private patches being used for a long time before (or if at all) reaching upstream. Sched programs are expected to return an int, which meaning will be context defined. This patch doesn't add any real scheduler hooks (only a stub), it will be done by following patches in the series. Scheduler bpf programs as now are very restricted in what they can do: only the bpf_printk() helper is available. The scheduler context can impose significant restrictions on what's safe and what's not. So let's extend their abilities on case by case basis when a need arise. Signed-off-by: NRoman Gushchin <guro@fb.com> Signed-off-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NRen Zhijie <renzhijie2@huawei.com>
-
- 18 7月, 2022 1 次提交
-
-
由 Jakub Sitnicki 提交于
stable inclusion from stable-v5.10.111 commit 995f517888687c0730bc3d2dbca424c27350eaa7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5GL1Z Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=995f517888687c0730bc3d2dbca424c27350eaa7 -------------------------------- [ Upstream commit 4421a582 ] Menglong Dong reports that the documentation for the dst_port field in struct bpf_sock is inaccurate and confusing. From the BPF program PoV, the field is a zero-padded 16-bit integer in network byte order. The value appears to the BPF user as if laid out in memory as so: offsetof(struct bpf_sock, dst_port) + 0 <port MSB> + 8 <port LSB> +16 0x00 +24 0x00 32-, 16-, and 8-bit wide loads from the field are all allowed, but only if the offset into the field is 0. 32-bit wide loads from dst_port are especially confusing. The loaded value, after converting to host byte order with bpf_ntohl(dst_port), contains the port number in the upper 16-bits. Remove the confusion by splitting the field into two 16-bit fields. For backward compatibility, allow 32-bit wide loads from offsetof(struct bpf_sock, dst_port). While at it, allow loads 8-bit loads at offset [0] and [1] from dst_port. Reported-by: NMenglong Dong <imagedong@tencent.com> Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220130115518.213259-2-jakub@cloudflare.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 06 7月, 2022 2 次提交
-
-
由 Hengqi Chen 提交于
stable inclusion from stable-v5.10.110 commit 73f2f37417b035d9607888be4fd23a9e709a85c6 bugzilla: https://gitee.com/openeuler/kernel/issues/I574AL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=73f2f37417b035d9607888be4fd23a9e709a85c6 -------------------------------- commit 58617014 upstream. Fix the descriptions of the return values of helper bpf_current_task_under_cgroup(). Fixes: c6b5fb86 ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Namhyung Kim 提交于
stable inclusion from stable-v5.10.110 commit 90805175a206f784b6a77f16f07b07f6803e286b bugzilla: https://gitee.com/openeuler/kernel/issues/I574AL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=90805175a206f784b6a77f16f07b07f6803e286b -------------------------------- commit ee2a0988 upstream. Let's say that the caller has storage for num_elem stack frames. Then, the BPF stack helper functions walk the stack for only num_elem frames. This means that if skip > 0, one keeps only 'num_elem - skip' frames. This is because it sets init_nr in the perf_callchain_entry to the end of the buffer to save num_elem entries only. I believe it was because the perf callchain code unwound the stack frames until it reached the global max size (sysctl_perf_event_max_stack). However it now has perf_callchain_entry_ctx.max_stack to limit the iteration locally. This simplifies the code to handle init_nr in the BPF callstack entries and removes the confusion with the perf_event's __PERF_SAMPLE_CALLCHAIN_EARLY which sets init_nr to 0. Also change the comment on bpf_get_stack() in the header file to be more explicit what the return value means. Fixes: c195651e ("bpf: add bpf_get_stack helper") Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/30a7b5d5-6726-1cc2-eaee-8da2828a9a9c@oracle.com Link: https://lore.kernel.org/bpf/20220314182042.71025-1-namhyung@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Based-on-patch-by: NEugene Loh <eugene.loh@oracle.com> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 10 5月, 2022 2 次提交
-
-
由 Liu Jian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I545NW CVE: NA -------------------------------- Add new optname(BPF_SO_ORIGINAL_DST 800, BPF_SO_REPLY_SRC 801) to get origdst/reply src for bpf progs. Now only support IPv4. Signed-off-by: NWang Yufen <wangyufen@huawei.com> Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Jian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I545NW CVE: NA -------------------------------- Add the function for bpf sock_ops hook to get sock's uid and gid. Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 1月, 2022 1 次提交
-
-
由 Dave Marchevsky 提交于
mainline inclusion from mainline-v5.15-rc1 commit 6fc88c35 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA ---------- Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf attach types and a function to map bpf_attach_type values to the new enum. Inspired by netns_bpf_attach_type. Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever possible. Functionality is unchanged as attach_type_to_prog_type switches in bpf/syscall.c were preventing non-cgroup programs from making use of the invalid cgroup_bpf array slots. As a result struct cgroup_bpf uses 504 fewer bytes relative to when its arrays were sized using MAX_BPF_ATTACH_TYPE. bpf_cgroup_storage is notably not migrated as struct bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type member which is not meant to be opaque. Similarly, bpf_cgroup_link continues to report its bpf_attach_type member to userspace via fdinfo and bpf_link_info. To ease disambiguation, bpf_attach_type variables are renamed from 'type' to 'atype' when changed to cgroup_bpf_attach_type. Signed-off-by: NDave Marchevsky <davemarchevsky@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com Conflicts: include/linux/bpf-cgroup.h kernel/bpf/cgroup.c net/ipv4/af_inet.c net/ipv6/af_inet6.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 19 10月, 2021 1 次提交
-
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-5.10.65 commit d4213b70931640a327b4693bc3f9b5784f86b6dd bugzilla: 182361 https://gitee.com/openeuler/kernel/issues/I4EH3U Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d4213b70931640a327b4693bc3f9b5784f86b6dd -------------------------------- [ Upstream commit f170acda ] Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo in bpf.h. Fixes: 2dbb9b9e ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jpSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 12 12月, 2020 1 次提交
-
-
由 Andrii Nakryiko 提交于
Remove bpf_ prefix, which causes these helpers to be reported in verifier dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets fix it as long as it is still possible before UAPI freezes on these helpers. Fixes: eaa6bcb7 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 10月, 2020 1 次提交
-
-
由 Toke Høiland-Jørgensen 提交于
Based on the discussion in [0], update the bpf_redirect_neigh() helper to accept an optional parameter specifying the nexthop information. This makes it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without incurring a duplicate FIB lookup - since the FIB lookup helper will return the nexthop information even if no neighbour is present, this can simply be passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns BPF_FIB_LKUP_RET_NO_NEIGH. Thus fix & extend it before helper API is frozen. [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/160322915615.32199.1187570224032024535.stgit@toke.dk
-
- 12 10月, 2020 3 次提交
-
-
由 Daniel Borkmann 提交于
Recent work in f4d05259 ("bpf: Add map_meta_equal map ops") and 134fede4 ("bpf: Relax max_entries check for most of the inner map types") added support for dynamic inner max elements for most map-in-map types. Exceptions were maps like array or prog array where the map_gen_lookup() callback uses the maps' max_entries field as a constant when emitting instructions. We recently implemented Maglev consistent hashing into Cilium's load balancer which uses map-in-map with an outer map being hash and inner being array holding the Maglev backend table for each service. This has been designed this way in order to reduce overall memory consumption given the outer hash map allows to avoid preallocating a large, flat memory area for all services. Also, the number of service mappings is not always known a-priori. The use case for dynamic inner array map entries is to further reduce memory overhead, for example, some services might just have a small number of back ends while others could have a large number. Right now the Maglev backend table for small and large number of backends would need to have the same inner array map entries which adds a lot of unneeded overhead. Dynamic inner array map entries can be realized by avoiding the inlined code generation for their lookup. The lookup will still be efficient since it will be calling into array_map_lookup_elem() directly and thus avoiding retpoline. The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips inline code generation and relaxes array_map_meta_equal() check to ignore both maps' max_entries. This also still allows to have faster lookups for map-in-map when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed. Example code generation where inner map is dynamic sized array: # bpftool p d x i 125 int handle__sys_enter(void * ctx): ; int handle__sys_enter(void *ctx) 0: (b4) w1 = 0 ; int key = 0; 1: (63) *(u32 *)(r10 -4) = r1 2: (bf) r2 = r10 ; 3: (07) r2 += -4 ; inner_map = bpf_map_lookup_elem(&outer_arr_dyn, &key); 4: (18) r1 = map[id:468] 6: (07) r1 += 272 7: (61) r0 = *(u32 *)(r2 +0) 8: (35) if r0 >= 0x3 goto pc+5 9: (67) r0 <<= 3 10: (0f) r0 += r1 11: (79) r0 = *(u64 *)(r0 +0) 12: (15) if r0 == 0x0 goto pc+1 13: (05) goto pc+1 14: (b7) r0 = 0 15: (b4) w6 = -1 ; if (!inner_map) 16: (15) if r0 == 0x0 goto pc+6 17: (bf) r2 = r10 ; 18: (07) r2 += -4 ; val = bpf_map_lookup_elem(inner_map, &key); 19: (bf) r1 = r0 | No inlining but instead 20: (85) call array_map_lookup_elem#149280 | call to array_map_lookup_elem() ; return val ? *val : -1; | for inner array lookup. 21: (15) if r0 == 0x0 goto pc+1 ; return val ? *val : -1; 22: (61) r6 = *(u32 *)(r0 +0) ; } 23: (bc) w0 = w6 24: (95) exit Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net
-
由 Daniel Borkmann 提交于
Add an efficient ingress to ingress netns switch that can be used out of tc BPF programs in order to redirect traffic from host ns ingress into a container veth device ingress without having to go via CPU backlog queue [0]. For local containers this can also be utilized and path via CPU backlog queue only needs to be taken once, not twice. On a high level this borrows from ipvlan which does similar switch in __netif_receive_skb_core() and then iterates via another_round. This helps to reduce latency for mentioned use cases. Pod to remote pod with redirect(), TCP_RR [1]: # percpu_netperf 10.217.1.33 RT_LATENCY: 122.450 (per CPU: 122.666 122.401 122.333 122.401 ) MEAN_LATENCY: 121.210 (per CPU: 121.100 121.260 121.320 121.160 ) STDDEV_LATENCY: 120.040 (per CPU: 119.420 119.910 125.460 115.370 ) MIN_LATENCY: 46.500 (per CPU: 47.000 47.000 47.000 45.000 ) P50_LATENCY: 118.500 (per CPU: 118.000 119.000 118.000 119.000 ) P90_LATENCY: 127.500 (per CPU: 127.000 128.000 127.000 128.000 ) P99_LATENCY: 130.750 (per CPU: 131.000 131.000 129.000 132.000 ) TRANSACTION_RATE: 32666.400 (per CPU: 8152.200 8169.842 8174.439 8169.897 ) Pod to remote pod with redirect_peer(), TCP_RR: # percpu_netperf 10.217.1.33 RT_LATENCY: 44.449 (per CPU: 43.767 43.127 45.279 45.622 ) MEAN_LATENCY: 45.065 (per CPU: 44.030 45.530 45.190 45.510 ) STDDEV_LATENCY: 84.823 (per CPU: 66.770 97.290 84.380 90.850 ) MIN_LATENCY: 33.500 (per CPU: 33.000 33.000 34.000 34.000 ) P50_LATENCY: 43.250 (per CPU: 43.000 43.000 43.000 44.000 ) P90_LATENCY: 46.750 (per CPU: 46.000 47.000 47.000 47.000 ) P99_LATENCY: 52.750 (per CPU: 51.000 54.000 53.000 53.000 ) TRANSACTION_RATE: 90039.500 (per CPU: 22848.186 23187.089 22085.077 21919.130 ) [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperfSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net -
由 Daniel Borkmann 提交于
Follow-up to address David's feedback that we should better describe internals of the bpf_redirect_neigh() helper. Suggested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Link: https://lore.kernel.org/bpf/20201010234006.7075-2-daniel@iogearbox.net
-
- 09 10月, 2020 1 次提交
-
-
由 Nikita V. Shirokov 提交于
Adding support for TCP_NOTSENT_LOWAT sockoption (https://lwn.net/Articles/560082/) in tcp bpf programs. Signed-off-by: NNikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201009070325.226855-1-tehnerd@tehnerd.com
-
- 08 10月, 2020 1 次提交
-
-
由 Jakub Wilk 提交于
Reported-by: NSamanta Navarro <ferivoz@riseup.net> Signed-off-by: NJakub Wilk <jwilk@jwilk.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201007055717.7319-1-jwilk@jwilk.net
-
- 03 10月, 2020 1 次提交
-
-
由 Hao Luo 提交于
Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This helper always returns a valid pointer, therefore no need to check returned value for NULL. Also note that all programs run with preemption disabled, which means that the returned pointer is stable during all the execution of the program. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-6-haoluo@google.com
-