- 01 6月, 2017 1 次提交
-
-
由 Alexei Starovoitov 提交于
free up BPF_JMP | BPF_CALL | BPF_X opcode to be used by actual indirect call by register and use kernel internal opcode to mark call instruction into bpf_tail_call() helper. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 2月, 2017 2 次提交
-
-
由 Daniel Borkmann 提交于
Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 2月, 2017 1 次提交
-
-
由 Naveen N. Rao 提交于
Introduce __PPC_SH64() as a 64-bit variant to encode shift field in some of the shift and rotate instructions operating on double-words. Convert some of the BPF instruction macros to use the same. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 25 1月, 2017 2 次提交
-
-
由 Naveen N. Rao 提交于
With bpf_jit_binary_alloc(), we allocate at a page granularity and fill the rest of the space with illegal instructions to mitigate BPF spraying attacks, while having the actual JIT'ed BPF program at a random location within the allocated space. Under this scenario, it would be better to flush the entire allocated buffer rather than just the part containing the actual program. We already flush the buffer from start to the end of the BPF program. Extend this to include the illegal instructions after the BPF program. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Daniel Borkmann 提交于
We have a check earlier to ensure we don't proceed if image is NULL. As such, the redundant check can be removed. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> [Added similar changes for classic BPF JIT] Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 09 12月, 2016 1 次提交
-
-
由 Martin KaFai Lau 提交于
This patch allows XDP prog to extend/remove the packet data at the head (like adding or removing header). It is done by adding a new XDP helper bpf_xdp_adjust_head(). It also renames bpf_helper_changes_skb_data() to bpf_helper_changes_pkt_data() to better reflect that XDP prog does not work on skb. This patch adds one "xdp_adjust_head" bit to bpf_prog for the XDP-capable driver to check if the XDP prog requires bpf_xdp_adjust_head() support. The driver can then decide to error out during XDP_SETUP_PROG. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 10月, 2016 3 次提交
-
-
由 Naveen N. Rao 提交于
In line with similar support for other architectures by Daniel Borkmann. 'MOD Default X' from test_bpf without constant blinding: 84 bytes emitted from JIT compiler (pass:3, flen:7) d0000000058a4688 + <x>: 0: nop 4: nop 8: std r27,-40(r1) c: std r28,-32(r1) 10: xor r8,r8,r8 14: xor r28,r28,r28 18: mr r27,r3 1c: li r8,66 20: cmpwi r28,0 24: bne 0x0000000000000030 28: li r8,0 2c: b 0x0000000000000044 30: divwu r9,r8,r28 34: mullw r9,r28,r9 38: subf r8,r9,r8 3c: rotlwi r8,r8,0 40: li r8,66 44: ld r27,-40(r1) 48: ld r28,-32(r1) 4c: mr r3,r8 50: blr ... and with constant blinding: 140 bytes emitted from JIT compiler (pass:3, flen:11) d00000000bd6ab24 + <x>: 0: nop 4: nop 8: std r27,-40(r1) c: std r28,-32(r1) 10: xor r8,r8,r8 14: xor r28,r28,r28 18: mr r27,r3 1c: lis r2,-22834 20: ori r2,r2,36083 24: rotlwi r2,r2,0 28: xori r2,r2,36017 2c: xoris r2,r2,42702 30: rotlwi r2,r2,0 34: mr r8,r2 38: rotlwi r8,r8,0 3c: cmpwi r28,0 40: bne 0x000000000000004c 44: li r8,0 48: b 0x000000000000007c 4c: divwu r9,r8,r28 50: mullw r9,r28,r9 54: subf r8,r9,r8 58: rotlwi r8,r8,0 5c: lis r2,-17137 60: ori r2,r2,39065 64: rotlwi r2,r2,0 68: xori r2,r2,39131 6c: xoris r2,r2,48399 70: rotlwi r2,r2,0 74: mr r8,r2 78: rotlwi r8,r8,0 7c: ld r27,-40(r1) 80: ld r28,-32(r1) 84: mr r3,r8 88: blr Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF programs. This can be achieved either by: (1) retaining the stack setup by the first eBPF program and having all subsequent eBPF programs re-using it, or, (2) by unwinding/tearing down the stack and having each eBPF program deal with its own stack as it sees fit. To ensure that this does not create loops, there is a limit to how many tail calls can be done (currently 32). This requires the JIT'ed code to maintain a count of the number of tail calls done so far. Approach (1) is simple, but requires every eBPF program to have (almost) the same prologue/epilogue, regardless of whether they need it. This is inefficient for small eBPF programs which may not sometimes need a prologue at all. As such, to minimize impact of tail call implementation, we use approach (2) here which needs each eBPF program in the chain to use its own prologue/epilogue. This is not ideal when many tail calls are involved and when all the eBPF programs in the chain have similar prologue/epilogue. However, the impact is restricted to programs that do tail calls. Individual eBPF programs are not affected. We maintain the tail call count in a fixed location on the stack and updated tail call count values are passed in through this. The very first eBPF program in a chain sets this up to 0 (the first 2 instructions). Subsequent tail calls skip the first two eBPF JIT instructions to maintain the count. For programs that don't do tail calls themselves, the first two instructions are NOPs. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
While at it, ensure that the location of the local save area is consistent whether or not we setup our own stackframe. This property is utilised in the next patch that adds support for tail calls. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 24 6月, 2016 6 次提交
-
-
由 Naveen N. Rao 提交于
PPC64 eBPF JIT compiler. Enable with: echo 1 > /proc/sys/net/core/bpf_jit_enable or echo 2 > /proc/sys/net/core/bpf_jit_enable ... to see the generated JIT code. This can further be processed with tools/net/bpf_jit_disasm. With CONFIG_TEST_BPF=m and 'modprobe test_bpf': test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed] ... on both ppc64 BE and LE. The details of the approach are documented through various comments in the code. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
Break out classic BPF JIT specifics into a separate header in preparation for eBPF JIT implementation. Note that ppc32 will still need the classic BPF JIT. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
1. Per the ISA, ADDIS actually uses RT, rather than RS. Though the result is the same, make the usage clear. 2. The multiply instruction used is a 32-bit multiply. Rename PPC_MUL() to PPC_MULW() to make the same clear. 3. PPC_STW[U] take the entire 16-bit immediate value and do not require word-alignment, per the ISA. Change the macros to use IMM_L(). 4. A few white-space cleanups to satisfy checkpatch.pl. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
Since we will be using the rotate immediate instructions for extended BPF JIT, let's introduce macros for the same. And since the shift immediate operations use the rotate immediate instructions, let's redo those macros to use the newly introduced instructions. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
Similar to the LI32() optimization, if the value can be represented in 32-bits, use LI32(). Also handle loading a few specific forms of immediate values in an optimum manner. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Naveen N. Rao 提交于
The existing LI32() macro can sometimes result in a sign-extended 32-bit load that does not clear the top 32-bits properly. As an example, loading 0x7fffffff results in the register containing 0xffffffff7fffffff. While this does not impact classic BPF JIT implementation (since that only uses the lower word for all operations), we would like to share this macro between classic BPF JIT and extended BPF JIT, wherein the entire 64-bit value in the register matters. Fix this by first doing a shifted LI followed by ORI. An additional optimization is with loading values between -32768 to -1, where we now only need a single LI. The new implementation now generates the same or less number of instructions. Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 06 1月, 2016 1 次提交
-
-
由 Rabin Vincent 提交于
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. This was found using american fuzzy lop. Add a helper to determine if A needs to be cleared given the first instruction in a filter, and use this in the JITs. Except for ARM, the rest have only been compile-tested. Fixes: 34805931 ("net: filter: get rid of BPF_S_* enum") Signed-off-by: NRabin Vincent <rabin@rab.in> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2015 1 次提交
-
-
由 Daniel Borkmann 提交于
As we need to add further flags to the bpf_prog structure, lets migrate both bools to a bitfield representation. The size of the base structure (excluding insns) remains unchanged at 40 bytes. Add also tags for the kmemchecker, so that it doesn't throw false positives. Even in case gcc would generate suboptimal code, it's not being accessed in performance critical paths. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 2月, 2015 3 次提交
-
-
由 Denis Kirjanov 提交于
Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis Kirjanov 提交于
Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis Kirjanov 提交于
Use helpers from the asm-compat.h to wrap up assembly mnemonics Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 1月, 2015 1 次提交
-
-
由 Rusty Russell 提交于
Nothing needs the module pointer any more, and the next patch will call it from RCU, where the module itself might no longer exist. Removing the arg is the safest approach. This just codifies the use of the module_alloc/module_free pattern which ftrace and bpf use. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: x86@kernel.org Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: linux-cris-kernel@axis.com Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: nios2-dev@lists.rocketboards.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: netdev@vger.kernel.org
-
- 19 11月, 2014 1 次提交
-
-
由 Denis Kirjanov 提交于
Reduce duplicated code by unifying BPF_ALU | BPF_MOD | BPF_X and BPF_ALU | BPF_DIV | BPF_X CC: Alexei Starovoitov<alexei.starovoitov@gmail.com> CC: Daniel Borkmann<dborkman@redhat.com> CC: Philippe Bergheaud<felix@linux.vnet.ibm.com> Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 11月, 2014 1 次提交
-
-
由 Denis Kirjanov 提交于
Add BPF extension SKF_AD_HATYPE to ppc JIT to check the hw type of the interface Before: [ 57.723666] test_bpf: #20 LD_HATYPE [ 57.723675] BPF filter opcode 0020 (@0) unsupported [ 57.724168] 48 48 PASS After: [ 103.053184] test_bpf: #20 LD_HATYPE 7 6 PASS CC: Alexei Starovoitov<alexei.starovoitov@gmail.com> CC: Daniel Borkmann<dborkman@redhat.com> CC: Philippe Bergheaud<felix@linux.vnet.ibm.com> Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> v2: address Alexei's comments Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2014 1 次提交
-
-
由 Denis Kirjanov 提交于
Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load skb->pkt_type field. Before: [ 88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS [ 88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS After: [ 80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS [ 80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS CC: Alexei Starovoitov<alexei.starovoitov@gmail.com> CC: Michael Ellerman<mpe@ellerman.id.au> Cc: Matt Evans <matt@ozlabs.org> Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> v2: Added test rusults Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 9月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
Reported by Mikulas Patocka, kmemcheck currently barks out a false positive since we don't have special kmemcheck annotation for bitfields used in bpf_prog structure. We currently have jited:1, len:31 and thus when accessing len while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that we're reading uninitialized memory. As we don't need the whole bit universe for pages member, we can just split it to u16 and use a bool flag for jited instead of a bitfield. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 9月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
With eBPF getting more extended and exposure to user space is on it's way, hardening the memory range the interpreter uses to steer its command flow seems appropriate. This patch moves the to be interpreted bytecode to read-only pages. In case we execute a corrupted BPF interpreter image for some reason e.g. caused by an attacker which got past a verifier stage, it would not only provide arbitrary read/write memory access but arbitrary function calls as well. After setting up the BPF interpreter image, its contents do not change until destruction time, thus we can setup the image on immutable made pages in order to mitigate modifications to that code. The idea is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit against spraying attacks"). This is possible because bpf_prog is not part of sk_filter anymore. After setup bpf_prog cannot be altered during its life-time. This prevents any modifications to the entire bpf_prog structure (incl. function/JIT image pointer). Every eBPF program (including classic BPF that are migrated) have to call bpf_prog_select_runtime() to select either interpreter or a JIT image as a last setup step, and they all are being freed via bpf_prog_free(), including non-JIT. Therefore, we can easily integrate this into the eBPF life-time, plus since we directly allocate a bpf_prog, we have no performance penalty. Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual inspection of kernel_page_tables. Brad Spengler proposed the same idea via Twitter during development of this patch. Joint work with Hannes Frederic Sowa. Suggested-by: NBrad Spengler <spender@grsecurity.net> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2014 1 次提交
-
-
由 Alexei Starovoitov 提交于
clean up names related to socket filtering and bpf in the following way: - everything that deals with sockets keeps 'sk_*' prefix - everything that is pure BPF is changed to 'bpf_*' prefix split 'struct sk_filter' into struct sk_filter { atomic_t refcnt; struct rcu_head rcu; struct bpf_prog *prog; }; and struct bpf_prog { u32 jited:1, len:31; struct sock_fprog_kern *orig_prog; unsigned int (*bpf_func)(const struct sk_buff *skb, const struct bpf_insn *filter); union { struct sock_filter insns[0]; struct bpf_insn insnsi[0]; struct work_struct work; }; }; so that 'struct bpf_prog' can be used independent of sockets and cleans up 'unattached' bpf use cases split SK_RUN_FILTER macro into: SK_RUN_FILTER to be used with 'struct sk_filter *' and BPF_PROG_RUN to be used with 'struct bpf_prog *' __sk_filter_release(struct sk_filter *) gains __bpf_prog_release(struct bpf_prog *) helper function also perform related renames for the functions that work with 'struct bpf_prog *', since they're on the same lines: sk_filter_size -> bpf_prog_size sk_filter_select_runtime -> bpf_prog_select_runtime sk_filter_free -> bpf_prog_free sk_unattached_filter_create -> bpf_prog_create sk_unattached_filter_destroy -> bpf_prog_destroy sk_store_orig_filter -> bpf_prog_store_orig_filter sk_release_orig_filter -> bpf_release_orig_filter __sk_migrate_filter -> bpf_migrate_filter __sk_prepare_filter -> bpf_prepare_filter API for attaching classic BPF to a socket stays the same: sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *) and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program which is used by sockets, tun, af_packet API for 'unattached' BPF programs becomes: bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *) and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 6月, 2014 2 次提交
-
-
由 Denis Kirjanov 提交于
We have to return the boolean here if the tag presents or not, not just ANDing the TCI with the mask which results to: [ 709.412097] test_bpf: #18 LD_VLAN_TAG_PRESENT [ 709.412245] ret 4096 != 1 [ 709.412332] ret 4096 != 1 [ 709.412333] FAIL (2 times) Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis Kirjanov 提交于
To get a full tag (and not just a VID) we should access the TCI except the VLAN_TAG_PRESENT field (which means that 802.1q header is present). Also ensure that the VLAN_TAG_PRESENT stay on its place Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 6月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
This patch finally allows us to get rid of the BPF_S_* enum. Currently, the code performs unnecessary encode and decode workarounds in seccomp and filter migration itself when a filter is being attached in order to overcome BPF_S_* encoding which is not used anymore by the new interpreter resp. JIT compilers. Keeping it around would mean that also in future we would need to extend and maintain this enum and related encoders/decoders. We can get rid of all that and save us these operations during filter attaching. Naturally, also JIT compilers need to be updated by this. Before JIT conversion is being done, each compiler checks if A is being loaded at startup to obtain information if it needs to emit instructions to clear A first. Since BPF extensions are a subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements for extensions can be removed at that point. To ease and minimalize code changes in the classic JITs, we have introduced bpf_anc_helper(). Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int), arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we unfortunately didn't have access, but changes are analogous to the rest. Joint work with Alexei Starovoitov. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: NChema Gonzalez <chemag@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
This patch adds a jited flag into sk_filter struct in order to indicate whether a filter is currently jited or not. The size of sk_filter is not being expanded as the 32 bit 'len' member allows upper bits to be reused since a filter can currently only grow as large as BPF_MAXINSNS. Therefore, there's enough room also for other in future needed flags to reuse 'len' field if necessary. The jited flag also allows for having alternative interpreter functions running as currently, we can only detect jit compiled filters by testing fp->bpf_func to not equal the address of sk_run_filter(). Joint work with Alexei Starovoitov. Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2014 1 次提交
-
-
由 Tom Herbert 提交于
The packet hash can be considered a property of the packet, not just on RX path. This patch changes name of rxhash and l4_rxhash skbuff fields to be hash and l4_hash respectively. This includes changing uses of the field in the code which don't call the access functions. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 1月, 2014 1 次提交
-
-
由 Eric Dumazet 提交于
At first Jakub Zawadzki noticed that some divisions by reciprocal_divide were not correct. (off by one in some cases) http://www.wireshark.org/~darkjames/reciprocal-buggy.c He could also show this with BPF: http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c The reciprocal divide in linux kernel is not generic enough, lets remove its use in BPF, as it is not worth the pain with current cpus. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NJakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Daniel Borkmann <dxchgb@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Matt Evans <matt@ozlabs.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2013 2 次提交
-
-
由 Vladimir Murzin 提交于
commit b6069a95 (filter: add MOD operation) added generic support for modulus operation in BPF. This patch brings JIT support for PPC64 Signed-off-by: NVladimir Murzin <murzin.v@gmail.com> Acked-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Philippe Bergheaud 提交于
This enables the Berkeley Packet Filter JIT compiler for the PowerPC running in 64bit Little Endian. Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 10月, 2013 1 次提交
-
-
由 Alexei Starovoitov 提交于
on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] *** DEADLOCK *** [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 5月, 2013 1 次提交
-
-
由 Daniel Borkmann 提交于
Followup patch on module_free()/vfree() that takes care of the rest, so no longer this workaround with work_struct is needed. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matt Evans <matt@ozlabs.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2013 1 次提交
-
-
由 Daniel Borkmann 提交于
If bpf_jit_enable > 1, then we dump the emitted JIT compiled image after creation. Currently, only SPARC and PowerPC has similar output as in the reference implementation on x86_64. Make a small helper function in order to reduce duplicated code and make the dump output uniform across architectures x86_64, SPARC, PPC, ARM (e.g. on ARM flen, pass and proglen are currently not shown, but would be interesting to know as well), also for future BPF JIT implementations on other archs. Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Matt Evans <matt@ozlabs.org> Cc: Eric Dumazet <eric.dumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 11月, 2012 1 次提交
-
-
由 Daniel Borkmann 提交于
This patch is a follow-up for patch "net: filter: add vlan tag access" to support the new VLAN_TAG/VLAN_TAG_PRESENT accessors in BPF JIT. Signed-off-by: NDaniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Cc: Matt Evans <matt@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-