- 04 5月, 2020 1 次提交
-
-
由 Luke Nelson 提交于
This patch optimizes the code generated by emit_a32_arsh_r64, which handles the BPF_ALU64 BPF_ARSH BPF_X instruction. The original code uses a conditional B followed by an unconditional ORR. The optimization saves one instruction by removing the B instruction and using a conditional ORR (with an inverted condition). Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), before optimization: 34: rsb ip, r2, #32 38: subs r9, r2, #32 3c: lsr lr, r0, r2 40: orr lr, lr, r1, lsl ip 44: bmi 0x4c 48: orr lr, lr, r1, asr r9 4c: asr ip, r1, r2 50: mov r0, lr 54: mov r1, ip and after optimization: 34: rsb ip, r2, #32 38: subs r9, r2, #32 3c: lsr lr, r0, r2 40: orr lr, lr, r1, lsl ip 44: orrpl lr, lr, r1, asr r9 48: asr ip, r1, r2 4c: mov r0, lr 50: mov r1, ip Tested on QEMU using lib/test_bpf and test_verifier. Co-developed-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com
-
- 15 4月, 2020 1 次提交
-
-
由 Luke Nelson 提交于
This patch fixes an incorrect check in how immediate memory offsets are computed for BPF_DW on arm. For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte access into two separate 4-byte accesses using off+0 and off+4. If off fits in imm12, the JIT emits a ldr/str instruction with the immediate and avoids the use of a temporary register. While the current check off <= 0xfff ensures that the first immediate off+0 doesn't overflow imm12, it's not sufficient for the second immediate off+4, which may cause the second access of BPF_DW to read/write the wrong address. This patch fixes the problem by changing the check to off <= 0xfff - 4 for BPF_DW, ensuring off+4 will never overflow. A side effect of simplifying the check is that it now allows using negative immediate offsets in ldr/str. This means that small negative offsets can also avoid the use of a temporary register. This patch introduces no new failures in test_verifier or test_bpf.c. Fixes: c5eae692 ("ARM: net: bpf: improve 64-bit store implementation") Fixes: ec19e02b ("ARM: net: bpf: fix LDX instructions") Co-developed-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com
-
- 09 4月, 2020 1 次提交
-
-
由 Luke Nelson 提交于
The current arm BPF JIT does not correctly compile RSH or ARSH when the immediate shift amount is 0. This causes the "rsh64 by 0 imm" and "arsh64 by 0 imm" BPF selftests to hang the kernel by reaching an instruction the verifier determines to be unreachable. The root cause is in how immediate right shifts are encoded on arm. For LSR and ASR (logical and arithmetic right shift), a bit-pattern of 00000 in the immediate encodes a shift amount of 32. When the BPF immediate is 0, the generated code shifts by 32 instead of the expected behavior (a no-op). This patch fixes the bugs by adding an additional check if the BPF immediate is 0. After the change, the above mentioned BPF selftests pass. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Co-developed-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com
-
- 11 12月, 2019 1 次提交
-
-
由 Russell King 提交于
Improve the prologue code sequence to be able to take advantage of 64-bit stores, changing the code from: push {r4, r5, r6, r7, r8, r9, fp, lr} mov fp, sp sub ip, sp, #80 ; 0x50 sub sp, sp, #600 ; 0x258 str ip, [fp, #-100] ; 0xffffff9c mov r6, #0 str r6, [fp, #-96] ; 0xffffffa0 mov r4, #0 mov r3, r4 mov r2, r0 str r4, [fp, #-104] ; 0xffffff98 str r4, [fp, #-108] ; 0xffffff94 to the tighter: push {r4, r5, r6, r7, r8, r9, fp, lr} mov fp, sp mov r3, #0 sub r2, sp, #80 ; 0x50 sub sp, sp, #600 ; 0x258 strd r2, [fp, #-100] ; 0xffffff9c mov r2, #0 strd r2, [fp, #-108] ; 0xffffff94 mov r2, r0 resulting in a saving of three instructions. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/E1ieH2g-0004ih-Rb@rmk-PC.armlinux.org.uk
-
- 05 6月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 315 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAllison Randal <allison@lohutok.net> Reviewed-by: NArmijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 5月, 2019 1 次提交
-
-
由 Jiong Wang 提交于
Cc: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 27 1月, 2019 1 次提交
-
-
由 Jiong Wang 提交于
This patch implements code-gen for new JMP32 instructions on arm. For JSET, "ands" (AND with flags updated) is used, so corresponding encoding helper is added. Cc: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 13 7月, 2018 18 次提交
-
-
由 Russell King 提交于
Improbe the 64-bit ALU implementation from: movw r8, #65532 movt r8, #65535 movw r9, #65535 movt r9, #65535 ldr r7, [fp, #-44] adds r7, r7, r8 str r7, [fp, #-44] ldr r7, [fp, #-40] adc r7, r7, r9 str r7, [fp, #-40] to: movw r8, #65532 movt r8, #65535 movw r9, #65535 movt r9, #65535 ldrd r6, [fp, #-44] adds r6, r6, r8 adc r7, r7, r9 strd r6, [fp, #-44] Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Improve the 64-bit store implementation from: ldr r6, [fp, #-8] str r8, [r6] ldr r6, [fp, #-8] mov r7, #4 add r7, r6, r7 str r9, [r7] to: ldr r6, [fp, #-8] str r8, [r6] str r9, [r6, #4] We leave the store as two separate STR instructions rather than using STRD as the store may not be aligned, and STR can handle misalignment. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Improve the 64-bit sign-extended immediate from: mov r6, #1 str r6, [fp, #-52] ; 0xffffffcc mov r6, #0 str r6, [fp, #-48] ; 0xffffffd0 to: mov r6, #1 mov r7, #0 strd r6, [fp, #-52] ; 0xffffffcc Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Rather than writing each 32-bit half of the 64-bit immediate value separately when the register is on the stack: movw r6, #45056 ; 0xb000 movt r6, #60979 ; 0xee33 str r6, [fp, #-44] ; 0xffffffd4 mov r6, #0 str r6, [fp, #-40] ; 0xffffffd8 arrange to use the double-word store when available instead: movw r6, #45056 ; 0xb000 movt r6, #60979 ; 0xee33 mov r7, #0 strd r6, [fp, #-44] ; 0xffffffd4 Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Use double-word load and stores where support for this instruction is supported by the CPU architecture. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Always use an odd/even register pair for our 64-bit registers, so that we're able to use the double-word load/store instructions in the future. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Rearranging the order of the initial tail call code a little allows is to avoid reloading the 'array' pointer. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Avoid reloading 'index' after we have validated it - it remains in tmp2[1] up to the point that we begin the code to index the pointer array, so with a little rearrangement of the registers, we can use the already loaded value. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Rather than pre-shifting the rm register for the ldr in the tail call, shift it in the load instruction. This eliminates one unnecessary instruction. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Rather than moving constants to a register and then using them in a subsequent instruction, use them directly in the desired instruction cutting out the "middle" register. This removes two instructions from the tail call code path. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Provide a version of the imm8m() function that the compiler can optimise when used with a constant expression. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Access the eBPF scratch space using the frame pointer rather than our stack pointer, as the offsets from the ARM frame pointer are constant across all eBPF programs. Since we no longer reference the scratch space registers from the stack pointer, this simplifies emit_push_r64() as it no longer needs to know how many words are pushed onto the stack. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Provide a couple of 64-bit register accessors, and use them where appropriate Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Many of the code paths need to have knowledge about whether a register is stacked or in a CPU register. Move this decision making to a pair of helper functions instead of having it scattered throughout the code. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
The decision about whether a BPF register is on the stack or in a CPU register is detected at the top BPF insn processing level, and then percolated throughout the remainder of the code. Since we now use negative register values to represent stacked registers, we can detect where a BPF register is stored without restoring to carrying this additional metadata through all code paths. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Use negative numbers for eBPF registers that live on the stack. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Provide a set of load/store opcode generators that work with negative immediates as well as positive ones. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Russell King 提交于
Enumerate the contents of the JIT scratch stack layout used for storing some of the JITs 64-bit registers, tail call counter and AX register. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 30 6月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the set_memory_{ro,rw}() pair directly as otherwise changes to the former might break. arm32's eBPF conversion missed to change it, so fix this up here. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 05 6月, 2018 2 次提交
-
-
由 Wang YanQing 提交于
The names for BPF_ALU64 | BPF_ARSH are emit_a32_arsh_*, the names for BPF_ALU64 | BPF_LSH are emit_a32_lsh_*, but the names for BPF_ALU64 | BPF_RSH are emit_a32_lsr_*. For consistence reason, let's rename emit_a32_lsr_* to emit_a32_rsh_*. This patch also corrects a wrong comment. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NWang YanQing <udknight@gmail.com> Cc: Shubham Bansal <illusionist.neo@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux@armlinux.org.uk Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Wang YanQing 提交于
imm24 is signed, so the right range is: [-(1<<(24 - 1)), (1<<(24 - 1)) - 1] Note: this patch also fix a typo. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NWang YanQing <udknight@gmail.com> Cc: Shubham Bansal <illusionist.neo@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux@armlinux.org.uk Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 15 5月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
The extra skb_copy_bits() buffer is not used anymore, therefore remove the extra 4 byte stack space requirement. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 04 5月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from arm32 JIT. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 27 1月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from arm32 JIT. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 20 1月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 18 1月, 2018 8 次提交
-
-
由 Russell King 提交于
As per 90caccdd ("bpf: fix bpf_tail_call() x64 JIT"), the index used for array lookup is defined to be 32-bit wide. Update a misleading comment that suggests it is 64-bit wide. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
When the source and destination register are identical, our JIT does not generate correct code, which leads to kernel oopses. Fix this by (a) generating more efficient code, and (b) making use of the temporary earlier if we will overwrite the address register. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
When an eBPF program tail-calls another eBPF program, it enters it after the prologue to avoid having complex stack manipulations. This can lead to kernel oopses, and similar. Resolve this by always using a fixed stack layout, a CPU register frame pointer, and using this when reloading registers before returning. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
The stack layout documentation incorrectly suggests that the BPF JIT scratch space starts immediately below BPF_FP. This is not correct, so let's fix the documentation to reflect reality. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
Move the stack documentation towards the top of the file, where it's relevant for things like the register layout. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
As per 2dede2d8 ("ARM EABI: stack pointer must be 64-bit aligned after a CPU exception") the stack should be aligned to a 64-bit boundary on EABI systems. Ensure that the eBPF JIT appropraitely aligns the stack. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
When a tail call fails, it is documented that the tail call should continue execution at the following instruction. An example tail call sequence is: 12: (85) call bpf_tail_call#12 13: (b7) r0 = 0 14: (95) exit The ARM assembler for the tail call in this case ends up branching to instruction 14 instead of instruction 13, resulting in the BPF filter returning a non-zero value: 178: ldr r8, [sp, #588] ; insn 12 17c: ldr r6, [r8, r6] 180: ldr r8, [sp, #580] 184: cmp r8, r6 188: bcs 0x1e8 18c: ldr r6, [sp, #524] 190: ldr r7, [sp, #528] 194: cmp r7, #0 198: cmpeq r6, #32 19c: bhi 0x1e8 1a0: adds r6, r6, #1 1a4: adc r7, r7, #0 1a8: str r6, [sp, #524] 1ac: str r7, [sp, #528] 1b0: mov r6, #104 1b4: ldr r8, [sp, #588] 1b8: add r6, r8, r6 1bc: ldr r8, [sp, #580] 1c0: lsl r7, r8, #2 1c4: ldr r6, [r6, r7] 1c8: cmp r6, #0 1cc: beq 0x1e8 1d0: mov r8, #32 1d4: ldr r6, [r6, r8] 1d8: add r6, r6, #44 1dc: bx r6 1e0: mov r0, #0 ; insn 13 1e4: mov r1, #0 1e8: add sp, sp, #596 ; insn 14 1ec: pop {r4, r5, r6, r7, r8, sl, pc} For other sequences, the tail call could end up branching midway through the following BPF instructions, or maybe off the end of the function, leading to unknown behaviours. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Russell King 提交于
Avoid the 'bx' instruction on CPUs that have no support for Thumb and thus do not implement this instruction by moving the generation of this opcode to a separate function that selects between: bx reg and mov pc, reg according to the capabilities of the CPU. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-