提交 · cf48db69bdfad2930b95fd51d64444e5a7b469ae · openeuler / Kernel

04 5月, 2020 1 次提交

bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction · cf48db69

由 Luke Nelson 提交于 4月 30, 2020

This patch optimizes the code generated by emit_a32_arsh_r64, which
handles the BPF_ALU64 BPF_ARSH BPF_X instruction.

The original code uses a conditional B followed by an unconditional ORR.
The optimization saves one instruction by removing the B instruction
and using a conditional ORR (with an inverted condition).

Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0,
BPF_REG_1), before optimization:

  34:  rsb    ip, r2, #32
  38:  subs   r9, r2, #32
  3c:  lsr    lr, r0, r2
  40:  orr    lr, lr, r1, lsl ip
  44:  bmi    0x4c
  48:  orr    lr, lr, r1, asr r9
  4c:  asr    ip, r1, r2
  50:  mov    r0, lr
  54:  mov    r1, ip

and after optimization:

  34:  rsb    ip, r2, #32
  38:  subs   r9, r2, #32
  3c:  lsr    lr, r0, r2
  40:  orr    lr, lr, r1, lsl ip
  44:  orrpl  lr, lr, r1, asr r9
  48:  asr    ip, r1, r2
  4c:  mov    r0, lr
  50:  mov    r1, ip

Tested on QEMU using lib/test_bpf and test_verifier.
Co-developed-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com

cf48db69

15 4月, 2020 1 次提交

arm, bpf: Fix offset overflow for BPF_MEM BPF_DW · 4178417c

由 Luke Nelson 提交于 4月 09, 2020

This patch fixes an incorrect check in how immediate memory offsets are
computed for BPF_DW on arm.

For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte
access into two separate 4-byte accesses using off+0 and off+4. If off
fits in imm12, the JIT emits a ldr/str instruction with the immediate
and avoids the use of a temporary register. While the current check off
<= 0xfff ensures that the first immediate off+0 doesn't overflow imm12,
it's not sufficient for the second immediate off+4, which may cause the
second access of BPF_DW to read/write the wrong address.

This patch fixes the problem by changing the check to
off <= 0xfff - 4 for BPF_DW, ensuring off+4 will never overflow.

A side effect of simplifying the check is that it now allows using
negative immediate offsets in ldr/str. This means that small negative
offsets can also avoid the use of a temporary register.

This patch introduces no new failures in test_verifier or test_bpf.c.

Fixes: c5eae692 ("ARM: net: bpf: improve 64-bit store implementation")
Fixes: ec19e02b ("ARM: net: bpf: fix LDX instructions")
Co-developed-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com

4178417c

09 4月, 2020 1 次提交

arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0 · bb9562cf

由 Luke Nelson 提交于 4月 08, 2020

The current arm BPF JIT does not correctly compile RSH or ARSH when the
immediate shift amount is 0. This causes the "rsh64 by 0 imm" and "arsh64
by 0 imm" BPF selftests to hang the kernel by reaching an instruction
the verifier determines to be unreachable.

The root cause is in how immediate right shifts are encoded on arm.
For LSR and ASR (logical and arithmetic right shift), a bit-pattern
of 00000 in the immediate encodes a shift amount of 32. When the BPF
immediate is 0, the generated code shifts by 32 instead of the expected
behavior (a no-op).

This patch fixes the bugs by adding an additional check if the BPF
immediate is 0. After the change, the above mentioned BPF selftests pass.

Fixes: 39c13c20 ("arm: eBPF JIT compiler")
Co-developed-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NLuke Nelson <luke.r.nels@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com

bb9562cf

11 12月, 2019 1 次提交

ARM: net: bpf: Improve prologue code sequence · c4533128

由 Russell King 提交于 12月 09, 2019

Improve the prologue code sequence to be able to take advantage of
64-bit stores, changing the code from:

  push    {r4, r5, r6, r7, r8, r9, fp, lr}
  mov     fp, sp
  sub     ip, sp, #80     ; 0x50
  sub     sp, sp, #600    ; 0x258
  str     ip, [fp, #-100] ; 0xffffff9c
  mov     r6, #0
  str     r6, [fp, #-96]  ; 0xffffffa0
  mov     r4, #0
  mov     r3, r4
  mov     r2, r0
  str     r4, [fp, #-104] ; 0xffffff98
  str     r4, [fp, #-108] ; 0xffffff94

to the tighter:

  push    {r4, r5, r6, r7, r8, r9, fp, lr}
  mov     fp, sp
  mov     r3, #0
  sub     r2, sp, #80     ; 0x50
  sub     sp, sp, #600    ; 0x258
  strd    r2, [fp, #-100] ; 0xffffff9c
  mov     r2, #0
  strd    r2, [fp, #-108] ; 0xffffff94
  mov     r2, r0

resulting in a saving of three instructions.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/E1ieH2g-0004ih-Rb@rmk-PC.armlinux.org.uk

c4533128

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 · b886d83c

由 Thomas Gleixner 提交于 6月 01, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NArmijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b886d83c

25 5月, 2019 1 次提交

arm: bpf: eliminate zero extension code-gen · 163541e6

由 Jiong Wang 提交于 5月 24, 2019

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: NJiong Wang <jiong.wang@netronome.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

163541e6

27 1月, 2019 1 次提交

arm: bpf: implement jitting of JMP32 · b85062ac

由 Jiong Wang 提交于 1月 26, 2019

This patch implements code-gen for new JMP32 instructions on arm.

For JSET, "ands" (AND with flags updated) is used, so corresponding
encoding helper is added.

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: NJiong Wang <jiong.wang@netronome.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

b85062ac

13 7月, 2018 18 次提交

ARM: net: bpf: improve 64-bit ALU implementation · b18bea2a

由 Russell King 提交于 7月 12, 2018

Improbe the 64-bit ALU implementation from:

  movw    r8, #65532
  movt    r8, #65535
  movw    r9, #65535
  movt    r9, #65535
  ldr     r7, [fp, #-44]
  adds    r7, r7, r8
  str     r7, [fp, #-44]
  ldr     r7, [fp, #-40]
  adc     r7, r7, r9
  str     r7, [fp, #-40]

to:

  movw    r8, #65532
  movt    r8, #65535
  movw    r9, #65535
  movt    r9, #65535
  ldrd    r6, [fp, #-44]
  adds    r6, r6, r8
  adc     r7, r7, r9
  strd    r6, [fp, #-44]
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

b18bea2a

ARM: net: bpf: improve 64-bit store implementation · c5eae692

由 Russell King 提交于 7月 12, 2018

Improve the 64-bit store implementation from:

  ldr     r6, [fp, #-8]
  str     r8, [r6]
  ldr     r6, [fp, #-8]
  mov     r7, #4
  add     r7, r6, r7
  str     r9, [r7]

to:

  ldr     r6, [fp, #-8]
  str     r8, [r6]
  str     r9, [r6, #4]

We leave the store as two separate STR instructions rather than using
STRD as the store may not be aligned, and STR can handle misalignment.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

c5eae692

ARM: net: bpf: improve 64-bit sign-extended immediate load · 077513b8

由 Russell King 提交于 7月 12, 2018

Improve the 64-bit sign-extended immediate from:

  mov     r6, #1
  str     r6, [fp, #-52]  ; 0xffffffcc
  mov     r6, #0
  str     r6, [fp, #-48]  ; 0xffffffd0

to:

  mov     r6, #1
  mov     r7, #0
  strd    r6, [fp, #-52]  ; 0xffffffcc
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

077513b8

ARM: net: bpf: improve 64-bit load immediate implementation · f9ff5018

由 Russell King 提交于 7月 12, 2018

Rather than writing each 32-bit half of the 64-bit immediate value
separately when the register is on the stack:

  movw    r6, #45056      ; 0xb000
  movt    r6, #60979      ; 0xee33
  str     r6, [fp, #-44]  ; 0xffffffd4
  mov     r6, #0
  str     r6, [fp, #-40]  ; 0xffffffd8

arrange to use the double-word store when available instead:

  movw    r6, #45056      ; 0xb000
  movt    r6, #60979      ; 0xee33
  mov     r7, #0
  strd    r6, [fp, #-44]  ; 0xffffffd4
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

f9ff5018

ARM: net: bpf: use double-word load/stores where available · 8c9602d3