1. 03 9月, 2019 1 次提交
  2. 25 6月, 2019 1 次提交
  3. 19 6月, 2019 1 次提交
  4. 27 4月, 2019 2 次提交
  5. 27 1月, 2019 1 次提交
  6. 12 12月, 2018 1 次提交
  7. 05 12月, 2018 1 次提交
    • A
      arm64/bpf: don't allocate BPF JIT programs in module memory · 91fc957c
      Ard Biesheuvel 提交于
      The arm64 module region is a 128 MB region that is kept close to
      the core kernel, in order to ensure that relative branches are
      always in range. So using the same region for programs that do
      not have this restriction is wasteful, and preferably avoided.
      
      Now that the core BPF JIT code permits the alloc/free routines to
      be overridden, implement them by vmalloc()/vfree() calls from a
      dedicated 128 MB region set aside for BPF programs. This ensures
      that BPF programs are still in branching range of each other, which
      is something the JIT currently depends upon (and is not guaranteed
      when using module_alloc() on KASLR kernels like we do currently).
      It also ensures that placement of BPF programs does not correlate
      with the placement of the core kernel or modules, making it less
      likely that leaking the former will reveal the latter.
      
      This also solves an issue under KASAN, where shadow memory is
      needlessly allocated for all BPF programs (which don't require KASAN
      shadow pages since they are not KASAN instrumented)
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      91fc957c
  8. 30 11月, 2018 1 次提交
  9. 27 11月, 2018 1 次提交
    • D
      bpf, arm64: fix getting subprog addr from aux for calls · 8c11ea5c
      Daniel Borkmann 提交于
      The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF
      to BPF call offset can be too far away from core kernel in that relative
      encoding into imm is not sufficient and could potentially be truncated,
      see also fd045f6c ("arm64: add support for module PLTs") which adds
      spill-over space for module_alloc() and therefore bpf_jit_binary_alloc().
      Therefore, use the recently added bpf_jit_get_func_addr() helper for
      properly fetching the address through prog->aux->func[off]->bpf_func
      instead. This also has the benefit to optimize normal helper calls since
      their address can use the optimized emission. Tested on Cavium ThunderX
      CN8890.
      
      Fixes: db496944 ("bpf: arm64: add JIT support for multi-function programs")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      8c11ea5c
  10. 15 5月, 2018 3 次提交
    • D
      bpf, arm64: save 4 bytes in prologue when ebpf insns came from cbpf · 56ea6a8b
      Daniel Borkmann 提交于
      We can trivially save 4 bytes in prologue for cBPF since tail calls
      can never be used from there. The register push/pop is pairwise,
      here, x25 (fp) and x26 (tcc), so no point in changing that, only
      reset to zero is not needed.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      56ea6a8b
    • D
      bpf, arm64: optimize 32/64 immediate emission · 6d2eea6f
      Daniel Borkmann 提交于
      Improve the JIT to emit 64 and 32 bit immediates, the current
      algorithm is not optimal and we often emit more instructions
      than actually needed. arm64 has movz, movn, movk variants but
      for the current 64 bit immediates we only use movz with a
      series of movk when needed.
      
      For example loading ffffffffffffabab emits the following 4
      instructions in the JIT today:
      
        * movz: abab, shift:  0, result: 000000000000abab
        * movk: ffff, shift: 16, result: 00000000ffffabab
        * movk: ffff, shift: 32, result: 0000ffffffffabab
        * movk: ffff, shift: 48, result: ffffffffffffabab
      
      Whereas after the patch the same load only needs a single
      instruction:
      
        * movn: 5454, shift:  0, result: ffffffffffffabab
      
      Another example where two extra instructions can be saved:
      
        * movz: abab, shift:  0, result: 000000000000abab
        * movk: 1f2f, shift: 16, result: 000000001f2fabab
        * movk: ffff, shift: 32, result: 0000ffff1f2fabab
        * movk: ffff, shift: 48, result: ffffffff1f2fabab
      
      After the patch:
      
        * movn: e0d0, shift: 16, result: ffffffff1f2fffff
        * movk: abab, shift:  0, result: ffffffff1f2fabab
      
      Another example with movz, before:
      
        * movz: 0000, shift:  0, result: 0000000000000000
        * movk: fea0, shift: 32, result: 0000fea000000000
      
      After:
      
        * movz: fea0, shift: 32, result: 0000fea000000000
      
      Moreover, reuse emit_a64_mov_i() for 32 bit immediates that
      are loaded via emit_a64_mov_i64() which is a similar optimization
      as done in 6fe8b9c1 ("bpf, x64: save several bytes by using
      mov over movabsq when possible"). On arm64, the latter allows to
      use a single instruction with movn due to zero extension where
      otherwise two would be needed. And last but not least add a
      missing optimization in emit_a64_mov_i() where movn is used but
      the subsequent movk not needed. With some of the Cilium programs
      in use, this shrinks the needed instructions by about three
      percent. Tested on Cavium ThunderX CN8890.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      6d2eea6f
    • D
      bpf, arm64: save 4 bytes of unneeded stack space · 09ece3d0
      Daniel Borkmann 提交于
      Follow-up to 816d9ef3 ("bpf, arm64: remove ld_abs/ld_ind") in
      that the extra 4 byte JIT scratchpad is not needed anymore since it
      was in ld_abs/ld_ind as stack buffer for bpf_load_pointer().
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      09ece3d0
  11. 04 5月, 2018 1 次提交
  12. 23 2月, 2018 1 次提交
    • D
      bpf, arm64: fix out of bounds access in tail call · 16338a9b
      Daniel Borkmann 提交于
      I recently noticed a crash on arm64 when feeding a bogus index
      into BPF tail call helper. The crash would not occur when the
      interpreter is used, but only in case of JIT. Output looks as
      follows:
      
        [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
        [...]
        [  347.043065] [fffb850e96492510] address between user and kernel address ranges
        [  347.050205] Internal error: Oops: 96000004 [#1] SMP
        [...]
        [  347.190829] x13: 0000000000000000 x12: 0000000000000000
        [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
        [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
        [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
        [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
        [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
        [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
        [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
        [  347.235221] Call trace:
        [  347.237656]  0xffff000002f3a4fc
        [  347.240784]  bpf_test_run+0x78/0xf8
        [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
        [  347.248694]  SyS_bpf+0x77c/0x1110
        [  347.251999]  el0_svc_naked+0x30/0x34
        [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
        [...]
      
      In this case the index used in BPF r3 is the same as in r1
      at the time of the call, meaning we fed a pointer as index;
      here, it had the value 0xffff808fd7cf0500 which sits in x2.
      
      While I found tail calls to be working in general (also for
      hitting the error cases), I noticed the following in the code
      emission:
      
        # bpftool p d j i 988
        [...]
        38:   ldr     w10, [x1,x10]
        3c:   cmp     w2, w10
        40:   b.ge    0x000000000000007c              <-- signed cmp
        44:   mov     x10, #0x20                      // #32
        48:   cmp     x26, x10
        4c:   b.gt    0x000000000000007c
        50:   add     x26, x26, #0x1
        54:   mov     x10, #0x110                     // #272
        58:   add     x10, x1, x10
        5c:   lsl     x11, x2, #3
        60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
        64:   cbz     x11, 0x000000000000007c
        [...]
      
      Meaning, the tests passed because commit ddb55992 ("arm64:
      bpf: implement bpf_tail_call() helper") was using signed compares
      instead of unsigned which as a result had the test wrongly passing.
      
      Change this but also the tail call count test both into unsigned
      and cap the index as u32. Latter we did as well in 90caccdd
      ("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
      too. Tested on HiSilicon Hi1616.
      
      Result after patch:
      
        # bpftool p d j i 268
        [...]
        38:	ldr	w10, [x1,x10]
        3c:	add	w2, w2, #0x0
        40:	cmp	w2, w10
        44:	b.cs	0x0000000000000080
        48:	mov	x10, #0x20                  	// #32
        4c:	cmp	x26, x10
        50:	b.hi	0x0000000000000080
        54:	add	x26, x26, #0x1
        58:	mov	x10, #0x110                 	// #272
        5c:	add	x10, x1, x10
        60:	lsl	x11, x2, #3
        64:	ldr	x11, [x10,x11]
        68:	cbz	x11, 0x0000000000000080
        [...]
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      16338a9b
  13. 27 1月, 2018 1 次提交
  14. 20 1月, 2018 1 次提交
  15. 17 1月, 2018 1 次提交
    • D
      bpf, arm64: fix stack_depth tracking in combination with tail calls · a2284d91
      Daniel Borkmann 提交于
      Using dynamic stack_depth tracking in arm64 JIT is currently broken in
      combination with tail calls. In prologue, we cache ctx->stack_size and
      adjust SP reg for setting up function call stack, and tearing it down
      again in epilogue. Problem is that when doing a tail call, the cached
      ctx->stack_size might not be the same.
      
      One way to fix the problem with minimal overhead is to re-adjust SP in
      emit_bpf_tail_call() and properly adjust it to the current program's
      ctx->stack_size. Tested on Cavium ThunderX ARMv8.
      
      Fixes: f1c9eed7 ("bpf, arm64: take advantage of stack_depth tracking")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      a2284d91
  16. 19 12月, 2017 1 次提交
  17. 18 12月, 2017 2 次提交
    • A
      bpf: arm64: add JIT support for multi-function programs · db496944
      Alexei Starovoitov 提交于
      similar to x64 add support for bpf-to-bpf calls.
      When program has calls to in-kernel helpers the target call offset
      is known at JIT time and arm64 architecture needs 2 passes.
      With bpf-to-bpf calls the dynamically allocated function start
      is unknown until all functions of the program are JITed.
      Therefore (just like x64) arm64 JIT needs one extra pass over
      the program to emit correct call offsets.
      
      Implementation detail:
      Avoid being too clever in 64-bit immediate moves and
      always use 4 instructions (instead of 3-4 depending on the address)
      to make sure only one extra pass is needed.
      If some future optimization would make it worth while to optimize
      'call 64-bit imm' further, the JIT would need to do 4 passes
      over the program instead of 3 as in this patch.
      For typical bpf program address the mov needs 3 or 4 insns,
      so unconditional 4 insns to save extra pass is a worthy trade off
      at this state of JIT.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      db496944
    • A
      bpf: fix net.core.bpf_jit_enable race · 60b58afc
      Alexei Starovoitov 提交于
      global bpf_jit_enable variable is tested multiple times in JITs,
      blinding and verifier core. The malicious root can try to toggle
      it while loading the programs. This race condition was accounted
      for and there should be no issues, but it's safer to avoid
      this race condition.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      60b58afc
  18. 10 8月, 2017 1 次提交
  19. 01 7月, 2017 1 次提交
  20. 12 6月, 2017 1 次提交
  21. 08 6月, 2017 1 次提交
  22. 07 6月, 2017 1 次提交
  23. 01 6月, 2017 1 次提交
  24. 12 5月, 2017 1 次提交
    • D
      bpf, arm64: fix faulty emission of map access in tail calls · d8b54110
      Daniel Borkmann 提交于
      Shubham was recently asking on netdev why in arm64 JIT we don't multiply
      the index for accessing the tail call map by 8. That led me into testing
      out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
      dereference on the tail call.
      
      The buggy access is at:
      
        prog = array->ptrs[index];
        if (prog == NULL)
            goto out;
      
        [...]
        00000060:  d2800e0a  mov x10, #0x70 // #112
        00000064:  f86a682a  ldr x10, [x1,x10]
        00000068:  f862694b  ldr x11, [x10,x2]
        0000006c:  b40000ab  cbz x11, 0x00000080
        [...]
      
      The code triggering the crash is f862694b. x1 at the time contains the
      address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
      above we load the pointer to the program at map slot 0 into x10. x10
      can then be NULL if the slot is not occupied, which we later on try to
      access with a user given offset in x2 that is the map index.
      
      Fix this by emitting the following instead:
      
        [...]
        00000060:  d2800e0a  mov x10, #0x70 // #112
        00000064:  8b0a002a  add x10, x1, x10
        00000068:  d37df04b  lsl x11, x2, #3
        0000006c:  f86b694b  ldr x11, [x10,x11]
        00000070:  b40000ab  cbz x11, 0x00000084
        [...]
      
      This basically adds the offset to ptrs to the base address of the bpf
      array we got and we later on access the map with an index * 8 offset
      relative to that. The tail call map itself is basically one large area
      with meta data at the head followed by the array of prog pointers.
      This makes tail calls working again, tested on Cavium ThunderX ARMv8.
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Reported-by: NShubham Bansal <illusionist.neo@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8b54110
  25. 09 5月, 2017 1 次提交
  26. 03 5月, 2017 2 次提交
    • D
      bpf, arm64: fix jit branch offset related to ldimm64 · ddc665a4
      Daniel Borkmann 提交于
      When the instruction right before the branch destination is
      a 64 bit load immediate, we currently calculate the wrong
      jump offset in the ctx->offset[] array as we only account
      one instruction slot for the 64 bit load immediate although
      it uses two BPF instructions. Fix it up by setting the offset
      into the right slot after we incremented the index.
      
      Before (ldimm64 test 1):
      
        [...]
        00000020:  52800007  mov w7, #0x0 // #0
        00000024:  d2800060  mov x0, #0x3 // #3
        00000028:  d2800041  mov x1, #0x2 // #2
        0000002c:  eb01001f  cmp x0, x1
        00000030:  54ffff82  b.cs 0x00000020
        00000034:  d29fffe7  mov x7, #0xffff // #65535
        00000038:  f2bfffe7  movk x7, #0xffff, lsl #16
        0000003c:  f2dfffe7  movk x7, #0xffff, lsl #32
        00000040:  f2ffffe7  movk x7, #0xffff, lsl #48
        00000044:  d29dddc7  mov x7, #0xeeee // #61166
        00000048:  f2bdddc7  movk x7, #0xeeee, lsl #16
        0000004c:  f2ddddc7  movk x7, #0xeeee, lsl #32
        00000050:  f2fdddc7  movk x7, #0xeeee, lsl #48
        [...]
      
      After (ldimm64 test 1):
      
        [...]
        00000020:  52800007  mov w7, #0x0 // #0
        00000024:  d2800060  mov x0, #0x3 // #3
        00000028:  d2800041  mov x1, #0x2 // #2
        0000002c:  eb01001f  cmp x0, x1
        00000030:  540000a2  b.cs 0x00000044
        00000034:  d29fffe7  mov x7, #0xffff // #65535
        00000038:  f2bfffe7  movk x7, #0xffff, lsl #16
        0000003c:  f2dfffe7  movk x7, #0xffff, lsl #32
        00000040:  f2ffffe7  movk x7, #0xffff, lsl #48
        00000044:  d29dddc7  mov x7, #0xeeee // #61166
        00000048:  f2bdddc7  movk x7, #0xeeee, lsl #16
        0000004c:  f2ddddc7  movk x7, #0xeeee, lsl #32
        00000050:  f2fdddc7  movk x7, #0xeeee, lsl #48
        [...]
      
      Also, add a couple of test cases to make sure JITs pass
      this test. Tested on Cavium ThunderX ARMv8. The added
      test cases all pass after the fix.
      
      Fixes: 8eee539d ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()")
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddc665a4
    • D
      bpf, arm64: implement jiting of BPF_XADD · 85f68fe8
      Daniel Borkmann 提交于
      This work adds BPF_XADD for BPF_W/BPF_DW to the arm64 JIT and therefore
      completes JITing of all BPF instructions, meaning we can thus also remove
      the 'notyet' label and do not need to fall back to the interpreter when
      BPF_XADD is used in a program!
      
      This now also brings arm64 JIT in line with x86_64, s390x, ppc64, sparc64,
      where all current eBPF features are supported.
      
      BPF_W example from test_bpf:
      
        .u.insns_int = {
          BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
          BPF_ST_MEM(BPF_W, R10, -40, 0x10),
          BPF_STX_XADD(BPF_W, R10, R0, -40),
          BPF_LDX_MEM(BPF_W, R0, R10, -40),
          BPF_EXIT_INSN(),
        },
      
        [...]
        00000020:  52800247  mov w7, #0x12 // #18
        00000024:  928004eb  mov x11, #0xffffffffffffffd8 // #-40
        00000028:  d280020a  mov x10, #0x10 // #16
        0000002c:  b82b6b2a  str w10, [x25,x11]
        // start of xadd mapping:
        00000030:  928004ea  mov x10, #0xffffffffffffffd8 // #-40
        00000034:  8b19014a  add x10, x10, x25
        00000038:  f9800151  prfm pstl1strm, [x10]
        0000003c:  885f7d4b  ldxr w11, [x10]
        00000040:  0b07016b  add w11, w11, w7
        00000044:  880b7d4b  stxr w11, w11, [x10]
        00000048:  35ffffab  cbnz w11, 0x0000003c
        // end of xadd mapping:
        [...]
      
      BPF_DW example from test_bpf:
      
        .u.insns_int = {
          BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
          BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
          BPF_STX_XADD(BPF_DW, R10, R0, -40),
          BPF_LDX_MEM(BPF_DW, R0, R10, -40),
          BPF_EXIT_INSN(),
        },
      
        [...]
        00000020:  52800247  mov w7,  #0x12 // #18
        00000024:  928004eb  mov x11, #0xffffffffffffffd8 // #-40
        00000028:  d280020a  mov x10, #0x10 // #16
        0000002c:  f82b6b2a  str x10, [x25,x11]
        // start of xadd mapping:
        00000030:  928004ea  mov x10, #0xffffffffffffffd8 // #-40
        00000034:  8b19014a  add x10, x10, x25
        00000038:  f9800151  prfm pstl1strm, [x10]
        0000003c:  c85f7d4b  ldxr x11, [x10]
        00000040:  8b07016b  add x11, x11, x7
        00000044:  c80b7d4b  stxr w11, x11, [x10]
        00000048:  35ffffab  cbnz w11, 0x0000003c
        // end of xadd mapping:
        [...]
      
      Tested on Cavium ThunderX ARMv8, test suite results after the patch:
      
        No JIT:   [ 3751.855362] test_bpf: Summary: 311 PASSED, 0 FAILED, [0/303 JIT'ed]
        With JIT: [ 3573.759527] test_bpf: Summary: 311 PASSED, 0 FAILED, [303/303 JIT'ed]
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85f68fe8
  27. 29 4月, 2017 1 次提交
  28. 22 2月, 2017 1 次提交
  29. 18 2月, 2017 2 次提交
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
    • D
      bpf: remove stubs for cBPF from arch code · 9383191d
      Daniel Borkmann 提交于
      Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
      that a single __weak function in the core that can be overridden
      similarly to the eBPF one. Also remove stale pr_err() mentions
      of bpf_jit_compile.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9383191d
  30. 11 6月, 2016 3 次提交
    • Z
      arm64: bpf: optimize LD_ABS, LD_IND · 643c332d
      Zi Shen Lim 提交于
      Remove superfluous stack frame, saving us 3 instructions for every
      LD_ABS or LD_IND.
      Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      643c332d
    • Z
      arm64: bpf: optimize JMP_CALL · 997ce888
      Zi Shen Lim 提交于
      Remove superfluous stack frame, saving us 3 instructions for
      every JMP_CALL.
      Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      997ce888
    • Z
      arm64: bpf: implement bpf_tail_call() helper · ddb55992
      Zi Shen Lim 提交于
      Add support for JMP_CALL_X (tail call) introduced by commit 04fd61ab
      ("bpf: allow bpf programs to tail-call other bpf programs").
      
      bpf_tail_call() arguments:
        ctx   - context pointer passed to next program
        array - pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY
        index - index inside array that selects specific program to run
      
      In this implementation arm64 JIT jumps into callee program after prologue,
      so callee program reuses the same stack. For tail_call_cnt, we use the
      callee-saved R26 (which was already saved/restored but previously unused
      by JIT).
      
      With this patch a tail call generates the following code on arm64:
      
        if (index >= array->map.max_entries)
            goto out;
      
        34:   mov     x10, #0x10                      // #16
        38:   ldr     w10, [x1,x10]
        3c:   cmp     w2, w10
        40:   b.ge    0x0000000000000074
      
        if (tail_call_cnt > MAX_TAIL_CALL_CNT)
            goto out;
        tail_call_cnt++;
      
        44:   mov     x10, #0x20                      // #32
        48:   cmp     x26, x10
        4c:   b.gt    0x0000000000000074
        50:   add     x26, x26, #0x1
      
        prog = array->ptrs[index];
        if (prog == NULL)
            goto out;
      
        54:   mov     x10, #0x68                      // #104
        58:   ldr     x10, [x1,x10]
        5c:   ldr     x11, [x10,x2]
        60:   cbz     x11, 0x0000000000000074
      
        goto *(prog->bpf_func + prologue_size);
      
        64:   mov     x10, #0x20                      // #32
        68:   ldr     x10, [x11,x10]
        6c:   add     x10, x10, #0x20
        70:   br      x10
        74:
      Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddb55992
  31. 18 5月, 2016 1 次提交
  32. 17 5月, 2016 1 次提交