1. 01 6月, 2017 1 次提交
  2. 18 2月, 2017 2 次提交
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
    • D
      bpf: remove stubs for cBPF from arch code · 9383191d
      Daniel Borkmann 提交于
      Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
      that a single __weak function in the core that can be overridden
      similarly to the eBPF one. Also remove stale pr_err() mentions
      of bpf_jit_compile.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9383191d
  3. 25 1月, 2017 2 次提交
  4. 09 12月, 2016 1 次提交
  5. 04 10月, 2016 3 次提交
    • N
      powerpc/bpf: Add support for bpf constant blinding · b7b7013c
      Naveen N. Rao 提交于
      In line with similar support for other architectures by Daniel Borkmann.
      
      'MOD Default X' from test_bpf without constant blinding:
      84 bytes emitted from JIT compiler (pass:3, flen:7)
      d0000000058a4688 + <x>:
         0:	nop
         4:	nop
         8:	std     r27,-40(r1)
         c:	std     r28,-32(r1)
        10:	xor     r8,r8,r8
        14:	xor     r28,r28,r28
        18:	mr      r27,r3
        1c:	li      r8,66
        20:	cmpwi   r28,0
        24:	bne     0x0000000000000030
        28:	li      r8,0
        2c:	b       0x0000000000000044
        30:	divwu   r9,r8,r28
        34:	mullw   r9,r28,r9
        38:	subf    r8,r9,r8
        3c:	rotlwi  r8,r8,0
        40:	li      r8,66
        44:	ld      r27,-40(r1)
        48:	ld      r28,-32(r1)
        4c:	mr      r3,r8
        50:	blr
      
      ... and with constant blinding:
      140 bytes emitted from JIT compiler (pass:3, flen:11)
      d00000000bd6ab24 + <x>:
         0:	nop
         4:	nop
         8:	std     r27,-40(r1)
         c:	std     r28,-32(r1)
        10:	xor     r8,r8,r8
        14:	xor     r28,r28,r28
        18:	mr      r27,r3
        1c:	lis     r2,-22834
        20:	ori     r2,r2,36083
        24:	rotlwi  r2,r2,0
        28:	xori    r2,r2,36017
        2c:	xoris   r2,r2,42702
        30:	rotlwi  r2,r2,0
        34:	mr      r8,r2
        38:	rotlwi  r8,r8,0
        3c:	cmpwi   r28,0
        40:	bne     0x000000000000004c
        44:	li      r8,0
        48:	b       0x000000000000007c
        4c:	divwu   r9,r8,r28
        50:	mullw   r9,r28,r9
        54:	subf    r8,r9,r8
        58:	rotlwi  r8,r8,0
        5c:	lis     r2,-17137
        60:	ori     r2,r2,39065
        64:	rotlwi  r2,r2,0
        68:	xori    r2,r2,39131
        6c:	xoris   r2,r2,48399
        70:	rotlwi  r2,r2,0
        74:	mr      r8,r2
        78:	rotlwi  r8,r8,0
        7c:	ld      r27,-40(r1)
        80:	ld      r28,-32(r1)
        84:	mr      r3,r8
        88:	blr
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b7b7013c
    • N
      powerpc/bpf: Implement support for tail calls · ce076141
      Naveen N. Rao 提交于
      Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
      programs. This can be achieved either by:
      (1) retaining the stack setup by the first eBPF program and having all
      subsequent eBPF programs re-using it, or,
      (2) by unwinding/tearing down the stack and having each eBPF program
      deal with its own stack as it sees fit.
      
      To ensure that this does not create loops, there is a limit to how many
      tail calls can be done (currently 32). This requires the JIT'ed code to
      maintain a count of the number of tail calls done so far.
      
      Approach (1) is simple, but requires every eBPF program to have (almost)
      the same prologue/epilogue, regardless of whether they need it. This is
      inefficient for small eBPF programs which may not sometimes need a
      prologue at all. As such, to minimize impact of tail call
      implementation, we use approach (2) here which needs each eBPF program
      in the chain to use its own prologue/epilogue. This is not ideal when
      many tail calls are involved and when all the eBPF programs in the chain
      have similar prologue/epilogue. However, the impact is restricted to
      programs that do tail calls. Individual eBPF programs are not affected.
      
      We maintain the tail call count in a fixed location on the stack and
      updated tail call count values are passed in through this. The very
      first eBPF program in a chain sets this up to 0 (the first 2
      instructions). Subsequent tail calls skip the first two eBPF JIT
      instructions to maintain the count. For programs that don't do tail
      calls themselves, the first two instructions are NOPs.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ce076141
    • N
      powerpc/bpf: Introduce accessors for using the tmp local stack space · 7b847f52
      Naveen N. Rao 提交于
      While at it, ensure that the location of the local save area is
      consistent whether or not we setup our own stackframe. This property is
      utilised in the next patch that adds support for tail calls.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7b847f52
  6. 24 6月, 2016 1 次提交