1. 19 5月, 2016 1 次提交
  2. 17 5月, 2016 2 次提交
  3. 19 12月, 2015 1 次提交
    • D
      bpf: move clearing of A/X into classic to eBPF migration prologue · 8b614aeb
      Daniel Borkmann 提交于
      Back in the days where eBPF (or back then "internal BPF" ;->) was not
      exposed to user space, and only the classic BPF programs internally
      translated into eBPF programs, we missed the fact that for classic BPF
      A and X needed to be cleared. It was fixed back then via 83d5b7ef
      ("net: filter: initialize A and X registers"), and thus classic BPF
      specifics were added to the eBPF interpreter core to work around it.
      
      This added some confusion for JIT developers later on that take the
      eBPF interpreter code as an example for deriving their JIT. F.e. in
      f75298f5 ("s390/bpf: clear correct BPF accumulator register"), at
      least X could leak stack memory. Furthermore, since this is only needed
      for classic BPF translations and not for eBPF (verifier takes care
      that read access to regs cannot be done uninitialized), more complexity
      is added to JITs as they need to determine whether they deal with
      migrations or native eBPF where they can just omit clearing A/X in
      their prologue and thus reduce image size a bit, see f.e. cde66c2d
      ("s390/bpf: Only clear A and X for converted BPF programs"). In other
      cases (x86, arm64), A and X is being cleared in the prologue also for
      eBPF case, which is unnecessary.
      
      Lets move this into the BPF migration in bpf_convert_filter() where it
      actually belongs as long as the number of eBPF JITs are still few. It
      can thus be done generically; allowing us to remove the quirk from
      __bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
      while reducing code duplication on this matter in current(/future) eBPF
      JITs.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Tested-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Zi Shen Lim <zlim.lnx@gmail.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Acked-by: NYang Shi <yang.shi@linaro.org>
      Acked-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b614aeb
  4. 03 10月, 2015 1 次提交
  5. 12 8月, 2015 1 次提交
  6. 31 7月, 2015 1 次提交
  7. 30 7月, 2015 5 次提交
  8. 27 7月, 2015 1 次提交
  9. 21 7月, 2015 1 次提交
    • A
      bpf: introduce bpf_skb_vlan_push/pop() helpers · 4e10df9a
      Alexei Starovoitov 提交于
      Allow eBPF programs attached to TC qdiscs call skb_vlan_push/pop via
      helper functions. These functions may change skb->data/hlen which are
      cached by some JITs to improve performance of ld_abs/ld_ind instructions.
      Therefore JITs need to recognize bpf_skb_vlan_push/pop() calls,
      re-compute header len and re-cache skb->data/hlen back into cpu registers.
      Note, skb->data/hlen are not directly accessible from the programs,
      so any changes to skb->data done either by these helpers or by other
      TC actions are safe.
      
      eBPF JIT supported by three architectures:
      - arm64 JIT is using bpf_load_pointer() without caching, so it's ok as-is.
      - x64 JIT re-caches skb->data/hlen unconditionally after vlan_push/pop calls
        (experiments showed that conditional re-caching is slower).
      - s390 JIT falls back to interpreter for now when bpf_skb_vlan_push() is present
        in the program (re-caching is tbd).
      
      These helpers allow more scalable handling of vlan from the programs.
      Instead of creating thousands of vlan netdevs on top of eth0 and attaching
      TC+ingress+bpf to all of them, the program can be attached to eth0 directly
      and manipulate vlans as necessary.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e10df9a
  10. 25 6月, 2015 1 次提交
    • M
      s390/bpf: Fix backward jumps · b035b60d
      Michael Holzheu 提交于
      Currently all backward jumps crash for JITed s390x eBPF programs
      with an illegal instruction program check and kernel panic. Because
      for negative values the opcode of the jump instruction is overriden
      by the negative branch offset an illegal instruction is generated
      by the JIT:
      
       000003ff802da378: c01100000002   lgfi    %r1,2
       000003ff802da37e: fffffff52065   unknown <-- illegal instruction
       000003ff802da384: b904002e       lgr     %r2,%r14
      
      So fix this and mask the offset in order not to damage the opcode.
      
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b035b60d
  11. 10 6月, 2015 1 次提交
    • M
      s390/bpf: implement bpf_tail_call() helper · 6651ee07
      Michael Holzheu 提交于
      bpf_tail_call() arguments:
      
       - ctx......: Context pointer
       - jmp_table: One of BPF_MAP_TYPE_PROG_ARRAY maps used as the jump table
       - index....: Index in the jump table
      
      In this implementation s390x JIT does stack unwinding and jumps into the
      callee program prologue. Caller and callee use the same stack.
      
      With this patch a tail call generates the following code on s390x:
      
       if (index >= array->map.max_entries)
               goto out
       000003ff8001c7e4: e31030100016   llgf    %r1,16(%r3)
       000003ff8001c7ea: ec41001fa065   clgrj   %r4,%r1,10,3ff8001c828
      
       if (tail_call_cnt++ > MAX_TAIL_CALL_CNT)
               goto out;
       000003ff8001c7f0: a7080001       lhi     %r0,1
       000003ff8001c7f4: eb10f25000fa   laal    %r1,%r0,592(%r15)
       000003ff8001c7fa: ec120017207f   clij    %r1,32,2,3ff8001c828
      
       prog = array->prog[index];
       if (prog == NULL)
               goto out;
       000003ff8001c800: eb140003000d   sllg    %r1,%r4,3
       000003ff8001c806: e31310800004   lg      %r1,128(%r3,%r1)
       000003ff8001c80c: ec18000e007d   clgij   %r1,0,8,3ff8001c828
      
       Restore registers before calling function
       000003ff8001c812: eb68f2980004   lmg     %r6,%r8,664(%r15)
       000003ff8001c818: ebbff2c00004   lmg     %r11,%r15,704(%r15)
      
       goto *(prog->bpf_func + tail_call_start);
       000003ff8001c81e: e31100200004   lg      %r1,32(%r1,%r0)
       000003ff8001c824: 47f01006       bc      15,6(%r1)
      Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6651ee07
  12. 04 6月, 2015 1 次提交
  13. 30 4月, 2015 2 次提交
    • M
      s390/bpf: Fix gcov stack space problem · b9b4b1ce
      Michael Holzheu 提交于
      When compiling the kernel for GCOV (CONFIG_GCOV_KERNEL,-fprofile-arcs),
      gcc allocates a lot of stack space because of the large switch statement
      in bpf_jit_insn().
      
      This leads to the following compile warning:
      
       arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_prog':
       arch/s390/net/bpf_jit_comp.c:1144:1: warning: frame size of
        function 'bpf_jit_prog' is 12592 bytes which is more than
        half the stack size. The dynamic check would not be reliable.
        No check emitted for this function.
      
       arch/s390/net/bpf_jit_comp.c:1144:1: warning: the frame size of 12504
        bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      And indead gcc allocates 12592 bytes of stack space:
      
       # objdump -d arch/s390/net/bpf_jit_comp.o
       ...
       0000000000000c60 <bpf_jit_prog>:
           c60:       eb 6f f0 48 00 24       stmg    %r6,%r15,72(%r15)
           c66:       b9 04 00 ef             lgr     %r14,%r15
           c6a:       e3 f0 fe d0 fc 71       lay     %r15,-12592(%r15)
      
      As a workaround of that problem we now define bpf_jit_insn() as
      noinline which then reduces the stack space.
      
       # objdump -d arch/s390/net/bpf_jit_comp.o
       ...
       0000000000000070 <bpf_jit_insn>:
            70:       eb 6f f0 48 00 24       stmg    %r6,%r15,72(%r15)
            76:       c0 d0 00 00 00 00       larl    %r13,76 <bpf_jit_insn+0x6>
            7c:       a7 f1 3f 80             tmll    %r15,16256
            80:       b9 04 00 ef             lgr     %r14,%r15
            84:       e3 f0 ff a0 ff 71       lay     %r15,-96(%r15)
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b9b4b1ce
    • M
      s390/bpf: Adjust ALU64_DIV/MOD to match interpreter change · 771aada9
      Michael Holzheu 提交于
      The s390x ALU64_DIV/MOD has been implemented according to the eBPF
      interpreter specification that used do_div(). This function does a 64-bit
      by 32-bit divide. It turned out that this was wrong and now the interpreter
      uses div64_u64_rem() for full 64-bit division.
      
      So fix this and use full 64-bit division in the s390x eBPF backend code.
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      771aada9
  14. 15 4月, 2015 1 次提交
  15. 15 1月, 2015 1 次提交
  16. 09 1月, 2015 2 次提交
  17. 14 9月, 2014 1 次提交
  18. 10 9月, 2014 2 次提交
    • D
      net: bpf: be friendly to kmemcheck · 286aad3c
      Daniel Borkmann 提交于
      Reported by Mikulas Patocka, kmemcheck currently barks out a
      false positive since we don't have special kmemcheck annotation
      for bitfields used in bpf_prog structure.
      
      We currently have jited:1, len:31 and thus when accessing len
      while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
      we're reading uninitialized memory.
      
      As we don't need the whole bit universe for pages member, we
      can just split it to u16 and use a bool flag for jited instead
      of a bitfield.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      286aad3c
    • D
      net: bpf: consolidate JIT binary allocator · 738cbe72
      Daniel Borkmann 提交于
      Introduced in commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
      against spraying attacks") and later on replicated in aa2d2c73
      ("s390/bpf,jit: address randomize and write protect jit code") for
      s390 architecture, write protection for BPF JIT images got added and
      a random start address of the JIT code, so that it's not on a page
      boundary anymore.
      
      Since both use a very similar allocator for the BPF binary header,
      we can consolidate this code into the BPF core as it's mostly JIT
      independant anyway.
      
      This will also allow for future archs that support DEBUG_SET_MODULE_RONX
      to just reuse instead of reimplementing it.
      
      JIT tested on x86_64 and s390x with BPF test suite.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      738cbe72
  19. 06 9月, 2014 1 次提交
    • D
      net: bpf: make eBPF interpreter images read-only · 60a3b225
      Daniel Borkmann 提交于
      With eBPF getting more extended and exposure to user space is on it's way,
      hardening the memory range the interpreter uses to steer its command flow
      seems appropriate.  This patch moves the to be interpreted bytecode to
      read-only pages.
      
      In case we execute a corrupted BPF interpreter image for some reason e.g.
      caused by an attacker which got past a verifier stage, it would not only
      provide arbitrary read/write memory access but arbitrary function calls
      as well. After setting up the BPF interpreter image, its contents do not
      change until destruction time, thus we can setup the image on immutable
      made pages in order to mitigate modifications to that code. The idea
      is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
      against spraying attacks").
      
      This is possible because bpf_prog is not part of sk_filter anymore.
      After setup bpf_prog cannot be altered during its life-time. This prevents
      any modifications to the entire bpf_prog structure (incl. function/JIT
      image pointer).
      
      Every eBPF program (including classic BPF that are migrated) have to call
      bpf_prog_select_runtime() to select either interpreter or a JIT image
      as a last setup step, and they all are being freed via bpf_prog_free(),
      including non-JIT. Therefore, we can easily integrate this into the
      eBPF life-time, plus since we directly allocate a bpf_prog, we have no
      performance penalty.
      
      Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
      inspection of kernel_page_tables.  Brad Spengler proposed the same idea
      via Twitter during development of this patch.
      
      Joint work with Hannes Frederic Sowa.
      Suggested-by: NBrad Spengler <spender@grsecurity.net>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60a3b225
  20. 03 8月, 2014 1 次提交
    • A
      net: filter: split 'struct sk_filter' into socket and bpf parts · 7ae457c1
      Alexei Starovoitov 提交于
      clean up names related to socket filtering and bpf in the following way:
      - everything that deals with sockets keeps 'sk_*' prefix
      - everything that is pure BPF is changed to 'bpf_*' prefix
      
      split 'struct sk_filter' into
      struct sk_filter {
      	atomic_t        refcnt;
      	struct rcu_head rcu;
      	struct bpf_prog *prog;
      };
      and
      struct bpf_prog {
              u32                     jited:1,
                                      len:31;
              struct sock_fprog_kern  *orig_prog;
              unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                                  const struct bpf_insn *filter);
              union {
                      struct sock_filter      insns[0];
                      struct bpf_insn         insnsi[0];
                      struct work_struct      work;
              };
      };
      so that 'struct bpf_prog' can be used independent of sockets and cleans up
      'unattached' bpf use cases
      
      split SK_RUN_FILTER macro into:
          SK_RUN_FILTER to be used with 'struct sk_filter *' and
          BPF_PROG_RUN to be used with 'struct bpf_prog *'
      
      __sk_filter_release(struct sk_filter *) gains
      __bpf_prog_release(struct bpf_prog *) helper function
      
      also perform related renames for the functions that work
      with 'struct bpf_prog *', since they're on the same lines:
      
      sk_filter_size -> bpf_prog_size
      sk_filter_select_runtime -> bpf_prog_select_runtime
      sk_filter_free -> bpf_prog_free
      sk_unattached_filter_create -> bpf_prog_create
      sk_unattached_filter_destroy -> bpf_prog_destroy
      sk_store_orig_filter -> bpf_prog_store_orig_filter
      sk_release_orig_filter -> bpf_release_orig_filter
      __sk_migrate_filter -> bpf_migrate_filter
      __sk_prepare_filter -> bpf_prepare_filter
      
      API for attaching classic BPF to a socket stays the same:
      sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
      and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
      which is used by sockets, tun, af_packet
      
      API for 'unattached' BPF programs becomes:
      bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
      and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
      which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ae457c1
  21. 02 6月, 2014 1 次提交
    • D
      net: filter: get rid of BPF_S_* enum · 34805931
      Daniel Borkmann 提交于
      This patch finally allows us to get rid of the BPF_S_* enum.
      Currently, the code performs unnecessary encode and decode
      workarounds in seccomp and filter migration itself when a filter
      is being attached in order to overcome BPF_S_* encoding which
      is not used anymore by the new interpreter resp. JIT compilers.
      
      Keeping it around would mean that also in future we would need
      to extend and maintain this enum and related encoders/decoders.
      We can get rid of all that and save us these operations during
      filter attaching. Naturally, also JIT compilers need to be updated
      by this.
      
      Before JIT conversion is being done, each compiler checks if A
      is being loaded at startup to obtain information if it needs to
      emit instructions to clear A first. Since BPF extensions are a
      subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements
      for extensions can be removed at that point. To ease and minimalize
      code changes in the classic JITs, we have introduced bpf_anc_helper().
      
      Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int),
      arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we
      unfortunately didn't have access, but changes are analogous to
      the rest.
      
      Joint work with Alexei Starovoitov.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mircea Gherzan <mgherzan@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NChema Gonzalez <chemag@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34805931
  22. 15 5月, 2014 1 次提交
    • H
      net: filter: s390: fix JIT address randomization · e84d2f8d
      Heiko Carstens 提交于
      This is the s390 variant of Alexei's JIT bug fix.
      (patch description below stolen from Alexei's patch)
      
      bpf_alloc_binary() adds 128 bytes of room to JITed program image
      and rounds it up to the nearest page size. If image size is close
      to page size (like 4000), it is rounded to two pages:
      round_up(4000 + 4 + 128) == 8192
      then 'hole' is computed as 8192 - (4000 + 4) = 4188
      If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header)
      then kernel will crash during bpf_jit_free():
      
      kernel BUG at arch/x86/mm/pageattr.c:887!
      Call Trace:
       [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460
       [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50
       [<ffffffff810378ff>] set_memory_rw+0x2f/0x40
       [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60
       [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0
       [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0
       [<ffffffff8106c90c>] worker_thread+0x11c/0x370
      
      since bpf_jit_free() does:
        unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
        struct bpf_binary_header *header = (void *)addr;
      to compute start address of 'bpf_binary_header'
      and header->pages will pass junk to:
        set_memory_rw(addr, header->pages);
      
      Fix it by making sure that &header->image[prandom_u32() % hole] and &header
      are in the same page.
      
      Fixes: aa2d2c73 ("s390/bpf,jit: address randomize and write protect jit code")
      Reported-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: <stable@vger.kernel.org> # v3.11+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e84d2f8d
  23. 25 4月, 2014 1 次提交
  24. 31 3月, 2014 1 次提交
    • D
      net: filter: add jited flag to indicate jit compiled filters · f8bbbfc3
      Daniel Borkmann 提交于
      This patch adds a jited flag into sk_filter struct in order to indicate
      whether a filter is currently jited or not. The size of sk_filter is
      not being expanded as the 32 bit 'len' member allows upper bits to be
      reused since a filter can currently only grow as large as BPF_MAXINSNS.
      
      Therefore, there's enough room also for other in future needed flags to
      reuse 'len' field if necessary. The jited flag also allows for having
      alternative interpreter functions running as currently, we can only
      detect jit compiled filters by testing fp->bpf_func to not equal the
      address of sk_run_filter().
      
      Joint work with Alexei Starovoitov.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8bbbfc3
  25. 27 3月, 2014 1 次提交
  26. 18 1月, 2014 1 次提交
    • H
      s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions · 3af57f78
      Heiko Carstens 提交于
      The s390 bpf jit compiler emits the signed divide instructions "dr" and "d"
      for unsigned divisions.
      This can cause problems: the dividend will be zero extended to a 64 bit value
      and the divisor is the 32 bit signed value as specified A or X accumulator,
      even though A and X are supposed to be treated as unsigned values.
      
      The divide instrunctions will generate an exception if the result cannot be
      expressed with a 32 bit signed value.
      This is the case if e.g. the dividend is 0xffffffff and the divisor either 1
      or also 0xffffffff (signed: -1).
      
      To avoid all these issues simply use unsigned divide instructions.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3af57f78
  27. 16 1月, 2014 1 次提交
  28. 24 10月, 2013 2 次提交
  29. 08 10月, 2013 1 次提交
    • A
      net: fix unsafe set_memory_rw from softirq · d45ed4a4
      Alexei Starovoitov 提交于
      on x86 system with net.core.bpf_jit_enable = 1
      
      sudo tcpdump -i eth1 'tcp port 22'
      
      causes the warning:
      [   56.766097]  Possible unsafe locking scenario:
      [   56.766097]
      [   56.780146]        CPU0
      [   56.786807]        ----
      [   56.793188]   lock(&(&vb->lock)->rlock);
      [   56.799593]   <Interrupt>
      [   56.805889]     lock(&(&vb->lock)->rlock);
      [   56.812266]
      [   56.812266]  *** DEADLOCK ***
      [   56.812266]
      [   56.830670] 1 lock held by ksoftirqd/1/13:
      [   56.836838]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
      [   56.849757]
      [   56.849757] stack backtrace:
      [   56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
      [   56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [   56.882004]  ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
      [   56.895630]  ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
      [   56.909313]  ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
      [   56.923006] Call Trace:
      [   56.929532]  [<ffffffff8175a145>] dump_stack+0x55/0x76
      [   56.936067]  [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
      [   56.942445]  [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
      [   56.948932]  [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
      [   56.955470]  [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
      [   56.961945]  [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
      [   56.968474]  [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
      [   56.975140]  [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
      [   56.981942]  [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
      [   56.988745]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   56.995619]  [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
      [   57.002493]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   57.009447]  [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
      [   57.016477]  [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
      [   57.023607]  [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
      [   57.030818]  [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
      [   57.037896]  [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
      [   57.044789]  [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
      [   57.051720]  [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
      [   57.058727]  [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
      [   57.065577]  [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
      [   57.072338]  [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
      [   57.078962]  [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
      [   57.085373]  [<ffffffff81058245>] run_ksoftirqd+0x35/0x70
      
      cannot reuse jited filter memory, since it's readonly,
      so use original bpf insns memory to hold work_struct
      
      defer kfree of sk_filter until jit completed freeing
      
      tested on x86_64 and i386
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d45ed4a4
  30. 04 9月, 2013 1 次提交
    • H
      s390/bpf,jit: fix address randomization · 4784955a
      Heiko Carstens 提交于
      Add misssing braces to hole calculation. This resulted in an addition
      instead of an substraction. Which in turn means that the jit compiler
      could try to write out of bounds of the allocated piece of memory.
      
      This bug was introduced with aa2d2c73 "s390/bpf,jit: address randomize
      and write protect jit code".
      
      Fixes this one:
      
      [   37.320956] Unable to handle kernel pointer dereference at virtual kernel address 000003ff80231000
      [   37.320984] Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [   37.320993] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm autofs4
      [   37.321007] CPU: 28 PID: 6443 Comm: multipathd Not tainted 3.10.9-61.x.20130829-s390xdefault #1
      [   37.321011] task: 0000004ada778000 ti: 0000004ae3304000 task.ti: 0000004ae3304000
      [   37.321014] Krnl PSW : 0704c00180000000 000000000012d1de (bpf_jit_compile+0x198e/0x23d0)
      [   37.321022]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
                     Krnl GPRS: 000000004350207d 0000004a00000001 0000000000000007 000003ff80231002
      [   37.321029]            0000000000000007 000003ff80230ffe 00000000a7740000 000003ff80230f76
      [   37.321032]            000003ffffffffff 000003ff00000000 000003ff0000007d 000000000071e820
      [   37.321035]            0000004adbe99950 000000000071ea18 0000004af3d9e7c0 0000004ae3307b80
      [   37.321046] Krnl Code: 000000000012d1d0: 41305004            la      %r3,4(%r5)
                                000000000012d1d4: e330f0f80021        clg     %r3,248(%r15)
                               #000000000012d1da: a7240009            brc     2,12d1ec
                               >000000000012d1de: 50805000            st      %r8,0(%r5)
                                000000000012d1e2: e330f0f00004        lg      %r3,240(%r15)
                                000000000012d1e8: 41303004            la      %r3,4(%r3)
                                000000000012d1ec: e380f0e00004        lg      %r8,224(%r15)
                                000000000012d1f2: e330f0f00024        stg     %r3,240(%r15)
      [   37.321074] Call Trace:
      [   37.321077] ([<000000000012da78>] bpf_jit_compile+0x2228/0x23d0)
      [   37.321083]  [<00000000006007c2>] sk_attach_filter+0xfe/0x214
      [   37.321090]  [<00000000005d2d92>] sock_setsockopt+0x926/0xbdc
      [   37.321097]  [<00000000005cbfb6>] SyS_setsockopt+0x8a/0xe8
      [   37.321101]  [<00000000005ccaa8>] SyS_socketcall+0x264/0x364
      [   37.321106]  [<0000000000713f1c>] sysc_nr_ok+0x22/0x28
      [   37.321113]  [<000003fffce10ea8>] 0x3fffce10ea8
      [   37.321118] INFO: lockdep is turned off.
      [   37.321121] Last Breaking-Event-Address:
      [   37.321124]  [<000000000012d192>] bpf_jit_compile+0x1942/0x23d0
      [   37.321132]
      [   37.321135] Kernel panic - not syncing: Fatal exception: panic_on_oops
      
      Cc: stable@vger.kernel.org # v3.11
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      4784955a
  31. 18 7月, 2013 1 次提交