1. 14 8月, 2020 9 次提交
    • A
      selftest/bpf: Fix compilation warnings in 32-bit mode · 9028bbcc
      Andrii Nakryiko 提交于
      Fix compilation warnings emitted when compiling selftests for 32-bit platform
      (x86 in my case).
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200813204945.1020225-3-andriin@fb.com
      9028bbcc
    • A
      tools/bpftool: Fix compilation warnings in 32-bit mode · 09f44b75
      Andrii Nakryiko 提交于
      Fix few compilation warnings in bpftool when compiling in 32-bit mode.
      Abstract away u64 to pointer conversion into a helper function.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200813204945.1020225-2-andriin@fb.com
      09f44b75
    • J
      doc: Add link to bpf helpers man page · a62f68c1
      Joe Stringer 提交于
      The bpf-helpers(7) man pages provide an invaluable description of the
      functions that an eBPF program can call at runtime. Link them here.
      Signed-off-by: NJoe Stringer <joe@wand.net.nz>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200813180807.2821735-1-joe@wand.net.nz
      a62f68c1
    • J
      bpf, selftests: Add tests to sock_ops for loading sk · 9efa9e49
      John Fastabend 提交于
      Add tests to directly accesse sock_ops sk field. Then use it to
      ensure a bad pointer access will fault if something goes wrong.
      We do three tests:
      
      The first test ensures when we read sock_ops sk pointer into the
      same register that we don't fault as described earlier. Here r9
      is chosen as the temp register.  The xlated code is,
      
        36: (7b) *(u64 *)(r1 +32) = r9
        37: (61) r9 = *(u32 *)(r1 +28)
        38: (15) if r9 == 0x0 goto pc+3
        39: (79) r9 = *(u64 *)(r1 +32)
        40: (79) r1 = *(u64 *)(r1 +0)
        41: (05) goto pc+1
        42: (79) r9 = *(u64 *)(r1 +32)
      
      The second test ensures the temp register selection does not collide
      with in-use register r9. Shown here r8 is chosen because r9 is the
      sock_ops pointer. The xlated code is as follows,
      
        46: (7b) *(u64 *)(r9 +32) = r8
        47: (61) r8 = *(u32 *)(r9 +28)
        48: (15) if r8 == 0x0 goto pc+3
        49: (79) r8 = *(u64 *)(r9 +32)
        50: (79) r9 = *(u64 *)(r9 +0)
        51: (05) goto pc+1
        52: (79) r8 = *(u64 *)(r9 +32)
      
      And finally, ensure we didn't break the base case where dst_reg does
      not equal the source register,
      
        56: (61) r2 = *(u32 *)(r1 +28)
        57: (15) if r2 == 0x0 goto pc+1
        58: (79) r2 = *(u64 *)(r1 +0)
      
      Notice it takes us an extra four instructions when src reg is the
      same as dst reg. One to save the reg, two to restore depending on
      the branch taken and a goto to jump over the second restore.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/159718355325.4728.4163036953345999636.stgit@john-Precision-5820-Tower
      9efa9e49
    • J
      bpf, selftests: Add tests for sock_ops load with r9, r8.r7 registers · 8e0c1517
      John Fastabend 提交于
      Loads in sock_ops case when using high registers requires extra logic to
      ensure the correct temporary value is used. We need to ensure the temp
      register does not use either the src_reg or dst_reg. Lets add an asm
      test to force the logic is triggered.
      
      The xlated code is here,
      
        30: (7b) *(u64 *)(r9 +32) = r7
        31: (61) r7 = *(u32 *)(r9 +28)
        32: (15) if r7 == 0x0 goto pc+2
        33: (79) r7 = *(u64 *)(r9 +0)
        34: (63) *(u32 *)(r7 +916) = r8
        35: (79) r7 = *(u64 *)(r9 +32)
      
      Notice r9 and r8 are not used for temp registers and r7 is chosen.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/159718353345.4728.8805043614257933227.stgit@john-Precision-5820-Tower
      8e0c1517
    • J
      bpf, selftests: Add tests for ctx access in sock_ops with single register · 86ed4be6
      John Fastabend 提交于
      To verify fix ("bpf: sock_ops ctx access may stomp registers in corner case")
      we want to force compiler to generate the following code when accessing a
      field with BPF_TCP_SOCK_GET_COMMON,
      
           r1 = *(u32 *)(r1 + 96) // r1 is skops ptr
      
      Rather than depend on clang to do this we add the test with inline asm to
      the tcpbpf test. This saves us from having to create another runner and
      ensures that if we break this again test_tcpbpf will crash.
      
      With above code we get the xlated code,
      
        11: (7b) *(u64 *)(r1 +32) = r9
        12: (61) r9 = *(u32 *)(r1 +28)
        13: (15) if r9 == 0x0 goto pc+4
        14: (79) r9 = *(u64 *)(r1 +32)
        15: (79) r1 = *(u64 *)(r1 +0)
        16: (61) r1 = *(u32 *)(r1 +2348)
        17: (05) goto pc+1
        18: (79) r9 = *(u64 *)(r1 +32)
      
      We also add the normal case where src_reg != dst_reg so we can compare
      code generation easily from llvm-objdump and ensure that case continues
      to work correctly. The normal code is xlated to,
      
        20: (b7) r1 = 0
        21: (61) r1 = *(u32 *)(r3 +28)
        22: (15) if r1 == 0x0 goto pc+2
        23: (79) r1 = *(u64 *)(r3 +0)
        24: (61) r1 = *(u32 *)(r1 +2348)
      
      Where the temp variable is not used.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/159718351457.4728.3295119261717842496.stgit@john-Precision-5820-Tower
      86ed4be6
    • J
      bpf: sock_ops sk access may stomp registers when dst_reg = src_reg · 84f44df6
      John Fastabend 提交于
      Similar to patch ("bpf: sock_ops ctx access may stomp registers") if the
      src_reg = dst_reg when reading the sk field of a sock_ops struct we
      generate xlated code,
      
        53: (61) r9 = *(u32 *)(r9 +28)
        54: (15) if r9 == 0x0 goto pc+3
        56: (79) r9 = *(u64 *)(r9 +0)
      
      This stomps on the r9 reg to do the sk_fullsock check and then when
      reading the skops->sk field instead of the sk pointer we get the
      sk_fullsock. To fix use similar pattern noted in the previous fix
      and use the temp field to save/restore a register used to do
      sk_fullsock check.
      
      After the fix the generated xlated code reads,
      
        52: (7b) *(u64 *)(r9 +32) = r8
        53: (61) r8 = *(u32 *)(r9 +28)
        54: (15) if r9 == 0x0 goto pc+3
        55: (79) r8 = *(u64 *)(r9 +32)
        56: (79) r9 = *(u64 *)(r9 +0)
        57: (05) goto pc+1
        58: (79) r8 = *(u64 *)(r9 +32)
      
      Here r9 register was in-use so r8 is chosen as the temporary register.
      In line 52 r8 is saved in temp variable and at line 54 restored in case
      fullsock != 0. Finally we handle fullsock == 0 case by restoring at
      line 58.
      
      This adds a new macro SOCK_OPS_GET_SK it is almost possible to merge
      this with SOCK_OPS_GET_FIELD, but I found the extra branch logic a
      bit more confusing than just adding a new macro despite a bit of
      duplicating code.
      
      Fixes: 1314ef56 ("bpf: export bpf_sock for BPF_PROG_TYPE_SOCK_OPS prog type")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/159718349653.4728.6559437186853473612.stgit@john-Precision-5820-Tower
      84f44df6
    • J
      bpf: sock_ops ctx access may stomp registers in corner case · fd09af01
      John Fastabend 提交于
      I had a sockmap program that after doing some refactoring started spewing
      this splat at me:
      
      [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
      [...]
      [18610.807359] Call Trace:
      [18610.807370]  ? 0xffffffffc114d0d5
      [18610.807382]  __cgroup_bpf_run_filter_sock_ops+0x7d/0xb0
      [18610.807391]  tcp_connect+0x895/0xd50
      [18610.807400]  tcp_v4_connect+0x465/0x4e0
      [18610.807407]  __inet_stream_connect+0xd6/0x3a0
      [18610.807412]  ? __inet_stream_connect+0x5/0x3a0
      [18610.807417]  inet_stream_connect+0x3b/0x60
      [18610.807425]  __sys_connect+0xed/0x120
      
      After some debugging I was able to build this simple reproducer,
      
       __section("sockops/reproducer_bad")
       int bpf_reproducer_bad(struct bpf_sock_ops *skops)
       {
              volatile __maybe_unused __u32 i = skops->snd_ssthresh;
              return 0;
       }
      
      And along the way noticed that below program ran without splat,
      
      __section("sockops/reproducer_good")
      int bpf_reproducer_good(struct bpf_sock_ops *skops)
      {
              volatile __maybe_unused __u32 i = skops->snd_ssthresh;
              volatile __maybe_unused __u32 family;
      
              compiler_barrier();
      
              family = skops->family;
              return 0;
      }
      
      So I decided to check out the code we generate for the above two
      programs and noticed each generates the BPF code you would expect,
      
      0000000000000000 <bpf_reproducer_bad>:
      ;       volatile __maybe_unused __u32 i = skops->snd_ssthresh;
             0:       r1 = *(u32 *)(r1 + 96)
             1:       *(u32 *)(r10 - 4) = r1
      ;       return 0;
             2:       r0 = 0
             3:       exit
      
      0000000000000000 <bpf_reproducer_good>:
      ;       volatile __maybe_unused __u32 i = skops->snd_ssthresh;
             0:       r2 = *(u32 *)(r1 + 96)
             1:       *(u32 *)(r10 - 4) = r2
      ;       family = skops->family;
             2:       r1 = *(u32 *)(r1 + 20)
             3:       *(u32 *)(r10 - 8) = r1
      ;       return 0;
             4:       r0 = 0
             5:       exit
      
      So we get reasonable assembly, but still something was causing the null
      pointer dereference. So, we load the programs and dump the xlated version
      observing that line 0 above 'r* = *(u32 *)(r1 +96)' is going to be
      translated by the skops access helpers.
      
      int bpf_reproducer_bad(struct bpf_sock_ops * skops):
      ; volatile __maybe_unused __u32 i = skops->snd_ssthresh;
         0: (61) r1 = *(u32 *)(r1 +28)
         1: (15) if r1 == 0x0 goto pc+2
         2: (79) r1 = *(u64 *)(r1 +0)
         3: (61) r1 = *(u32 *)(r1 +2340)
      ; volatile __maybe_unused __u32 i = skops->snd_ssthresh;
         4: (63) *(u32 *)(r10 -4) = r1
      ; return 0;
         5: (b7) r0 = 0
         6: (95) exit
      
      int bpf_reproducer_good(struct bpf_sock_ops * skops):
      ; volatile __maybe_unused __u32 i = skops->snd_ssthresh;
         0: (61) r2 = *(u32 *)(r1 +28)
         1: (15) if r2 == 0x0 goto pc+2
         2: (79) r2 = *(u64 *)(r1 +0)
         3: (61) r2 = *(u32 *)(r2 +2340)
      ; volatile __maybe_unused __u32 i = skops->snd_ssthresh;
         4: (63) *(u32 *)(r10 -4) = r2
      ; family = skops->family;
         5: (79) r1 = *(u64 *)(r1 +0)
         6: (69) r1 = *(u16 *)(r1 +16)
      ; family = skops->family;
         7: (63) *(u32 *)(r10 -8) = r1
      ; return 0;
         8: (b7) r0 = 0
         9: (95) exit
      
      Then we look at lines 0 and 2 above. In the good case we do the zero
      check in r2 and then load 'r1 + 0' at line 2. Do a quick cross-check
      into the bpf_sock_ops check and we can confirm that is the 'struct
      sock *sk' pointer field. But, in the bad case,
      
         0: (61) r1 = *(u32 *)(r1 +28)
         1: (15) if r1 == 0x0 goto pc+2
         2: (79) r1 = *(u64 *)(r1 +0)
      
      Oh no, we read 'r1 +28' into r1, this is skops->fullsock and then in
      line 2 we read the 'r1 +0' as a pointer. Now jumping back to our spat,
      
      [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
      
      The 0x01 makes sense because that is exactly the fullsock value. And
      its not a valid dereference so we splat.
      
      To fix we need to guard the case when a program is doing a sock_ops field
      access with src_reg == dst_reg. This is already handled in the load case
      where the ctx_access handler uses a tmp register being careful to
      store the old value and restore it. To fix the get case test if
      src_reg == dst_reg and in this case do the is_fullsock test in the
      temporary register. Remembering to restore the temporary register before
      writing to either dst_reg or src_reg to avoid smashing the pointer into
      the struct holding the tmp variable.
      
      Adding this inline code to test_tcpbpf_kern will now be generated
      correctly from,
      
        9: r2 = *(u32 *)(r2 + 96)
      
      to xlated code,
      
        12: (7b) *(u64 *)(r2 +32) = r9
        13: (61) r9 = *(u32 *)(r2 +28)
        14: (15) if r9 == 0x0 goto pc+4
        15: (79) r9 = *(u64 *)(r2 +32)
        16: (79) r2 = *(u64 *)(r2 +0)
        17: (61) r2 = *(u32 *)(r2 +2348)
        18: (05) goto pc+1
        19: (79) r9 = *(u64 *)(r2 +32)
      
      And in the normal case we keep the original code, because really this
      is an edge case. From this,
      
        9: r2 = *(u32 *)(r6 + 96)
      
      to xlated code,
      
        22: (61) r2 = *(u32 *)(r6 +28)
        23: (15) if r2 == 0x0 goto pc+2
        24: (79) r2 = *(u64 *)(r6 +0)
        25: (61) r2 = *(u32 *)(r2 +2348)
      
      So three additional instructions if dst == src register, but I scanned
      my current code base and did not see this pattern anywhere so should
      not be a big deal. Further, it seems no one else has hit this or at
      least reported it so it must a fairly rare pattern.
      
      Fixes: 9b1f3d6e ("bpf: Refactor sock_ops_convert_ctx_access")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/159718347772.4728.2781381670567919577.stgit@john-Precision-5820-Tower
      fd09af01
    • T
      libbpf: Prevent overriding errno when logging errors · 23ab656b
      Toke Høiland-Jørgensen 提交于
      Turns out there were a few more instances where libbpf didn't save the
      errno before writing an error message, causing errno to be overridden by
      the printf() return and the error disappearing if logging is enabled.
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200813142905.160381-1-toke@redhat.com
      23ab656b
  2. 13 8月, 2020 4 次提交
  3. 11 8月, 2020 8 次提交
  4. 09 8月, 2020 6 次提交
  5. 08 8月, 2020 9 次提交
  6. 07 8月, 2020 4 次提交
    • X
      drivers/net/wan/lapbether: Added needed_headroom and a skb->len check · c7ca03c2
      Xie He 提交于
      1. Added a skb->len check
      
      This driver expects upper layers to include a pseudo header of 1 byte
      when passing down a skb for transmission. This driver will read this
      1-byte header. This patch added a skb->len check before reading the
      header to make sure the header exists.
      
      2. Changed to use needed_headroom instead of hard_header_len to request
      necessary headroom to be allocated
      
      In net/packet/af_packet.c, the function packet_snd first reserves a
      headroom of length (dev->hard_header_len + dev->needed_headroom).
      Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header,
      which calls dev->header_ops->create, to create the link layer header.
      If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of
      length (dev->hard_header_len), and assumes the user to provide the
      appropriate link layer header.
      
      So according to the logic of af_packet.c, dev->hard_header_len should
      be the length of the header that would be created by
      dev->header_ops->create.
      
      However, this driver doesn't provide dev->header_ops, so logically
      dev->hard_header_len should be 0.
      
      So we should use dev->needed_headroom instead of dev->hard_header_len
      to request necessary headroom to be allocated.
      
      This change fixes kernel panic when this driver is used with AF_PACKET
      SOCK_RAW sockets.
      
      Call stack when panic:
      
      [  168.399197] skbuff: skb_under_panic: text:ffffffff819d95fb len:20
      put:14 head:ffff8882704c0a00 data:ffff8882704c09fd tail:0x11 end:0xc0
      dev:veth0
      ...
      [  168.399255] Call Trace:
      [  168.399259]  skb_push.cold+0x14/0x24
      [  168.399262]  eth_header+0x2b/0xc0
      [  168.399267]  lapbeth_data_transmit+0x9a/0xb0 [lapbether]
      [  168.399275]  lapb_data_transmit+0x22/0x2c [lapb]
      [  168.399277]  lapb_transmit_buffer+0x71/0xb0 [lapb]
      [  168.399279]  lapb_kick+0xe3/0x1c0 [lapb]
      [  168.399281]  lapb_data_request+0x76/0xc0 [lapb]
      [  168.399283]  lapbeth_xmit+0x56/0x90 [lapbether]
      [  168.399286]  dev_hard_start_xmit+0x91/0x1f0
      [  168.399289]  ? irq_init_percpu_irqstack+0xc0/0x100
      [  168.399291]  __dev_queue_xmit+0x721/0x8e0
      [  168.399295]  ? packet_parse_headers.isra.0+0xd2/0x110
      [  168.399297]  dev_queue_xmit+0x10/0x20
      [  168.399298]  packet_sendmsg+0xbf0/0x19b0
      ......
      
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: Martin Schiller <ms@dev.tdt.de>
      Cc: Brian Norris <briannorris@chromium.org>
      Signed-off-by: NXie He <xie.he.0141@gmail.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7ca03c2
    • J
      bpf: Fix compilation warning of selftests · 929e54a9
      Jianlin Lv 提交于
      Clang compiler version: 12.0.0
      The following warning appears during the selftests/bpf compilation:
      
      prog_tests/send_signal.c:51:3: warning: ignoring return value of ‘write’,
      declared with attribute warn_unused_result [-Wunused-result]
         51 |   write(pipe_c2p[1], buf, 1);
            |   ^~~~~~~~~~~~~~~~~~~~~~~~~~
      prog_tests/send_signal.c:54:3: warning: ignoring return value of ‘read’,
      declared with attribute warn_unused_result [-Wunused-result]
         54 |   read(pipe_p2c[0], buf, 1);
            |   ^~~~~~~~~~~~~~~~~~~~~~~~~
      ......
      
      prog_tests/stacktrace_build_id_nmi.c:13:2: warning: ignoring return value
      of ‘fscanf’,declared with attribute warn_unused_result [-Wunused-resul]
         13 |  fscanf(f, "%llu", &sample_freq);
            |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      test_tcpnotify_user.c:133:2: warning:ignoring return value of ‘system’,
      declared with attribute warn_unused_result [-Wunused-result]
        133 |  system(test_script);
            |  ^~~~~~~~~~~~~~~~~~~
      test_tcpnotify_user.c:138:2: warning:ignoring return value of ‘system’,
      declared with attribute warn_unused_result [-Wunused-result]
        138 |  system(test_script);
            |  ^~~~~~~~~~~~~~~~~~~
      test_tcpnotify_user.c:143:2: warning:ignoring return value of ‘system’,
      declared with attribute warn_unused_result [-Wunused-result]
        143 |  system(test_script);
            |  ^~~~~~~~~~~~~~~~~~~
      
      Add code that fix compilation warning about ignoring return value and
      handles any errors; Check return value of library`s API make the code
      more secure.
      Signed-off-by: NJianlin Lv <Jianlin.Lv@arm.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200806104224.95306-1-Jianlin.Lv@arm.com
      929e54a9
    • J
      selftests: bpf: Switch off timeout · 6fc5916c
      Jiri Benc 提交于
      Several bpf tests are interrupted by the default timeout of 45 seconds added
      by commit 852c8cbf ("selftests/kselftest/runner.sh: Add 45 second
      timeout per test"). In my case it was test_progs, test_tunnel.sh,
      test_lwt_ip_encap.sh and test_xdping.sh.
      
      There's not much value in having a timeout for bpf tests, switch it off.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/7a9198ed10917f4ecab4a3dd74bcda1200791efd.1596739059.git.jbenc@redhat.com
      6fc5916c
    • S
      bpf: Remove inline from bpf_do_trace_printk · 0d360d64
      Stanislav Fomichev 提交于
      I get the following error during compilation on my side:
      kernel/trace/bpf_trace.c: In function 'bpf_do_trace_printk':
      kernel/trace/bpf_trace.c:386:34: error: function 'bpf_do_trace_printk' can never be inlined because it uses variable argument lists
       static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
                                        ^
      
      Fixes: ac5a72ea ("bpf: Use dedicated bpf_trace_printk event instead of trace_printk()")
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200806182612.1390883-1-sdf@google.com
      0d360d64