1. 29 8月, 2017 7 次提交
    • J
      bpf: sockmap indicate sock events to listeners · 78aeaaef
      John Fastabend 提交于
      After userspace pushes sockets into a sockmap it may not be receiving
      data (assuming stream_{parser|verdict} programs are attached). But, it
      may still want to manage the socks. A common pattern is to poll/select
      for a POLLRDHUP event so we can close the sock.
      
      This patch adds the logic to wake up these listeners.
      
      Also add TCP_SYN_SENT to the list of events to handle. We don't want
      to break the connection just because we happen to be in this state.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78aeaaef
    • J
      bpf: harden sockmap program attach to ensure correct map type · 81374aaa
      John Fastabend 提交于
      When attaching a program to sockmap we need to check map type
      is correct.
      
      Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81374aaa
    • J
      bpf: more SK_SKB selftests · ed85054d
      John Fastabend 提交于
      Tests packet read/writes and additional skb fields.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed85054d
    • J
      bpf: additional sockmap self tests · 6fd28865
      John Fastabend 提交于
      Add some more sockmap tests to cover,
      
       - forwarding to NULL entries
       - more than two maps to test list ops
       - forwarding to different map
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fd28865
    • J
      bpf: sockmap add missing rcu_read_(un)lock in smap_data_ready · d26e597d
      John Fastabend 提交于
      References to psock must be done inside RCU critical section.
      
      Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d26e597d
    • J
      bpf: sockmap, remove STRPARSER map_flags and add multi-map support · 2f857d04
      John Fastabend 提交于
      The addition of map_flags BPF_SOCKMAP_STRPARSER flags was to handle a
      specific use case where we want to have BPF parse program disabled on
      an entry in a sockmap.
      
      However, Alexei found the API a bit cumbersome and I agreed. Lets
      remove the STRPARSER flag and support the use case by allowing socks
      to be in multiple maps. This allows users to create two maps one with
      programs attached and one without. When socks are added to maps they
      now inherit any programs attached to the map. This is a nice
      generalization and IMO improves the API.
      
      The API rules are less ambiguous and do not need a flag:
      
        - When a sock is added to a sockmap we have two cases,
      
           i. The sock map does not have any attached programs so
              we can add sock to map without inheriting bpf programs.
              The sock may exist in 0 or more other maps.
      
          ii. The sock map has an attached BPF program. To avoid duplicate
              bpf programs we only add the sock entry if it does not have
              an existing strparser/verdict attached, returning -EBUSY if
              a program is already attached. Otherwise attach the program
              and inherit strparser/verdict programs from the sock map.
      
      This allows for socks to be in a multiple maps for redirects and
      inherit a BPF program from a single map.
      
      Also this patch simplifies the logic around BPF_{EXIST|NOEXIST|ANY}
      flags. In the original patch I tried to be extra clever and only
      update map entries when necessary. Now I've decided the complexity
      is not worth it. If users constantly update an entry with the same
      sock for no reason (i.e. update an entry without actually changing
      any parameters on map or sock) we still do an alloc/release. Using
      this and allowing multiple entries of a sock to exist in a map the
      logic becomes much simpler.
      
      Note: Now that multiple maps are supported the "maps" pointer called
      when a socket is closed becomes a list of maps to remove the sock from.
      To keep the map up to date when a sock is added to the sockmap we must
      add the map/elem in the list. Likewise when it is removed we must
      remove it from the list. This results in searching the per psock list
      on delete operation. On TCP_CLOSE events we walk the list and remove
      the psock from all map/entry locations. I don't see any perf
      implications in this because at most I have a psock in two maps. If
      a psock were to be in many maps its possibly this might be noticeable
      on delete but I can't think of a reason to dup a psock in many maps.
      The sk_callback_lock is used to protect read/writes to the list. This
      was convenient because in all locations we were taking the lock
      anyways just after working on the list. Also the lock is per sock so
      in normal cases we shouldn't see any contention.
      Suggested-by: NAlexei Starovoitov <ast@kernel.org>
      Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f857d04
    • J
      bpf: convert sockmap field attach_bpf_fd2 to type · 464bc0fd
      John Fastabend 提交于
      In the initial sockmap API we provided strparser and verdict programs
      using a single attach command by extending the attach API with a the
      attach_bpf_fd2 field.
      
      However, if we add other programs in the future we will be adding a
      field for every new possible type, attach_bpf_fd(3,4,..). This
      seems a bit clumsy for an API. So lets push the programs using two
      new type fields.
      
         BPF_SK_SKB_STREAM_PARSER
         BPF_SK_SKB_STREAM_VERDICT
      
      This has the advantage of having a readable name and can easily be
      extended in the future.
      
      Updates to samples and sockmap included here also generalize tests
      slightly to support upcoming patch for multiple map support.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
      Suggested-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      464bc0fd
  2. 28 8月, 2017 1 次提交
  3. 26 8月, 2017 31 次提交
  4. 25 8月, 2017 1 次提交
    • E
      strparser: initialize all callbacks · 3fd87127
      Eric Biggers 提交于
      commit bbb03029 ("strparser: Generalize strparser") added more
      function pointers to 'struct strp_callbacks'; however, kcm_attach() was
      not updated to initialize them.  This could cause the ->lock() and/or
      ->unlock() function pointers to be set to garbage values, causing a
      crash in strp_work().
      
      Fix the bug by moving the callback structs into static memory, so
      unspecified members are zeroed.  Also constify them while we're at it.
      
      This bug was found by syzkaller, which encountered the following splat:
      
          IP: 0x55
          PGD 3b1ca067
          P4D 3b1ca067
          PUD 3b12f067
          PMD 0
      
          Oops: 0010 [#1] SMP KASAN
          Dumping ftrace buffer:
             (ftrace buffer empty)
          Modules linked in:
          CPU: 2 PID: 1194 Comm: kworker/u8:1 Not tainted 4.13.0-rc4-next-20170811 #2
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Workqueue: kstrp strp_work
          task: ffff88006bb0e480 task.stack: ffff88006bb10000
          RIP: 0010:0x55
          RSP: 0018:ffff88006bb17540 EFLAGS: 00010246
          RAX: dffffc0000000000 RBX: ffff88006ce4bd60 RCX: 0000000000000000
          RDX: 1ffff1000d9c97bd RSI: 0000000000000000 RDI: ffff88006ce4bc48
          RBP: ffff88006bb17558 R08: ffffffff81467ab2 R09: 0000000000000000
          R10: ffff88006bb17438 R11: ffff88006bb17940 R12: ffff88006ce4bc48
          R13: ffff88003c683018 R14: ffff88006bb17980 R15: ffff88003c683000
          FS:  0000000000000000(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000055 CR3: 000000003c145000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2098
           worker_thread+0x223/0x1860 kernel/workqueue.c:2233
           kthread+0x35e/0x430 kernel/kthread.c:231
           ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
          Code:  Bad RIP value.
          RIP: 0x55 RSP: ffff88006bb17540
          CR2: 0000000000000055
          ---[ end trace f0e4920047069cee ]---
      
      Here is a C reproducer (requires CONFIG_BPF_SYSCALL=y and
      CONFIG_AF_KCM=y):
      
          #include <linux/bpf.h>
          #include <linux/kcm.h>
          #include <linux/types.h>
          #include <stdint.h>
          #include <sys/ioctl.h>
          #include <sys/socket.h>
          #include <sys/syscall.h>
          #include <unistd.h>
      
          static const struct bpf_insn bpf_insns[3] = {
              { .code = 0xb7 }, /* BPF_MOV64_IMM(0, 0) */
              { .code = 0x95 }, /* BPF_EXIT_INSN() */
          };
      
          static const union bpf_attr bpf_attr = {
              .prog_type = 1,
              .insn_cnt = 2,
              .insns = (uintptr_t)&bpf_insns,
              .license = (uintptr_t)"",
          };
      
          int main(void)
          {
              int bpf_fd = syscall(__NR_bpf, BPF_PROG_LOAD,
                                   &bpf_attr, sizeof(bpf_attr));
              int inet_fd = socket(AF_INET, SOCK_STREAM, 0);
              int kcm_fd = socket(AF_KCM, SOCK_DGRAM, 0);
      
              ioctl(kcm_fd, SIOCKCMATTACH,
                    &(struct kcm_attach) { .fd = inet_fd, .bpf_fd = bpf_fd });
          }
      
      Fixes: bbb03029 ("strparser: Generalize strparser")
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Tom Herbert <tom@quantonium.net>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fd87127