1. 29 8月, 2017 20 次提交
  2. 26 8月, 2017 10 次提交
  3. 25 8月, 2017 10 次提交
    • E
      strparser: initialize all callbacks · 3fd87127
      Eric Biggers 提交于
      commit bbb03029 ("strparser: Generalize strparser") added more
      function pointers to 'struct strp_callbacks'; however, kcm_attach() was
      not updated to initialize them.  This could cause the ->lock() and/or
      ->unlock() function pointers to be set to garbage values, causing a
      crash in strp_work().
      
      Fix the bug by moving the callback structs into static memory, so
      unspecified members are zeroed.  Also constify them while we're at it.
      
      This bug was found by syzkaller, which encountered the following splat:
      
          IP: 0x55
          PGD 3b1ca067
          P4D 3b1ca067
          PUD 3b12f067
          PMD 0
      
          Oops: 0010 [#1] SMP KASAN
          Dumping ftrace buffer:
             (ftrace buffer empty)
          Modules linked in:
          CPU: 2 PID: 1194 Comm: kworker/u8:1 Not tainted 4.13.0-rc4-next-20170811 #2
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Workqueue: kstrp strp_work
          task: ffff88006bb0e480 task.stack: ffff88006bb10000
          RIP: 0010:0x55
          RSP: 0018:ffff88006bb17540 EFLAGS: 00010246
          RAX: dffffc0000000000 RBX: ffff88006ce4bd60 RCX: 0000000000000000
          RDX: 1ffff1000d9c97bd RSI: 0000000000000000 RDI: ffff88006ce4bc48
          RBP: ffff88006bb17558 R08: ffffffff81467ab2 R09: 0000000000000000
          R10: ffff88006bb17438 R11: ffff88006bb17940 R12: ffff88006ce4bc48
          R13: ffff88003c683018 R14: ffff88006bb17980 R15: ffff88003c683000
          FS:  0000000000000000(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000055 CR3: 000000003c145000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2098
           worker_thread+0x223/0x1860 kernel/workqueue.c:2233
           kthread+0x35e/0x430 kernel/kthread.c:231
           ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
          Code:  Bad RIP value.
          RIP: 0x55 RSP: ffff88006bb17540
          CR2: 0000000000000055
          ---[ end trace f0e4920047069cee ]---
      
      Here is a C reproducer (requires CONFIG_BPF_SYSCALL=y and
      CONFIG_AF_KCM=y):
      
          #include <linux/bpf.h>
          #include <linux/kcm.h>
          #include <linux/types.h>
          #include <stdint.h>
          #include <sys/ioctl.h>
          #include <sys/socket.h>
          #include <sys/syscall.h>
          #include <unistd.h>
      
          static const struct bpf_insn bpf_insns[3] = {
              { .code = 0xb7 }, /* BPF_MOV64_IMM(0, 0) */
              { .code = 0x95 }, /* BPF_EXIT_INSN() */
          };
      
          static const union bpf_attr bpf_attr = {
              .prog_type = 1,
              .insn_cnt = 2,
              .insns = (uintptr_t)&bpf_insns,
              .license = (uintptr_t)"",
          };
      
          int main(void)
          {
              int bpf_fd = syscall(__NR_bpf, BPF_PROG_LOAD,
                                   &bpf_attr, sizeof(bpf_attr));
              int inet_fd = socket(AF_INET, SOCK_STREAM, 0);
              int kcm_fd = socket(AF_KCM, SOCK_DGRAM, 0);
      
              ioctl(kcm_fd, SIOCKCMATTACH,
                    &(struct kcm_attach) { .fd = inet_fd, .bpf_fd = bpf_fd });
          }
      
      Fixes: bbb03029 ("strparser: Generalize strparser")
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Tom Herbert <tom@quantonium.net>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fd87127
    • J
      ipv6: Use multipath hash from flow info if available · b673d6cc
      Jakub Sitnicki 提交于
      Allow our callers to influence the choice of ECMP link by honoring the
      hash passed together with the flow info. This allows for special
      treatment of ICMP errors which we would like to route over the same path
      as the IPv6 datagram that triggered the error.
      
      Also go through rt6_multipath_hash(), in the usual case when we aren't
      dealing with an ICMP error, so that there is one central place where
      multipath hash is computed.
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b673d6cc
    • J
      ipv6: Fold rt6_info_hash_nhsfn() into its only caller · 956b4531
      Jakub Sitnicki 提交于
      Commit 644d0e65 ("ipv6 Use get_hash_from_flowi6 for rt6 hash") has
      turned rt6_info_hash_nhsfn() into a one-liner, so it no longer makes
      sense to keep it around. Also remove the accompanying comment that has
      become outdated.
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      956b4531
    • J
      ipv6: Compute multipath hash for ICMP errors from offending packet · 23aebdac
      Jakub Sitnicki 提交于
      When forwarding or sending out an ICMPv6 error, look at the embedded
      packet that triggered the error and compute a flow hash over its
      headers.
      
      This let's us route the ICMP error together with the flow it belongs to
      when multipath (ECMP) routing is in use, which in turn makes Path MTU
      Discovery work in ECMP load-balanced or anycast setups (RFC 7690).
      
      Granted, end-hosts behind the ECMP router (aka servers) need to reflect
      the IPv6 Flow Label for PMTUD to work.
      
      The code is organized to be in parallel with ipv4 stack:
      
        ip_multipath_l3_keys -> ip6_multipath_l3_keys
        fib_multipath_hash   -> rt6_multipath_hash
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23aebdac
    • J
      ipv6: Add sysctl for per namespace flow label reflection · 22b6722b
      Jakub Sitnicki 提交于
      Reflecting IPv6 Flow Label at server nodes is useful in environments
      that employ multipath routing to load balance the requests. As "IPv6
      Flow Label Reflection" standard draft [1] points out - ICMPv6 PTB error
      messages generated in response to a downstream packets from the server
      can be routed by a load balancer back to the original server without
      looking at transport headers, if the server applies the flow label
      reflection. This enables the Path MTU Discovery past the ECMP router in
      load-balance or anycast environments where each server node is reachable
      by only one path.
      
      Introduce a sysctl to enable flow label reflection per net namespace for
      all newly created sockets. Same could be earlier achieved only per
      socket by setting the IPV6_FL_F_REFLECT flag for the IPV6_FLOWLABEL_MGR
      socket option.
      
      [1] https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22b6722b
    • J
      xdp: remove net_device names from xdp_redirect tracepoint · a8735855
      Jesper Dangaard Brouer 提交于
      There is too much overhead in the current trace_xdp_redirect
      tracepoint as it does strcpy and strlen on the net_device names.
      
      Besides, exposing the ifindex/index is actually the information that
      is needed in the tracepoint to diagnose issues.  When a lookup fails
      (either ifindex or devmap index) then there is a need for saying which
      to_index that have issues.
      
      V2: Adjust args to be aligned with trace_xdp_exception.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8735855
    • J
      xdp: make generic xdp redirect use tracepoint trace_xdp_redirect · 2facaad6
      Jesper Dangaard Brouer 提交于
      If the xdp_do_generic_redirect() call fails, it trigger the
      trace_xdp_exception tracepoint.  It seems better to use the same
      tracepoint trace_xdp_redirect, as the native xdp_do_redirect{,_map} does.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2facaad6
    • J
      xdp: remove bpf_warn_invalid_xdp_redirect · d08adb82
      Jesper Dangaard Brouer 提交于
      Given there is a tracepoint that can track the error code
      of xdp_do_redirect calls, the WARN_ONCE in bpf_warn_invalid_xdp_redirect
      doesn't seem relevant any longer.  Simply remove the function.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d08adb82
    • A
      devlink: Move dpipe entry clear function into devlink · 35807324
      Arkadi Sharshevsky 提交于
      The entry clear routine can be shared between the drivers, thus it is
      moved inside devlink.
      Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35807324
    • A
      devlink: Add support for dynamic table size · ffd3cdcc
      Arkadi Sharshevsky 提交于
      Up until now the dpipe table's size was static and known at registration
      time. The host table does not have constant size and it is resized in
      dynamic manner. In order to support this behavior the size is changed
      to be obtained dynamically via an op.
      
      This patch also adjust the current dpipe table for the new API.
      Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ffd3cdcc