1. 15 10月, 2015 1 次提交
  2. 08 10月, 2015 1 次提交
  3. 05 10月, 2015 1 次提交
    • D
      bpf: fix panic in SO_GET_FILTER with native ebpf programs · 93d08b69
      Daniel Borkmann 提交于
      When sockets have a native eBPF program attached through
      setsockopt(sk, SOL_SOCKET, SO_ATTACH_BPF, ...), and then try to
      dump these over getsockopt(sk, SOL_SOCKET, SO_GET_FILTER, ...),
      the following panic appears:
      
        [49904.178642] BUG: unable to handle kernel NULL pointer dereference at (null)
        [49904.178762] IP: [<ffffffff81610fd9>] sk_get_filter+0x39/0x90
        [49904.182000] PGD 86fc9067 PUD 531a1067 PMD 0
        [49904.185196] Oops: 0000 [#1] SMP
        [...]
        [49904.224677] Call Trace:
        [49904.226090]  [<ffffffff815e3d49>] sock_getsockopt+0x319/0x740
        [49904.227535]  [<ffffffff812f59e3>] ? sock_has_perm+0x63/0x70
        [49904.228953]  [<ffffffff815e2fc8>] ? release_sock+0x108/0x150
        [49904.230380]  [<ffffffff812f5a43>] ? selinux_socket_getsockopt+0x23/0x30
        [49904.231788]  [<ffffffff815dff36>] SyS_getsockopt+0xa6/0xc0
        [49904.233267]  [<ffffffff8171b9ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      The underlying issue is the very same as in commit b382c086
      ("sock, diag: fix panic in sock_diag_put_filterinfo"), that is,
      native eBPF programs don't store an original program since this
      is only needed in cBPF ones.
      
      However, sk_get_filter() wasn't updated to test for this at the
      time when eBPF could be attached. Just throw an error to the user
      to indicate that eBPF cannot be dumped over this interface.
      That way, it can also be known that a program _is_ attached (as
      opposed to just return 0), and a different (future) method needs
      to be consulted for a dump.
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93d08b69
  4. 30 9月, 2015 2 次提交
  5. 25 9月, 2015 2 次提交
    • R
      net: fix net_device refcounting · 9861f720
      Russell King 提交于
      of_find_net_device_by_node() uses class_find_device() internally to
      lookup the corresponding network device.  class_find_device() returns
      a reference to the embedded struct device, with its refcount
      incremented.
      
      Add a comment to the definition in net/core/net-sysfs.c indicating the
      need to drop this refcount, and fix the DSA code to drop this refcount
      when the OF-generated platform data is cleaned up and freed.  Also
      arrange for the ref to be dropped when handling errors.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9861f720
    • W
      fib_rules: fix fib rule dumps across multiple skbs · 41fc0143
      Wilson Kok 提交于
      dump_rules returns skb length and not error.
      But when family == AF_UNSPEC, the caller of dump_rules
      assumes that it returns an error. Hence, when family == AF_UNSPEC,
      we continue trying to dump on -EMSGSIZE errors resulting in
      incorrect dump idx carried between skbs belonging to the same dump.
      This results in fib rule dump always only dumping rules that fit
      into the first skb.
      
      This patch fixes dump_rules to return error so that we exit correctly
      and idx is correctly maintained between skbs that are part of the
      same dump.
      Signed-off-by: NWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41fc0143
  6. 24 9月, 2015 1 次提交
    • N
      netpoll: Close race condition between poll_one_napi and napi_disable · 2d8bff12
      Neil Horman 提交于
      Drivers might call napi_disable while not holding the napi instance poll_lock.
      In those instances, its possible for a race condition to exist between
      poll_one_napi and napi_disable.  That is to say, poll_one_napi only tests the
      NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
      the following may happen:
      
      CPU0				CPU1
      ndo_tx_timeout			napi_poll_dev
       napi_disable			 poll_one_napi
        test_and_set_bit (ret 0)
      				  test_bit (ret 1)
         reset adapter		   napi_poll_routine
      
      If the adapter gets a tx timeout without a napi instance scheduled, its possible
      for the adapter to think it has exclusive access to the hardware  (as the napi
      instance is now scheduled via the napi_disable call), while the netpoll code
      thinks there is simply work to do.  The result is parallel hardware access
      leading to corrupt data structures in the driver, and a crash.
      
      Additionaly, there is another, more critical race between netpoll and
      napi_disable.  The disabled napi state is actually identical to the scheduled
      state for a given napi instance.  The implication being that, if a napi instance
      is disabled, a netconsole instance would see the napi state of the device as
      having been scheduled, and poll it, likely while the driver was dong something
      requiring exclusive access.  In the case above, its fairly clear that not having
      the rings in a state ready to be polled will cause any number of crashes.
      
      The fix should be pretty easy.  netpoll uses its own bit to indicate that that
      the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
      We can just gate disabling on that bit as well as the sched bit.  That should
      prevent netpoll from conducting a napi poll if we convert its set bit to a
      test_and_set_bit operation to provide mutual exclusion
      
      Change notes:
      V2)
      	Remove a trailing whtiespace
      	Resubmit with proper subject prefix
      
      V3)
      	Clean up spacing nits
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: jmaxwell@redhat.com
      Tested-by: jmaxwell@redhat.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8bff12
  7. 16 9月, 2015 2 次提交
  8. 12 9月, 2015 1 次提交
  9. 10 9月, 2015 1 次提交
    • P
      net: ipv6: use common fib_default_rule_pref · f53de1e9
      Phil Sutter 提交于
      This switches IPv6 policy routing to use the shared
      fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
      multicast routing for IPv4 as well as IPv6.
      
      The motivation for this patch is a complaint about iproute2 behaving
      inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
      IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
      assigned priority value was decreased with each rule added.
      
      Since then all users of the default_pref field have been converted to
      assign the generic function fib_default_rule_pref(), fib_nl_newrule()
      may just use it directly instead. Therefore get rid of the function
      pointer altogether and make fib_default_rule_pref() static, as it's not
      used outside fib_rules.c anymore.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f53de1e9
  10. 03 9月, 2015 1 次提交
    • D
      sock, diag: fix panic in sock_diag_put_filterinfo · b382c086
      Daniel Borkmann 提交于
      diag socket's sock_diag_put_filterinfo() dumps classic BPF programs
      upon request to user space (ss -0 -b). However, native eBPF programs
      attached to sockets (SO_ATTACH_BPF) cannot be dumped with this method:
      
      Their orig_prog is always NULL. However, sock_diag_put_filterinfo()
      unconditionally tries to access its filter length resp. wants to copy
      the filter insns from there. Internal cBPF to eBPF transformations
      attached to sockets don't have this issue, as orig_prog state is kept.
      
      It's currently only used by packet sockets. If we would want to add
      native eBPF support in the future, this needs to be done through
      a different attribute than PACKET_DIAG_FILTER to not confuse possible
      user space disassemblers that work on diag data.
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b382c086
  11. 02 9月, 2015 13 次提交
  12. 01 9月, 2015 2 次提交
    • P
      tun_dst: Remove opts_size · 63b6c13d
      Pravin B Shelar 提交于
      opts_size is only written and never read. Following patch
      removes this unused variable.
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63b6c13d
    • D
      tcp: use dctcp if enabled on the route to the initiator · c3a8d947
      Daniel Borkmann 提交于
      Currently, the following case doesn't use DCTCP, even if it should:
      A responder has f.e. Cubic as system wide default, but for a specific
      route to the initiating host, DCTCP is being set in RTAX_CC_ALGO. The
      initiating host then uses DCTCP as congestion control, but since the
      initiator sets ECT(0), tcp_ecn_create_request() doesn't set ecn_ok,
      and we have to fall back to Reno after 3WHS completes.
      
      We were thinking on how to solve this in a minimal, non-intrusive
      way without bloating tcp_ecn_create_request() needlessly: lets cache
      the CA ecn option flag in RTAX_FEATURES. In other words, when ECT(0)
      is set on the SYN packet, set ecn_ok=1 iff route RTAX_FEATURES
      contains the unexposed (internal-only) DST_FEATURE_ECN_CA. This allows
      to only do a single metric feature lookup inside tcp_ecn_create_request().
      
      Joint work with Florian Westphal.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3a8d947
  13. 31 8月, 2015 1 次提交
  14. 30 8月, 2015 2 次提交
  15. 29 8月, 2015 1 次提交
  16. 28 8月, 2015 4 次提交
  17. 27 8月, 2015 1 次提交
  18. 26 8月, 2015 2 次提交
    • W
      route: fix a use-after-free · e252b3d1
      WANG Cong 提交于
      This patch fixes the following crash:
      
       general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc7+ #166
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff88010656d280 ti: ffff880106570000 task.ti: ffff880106570000
       RIP: 0010:[<ffffffff8182f91b>]  [<ffffffff8182f91b>] dst_destroy+0xa6/0xef
       RSP: 0018:ffff880107603e38  EFLAGS: 00010202
       RAX: 0000000000000001 RBX: ffff8800d225a000 RCX: ffffffff82250fd0
       RDX: 0000000000000001 RSI: ffffffff82250fd0 RDI: 6b6b6b6b6b6b6b6b
       RBP: ffff880107603e58 R08: 0000000000000001 R09: 0000000000000001
       R10: 000000000000b530 R11: ffff880107609000 R12: 0000000000000000
       R13: ffffffff82343c40 R14: 0000000000000000 R15: ffffffff8182fb4f
       FS:  0000000000000000(0000) GS:ffff880107600000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 00007fcabd9d3000 CR3: 00000000d7279000 CR4: 00000000000006e0
       Stack:
        ffffffff82250fd0 ffff8801077d6f00 ffffffff82253c40 ffff8800d225a000
        ffff880107603e68 ffffffff8182fb5d ffff880107603f08 ffffffff810d795e
        ffffffff810d7648 ffff880106574000 ffff88010656d280 ffff88010656d280
       Call Trace:
        <IRQ>
        [<ffffffff8182fb5d>] dst_destroy_rcu+0xe/0x1d
        [<ffffffff810d795e>] rcu_process_callbacks+0x618/0x7eb
        [<ffffffff810d7648>] ? rcu_process_callbacks+0x302/0x7eb
        [<ffffffff8182fb4f>] ? dst_gc_task+0x1eb/0x1eb
        [<ffffffff8107e11b>] __do_softirq+0x178/0x39f
        [<ffffffff8107e52e>] irq_exit+0x41/0x95
        [<ffffffff81a4f215>] smp_apic_timer_interrupt+0x34/0x40
        [<ffffffff81a4d5cd>] apic_timer_interrupt+0x6d/0x80
        <EOI>
        [<ffffffff8100b968>] ? default_idle+0x21/0x32
        [<ffffffff8100b966>] ? default_idle+0x1f/0x32
        [<ffffffff8100bf19>] arch_cpu_idle+0xf/0x11
        [<ffffffff810b0bc7>] default_idle_call+0x1f/0x21
        [<ffffffff810b0dce>] cpu_startup_entry+0x1ad/0x273
        [<ffffffff8102fe67>] start_secondary+0x135/0x156
      
      dst is freed right before lwtstate_put(), this is not correct...
      
      Fixes: 61adedf3 ("route: move lwtunnel state to dst_entry")
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e252b3d1
    • M
      net-next: Fix warning while make xmldocs caused by skbuff.c · d7499160
      Masanari Iida 提交于
      This patch fix following warnings.
      
      .//net/core/skbuff.c:407: warning: No description found
      for parameter 'len'
      .//net/core/skbuff.c:407: warning: Excess function parameter
       'length' description in '__netdev_alloc_skb'
      .//net/core/skbuff.c:476: warning: No description found
       for parameter 'len'
      .//net/core/skbuff.c:476: warning: Excess function parameter
      'length' description in '__napi_alloc_skb'
      Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7499160
  19. 25 8月, 2015 1 次提交