1. 10 9月, 2015 4 次提交
    • W
      ipv6: fix ifnullfree.cocci warnings · 52fe51f8
      Wu Fengguang 提交于
      net/ipv6/route.c:2946:3-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
      
       NULL check before some freeing functions is not needed.
      
       Based on checkpatch warning
       "kfree(NULL) is safe this check is probably not required"
       and kfreeaddr.cocci by Julia Lawall.
      
      Generated by: scripts/coccinelle/free/ifnullfree.cocci
      
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52fe51f8
    • P
      net: ipv6: use common fib_default_rule_pref · f53de1e9
      Phil Sutter 提交于
      This switches IPv6 policy routing to use the shared
      fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
      multicast routing for IPv4 as well as IPv6.
      
      The motivation for this patch is a complaint about iproute2 behaving
      inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
      IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
      assigned priority value was decreased with each rule added.
      
      Since then all users of the default_pref field have been converted to
      assign the generic function fib_default_rule_pref(), fib_nl_newrule()
      may just use it directly instead. Therefore get rid of the function
      pointer altogether and make fib_default_rule_pref() static, as it's not
      used outside fib_rules.c anymore.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f53de1e9
    • R
      ipv6: fix multipath route replace error recovery · 6b9ea5a6
      Roopa Prabhu 提交于
      Problem:
      The ecmp route replace support for ipv6 in the kernel, deletes the
      existing ecmp route too early, ie when it installs the first nexthop.
      If there is an error in installing the subsequent nexthops, its too late
      to recover the already deleted existing route leaving the fib
      in an inconsistent state.
      
      This patch reduces the possibility of this by doing the following:
      a) Changes the existing multipath route add code to a two stage process:
        build rt6_infos + insert them
      	ip6_route_add rt6_info creation code is moved into
      	ip6_route_info_create.
      b) This ensures that most errors are caught during building rt6_infos
        and we fail early
      c) Separates multipath add and del code. Because add needs the special
        two stage mode in a) and delete essentially does not care.
      d) In any event if the code fails during inserting a route again, a
        warning is printed (This should be unlikely)
      
      Before the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      /* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
       * kernel */
      
      After the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b9ea5a6
    • S
      RDS: verify the underlying transport exists before creating a connection · 74e98eb0
      Sasha Levin 提交于
      There was no verification that an underlying transport exists when creating
      a connection, this would cause dereferencing a NULL ptr.
      
      It might happen on sockets that weren't properly bound before attempting to
      send a message, which will cause a NULL ptr deref:
      
      [135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [135546.051270] Modules linked in:
      [135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
      [135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
      [135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
      [135546.055666] RSP: 0018:ffff8800bc70fab0  EFLAGS: 00010202
      [135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
      [135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
      [135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
      [135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
      [135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
      [135546.061668] FS:  00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
      [135546.062836] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
      [135546.064723] Stack:
      [135546.065048]  ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
      [135546.066247]  0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
      [135546.067438]  1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
      [135546.068629] Call Trace:
      [135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
      [135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
      [135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
      [135546.071981] rds_sendmsg (net/rds/send.c:1058)
      [135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
      [135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
      [135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
      [135546.076349] ? __might_fault (mm/memory.c:3795)
      [135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
      [135546.078856] SYSC_sendto (net/socket.c:1657)
      [135546.079596] ? SYSC_connect (net/socket.c:1628)
      [135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
      [135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
      [135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
      [135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
      [135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74e98eb0
  2. 09 9月, 2015 3 次提交
    • K
      net: tipc: fix stall during bclink wakeup procedure · 7845989c
      Kolmakov Dmitriy 提交于
      If an attempt to wake up users of broadcast link is made when there is
      no enough place in send queue than it may hang up inside the
      tipc_sk_rcv() function since the loop breaks only after the wake up
      queue becomes empty. This can lead to complete CPU stall with the
      following message generated by RCU:
      
      INFO: rcu_sched self-detected stall on CPU { 0}  (t=2101 jiffies
      					g=54225 c=54224 q=11465)
      Task dump for CPU 0:
      tpch            R  running task        0 39949  39948 0x0000000a
       ffffffff818536c0 ffff88181fa037a0 ffffffff8106a4be 0000000000000000
       ffffffff818536c0 ffff88181fa037c0 ffffffff8106d8a8 ffff88181fa03800
       0000000000000001 ffff88181fa037f0 ffffffff81094a50 ffff88181fa15680
      Call Trace:
       <IRQ>  [<ffffffff8106a4be>] sched_show_task+0xae/0x120
       [<ffffffff8106d8a8>] dump_cpu_task+0x38/0x40
       [<ffffffff81094a50>] rcu_dump_cpu_stacks+0x90/0xd0
       [<ffffffff81097c3b>] rcu_check_callbacks+0x3eb/0x6e0
       [<ffffffff8106e53f>] ? account_system_time+0x7f/0x170
       [<ffffffff81099e64>] update_process_times+0x34/0x60
       [<ffffffff810a84d1>] tick_sched_handle.isra.18+0x31/0x40
       [<ffffffff810a851c>] tick_sched_timer+0x3c/0x70
       [<ffffffff8109a43d>] __run_hrtimer.isra.34+0x3d/0xc0
       [<ffffffff8109aa95>] hrtimer_interrupt+0xc5/0x1e0
       [<ffffffff81030d52>] ? native_smp_send_reschedule+0x42/0x60
       [<ffffffff81032f04>] local_apic_timer_interrupt+0x34/0x60
       [<ffffffff810335bc>] smp_apic_timer_interrupt+0x3c/0x60
       [<ffffffff8165a3fb>] apic_timer_interrupt+0x6b/0x70
       [<ffffffff81659129>] ? _raw_spin_unlock_irqrestore+0x9/0x10
       [<ffffffff8107eb9f>] __wake_up_sync_key+0x4f/0x60
       [<ffffffffa313ddd1>] tipc_write_space+0x31/0x40 [tipc]
       [<ffffffffa313dadf>] filter_rcv+0x31f/0x520 [tipc]
       [<ffffffffa313d699>] ? tipc_sk_lookup+0xc9/0x110 [tipc]
       [<ffffffff81659259>] ? _raw_spin_lock_bh+0x19/0x30
       [<ffffffffa314122c>] tipc_sk_rcv+0x2dc/0x3e0 [tipc]
       [<ffffffffa312e7ff>] tipc_bclink_wakeup_users+0x2f/0x40 [tipc]
       [<ffffffffa313ce26>] tipc_node_unlock+0x186/0x190 [tipc]
       [<ffffffff81597c1c>] ? kfree_skb+0x2c/0x40
       [<ffffffffa313475c>] tipc_rcv+0x2ac/0x8c0 [tipc]
       [<ffffffffa312ff58>] tipc_l2_rcv_msg+0x38/0x50 [tipc]
       [<ffffffff815a76d3>] __netif_receive_skb_core+0x5a3/0x950
       [<ffffffff815a98d3>] __netif_receive_skb+0x13/0x60
       [<ffffffff815a993e>] netif_receive_skb_internal+0x1e/0x90
       [<ffffffff815aa138>] napi_gro_receive+0x78/0xa0
       [<ffffffffa07f93f4>] tg3_poll_work+0xc54/0xf40 [tg3]
       [<ffffffff81597c8c>] ? consume_skb+0x2c/0x40
       [<ffffffffa07f9721>] tg3_poll_msix+0x41/0x160 [tg3]
       [<ffffffff815ab0f2>] net_rx_action+0xe2/0x290
       [<ffffffff8104b92a>] __do_softirq+0xda/0x1f0
       [<ffffffff8104bc26>] irq_exit+0x76/0xa0
       [<ffffffff81004355>] do_IRQ+0x55/0xf0
       [<ffffffff8165a12b>] common_interrupt+0x6b/0x6b
       <EOI>
      
      The issue occurs only when tipc_sk_rcv() is used to wake up postponed
      senders:
      
      	tipc_bclink_wakeup_users()
      		// wakeupq - is a queue which consists of special
      		// 		 messages with SOCK_WAKEUP type.
      		tipc_sk_rcv(wakeupq)
      			...
      			while (skb_queue_len(inputq)) {
      				filter_rcv(skb)
      					// Here the type of message is checked
      					// and if it is SOCK_WAKEUP then
      					// it tries to wake up a sender.
      					tipc_write_space(sk)
      						wake_up_interruptible_sync_poll()
      			}
      
      After the sender thread is woke up it can gather control and perform
      an attempt to send a message. But if there is no enough place in send
      queue it will call link_schedule_user() function which puts a message
      of type SOCK_WAKEUP to the wakeup queue and put the sender to sleep.
      Thus the size of the queue actually is not changed and the while()
      loop never exits.
      
      The approach I proposed is to wake up only senders for which there is
      enough place in send queue so the described issue can't occur.
      Moreover the same approach is already used to wake up senders on
      unicast links.
      
      I have got into the issue on our product code but to reproduce the
      issue I changed a benchmark test application (from
      tipcutils/demos/benchmark) to perform the following scenario:
      	1. Run 64 instances of test application (nodes). It can be done
      	   on the one physical machine.
      	2. Each application connects to all other using TIPC sockets in
      	   RDM mode.
      	3. When setup is done all nodes start simultaneously send
      	   broadcast messages.
      	4. Everything hangs up.
      
      The issue is reproducible only when a congestion on broadcast link
      occurs. For example, when there are only 8 nodes it works fine since
      congestion doesn't occur. Send queue limit is 40 in my case (I use a
      critical importance level) and when 64 nodes send a message at the
      same moment a congestion occurs every time.
      Signed-off-by: NDmitry S Kolmakov <kolmakov.dmitriy@huawei.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7845989c
    • V
      net: bridge: remove unnecessary switchdev include · 7a577f01
      Vivien Didelot 提交于
      Remove the unnecessary switchdev.h include from br_netlink.c.
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a577f01
    • V
      net: bridge: check __vlan_vid_del for error · bf361ad3
      Vivien Didelot 提交于
      Since __vlan_del can return an error code, change its inner function
      __vlan_vid_del to return an eventual error from switchdev_port_obj_del.
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf361ad3
  3. 07 9月, 2015 1 次提交
  4. 06 9月, 2015 3 次提交
  5. 04 9月, 2015 8 次提交
  6. 03 9月, 2015 4 次提交
    • D
      netfilter: nf_conntrack: make nf_ct_zone_dflt built-in · 62da9865
      Daniel Borkmann 提交于
      Fengguang reported, that some randconfig generated the following linker
      issue with nf_ct_zone_dflt object involved:
      
        [...]
        CC      init/version.o
        LD      init/built-in.o
        net/built-in.o: In function `ipv4_conntrack_defrag':
        nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
        net/built-in.o: In function `ipv6_defrag':
        nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
        make: *** [vmlinux] Error 1
      
      Given that configurations exist where we have a built-in part, which is
      accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
      and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
      module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
      area when netfilter is configured in general.
      
      Therefore, split the more generic parts into a common header under
      include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
      section that already holds parts related to CONFIG_NF_CONNTRACK in the
      netfilter core. This fixes the issue on my side.
      
      Fixes: 308ac914 ("netfilter: nf_conntrack: push zone object into functions")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62da9865
    • D
      netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled · a82b0e63
      Daniel Borkmann 提交于
      While testing various Kconfig options on another issue, I found that
      the following one triggers as well on allmodconfig and nf_conntrack
      disabled:
      
        net/ipv4/netfilter/nf_dup_ipv4.c: In function ‘nf_dup_ipv4’:
        net/ipv4/netfilter/nf_dup_ipv4.c:72:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
          if (this_cpu_read(nf_skb_duplicated))
        [...]
        net/ipv6/netfilter/nf_dup_ipv6.c: In function ‘nf_dup_ipv6’:
        net/ipv6/netfilter/nf_dup_ipv6.c:66:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
          if (this_cpu_read(nf_skb_duplicated))
      
      Fix it by including directly the header where it is defined.
      
      Fixes: bbde9fc1 ("netfilter: factor out packet duplication for IPv4/IPv6")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a82b0e63
    • D
      ipv6: fix exthdrs offload registration in out_rt path · e41b0bed
      Daniel Borkmann 提交于
      We previously register IPPROTO_ROUTING offload under inet6_add_offload(),
      but in error path, we try to unregister it with inet_del_offload(). This
      doesn't seem correct, it should actually be inet6_del_offload(), also
      ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it
      also uses rthdr_offload twice), but it got removed entirely later on.
      
      Fixes: 3336288a ("ipv6: Switch to using new offload infrastructure.")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e41b0bed
    • D
      sock, diag: fix panic in sock_diag_put_filterinfo · b382c086
      Daniel Borkmann 提交于
      diag socket's sock_diag_put_filterinfo() dumps classic BPF programs
      upon request to user space (ss -0 -b). However, native eBPF programs
      attached to sockets (SO_ATTACH_BPF) cannot be dumped with this method:
      
      Their orig_prog is always NULL. However, sock_diag_put_filterinfo()
      unconditionally tries to access its filter length resp. wants to copy
      the filter insns from there. Internal cBPF to eBPF transformations
      attached to sockets don't have this issue, as orig_prog state is kept.
      
      It's currently only used by packet sockets. If we would want to add
      native eBPF support in the future, this needs to be done through
      a different attribute than PACKET_DIAG_FILTER to not confuse possible
      user space disassemblers that work on diag data.
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b382c086
  7. 02 9月, 2015 14 次提交
  8. 01 9月, 2015 3 次提交