1. 12 7月, 2013 3 次提交
    • C
      inet: fix spacing in assignment · 3b8ccd44
      Camelia Groza 提交于
      Found using checkpatch.pl
      Signed-off-by: NCamelia Groza <camelia.groza@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b8ccd44
    • H
      ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF · afc154e9
      Hannes Frederic Sowa 提交于
      This is a follow-up patch to 3630d400
      ("ipv6: rt6_check_neigh should successfully verify neigh if no NUD
      information are available").
      
      Since the removal of rt->n in rt6_info we can end up with a dst ==
      NULL in rt6_check_neigh. In case the kernel is not compiled with
      CONFIG_IPV6_ROUTER_PREF we should also select a route with unkown
      NUD state but we must not avoid doing round robin selection on routes
      with the same target. So introduce and pass down a boolean ``do_rr'' to
      indicate when we should update rt->rr_ptr. As soon as no route is valid
      we do backtracking and do a lookup on a higher level in the fib trie.
      
      v2:
      a) Improved rt6_check_neigh logic (no need to create neighbour there)
         and documented return values.
      
      v3:
      a) Introduce enum rt6_nud_state to get rid of the magic numbers
         (thanks to David Miller).
      b) Update and shorten commit message a bit to actualy reflect
         the source.
      Reported-by: NPierre Emeriaud <petrus.lt@gmail.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afc154e9
    • S
      9p: fix off by one causing access violations and memory corruption · 110ecd69
      Sasha Levin 提交于
      p9_release_pages() would attempt to dereference one value past the end of
      pages[]. This would cause the following crashes:
      
      [ 6293.171817] BUG: unable to handle kernel paging request at ffff8807c96f3000
      [ 6293.174146] IP: [<ffffffff8412793b>] p9_release_pages+0x3b/0x60
      [ 6293.176447] PGD 79c5067 PUD 82c1e3067 PMD 82c197067 PTE 80000007c96f3060
      [ 6293.180060] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [ 6293.180060] Modules linked in:
      [ 6293.180060] CPU: 62 PID: 174043 Comm: modprobe Tainted: G        W    3.10.0-next-20130710-sasha #3954
      [ 6293.180060] task: ffff8807b803b000 ti: ffff880787dde000 task.ti: ffff880787dde000
      [ 6293.180060] RIP: 0010:[<ffffffff8412793b>]  [<ffffffff8412793b>] p9_release_pages+0x3b/0x60
      [ 6293.214316] RSP: 0000:ffff880787ddfc28  EFLAGS: 00010202
      [ 6293.214316] RAX: 0000000000000001 RBX: ffff8807c96f2ff8 RCX: 0000000000000000
      [ 6293.222017] RDX: ffff8807b803b000 RSI: 0000000000000001 RDI: ffffea001c7e3d40
      [ 6293.222017] RBP: ffff880787ddfc48 R08: 0000000000000000 R09: 0000000000000000
      [ 6293.222017] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
      [ 6293.222017] R13: 0000000000000001 R14: ffff8807cc50c070 R15: ffff8807cc50c070
      [ 6293.222017] FS:  00007f572641d700(0000) GS:ffff8807f3600000(0000) knlGS:0000000000000000
      [ 6293.256784] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 6293.256784] CR2: ffff8807c96f3000 CR3: 00000007c8e81000 CR4: 00000000000006e0
      [ 6293.256784] Stack:
      [ 6293.256784]  ffff880787ddfcc8 ffff880787ddfcc8 0000000000000000 ffff880787ddfcc8
      [ 6293.256784]  ffff880787ddfd48 ffffffff84128be8 ffff880700000002 0000000000000001
      [ 6293.256784]  ffff8807b803b000 ffff880787ddfce0 0000100000000000 0000000000000000
      [ 6293.256784] Call Trace:
      [ 6293.256784]  [<ffffffff84128be8>] p9_virtio_zc_request+0x598/0x630
      [ 6293.256784]  [<ffffffff8115c610>] ? wake_up_bit+0x40/0x40
      [ 6293.256784]  [<ffffffff841209b1>] p9_client_zc_rpc+0x111/0x3a0
      [ 6293.256784]  [<ffffffff81174b78>] ? sched_clock_cpu+0x108/0x120
      [ 6293.256784]  [<ffffffff84122a21>] p9_client_read+0xe1/0x2c0
      [ 6293.256784]  [<ffffffff81708a90>] v9fs_file_read+0x90/0xc0
      [ 6293.256784]  [<ffffffff812bd073>] vfs_read+0xc3/0x130
      [ 6293.256784]  [<ffffffff811a78bd>] ? trace_hardirqs_on+0xd/0x10
      [ 6293.256784]  [<ffffffff812bd5a2>] SyS_read+0x62/0xa0
      [ 6293.256784]  [<ffffffff841a1a00>] tracesys+0xdd/0xe2
      [ 6293.256784] Code: 66 90 48 89 fb 41 89 f5 48 8b 3f 48 85 ff 74 29 85 f6 74 25 45 31 e4 66 0f 1f 84 00 00 00 00 00 e8 eb 14 12 fd 41 ff c4 49 63 c4 <48> 8b 3c c3 48 85 ff 74 05 45 39 e5 75 e7 48 83 c4 08 5b 41 5c
      [ 6293.256784] RIP  [<ffffffff8412793b>] p9_release_pages+0x3b/0x60
      [ 6293.256784]  RSP <ffff880787ddfc28>
      [ 6293.256784] CR2: ffff8807c96f3000
      [ 6293.256784] ---[ end trace 50822ee72cd360fc ]---
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      110ecd69
  2. 11 7月, 2013 4 次提交
    • H
      ipv6: in case of link failure remove route directly instead of letting it expire · 1eb4f758
      Hannes Frederic Sowa 提交于
      We could end up expiring a route which is part of an ecmp route set. Doing
      so would invalidate the rt->rt6i_nsiblings calculations and could provoke
      the following panic:
      
      [   80.144667] ------------[ cut here ]------------
      [   80.145172] kernel BUG at net/ipv6/ip6_fib.c:733!
      [   80.145172] invalid opcode: 0000 [#1] SMP
      [   80.145172] Modules linked in: 8021q nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables
      +snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer virtio_balloon snd soundcore i2c_piix4 i2c_core virtio_net virtio_blk
      [   80.145172] CPU: 1 PID: 786 Comm: ping6 Not tainted 3.10.0+ #118
      [   80.145172] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   80.145172] task: ffff880117fa0000 ti: ffff880118770000 task.ti: ffff880118770000
      [   80.145172] RIP: 0010:[<ffffffff815f3b5d>]  [<ffffffff815f3b5d>] fib6_add+0x75d/0x830
      [   80.145172] RSP: 0018:ffff880118771798  EFLAGS: 00010202
      [   80.145172] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011350e480
      [   80.145172] RDX: ffff88011350e238 RSI: 0000000000000004 RDI: ffff88011350f738
      [   80.145172] RBP: ffff880118771848 R08: ffff880117903280 R09: 0000000000000001
      [   80.145172] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011350f680
      [   80.145172] R13: ffff880117903280 R14: ffff880118771890 R15: ffff88011350ef90
      [   80.145172] FS:  00007f02b5127740(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
      [   80.145172] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   80.145172] CR2: 00007f981322a000 CR3: 00000001181b1000 CR4: 00000000000006e0
      [   80.145172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   80.145172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [   80.145172] Stack:
      [   80.145172]  0000000000000001 ffff880100000000 ffff880100000000 ffff880117903280
      [   80.145172]  0000000000000000 ffff880119a4cf00 0000000000000400 00000000000007fa
      [   80.145172]  0000000000000000 0000000000000000 0000000000000000 ffff88011350f680
      [   80.145172] Call Trace:
      [   80.145172]  [<ffffffff815eeceb>] ? rt6_bind_peer+0x4b/0x90
      [   80.145172]  [<ffffffff815ed985>] __ip6_ins_rt+0x45/0x70
      [   80.145172]  [<ffffffff815eee35>] ip6_ins_rt+0x35/0x40
      [   80.145172]  [<ffffffff815ef1e4>] ip6_pol_route.isra.44+0x3a4/0x4b0
      [   80.145172]  [<ffffffff815ef34a>] ip6_pol_route_output+0x2a/0x30
      [   80.145172]  [<ffffffff81616077>] fib6_rule_action+0xd7/0x210
      [   80.145172]  [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30
      [   80.145172]  [<ffffffff81553026>] fib_rules_lookup+0xc6/0x140
      [   80.145172]  [<ffffffff81616374>] fib6_rule_lookup+0x44/0x80
      [   80.145172]  [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30
      [   80.145172]  [<ffffffff815edea3>] ip6_route_output+0x73/0xb0
      [   80.145172]  [<ffffffff815dfdf3>] ip6_dst_lookup_tail+0x2c3/0x2e0
      [   80.145172]  [<ffffffff813007b1>] ? list_del+0x11/0x40
      [   80.145172]  [<ffffffff81082a4c>] ? remove_wait_queue+0x3c/0x50
      [   80.145172]  [<ffffffff815dfe4d>] ip6_dst_lookup_flow+0x3d/0xa0
      [   80.145172]  [<ffffffff815fda77>] rawv6_sendmsg+0x267/0xc20
      [   80.145172]  [<ffffffff815a8a83>] inet_sendmsg+0x63/0xb0
      [   80.145172]  [<ffffffff8128eb93>] ? selinux_socket_sendmsg+0x23/0x30
      [   80.145172]  [<ffffffff815218d6>] sock_sendmsg+0xa6/0xd0
      [   80.145172]  [<ffffffff81524a68>] SYSC_sendto+0x128/0x180
      [   80.145172]  [<ffffffff8109825c>] ? update_curr+0xec/0x170
      [   80.145172]  [<ffffffff81041d09>] ? kvm_clock_get_cycles+0x9/0x10
      [   80.145172]  [<ffffffff810afd1e>] ? __getnstimeofday+0x3e/0xd0
      [   80.145172]  [<ffffffff8152509e>] SyS_sendto+0xe/0x10
      [   80.145172]  [<ffffffff8164efd9>] system_call_fastpath+0x16/0x1b
      [   80.145172] Code: fe ff ff 41 f6 45 2a 06 0f 85 ca fe ff ff 49 8b 7e 08 4c 89 ee e8 94 ef ff ff e9 b9 fe ff ff 48 8b 82 28 05 00 00 e9 01 ff ff ff <0f> 0b 49 8b 54 24 30 0d 00 00 40 00 89 83 14 01 00 00 48 89 53
      [   80.145172] RIP  [<ffffffff815f3b5d>] fib6_add+0x75d/0x830
      [   80.145172]  RSP <ffff880118771798>
      [   80.387413] ---[ end trace 02f20b7a8b81ed95 ]---
      [   80.390154] Kernel panic - not syncing: Fatal exception in interrupt
      
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1eb4f758
    • E
      net: rename busy poll socket op and globals · 64b0dc51
      Eliezer Tamir 提交于
      Rename LL_SO to BUSY_POLL_SO
      Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
      Fix up users of these variables.
      Fix documentation for sysctl.
      
      a patch for the socket.7  man page will follow separately,
      because of limitations of my mail setup.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64b0dc51
    • E
      net: rename ll methods to busy-poll · 8b80cda5
      Eliezer Tamir 提交于
      Rename ndo_ll_poll to ndo_busy_poll.
      Rename sk_mark_ll to sk_mark_napi_id.
      Rename skb_mark_ll to skb_mark_napi_id.
      Correct all useres of these functions.
      Update comments and defines  in include/net/busy_poll.h
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b80cda5
    • E
      net: rename include/net/ll_poll.h to include/net/busy_poll.h · 076bb0c8
      Eliezer Tamir 提交于
      Rename the file and correct all the places where it is included.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      076bb0c8
  3. 10 7月, 2013 1 次提交
    • D
      net: sctp: confirm route during forward progress · 8c2f414a
      Daniel Borkmann 提交于
      This fix has been proposed originally by Vlad Yasevich. He says:
      
        When SCTP makes forward progress (receives a SACK that acks new chunks,
        renegs, or answeres 0-window probes) or when HB-ACK arrives, mark
        the route as confirmed so we don't unnecessarily send NUD probes.
      
      Having a simple SCTP client/server that exchange data chunks every 1sec,
      without this patch ARP requests are sent periodically every 40-60sec.
      With this fix applied, an ARP request is only done once right at the
      "session" beginning. Also, when clearing the related ARP cache entry
      manually during the session, a new request is correctly done. I have
      only "backported" this to net-next and tested that it works, so full
      credit goes to Vlad.
      Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c2f414a
  4. 09 7月, 2013 1 次提交
  5. 07 7月, 2013 1 次提交
    • C
      bridge: fix some kernel warning in multicast timer · c7e8e8a8
      Cong Wang 提交于
      Several people reported the warning: "kernel BUG at kernel/timer.c:729!"
      and the stack trace is:
      
      	#7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905
      	#8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge]
      	#9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge]
      	#10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge]
      	#11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge]
      	#12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc
      	#13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6
      	#14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad
      	#15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17
      	#16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68
      	#17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101
      	#18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8
      	#19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun]
      	#20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun]
      	#21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1
      	#22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe
      	#23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f
      	#24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1
      	#25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292
      
      this is due to I forgot to check if mp->timer is armed in
      br_multicast_del_pg(). This bug is introduced by
      commit 9f00b2e7 (bridge: only expire the mdb entry
      when query is received).
      
      Same for __br_mdb_del().
      Tested-by: Npoma <pomidorabelisima@gmail.com>
      Reported-by: NLiYonghua <809674045@qq.com>
      Reported-by: NRobert Hancock <hancockrwd@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7e8e8a8
  6. 05 7月, 2013 1 次提交
  7. 04 7月, 2013 15 次提交
  8. 03 7月, 2013 8 次提交
    • P
      ip_tunnels: Use skb-len to PMTU check. · 23a3647b
      Pravin B Shelar 提交于
      In path mtu check, ip header total length works for gre device
      but not for gre-tap device.  Use skb len which is consistent
      for all tunneling types.  This is old bug in gre.
      This also fixes mtu calculation bug introduced by
      commit c5441932 (GRE: Refactor GRE tunneling code).
      Reported-by: NTimo Teras <timo.teras@iki.fi>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23a3647b
    • J
      l2tp: make datapath resilient to packet loss when sequence numbers enabled · a0dbd822
      James Chapman 提交于
      If L2TP data sequence numbers are enabled and reordering is not
      enabled, data reception stops if a packet is lost since the kernel
      waits for a sequence number that is never resent. (When reordering is
      enabled, data reception restarts when the reorder timeout expires.) If
      no reorder timeout is set, we should count the number of in-sequence
      packets after the out-of-sequence (OOS) condition is detected, and reset
      sequence number state after a number of such packets are received.
      
      For now, the number of in-sequence packets while in OOS state which
      cause the sequence number state to be reset is hard-coded to 5. This
      could be configurable later.
      Signed-off-by: NJames Chapman <jchapman@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0dbd822
    • J
      l2tp: make datapath sequence number support RFC-compliant · 8a1631d5
      James Chapman 提交于
      The L2TP datapath is not currently RFC-compliant when sequence numbers
      are used in L2TP data packets. According to the L2TP RFC, any received
      sequence number NR greater than or equal to the next expected NR is
      acceptable, where the "greater than or equal to" test is determined by
      the NR wrap point. This differs for L2TPv2 and L2TPv3, so add state in
      the session context to hold the max NR value and the NR window size in
      order to do the acceptable sequence number value check. These might be
      configurable later, but for now we derive it from the tunnel L2TP
      version, which determines the sequence number field size.
      Signed-off-by: NJames Chapman <jchapman@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a1631d5
    • J
      l2tp: do data sequence number handling in a separate func · b6dc01a4
      James Chapman 提交于
      This change moves some code handling data sequence numbers into a
      separate function to avoid too much indentation. This is to prepare
      for some changes to data sequence number handling in subsequent
      patches.
      Signed-off-by: NJames Chapman <jchapman@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6dc01a4
    • Y
      sctp: use get_unused_fd_flags(0) instead of get_unused_fd() · 8a59bd3e
      Yann Droneaud 提交于
      Macro get_unused_fd() is used to allocate a file descriptor with
      default flags. Those default flags (0) can be "unsafe":
      O_CLOEXEC must be used by default to not leak file descriptor
      across exec().
      
      Instead of macro get_unused_fd(), functions anon_inode_getfd()
      or get_unused_fd_flags() should be used with flags given by userspace.
      If not possible, flags should be set to O_CLOEXEC to provide userspace
      with a default safe behavor.
      
      In a further patch, get_unused_fd() will be removed so that
      new code start using anon_inode_getfd() or get_unused_fd_flags()
      with correct flags.
      
      This patch replaces calls to get_unused_fd() with equivalent call to
      get_unused_fd_flags(0) to preserve current behavor for existing code.
      
      The hard coded flag value (0) should be reviewed on a per-subsystem basis,
      and, if possible, set to O_CLOEXEC.
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a59bd3e
    • I
      core/dev: set pkt_type after eth_type_trans() in dev_forward_skb() · 06a23fe3
      Isaku Yamahata 提交于
      The dev_forward_skb() assignment of pkt_type should be done
      after the call to eth_type_trans().
      
      ip-encapsulated packets can be handled by localhost. But skb->pkt_type
      can be PACKET_OTHERHOST when packet comes via veth into ip tunnel device.
      In that case, the packet is dropped by ip_rcv().
      Although this example uses gretap. l2tp-eth also has same issue.
      For l2tp-eth case, add dummy device for ip address and ip l2tp command.
      
      netns A |                     root netns                      | netns B
         veth<->veth=bridge=gretap <-loop back-> gretap=bridge=veth<->veth
      
      arp packet ->
      pkt_type
               BROADCAST------------>ip_rcv()------------------------>
      
                                                                   <- arp reply
                                                                      pkt_type
                                     ip_rcv()<-----------------OTHERHOST
                                     drop
      
      sample operations
        ip link add tapa type gretap remote 172.17.107.4 local 172.17.107.3
        ip link add tapb type gretap remote 172.17.107.3 local 172.17.107.4
        ip link set tapa up
        ip link set tapb up
        ip address add 172.17.107.3 dev tapa
        ip address add 172.17.107.4 dev tapb
        ip route get 172.17.107.3
        > local 172.17.107.3 dev lo  src 172.17.107.3
        >    cache <local>
        ip route get 172.17.107.4
        > local 172.17.107.4 dev lo  src 172.17.107.4
        >    cache <local>
        ip link add vetha type veth peer name vetha-peer
        ip link add vethb type veth peer name vethb-peer
        brctl addbr bra
        brctl addbr brb
        brctl addif bra tapa
        brctl addif bra vetha-peer
        brctl addif brb tapb
        brctl addif brb vethb-peer
        brctl show
        > bridge name     bridge id               STP enabled     interfaces
        > bra             8000.6ea21e758ff1       no              tapa
        >                                                         vetha-peer
        > brb             8000.420020eb92d5       no              tapb
        >                                                         vethb-peer
        ip link set vetha-peer up
        ip link set vethb-peer up
        ip link set bra up
        ip link set brb up
        ip netns add a
        ip netns add b
        ip link set vetha netns a
        ip link set vethb netns b
        ip netns exec a ip address add 10.0.0.3/24 dev vetha
        ip netns exec b ip address add 10.0.0.4/24 dev vethb
        ip netns exec a ip link set vetha up
        ip netns exec b ip link set vethb up
        ip netns exec a arping -I vetha 10.0.0.4
        ARPING 10.0.0.4 from 10.0.0.3 vetha
        ^CSent 2 probes (2 broadcast(s))
        Received 0 response(s)
      
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Hong Zhiguo <honkiko@gmail.com>
      Cc: Rami Rosen <ramirose@gmail.com>
      Cc: Tom Parkin <tparkin@katalix.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: dev@openvswitch.org
      Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06a23fe3
    • H
      ipv6: ip6_append_data_mtu did not care about pmtudisc and frag_size · 75a493e6
      Hannes Frederic Sowa 提交于
      If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track
      of this when appending the second frame on a corked socket. This results
      in the following splat:
      
      [37598.993962] ------------[ cut here ]------------
      [37598.994008] kernel BUG at net/core/skbuff.c:2064!
      [37598.994008] invalid opcode: 0000 [#1] SMP
      [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
      +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi
      +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm
      [37598.994008]  snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc
      +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video
      [37598.994008] CPU 0
      [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG
      [37598.994008] RIP: 0010:[<ffffffff815443a5>]  [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330
      [37598.994008] RSP: 0018:ffff88003670da18  EFLAGS: 00010202
      [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0
      [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00
      [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040
      [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8
      [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000
      [37598.994008] FS:  00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000
      [37598.994008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0
      [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0)
      [37598.994008] Stack:
      [37598.994008]  ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8
      [37598.994008]  ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200
      [37598.994008]  0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4
      [37598.994008] Call Trace:
      [37598.994008]  [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0
      [37598.994008]  [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0
      [37598.994008]  [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40
      [37598.994008]  [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10
      [37598.994008]  [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90
      [37598.994008]  [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0
      [37598.994008]  [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30
      [37598.994008]  [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0
      [37598.994008]  [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0
      [37598.994008]  [<ffffffff8153d97d>] sys_sendto+0x12d/0x180
      [37598.994008]  [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0
      [37598.994008]  [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240
      [37598.994008]  [<ffffffff8166a7e7>] tracesys+0xdd/0xe2
      [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48
      [37598.994008] RIP  [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330
      [37598.994008]  RSP <ffff88003670da18>
      [37599.007323] ---[ end trace d69f6a17f8ac8eee ]---
      
      While there, also check if path mtu discovery is activated for this
      socket. The logic was adapted from ip6_append_data when first writing
      on the corked socket.
      
      This bug was introduced with commit
      0c183379 ("ipv6: fix incorrect ipsec
      fragment").
      
      v2:
      a) Replace IPV6_PMTU_DISC_DO with IPV6_PMTUDISC_PROBE.
      b) Don't pass ipv6_pinfo to ip6_append_data_mtu (suggestion by Gao
         feng, thanks!).
      c) Change mtu to unsigned int, else we get a warning about
         non-matching types because of the min()-macro type-check.
      Acked-by: NGao feng <gaofeng@cn.fujitsu.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75a493e6
    • H
      ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data · 8822b64a
      Hannes Frederic Sowa 提交于
      We accidentally call down to ip6_push_pending_frames when uncorking
      pending AF_INET data on a ipv6 socket. This results in the following
      splat (from Dave Jones):
      
      skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL>
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:126!
      invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
      +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
      CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
      task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
      RIP: 0010:[<ffffffff816e759c>]  [<ffffffff816e759c>] skb_panic+0x63/0x65
      RSP: 0018:ffff8801e6431de8  EFLAGS: 00010282
      RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
      RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
      RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
      R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
      FS:  00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
      Stack:
       ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
       ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
       ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
      Call Trace:
       [<ffffffff8159a9aa>] skb_push+0x3a/0x40
       [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0
       [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140
       [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0
       [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20
       [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0
       [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0
       [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40
       [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20
       [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0
       [<ffffffff816f5d54>] tracesys+0xdd/0xe2
      Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
      RIP  [<ffffffff816e759c>] skb_panic+0x63/0x65
       RSP <ffff8801e6431de8>
      
      This patch adds a check if the pending data is of address family AF_INET
      and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
      if that is the case.
      
      This bug was found by Dave Jones with trinity.
      
      (Also move the initialization of fl6 below the AF_INET check, even if
      not strictly necessary.)
      
      Cc: Dave Jones <davej@redhat.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8822b64a
  9. 02 7月, 2013 6 次提交
    • C
      ipip: fix a regression in ioctl · 3b7b514f
      Cong Wang 提交于
      This is a regression introduced by
      commit fd58156e (IPIP: Use ip-tunneling code.)
      
      Similar to GRE tunnel, previously we only check the parameters
      for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the
      check is moved for all commands.
      
      So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL.
      
      Also, the check for i_key, o_key etc. is suspicious too,
      which did not exist before, reset them before passing
      to ip_tunnel_ioctl().
      
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b7b514f
    • W
      l2tp: add missing .owner to struct pppox_proto · e1558a93
      Wei Yongjun 提交于
      Add missing .owner of struct pppox_proto. This prevents the
      module from being removed from underneath its users.
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1558a93
    • M
      ethtool: make .get_dump_data() harder to misuse by drivers · c590b5e2
      Michal Schmidt 提交于
      As the patch "bnx2x: remove zeroing of dump data buffer" showed,
      it is too easy implement .get_dump_data incorrectly in a driver.
      
      Let's make sure drivers cannot get confused by userspace requesting
      a too big dump.
      
      Also WARN if the driver sets dump->len to something weird and make
      sure the length reported to userspace is the actual length of data
      copied to userspace.
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c590b5e2
    • D
      net: sctp: get rid of SCTP_DBG_TSNS entirely · e02010ad
      Daniel Borkmann 提交于
      After having reworked the debugging framework, Neil and Vlad agreed to
      get rid of the leftover SCTP_DBG_TSNS code for a couple of reasons:
      
      We can use systemtap scripts to investigate these things, we now have
      pr_debug() helpers that make life easier, and if we really need anything
      else besides those tools, we will be forced to come up with something
      better than we have there. Therefore, get rid of this ifdef debugging
      code entirely for now.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e02010ad
    • A
      ipv6,mcast: always hold idev->lock before mca_lock · 8965779d
      Amerigo Wang 提交于
      dingtianhong reported the following deadlock detected by lockdep:
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.4.24.05-0.1-default #1 Not tainted
       -------------------------------------------------------
       ksoftirqd/0/3 is trying to acquire lock:
        (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
      
       but task is already holding lock:
        (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&mc->mca_lock){+.+...}:
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
              [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
              [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
              [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
              [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
              [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
              [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
              [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
              [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
              [<ffffffff813d9360>] dev_change_flags+0x40/0x70
              [<ffffffff813ea627>] do_setlink+0x237/0x8a0
              [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
              [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
              [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
              [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
              [<ffffffff81403e20>] netlink_unicast+0x140/0x180
              [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
              [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
              [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
              [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
              [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&ndev->lock){+.+...}:
              [<ffffffff810a798e>] check_prev_add+0x3de/0x440
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
              [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
              [<ffffffff8149b036>] mld_newpack+0xb6/0x160
              [<ffffffff8149b18b>] add_grhead+0xab/0xc0
              [<ffffffff8149d03b>] add_grec+0x3ab/0x460
              [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
              [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
              [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
              [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
              [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
              [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
              [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
              [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
              [<ffffffff8106c7de>] kthread+0xae/0xc0
              [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
      
      actually we can just hold idev->lock before taking pmc->mca_lock,
      and avoid taking idev->lock again when iterating idev->addr_list,
      since the upper callers of mld_newpack() already take
      read_lock_bh(&idev->lock).
      Reported-by: Ndingtianhong <dingtianhong@huawei.com>
      Cc: dingtianhong <dingtianhong@huawei.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Tested-by: NDing Tianhong <dingtianhong@huawei.com>
      Tested-by: NChen Weilong <chenweilong@huawei.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8965779d
    • C
      vti: remove duplicated code to fix a memory leak · ab6c7a0a
      Cong Wang 提交于
      vti module allocates dev->tstats twice: in vti_fb_tunnel_init()
      and in vti_tunnel_init(), this lead to a memory leak of
      dev->tstats.
      
      Just remove the duplicated operations in vti_fb_tunnel_init().
      
      (candidate for -stable)
      
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Saurabh Mohan <saurabh.mohan@vyatta.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Acked-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab6c7a0a