1. 08 9月, 2012 1 次提交
  2. 06 9月, 2012 1 次提交
  3. 01 9月, 2012 1 次提交
  4. 20 8月, 2012 2 次提交
    • N
      net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child() · fae6ef87
      Neal Cardwell 提交于
      This commit removes the sk_rx_dst_set calls from
      tcp_create_openreq_child(), because at that point the icsk_af_ops
      field of ipv6_mapped TCP sockets has not been set to its proper final
      value.
      
      Instead, to make sure we get the right sk_rx_dst_set variant
      appropriate for the address family of the new connection, we have
      tcp_v{4,6}_syn_recv_sock() directly call the appropriate function
      shortly after the call to tcp_create_openreq_child() returns.
      
      This also moves inet6_sk_rx_dst_set() to avoid a forward declaration
      with the new approach.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Reported-by: NArtem Savkov <artem.savkov@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fae6ef87
    • P
      net: ipv6: fix oops in inet_putpeer() · 9d7b0fc1
      Patrick McHardy 提交于
      Commit 97bab73f (inet: Hide route peer accesses behind helpers.) introduced
      a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not
      initialized, causing a false positive result from inetpeer_ptr_is_peer(),
      which in turn causes a NULL pointer dereference in inet_putpeer().
      
      Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X
      EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0
      EIP is at inet_putpeer+0xe/0x16
      EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641
      ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4
       DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
      CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750
      DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
      DR6: ffff0ff0 DR7: 00000400
       f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28
       f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4
       f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8
       [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb
       [<c038d5f7>] dst_destroy+0x1d/0xa4
       [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36
       [<c0396870>] flow_cache_gc_task+0x85/0x9f
       [<c0142d2b>] process_one_work+0x122/0x441
       [<c043feb5>] ? apic_timer_interrupt+0x31/0x38
       [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b
       [<c0143e2d>] worker_thread+0x113/0x3cc
      
      Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to
      properly initialize the dst's peer pointer.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d7b0fc1
  5. 15 8月, 2012 2 次提交
    • B
      ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock · 4acd4945
      Ben Hutchings 提交于
      Cong Wang reports that lockdep detected suspicious RCU usage while
      enabling IPV6 forwarding:
      
       [ 1123.310275] ===============================
       [ 1123.442202] [ INFO: suspicious RCU usage. ]
       [ 1123.558207] 3.6.0-rc1+ #109 Not tainted
       [ 1123.665204] -------------------------------
       [ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section!
       [ 1123.992320]
       [ 1123.992320] other info that might help us debug this:
       [ 1123.992320]
       [ 1124.307382]
       [ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0
       [ 1124.522220] 2 locks held by sysctl/5710:
       [ 1124.648364]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17
       [ 1124.882211]  #1:  (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29
       [ 1125.085209]
       [ 1125.085209] stack backtrace:
       [ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ #109
       [ 1125.441291] Call Trace:
       [ 1125.545281]  [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112
       [ 1125.667212]  [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47
       [ 1125.781838]  [<ffffffff8107c260>] __might_sleep+0x1e/0x19b
      [...]
       [ 1127.445223]  [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f
      [...]
       [ 1127.772188]  [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b
       [ 1127.885174]  [<ffffffff81872d26>] dev_forward_change+0x30/0xcb
       [ 1128.013214]  [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5
      [...]
      
      addrconf_forward_change() uses RCU iteration over the netdev list,
      which is unnecessary since it already holds the RTNL lock.  We also
      cannot reasonably require netdevice notifier functions not to sleep.
      Reported-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4acd4945
    • I
      net: ipv6: proc: Fix error handling · 4855d6f3
      Igor Maravic 提交于
      Fix error handling in case making of dir dev_snmp6 failes
      Signed-off-by: NIgor Maravic <igorm@etf.rs>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4855d6f3
  6. 10 8月, 2012 1 次提交
    • E
      net: tcp: ipv6_mapped needs sk_rx_dst_set method · 63d02d15
      Eric Dumazet 提交于
      commit 5d299f3d (net: ipv6: fix TCP early demux) added a
      regression for ipv6_mapped case.
      
      [   67.422369] SELinux: initialized (dev autofs, type autofs), uses
      genfs_contexts
      [   67.449678] SELinux: initialized (dev autofs, type autofs), uses
      genfs_contexts
      [   92.631060] BUG: unable to handle kernel NULL pointer dereference at
      (null)
      [   92.631435] IP: [<          (null)>]           (null)
      [   92.631645] PGD 0
      [   92.631846] Oops: 0010 [#1] SMP
      [   92.632095] Modules linked in: autofs4 sunrpc ipv6 dm_mirror
      dm_region_hash dm_log dm_multipath dm_mod video sbs sbshc battery ac lp
      parport sg snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event
      snd_seq snd_seq_device pcspkr snd_pcm_oss snd_mixer_oss snd_pcm
      snd_timer serio_raw button floppy snd i2c_i801 i2c_core soundcore
      snd_page_alloc shpchp ide_cd_mod cdrom microcode ehci_hcd ohci_hcd
      uhci_hcd
      [   92.634294] CPU 0
      [   92.634294] Pid: 4469, comm: sendmail Not tainted 3.6.0-rc1 #3
      [   92.634294] RIP: 0010:[<0000000000000000>]  [<          (null)>]
      (null)
      [   92.634294] RSP: 0018:ffff880245fc7cb0  EFLAGS: 00010282
      [   92.634294] RAX: ffffffffa01985f0 RBX: ffff88024827ad00 RCX:
      0000000000000000
      [   92.634294] RDX: 0000000000000218 RSI: ffff880254735380 RDI:
      ffff88024827ad00
      [   92.634294] RBP: ffff880245fc7cc8 R08: 0000000000000001 R09:
      0000000000000000
      [   92.634294] R10: 0000000000000000 R11: ffff880245fc7bf8 R12:
      ffff880254735380
      [   92.634294] R13: ffff880254735380 R14: 0000000000000000 R15:
      7fffffffffff0218
      [   92.634294] FS:  00007f4516ccd6f0(0000) GS:ffff880256600000(0000)
      knlGS:0000000000000000
      [   92.634294] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   92.634294] CR2: 0000000000000000 CR3: 0000000245ed1000 CR4:
      00000000000007f0
      [   92.634294] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [   92.634294] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [   92.634294] Process sendmail (pid: 4469, threadinfo ffff880245fc6000,
      task ffff880254b8cac0)
      [   92.634294] Stack:
      [   92.634294]  ffffffff813837a7 ffff88024827ad00 ffff880254b6b0e8
      ffff880245fc7d68
      [   92.634294]  ffffffff81385083 00000000001d2680 ffff8802547353a8
      ffff880245fc7d18
      [   92.634294]  ffffffff8105903a ffff88024827ad60 0000000000000002
      00000000000000ff
      [   92.634294] Call Trace:
      [   92.634294]  [<ffffffff813837a7>] ? tcp_finish_connect+0x2c/0xfa
      [   92.634294]  [<ffffffff81385083>] tcp_rcv_state_process+0x2b6/0x9c6
      [   92.634294]  [<ffffffff8105903a>] ? sched_clock_cpu+0xc3/0xd1
      [   92.634294]  [<ffffffff81059073>] ? local_clock+0x2b/0x3c
      [   92.634294]  [<ffffffff8138caf3>] tcp_v4_do_rcv+0x63a/0x670
      [   92.634294]  [<ffffffff8133278e>] release_sock+0x128/0x1bd
      [   92.634294]  [<ffffffff8139f060>] __inet_stream_connect+0x1b1/0x352
      [   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
      [   92.634294]  [<ffffffff8104b333>] ? wake_up_bit+0x25/0x25
      [   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
      [   92.634294]  [<ffffffff8139f223>] ? inet_stream_connect+0x22/0x4b
      [   92.634294]  [<ffffffff8139f234>] inet_stream_connect+0x33/0x4b
      [   92.634294]  [<ffffffff8132e8cf>] sys_connect+0x78/0x9e
      [   92.634294]  [<ffffffff813fd407>] ? sysret_check+0x1b/0x56
      [   92.634294]  [<ffffffff81088503>] ? __audit_syscall_entry+0x195/0x1c8
      [   92.634294]  [<ffffffff811cc26e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [   92.634294]  [<ffffffff813fd3e2>] system_call_fastpath+0x16/0x1b
      [   92.634294] Code:  Bad RIP value.
      [   92.634294] RIP  [<          (null)>]           (null)
      [   92.634294]  RSP <ffff880245fc7cb0>
      [   92.634294] CR2: 0000000000000000
      [   92.648982] ---[ end trace 24e2bed94314c8d9 ]---
      [   92.649146] Kernel panic - not syncing: Fatal exception in interrupt
      
      Fix this using inet_sk_rx_dst_set(), and export this function in case
      IPv6 is modular.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63d02d15
  7. 07 8月, 2012 1 次提交
  8. 01 8月, 2012 2 次提交
  9. 31 7月, 2012 1 次提交
    • E
      net: TCP early demux cleanup · cca32e4b
      Eric Dumazet 提交于
      early_demux() handlers should be called in RCU context, and as we
      use skb_dst_set_noref(skb, dst), caller must not exit from RCU context
      before dst use (skb_dst(skb)) or release (skb_drop(dst))
      
      Therefore, rcu_read_lock()/rcu_read_unlock() pairs around
      ->early_demux() are confusing and not needed :
      
      Protocol handlers are already in an RCU read lock section.
      (__netif_receive_skb() does the rcu_read_lock() )
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cca32e4b
  10. 30 7月, 2012 1 次提交
    • L
      ipv6: fix incorrect route 'expires' value passed to userspace · 8253947e
      Li Wei 提交于
      When userspace use RTM_GETROUTE to dump route table, with an already
      expired route entry, we always got an 'expires' value(2147157)
      calculated base on INT_MAX.
      
      The reason of this problem is in the following satement:
      	rt->dst.expires - jiffies < INT_MAX
      gcc promoted the type of both sides of '<' to unsigned long, thus
      a small negative value would be considered greater than INT_MAX.
      
      With the help of Eric Dumazet, do the out of bound checks in
      rtnl_put_cacheinfo(), _after_ conversion to clock_t.
      Signed-off-by: NLi Wei <lw@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8253947e
  11. 27 7月, 2012 1 次提交
  12. 23 7月, 2012 1 次提交
    • E
      tcp: dont drop MTU reduction indications · 563d34d0
      Eric Dumazet 提交于
      ICMP messages generated in output path if frame length is bigger than
      mtu are actually lost because socket is owned by user (doing the xmit)
      
      One example is the ipgre_tunnel_xmit() calling
      icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
      
      We had a similar case fixed in commit a34a101e (ipv6: disable GSO on
      sockets hitting dst_allfrag).
      
      Problem of such fix is that it relied on retransmit timers, so short tcp
      sessions paid a too big latency increase price.
      
      This patch uses the tcp_release_cb() infrastructure so that MTU
      reduction messages (ICMP messages) are not lost, and no extra delay
      is added in TCP transmits.
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Diagnosed-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Tore Anderson <tore@fud.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      563d34d0
  13. 21 7月, 2012 1 次提交
  14. 20 7月, 2012 1 次提交
    • Y
      net-tcp: Fast Open base · 2100c8d2
      Yuchung Cheng 提交于
      This patch impelements the common code for both the client and server.
      
      1. TCP Fast Open option processing. Since Fast Open does not have an
         option number assigned by IANA yet, it shares the experiment option
         code 254 by implementing draft-ietf-tcpm-experimental-options
         with a 16 bits magic number 0xF989. This enables global experiments
         without clashing the scarce(2) experimental options available for TCP.
      
         When the draft status becomes standard (maybe), the client should
         switch to the new option number assigned while the server supports
         both numbers for transistion.
      
      2. The new sysctl tcp_fastopen
      
      3. A place holder init function
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2100c8d2
  15. 19 7月, 2012 1 次提交
    • E
      ipv6: add ipv6_addr_hash() helper · ddbe5032
      Eric Dumazet 提交于
      Introduce ipv6_addr_hash() helper doing a XOR on all bits
      of an IPv6 address, with an optimized x86_64 version.
      
      Use it in flow dissector, as suggested by Andrew McGregor,
      to reduce hash collision probabilities in fq_codel (and other
      users of flow dissector)
      
      Use it in ip6_tunnel.c and use more bit shuffling, as suggested
      by David Laight, as existing hash was ignoring most of them.
      
      Use it in sunrpc and use more bit shuffling, using hash_32().
      
      Use it in net/ipv6/addrconf.c, using hash_32() as well.
      
      As a cleanup, use it in net/ipv4/tcp_metrics.c
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrew McGregor <andrewmcgr@gmail.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddbe5032
  16. 18 7月, 2012 1 次提交
  17. 17 7月, 2012 3 次提交
    • D
      net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270
      David S. Miller 提交于
      This will be used so that we can compose a full flow key.
      
      Even though we have a route in this context, we need more.  In the
      future the routes will be without destination address, source address,
      etc. keying.  One ipv4 route will cover entire subnets, etc.
      
      In this environment we have to have a way to possess persistent storage
      for redirects and PMTU information.  This persistent storage will exist
      in the FIB tables, and that's why we'll need to be able to rebuild a
      full lookup flow key here.  Using that flow key will do a fib_lookup()
      and create/update the persistent entry.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6700c270
    • L
      ipv6: fix unappropriate errno returned for non-multicast address · a858d64b
      Li Wei 提交于
      We need to check the passed in multicast address and return
      appropriate errno(EINVAL) if it is not valid. And it's no need
      to walk through the ipv6_mc_list in this situation.
      Signed-off-by: NLi Wei <lw@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a858d64b
    • D
      ipv6: fix RTPROT_RA markup of RA routes w/nexthops · f0396f60
      Denis Ovsienko 提交于
      Userspace implementations of network routing protocols sometimes need to
      tell RA-originated IPv6 routes from other kernel routes to make proper
      routing decisions. This makes most sense for RA routes with nexthops,
      namely, default routes and Route Information routes.
      
      The intended mean of preserving RA route origin in a netlink message is
      through indicating RTPROT_RA as protocol code. Function rt6_fill_node()
      tried to do that for default routes, but its test condition was taken
      wrong. This change is modeled after the original mailing list posting
      by Jeff Haran. It fixes the test condition for default route case and
      sets the same behaviour for Route Information case (both types use
      nexthops). Handling of the 3rd RA route type, Prefix Information, is
      left unchanged, as it stands for interface connected routes (without
      nexthops).
      Signed-off-by: NDenis Ovsienko <infrastation@yandex.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0396f60
  18. 16 7月, 2012 1 次提交
  19. 14 7月, 2012 1 次提交
  20. 12 7月, 2012 9 次提交
  21. 11 7月, 2012 7 次提交