1. 22 2月, 2012 2 次提交
    • G
      rtnetlink: Fix problem with buffer allocation · 115c9b81
      Greg Rose 提交于
      Implement a new netlink attribute type IFLA_EXT_MASK.  The mask
      is a 32 bit value that can be used to indicate to the kernel that
      certain extended ifinfo values are requested by the user application.
      At this time the only mask value defined is RTEXT_FILTER_VF to
      indicate that the user wants the ifinfo dump to send information
      about the VFs belonging to the interface.
      
      This patch fixes a bug in which certain applications do not have
      large enough buffers to accommodate the extra information returned
      by the kernel with large numbers of SR-IOV virtual functions.
      Those applications will not send the new netlink attribute with
      the interface info dump request netlink messages so they will
      not get unexpectedly large request buffers returned by the kernel.
      
      Modifies the rtnl_calcit function to traverse the list of net
      devices and compute the minimum buffer size that can hold the
      info dumps of all matching devices based upon the filter passed
      in via the new netlink attribute filter mask.  If no filter
      mask is sent then the buffer allocation defaults to NLMSG_GOODSIZE.
      
      With this change it is possible to add yet to be defined netlink
      attributes to the dump request which should make it fairly extensible
      in the future.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      115c9b81
    • M
      neighbour: Fixed race condition at tbl->nht · 84338a6c
      Michel Machado 提交于
      When the fixed race condition happens:
      
      1. While function neigh_periodic_work scans the neighbor hash table
      pointed by field tbl->nht, it unlocks and locks tbl->lock between
      buckets in order to call cond_resched.
      
      2. Assume that function neigh_periodic_work calls cond_resched, that is,
      the lock tbl->lock is available, and function neigh_hash_grow runs.
      
      3. Once function neigh_hash_grow finishes, and RCU calls
      neigh_hash_free_rcu, the original struct neigh_hash_table that function
      neigh_periodic_work was using doesn't exist anymore.
      
      4. Once back at neigh_periodic_work, whenever the old struct
      neigh_hash_table is accessed, things can go badly.
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84338a6c
  2. 20 2月, 2012 1 次提交
  3. 16 2月, 2012 2 次提交
  4. 15 2月, 2012 12 次提交
  5. 13 2月, 2012 2 次提交
  6. 11 2月, 2012 5 次提交
  7. 10 2月, 2012 2 次提交
    • M
      mac80211: Fix a rwlock bad magic bug · b57e6b56
      Mohammed Shafi Shajakhan 提交于
      read_lock(&tpt_trig->trig.leddev_list_lock) is accessed via the path
      ieee80211_open (->) ieee80211_do_open (->) ieee80211_mod_tpt_led_trig
      (->) ieee80211_start_tpt_led_trig (->) tpt_trig_timer before initializing
      it.
      the intilization of this read/write lock happens via the path
      ieee80211_led_init (->) led_trigger_register, but we are doing
      'ieee80211_led_init'  after 'ieeee80211_if_add' where we
      register netdev_ops.
      so we access leddev_list_lock before initializing it and causes the
      following bug in chrome laptops with AR928X cards with the following
      script
      
      while true
      do
      sudo modprobe -v ath9k
      sleep 3
      sudo modprobe -r ath9k
      sleep 3
      done
      
      	BUG: rwlock bad magic on CPU#1, wpa_supplicant/358, f5b9eccc
      	Pid: 358, comm: wpa_supplicant Not tainted 3.0.13 #1
      	Call Trace:
      
      	[<8137b9df>] rwlock_bug+0x3d/0x47
      	[<81179830>] do_raw_read_lock+0x19/0x29
      	[<8137f063>] _raw_read_lock+0xd/0xf
      	[<f9081957>] tpt_trig_timer+0xc3/0x145 [mac80211]
      	[<f9081f3a>] ieee80211_mod_tpt_led_trig+0x152/0x174 [mac80211]
      	[<f9076a3f>] ieee80211_do_open+0x11e/0x42e [mac80211]
      	[<f9075390>] ? ieee80211_check_concurrent_iface+0x26/0x13c [mac80211]
      	[<f9076d97>] ieee80211_open+0x48/0x4c [mac80211]
      	[<812dbed8>] __dev_open+0x82/0xab
      	[<812dc0c9>] __dev_change_flags+0x9c/0x113
      	[<812dc1ae>] dev_change_flags+0x18/0x44
      	[<8132144f>] devinet_ioctl+0x243/0x51a
      	[<81321ba9>] inet_ioctl+0x93/0xac
      	[<812cc951>] sock_ioctl+0x1c6/0x1ea
      	[<812cc78b>] ? might_fault+0x20/0x20
      	[<810b1ebb>] do_vfs_ioctl+0x46e/0x4a2
      	[<810a6ebb>] ? fget_light+0x2f/0x70
      	[<812ce549>] ? sys_recvmsg+0x3e/0x48
      	[<810b1f35>] sys_ioctl+0x46/0x69
      	[<8137fa77>] sysenter_do_call+0x12/0x2
      
      Cc: <stable@vger.kernel.org>
      Cc: Gary Morain <gmorain@google.com>
      Cc: Paul Stewart <pstew@google.com>
      Cc: Abhijit Pradhan <abhijit@qca.qualcomm.com>
      Cc: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
      Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
      Acked-by: NJohannes Berg <johannes.berg@intel.com>
      Tested-by: NMohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
      Signed-off-by: NMohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      b57e6b56
    • D
      net: Make qdisc_skb_cb upper size bound explicit. · 16bda13d
      David S. Miller 提交于
      Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside
      of other data structures.
      
      This is intended to be used by IPoIB so that it can remember
      addressing information stored at hard_header_ops->create() time that
      it can fetch when the packet gets to the transmit routine.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16bda13d
  8. 09 2月, 2012 1 次提交
    • E
      gro: more generic L2 header check · 5ca3b72c
      Eric Dumazet 提交于
      Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
      only, and failed on IB/ipoib traffic.
      
      He provided a patch faking a zeroed header to let GRO aggregates frames.
      
      Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
      check to be more generic, ie not assuming L2 header is 14 bytes, but
      taking into account hard_header_len.
      
      __napi_gro_receive() has special handling for the common case (Ethernet)
      to avoid a memcmp() call and use an inline optimized function instead.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NShlomo Pongratz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ca3b72c
  9. 08 2月, 2012 1 次提交
  10. 05 2月, 2012 2 次提交
  11. 03 2月, 2012 4 次提交
  12. 02 2月, 2012 3 次提交
  13. 31 1月, 2012 3 次提交
    • E
      af_unix: fix EPOLLET regression for stream sockets · 6f01fd6e
      Eric Dumazet 提交于
      Commit 0884d7aa (AF_UNIX: Fix poll blocking problem when reading from
      a stream socket) added a regression for epoll() in Edge Triggered mode
      (EPOLLET)
      
      Appropriate fix is to use skb_peek()/skb_unlink() instead of
      skb_dequeue(), and only call skb_unlink() when skb is fully consumed.
      
      This remove the need to requeue a partial skb into sk_receive_queue head
      and the extra sk->sk_data_ready() calls that added the regression.
      
      This is safe because once skb is given to sk_receive_queue, it is not
      modified by a writer, and readers are serialized by u->readlock mutex.
      
      This also reduce number of spinlock acquisition for small reads or
      MSG_PEEK users so should improve overall performance.
      Reported-by: NNick Mathewson <nickm@freehaven.net>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Alexey Moiseytsev <himeraster@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f01fd6e
    • N
      tcp: fix tcp_trim_head() to adjust segment count with skb MSS · 5b35e1e6
      Neal Cardwell 提交于
      This commit fixes tcp_trim_head() to recalculate the number of
      segments in the skb with the skb's existing MSS, so trimming the head
      causes the skb segment count to be monotonically non-increasing - it
      should stay the same or go down, but not increase.
      
      Previously tcp_trim_head() used the current MSS of the connection. But
      if there was a decrease in MSS between original transmission and ACK
      (e.g. due to PMTUD), this could cause tcp_trim_head() to
      counter-intuitively increase the segment count when trimming bytes off
      the head of an skb. This violated assumptions in tcp_tso_acked() that
      tcp_trim_head() only decreases the packet count, so that packets_acked
      in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
      pass u32 pkts_acked values as large as 0xffffffff to
      ca_ops->pkts_acked().
      
      As an aside, if tcp_trim_head() had really wanted the skb to reflect
      the current MSS, it should have called tcp_set_skb_tso_segs()
      unconditionally, since a decrease in MSS would mean that a
      single-packet skb should now be sliced into multiple segments.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NNandita Dukkipati <nanditad@google.com>
      Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b35e1e6
    • G
      net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL · 4acb4190
      Glauber Costa 提交于
      sysctl_tcp_mem() initialization was moved to sysctl_tcp_ipv4.c
      in commit 3dc43e3e, since it
      became a per-ns value.
      
      That code, however, will never run when CONFIG_SYSCTL is
      disabled, leading to bogus values on those fields - causing hung
      TCP sockets.
      
      This patch fixes it by keeping an initialization code in
      tcp_init(). It will be overwritten by the first net namespace
      init if CONFIG_SYSCTL is compiled in, and do the right thing if
      it is compiled out.
      
      It is also named properly as tcp_init_mem(), to properly signal
      its non-sysctl side effect on TCP limits.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: David S. Miller <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/4F22D05A.8030604@parallels.com
      [ renamed the function, tidied up the changelog a bit ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4acb4190