1. 28 10月, 2013 9 次提交
    • E
      inet: restore gso for vxlan · 8c3a897b
      Eric Dumazet 提交于
      Alexei reported a performance regression on vxlan, caused
      by commit 3347c960 "ipv4: gso: make inet_gso_segment() stackable"
      
      GSO vxlan packets were not properly segmented, adding IP fragments
      while they were not expected.
      
      Rename 'bool tunnel' to 'bool encap', and add a new boolean
      to express the fact that UDP should be fragmented.
      This fragmentation is triggered by skb->encapsulation being set.
      
      Remove a "skb->encapsulation = 1" added in above commit,
      as its not needed, as frags inherit skb->frag from original
      GSO skb.
      Reported-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c3a897b
    • D
      Revert "Merge branch 'bonding_monitor_locking'" · 1f2cd845
      David S. Miller 提交于
      This reverts commit 4d961a10, reversing
      changes made to a00f6fcc.
      
      Revert bond locking changes, they cause regressions and Veaceslav Falico
      doesn't like how the commit messages were done at all.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f2cd845
    • S
      be2net: add support for ndo_busy_poll · 6384a4d0
      Sathya Perla 提交于
      Includes:
      - ndo_busy_poll implementation
      - Locking between napi and busy_poll
      - Fix rx_post_starvation (replenish rx-queues in out-of-mememory scenario)
        logic to accomodate busy_poll.
      
      v2 changes:
      [Eric D.'s comment] call alloc_pages() with GFP_ATOMIC even in ndo_busy_poll
      context as it is not allowed to sleep.
      Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6384a4d0
    • D
      Merge branch 'bonding_monitor_locking' · 4d961a10
      David S. Miller 提交于
      Ding Tianhong says:
      
      ====================
      bonding: patchset for rcu use in bonding
      
      The slave list will add and del by bond_master_upper_dev_link() and
      bond_upper_dev_unlink(), which will call call_netdevice_notifiers(),
      even it is safe to call it in write bond lock now, but we can't sure
      that whether it is safe later, because other drivers may deal
      NETDEV_CHANGEUPPER in sleep way, so I didn't admit move the
      bond_upper_dev_unlink() in write bond lock.
      
      now the bond_for_each_slave only protect by rtnl_lock(), maybe use
      bond_for_each_slave_rcu is a good way to protect slave list for bond,
      but as a system slow path, it is no need to transform
      bond_for_each_slave() to bond_for_each_slave_rcu() in slow path, so in
      the patchset, I will remove the unused read bond lock for monitor
      function, maybe it is a better way, I will wait to accept any relay
      for it.
      
      Thanks for the Veaceslav Falico opinion.
      
      v2: add and modify commit for patchset and patch, it will be the first
      step for the whole patchset.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d961a10
    • D
      bonding: remove bond read lock for bond_3ad_state_machine_handler() · 5cc172c6
      dingtianhong 提交于
      The bond slave list may change when the monitor is running, the slave list is no longer
      protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
      1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
      to call call_netdevice_notifiers() in write lock.
      2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
      3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
      bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
      so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cc172c6
    • D
      bonding: remove bond read lock for bond_activebackup_arp_mon() · 80b9d236
      dingtianhong 提交于
      The bond slave list may change when the monitor is running, the slave list is no longer
      protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
      1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
      to call call_netdevice_notifiers() in write lock.
      2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
      3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
      bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
      so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80b9d236
    • D
      bonding: remove bond read lock for bond_loadbalance_arp_mon() · 7f1bb571
      dingtianhong 提交于
      The bond slave list may change when the monitor is running, the slave list is no longer
      protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
      1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
      to call call_netdevice_notifiers() in write lock.
      2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
      3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
      bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
      so I remove the bond->lock and add the rtnl lock to protect the whole monitor function.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f1bb571
    • D
      bonding: remove bond read lock for bond_alb_monitor() · 2d0dafb0
      dingtianhong 提交于
      The bond slave list may change when the monitor is running, the slave list is no longer
      protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
      1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
      to call call_netdevice_notifiers() in write lock.
      2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
      3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
      bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
      so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d0dafb0
    • D
      bonding: remove bond read lock for bond_mii_monitor() · 6b6c5261
      dingtianhong 提交于
      The bond slave list may change when the monitor is running, the slave list is no longer
      protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
      1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
      to call call_netdevice_notifiers() in write lock.
      2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
      3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
      bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
      so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b6c5261
  2. 26 10月, 2013 6 次提交
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · a00f6fcc
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates
      
      This series contains updates to igb, igbvf, i40e, ixgbe and ixgbevf.
      
      Dan Carpenter provides a patch for igbvf to fix a bug found by a static
      checker.  If the new MTU is very large, then "new_mtu + ETH_HLEN +
      ETH_FCS_LEN" can wrap and the check on the next line can underflow.
      
      Wei Yongjun provides 2 patches, the first against igbvf adds a missing
      iounmap() before the return from igbvf_probe().  The second against
      i40e, removes the include <linux/version.h> because it is not needed.
      
      Carolyn provides a patch for igb to fix a call to set the master/slave
      mode for all m88 generation 2 PHY's and removes the call for I210
      devices which do not need it.
      
      Stefan Assmann provides a patch for igb to fix an issue which was broke
      by:
         commit fa44f2f1
         Author: Greg Rose <gregory.v.rose@intel.com>
         Date:   Thu Jan 17 01:03:06 2013 -0800
         igb: Enable SR-IOV configuration via PCI sysfs interface
      which breaks the reloading of igb when VFs are assigned to a guest, in
      several ways.
      
      Jacob provides a patch for ixgbe and ixgbevf.  First, against ixgbe,
      cleans up ixgbe_enumerate_functions to reduce code complexity.  The
      second, against ixgbevf, adds support for ethtool's get_coalesce and
      set_coalesce command for the ixgbevf driver.
      
      Yijing Wang provides a patch for ixgbe to use pcie_capability_read_word()
      to simplify the code.
      
      Emil provides a ixgbe patch to fix an issue where the logic used to
      detect changes in rx-usecs was incorrect and was masked by the call to
      ixgbe_update_rsc().
      
      Don provides 2 patches for ixgbevf.  First creates a new function to set
      PSRTYPE.  The second bumps the ixgbevf driver version.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a00f6fcc
    • A
      net: fix rtnl notification in atomic context · 7f294054
      Alexei Starovoitov 提交于
      commit 991fb3f7 "dev: always advertise rx_flags changes via netlink"
      introduced rtnl notification from __dev_set_promiscuity(),
      which can be called in atomic context.
      
      Steps to reproduce:
      ip tuntap add dev tap1 mode tap
      ifconfig tap1 up
      tcpdump -nei tap1 &
      ip tuntap del dev tap1 mode tap
      
      [  271.627994] device tap1 left promiscuous mode
      [  271.639897] BUG: sleeping function called from invalid context at mm/slub.c:940
      [  271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip
      [  271.677525] INFO: lockdep is turned off.
      [  271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G        W    3.12.0-rc3+ #73
      [  271.703996] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [  271.731254]  ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5 ffff88082fa0f428
      [  271.760261]  ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1 ffffffff81110ff8
      [  271.790683]  0000000000000010 00000000000000d0 00000000000000d0 ffff8807f0d57af8
      [  271.822332] Call Trace:
      [  271.838234]  [<ffffffff817544e5>] dump_stack+0x55/0x76
      [  271.854446]  [<ffffffff8108bad1>] __might_sleep+0x181/0x240
      [  271.870836]  [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0
      [  271.887076]  [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0
      [  271.903368]  [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0
      [  271.919716]  [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0
      [  271.936088]  [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0
      [  271.952504]  [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0
      [  271.968902]  [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100
      [  271.985302]  [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0
      [  272.001642]  [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0
      [  272.017917]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
      [  272.033961]  [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50
      [  272.049855]  [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0
      [  272.065494]  [<ffffffff81732052>] packet_notifier+0x1b2/0x380
      [  272.080915]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
      [  272.096009]  [<ffffffff81761c66>] notifier_call_chain+0x66/0x150
      [  272.110803]  [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10
      [  272.125468]  [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20
      [  272.139984]  [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70
      [  272.154523]  [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20
      [  272.168552]  [<ffffffff816224c5>] rollback_registered_many+0x145/0x240
      [  272.182263]  [<ffffffff81622641>] rollback_registered+0x31/0x40
      [  272.195369]  [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90
      [  272.208230]  [<ffffffff81547ca0>] __tun_detach+0x140/0x340
      [  272.220686]  [<ffffffff81547ed6>] tun_chr_close+0x36/0x60
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f294054
    • H
      net: initialize hashrnd in flow_dissector with net_get_random_once · 66415cf8
      Hannes Frederic Sowa 提交于
      We also can defer the initialization of hashrnd in flow_dissector
      to its first use. Since net_get_random_once is irq safe now we don't
      have to audit the call paths if one of this functions get called by an
      interrupt handler.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66415cf8
    • H
      net: make net_get_random_once irq safe · f84be2bd
      Hannes Frederic Sowa 提交于
      I initial build non irq safe version of net_get_random_once because I
      would liked to have the freedom to defer even the extraction process of
      get_random_bytes until the nonblocking pool is fully seeded.
      
      I don't think this is a good idea anymore and thus this patch makes
      net_get_random_once irq safe. Now someone using net_get_random_once does
      not need to care from where it is called.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f84be2bd
    • N
      net: add missing dev_put() in __netdev_adjacent_dev_insert · 974daef7
      Nikolay Aleksandrov 提交于
      I think that a dev_put() is needed in the error path to preserve the
      proper dev refcount.
      
      CC: Veaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      974daef7
    • H
      netem: markov loss model transition fix · 4a3ad7b3
      Hagen Paul Pfeifer 提交于
      The transition from markov state "3 => lost packets within a burst
      period" to "1 => successfully transmitted packets within a gap period"
      has no *additional* loss event. The loss already happen for transition
      from 1 -> 3, this additional loss will make things go wild.
      
      E.g. transition probabilities:
      
      p13:   10%
      p31:  100%
      
      Expected:
      
      Ploss = p13 / (p13 + p31)
      Ploss = ~9.09%
      
      ... but it isn't. Even worse: we get a double loss - each time.
      So simple don't return true to indicate loss, rather break and return
      false.
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Stefano Salsano <stefano.salsano@uniroma2.it>
      Cc: Fabio Ludovici <fabio.ludovici@yahoo.it>
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a3ad7b3
  3. 24 10月, 2013 25 次提交