1. 27 5月, 2011 1 次提交
  2. 26 5月, 2011 2 次提交
    • F
      bonding: documentation and code cleanup for resend_igmp · 94265cf5
      Flavio Leitner 提交于
      Improves the documentation about how IGMP resend parameter
      works, fix two missing checks and coding style issues.
      Signed-off-by: NFlavio Leitner <fbl@redhat.com>
      Acked-by: NRick Jones <rick.jones2@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94265cf5
    • N
      bonding: prevent deadlock on slave store with alb mode (v3) · 9fe0617d
      Neil Horman 提交于
      This soft lockup was recently reported:
      
      [root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
      [root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      bonding bond5: master_dev is not up in bond_enslave
      [root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      
      BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
      CPU 12:
      Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
      be2d
      Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
      RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
      .text.lock.spinlock+0x26/00
      RSP: 0018:ffff810113167da8  EFLAGS: 00000286
      RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
      RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
      RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
      R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
      R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
      FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0
      
      Call Trace:
       [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
       [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
       [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
       [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
       [<ffffffff8006457b>] __down_write_nested+0x12/0x92
       [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
       [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
       [<ffffffff80016b87>] vfs_write+0xce/0x174
       [<ffffffff80017450>] sys_write+0x45/0x6e
       [<ffffffff8005d28d>] tracesys+0xd5/0xe0
      
      It occurs because we are able to change the slave configuarion of a bond while
      the bond interface is down.  The bonding driver initializes some data structures
      only after its ndo_open routine is called.  Among them is the initalization of
      the alb tx and rx hash locks.  So if we add or remove a slave without first
      opening the bond master device, we run the risk of trying to lock/unlock a
      spinlock that has garbage for data in it, which results in our above softlock.
      
      Note that sometimes this works, because in many cases an unlocked spinlock has
      the raw_lock parameter initialized to zero (meaning that the kzalloc of the
      net_device private data is equivalent to calling spin_lock_init), but thats not
      true in all cases, and we aren't guaranteed that condition, so we need to pass
      the relevant spinlocks through the spin_lock_init function.
      
      Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
      the ndo_init path, so they are ready for use by the bond_store_slaves path.
      
      Change notes:
      v2) Based on conversation with Jay and Nicolas it seems that the ability to
      enslave devices while the bond master is down should be safe to do.  As such
      this is an outlier bug, and so instead we'll just initalize the errant spinlocks
      in the init path rather than the open path, solving the problem.  We'll also
      remove the warnings about the bond being down during enslave operations, since
      it should be safe
      
      v3) Fix spelling error
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: jtluka@redhat.com
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: nicolas.2p.debian@gmail.com
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe0617d
  3. 23 5月, 2011 2 次提交
  4. 13 5月, 2011 1 次提交
  5. 10 5月, 2011 1 次提交
  6. 06 5月, 2011 1 次提交
  7. 30 4月, 2011 1 次提交
    • B
      ipv4, ipv6, bonding: Restore control over number of peer notifications · ad246c99
      Ben Hutchings 提交于
      For backward compatibility, we should retain the module parameters and
      sysfs attributes to control the number of peer notifications
      (gratuitous ARPs and unsolicited NAs) sent after bonding failover.
      Also, it is possible for failover to take place even though the new
      active slave does not have link up, and in that case the peer
      notification should be deferred until it does.
      
      Change ipv4 and ipv6 so they do not automatically send peer
      notifications on bonding failover.
      
      Change the bonding driver to send separate NETDEV_NOTIFY_PEERS
      notifications when the link is up, as many times as requested.  Since
      it does not directly control which protocols send notifications, make
      num_grat_arp and num_unsol_na aliases for a single parameter.  Bump
      the bonding version number and update its documentation.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Acked-by: NBrian Haley <brian.haley@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad246c99
  8. 26 4月, 2011 1 次提交
    • J
      bonding: move processing of recv handlers into handle_frame() · 3aba891d
      Jiri Pirko 提交于
      Since now when bonding uses rx_handler, all traffic going into bond
      device goes thru bond_handle_frame. So there's no need to go back into
      bonding code later via ptype handlers. This patch converts
      original ptype handlers into "bonding receive probes". These functions
      are called from bond_handle_frame and they are registered per-mode.
      
      Note that vlan packets are also handled because they are always untagged
      thanks to vlan_untag()
      
      Note that this also allows arpmon for eth-bond-bridge-vlan topology.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3aba891d
  9. 18 4月, 2011 1 次提交
  10. 17 4月, 2011 1 次提交
  11. 15 4月, 2011 3 次提交
  12. 05 4月, 2011 1 次提交
    • T
      net: Allow no-cache copy from user on transmit · c6e1a0d1
      Tom Herbert 提交于
      This patch uses __copy_from_user_nocache on transmit to bypass data
      cache for a performance improvement.  skb_add_data_nocache and
      skb_copy_to_page_nocache can be called by sendmsg functions to use
      this feature, initial support is in tcp_sendmsg.  This functionality is
      configurable per device using ethtool.
      
      Presumably, this feature would only be useful when the driver does
      not touch the data.  The feature is turned on by default if a device
      indicates that it does some form of checksum offload; it is off by
      default for devices that do no checksum offload or indicate no checksum
      is necessary.  For the former case copy-checksum is probably done
      anyway, in the latter case the device is likely loopback in which case
      the no cache copy is probably not beneficial.
      
      This patch was tested using 200 instances of netperf TCP_RR with
      1400 byte request and one byte reply.  Platform is 16 core AMD x86.
      
      No-cache copy disabled:
         672703 tps, 97.13% utilization
         50/90/99% latency:244.31 484.205 1028.41
      
      No-cache copy enabled:
         702113 tps, 96.16% utilization,
         50/90/99% latency 238.56 467.56 956.955
      
      Using 14000 byte request and response sizes demonstrate the
      effects more dramatically:
      
      No-cache copy disabled:
         79571 tps, 34.34 %utlization
         50/90/95% latency 1584.46 2319.59 5001.76
      
      No-cache copy enabled:
         83856 tps, 34.81% utilization
         50/90/95% latency 2508.42 2622.62 2735.88
      
      Note especially the effect on latency tail (95th percentile).
      
      This seems to provide a nice performance improvement and is
      consistent in the tests I ran.  Presumably, this would provide
      the greatest benfits in the presence of an application workload
      stressing the cache and a lot of transmit data happening.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6e1a0d1
  13. 24 3月, 2011 1 次提交
  14. 20 3月, 2011 1 次提交
  15. 17 3月, 2011 6 次提交
  16. 16 3月, 2011 2 次提交
  17. 13 3月, 2011 1 次提交
  18. 10 3月, 2011 1 次提交
  19. 08 3月, 2011 2 次提交
  20. 03 3月, 2011 1 次提交
  21. 28 2月, 2011 3 次提交
  22. 14 2月, 2011 2 次提交
  23. 25 1月, 2011 2 次提交
  24. 21 1月, 2011 1 次提交
    • N
      bonding: Ensure that we unshare skbs prior to calling pskb_may_pull · b3053251
      Neil Horman 提交于
      Recently reported oops:
      
      kernel BUG at net/core/skbuff.c:813!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/devices/virtual/net/bond0/broadcast
      CPU 8
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
      -[7870AC1]-
      RIP: 0010:[<ffffffff81405b16>]  [<ffffffff81405b16>]
      pskb_expand_head+0x36/0x1e0
      RSP: 0018:ffff880028303b70  EFLAGS: 00010202
      RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
      RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
      R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
      R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
      Stack:
       ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
      <0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
      <0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
      Call Trace:
      <IRQ>
       [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
       [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
       [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
       [<ffffffff814a0857>] ? packet_rcv+0x377/0x440
       [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
       [<ffffffff8140f788>] napi_skb_finish+0x58/0x70
       [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
       [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
       [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
       [<ffffffff8140fe53>] net_rx_action+0x103/0x210
       [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
       [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
       [<ffffffff810142cc>] call_softirq+0x1c/0x30
       [<ffffffff81015f35>] do_softirq+0x65/0xa0
       [<ffffffff810739d5>] irq_exit+0x85/0x90
       [<ffffffff814cf915>] do_IRQ+0x75/0xf0
       [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
       <EOI>
       [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
       [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
       [<ffffffff81011e96>] cpu_idle+0xb6/0x110
       [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f
      
      Resulted from bonding driver registering packet handlers via dev_add_pack and
      then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
      sockets) gets called first, the delivered skb will have a user count > 1, which
      causes pskb_may_pull to BUG halt when it does its skb_shared check.  Fix this by
      calling skb_share_check prior to the may_pull call sites in the bonding driver
      to clone the skb when needed.  Tested by myself and the reported successfully.
      
      Signed-off-by: Neil Horman
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3053251
  25. 17 12月, 2010 1 次提交