1. 09 7月, 2008 2 次提交
    • D
      netdev: Move queue_lock into struct netdev_queue. · dc2b4847
      David S. Miller 提交于
      The lock is now an attribute of the device queue.
      
      One thing to notice is that "suspicious" places
      emerge which will need specific training about
      multiple queue handling.  They are so marked with
      explicit "netdev->rx_queue" and "netdev->tx_queue"
      references.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc2b4847
    • D
      netdev: Create netdev_queue abstraction. · bb949fbd
      David S. Miller 提交于
      A netdev_queue is an entity managed by a qdisc.
      
      Currently there is one RX and one TX queue, and a netdev_queue merely
      contains a backpointer to the net_device.
      
      The Qdisc struct is augmented with a netdev_queue pointer as well.
      
      Eventually the 'dev' Qdisc member will go away and we will have the
      resulting hierarchy:
      
      	net_device --> netdev_queue --> Qdisc
      
      Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue
      pointer argument.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb949fbd
  2. 07 7月, 2008 1 次提交
  3. 06 7月, 2008 1 次提交
  4. 02 7月, 2008 4 次提交
  5. 28 6月, 2008 2 次提交
  6. 21 6月, 2008 1 次提交
    • E
      netns: Don't receive new packets in a dead network namespace. · b9f75f45
      Eric W. Biederman 提交于
      Alexey Dobriyan <adobriyan@gmail.com> writes:
      > Subject: ICMP sockets destruction vs ICMP packets oops
      
      > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
      > icmp_reply() wants ICMP socket.
      >
      > Steps to reproduce:
      >
      > 	launch shell in new netns
      > 	move real NIC to netns
      > 	setup routing
      > 	ping -i 0
      > 	exit from shell
      >
      > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > PGD 17f3cd067 PUD 17f3ce067 PMD 0 
      > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
      > CPU 0 
      > Modules linked in: usblp usbcore
      > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
      > RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
      > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
      > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
      > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
      > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
      > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
      > FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
      > Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
      >  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
      >  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
      > Call Trace:
      >  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
      >  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
      >  [<ffffffff803fd645>] icmp_echo+0x65/0x70
      >  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
      >  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
      >  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
      >  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
      >  [<ffffffff803be57d>] process_backlog+0x9d/0x100
      >  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
      >  [<ffffffff80237be5>] __do_softirq+0x75/0x100
      >  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
      >  [<ffffffff8020f085>] do_softirq+0x65/0xa0
      >  [<ffffffff80237af7>] irq_exit+0x97/0xa0
      >  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
      >  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
      >  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
      >  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
      >  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
      >  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
      >  [<ffffffff8040f380>] ? rest_init+0x70/0x80
      > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
      > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
      > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
      > RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      >  RSP <ffffffff8057fc30>
      > CR2: 0000000000000000
      > ---[ end trace ea161157b76b33e8 ]---
      > Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      Receiving packets while we are cleaning up a network namespace is a
      racy proposition. It is possible when the packet arrives that we have
      removed some but not all of the state we need to fully process it.  We
      have the choice of either playing wack-a-mole with the cleanup routines
      or simply dropping packets when we don't have a network namespace to
      handle them.
      
      Since the check looks inexpensive in netif_receive_skb let's just
      drop the incoming packets.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f75f45
  7. 20 6月, 2008 2 次提交
  8. 18 6月, 2008 4 次提交
    • W
      netdevice: Fix promiscuity and allmulti overflow · dad9b335
      Wang Chen 提交于
      Max of promiscuity and allmulti plus positive @inc can cause overflow.
      Fox example: when allmulti=0xFFFFFFFF, any caller give dev_set_allmulti() a
      positive @inc will cause allmulti be off.
      This is not what we want, though it's rare case.
      The fix is that only negative @inc will cause allmulti or promiscuity be off
      and when any caller makes the counters touch the roof, we return error.
      
      Change of v2:
      Change void function dev_set_promiscuity/allmulti to return int.
      So callers can get the overflow error.
      Caller's fix will be done later.
      
      Change of v3:
      1. Since we return error to caller, we don't need to print KERN_ERROR,
      KERN_WARNING is enough.
      2. In dev_set_promiscuity(), if __dev_set_promiscuity() failed, we
      return at once.
      Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dad9b335
    • D
      net: Add sk_set_socket() helper. · 972692e0
      David S. Miller 提交于
      In order to more easily grep for all things that set
      sk->sk_socket, add sk_set_socket() helper inline function.
      
      Suggested (although only half-seriously) by Evgeniy Polyakov.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      972692e0
    • J
      bonding: Allow setting max_bonds to zero · b8a9787e
      Jay Vosburgh 提交于
      	Permit bonding to function rationally if max_bonds is set to
      zero.  This will load the module, but create no master devices (which can
      be created via sysfs).
      
      	Requires some change to bond_create_sysfs; currently, the
      netdev sysfs directory is determined from the first bonding device created,
      but this is no longer possible.  Instead, an interface from net/core is
      created to create and destroy files in net_class.
      
      	Based on a patch submitted by Phil Oester <kernel@linuxaces.com>.
      Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to
      update the documentation.
      Signed-off-by: NPhil Oester <kernel@linuxace.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      b8a9787e
    • O
      net/core: add NETDEV_BONDING_FAILOVER event · c1da4ac7
      Or Gerlitz 提交于
      Add NETDEV_BONDING_FAILOVER event to be used in a successive patch
      by bonding to announce fail-over for the active-backup mode through the
      netdev events notifier chain mechanism. Such an event can be of use for the
      RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
      always be aligned with the IP stack, in the sense that they use the same
      ports/links as the stack does. More usages can be done to allow monitoring
      tools based on netlink events being aware to bonding fail-over.
      Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c1da4ac7
  9. 17 6月, 2008 1 次提交
    • B
      net: Fix test for VLAN TX checksum offload capability · 6de329e2
      Ben Hutchings 提交于
      Selected device feature bits can be propagated to VLAN devices, so we
      can make use of TX checksum offload and TSO on VLAN-tagged packets.
      However, if the physical device does not do VLAN tag insertion or
      generic checksum offload then the test for TX checksum offload in
      dev_queue_xmit() will see a protocol of htons(ETH_P_8021Q) and yield
      false.
      
      This splits the checksum offload test into two functions:
      
      - can_checksum_protocol() tests a given protocol against a feature bitmask
      
      - dev_can_checksum() first tests the skb protocol against the device
        features; if that fails and the protocol is htons(ETH_P_8021Q) then
        it tests the encapsulated protocol against the effective device
        features for VLANs
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6de329e2
  10. 12 6月, 2008 1 次提交
  11. 05 6月, 2008 1 次提交
  12. 04 6月, 2008 3 次提交
  13. 22 5月, 2008 1 次提交
  14. 21 5月, 2008 2 次提交
  15. 20 5月, 2008 1 次提交
  16. 15 5月, 2008 1 次提交
  17. 14 5月, 2008 1 次提交
  18. 13 5月, 2008 1 次提交
  19. 08 5月, 2008 2 次提交
    • B
      net: Added ASSERT_RTNL() to dev_open() and dev_close(). · e46b66bc
      Ben Hutchings 提交于
      dev_open() and dev_close() must be called holding the RTNL, since they
      call device functions and netdevice notifiers that are promised the RTNL.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e46b66bc
    • P
      netns: Fix arbitrary net_device-s corruptions on net_ns stop. · aca51397
      Pavel Emelyanov 提交于
      When a net namespace is destroyed, some devices (those, not killed
      on ns stop explicitly) are moved back to init_net.
      
      The problem, is that this net_ns change has one point of failure -
      the __dev_alloc_name() may be called if a name collision occurs (and
      this is easy to trigger). This allocator performs a likely-to-fail
      GFP_ATOMIC allocation to find a suitable number. Other possible 
      conditions that may cause error (for device being ns local or not
      registered) are always false in this case.
      
      So, when this call fails, the device is unregistered. But this is
      *not* the right thing to do, since after this the device may be
      released (and kfree-ed) improperly. E. g. bridges require more
      actions (sysfs update, timer disarming, etc.), some other devices 
      want to remove their private areas from lists, etc.
      
      I. e. arbitrary use-after-free cases may occur.
      
      The proposed fix is the following: since the only reason for the
      dev_change_net_namespace to fail is the name generation, we may
      give it a unique fall-back name w/o %d-s in it - the dev<ifindex>
      one, since ifindexes are still unique.
      
      So make this change, raise the failure-case printk loglevel to 
      EMERG and replace the unregister_netdevice call with BUG().
      
      [ Use snprintf() -DaveM ]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aca51397
  20. 04 5月, 2008 1 次提交
  21. 03 5月, 2008 4 次提交
  22. 02 5月, 2008 1 次提交
  23. 29 4月, 2008 1 次提交
  24. 25 4月, 2008 1 次提交
    • M
      ethtool: EEPROM dump no longer works for tg3 and natsemi · c5835df9
      Mandeep Singh Baines 提交于
      In the ethtool user-space application, tg3 and natsemi over-ride the
      default implementation of dump_eeprom(). In both tg3_dump_eeprom() and
      natsemi_dump_eeprom(), there is a magic number check which is not
      present in the default implementation.
      
      Commit b131dd5d ("[ETHTOOL]: Add support for large eeproms") snipped
      the code which copied the ethtool_eeprom structure back to
      user-space. tg3 and natsemi are over-writing the magic number field
      and then checking it in user-space. With the ethtool_eeprom copy
      removed, the check is failing.
      
      The fix is simple. Add the ethtool_eeprom copy back.
      Signed-off-by: NMandeep Singh Baines <msb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5835df9