1. 17 10月, 2008 1 次提交
  2. 08 10月, 2008 2 次提交
    • H
      net: Fix netdev_run_todo dead-lock · 58ec3b4d
      Herbert Xu 提交于
      Benjamin Thery tracked down a bug that explains many instances
      of the error
      
      unregister_netdevice: waiting for %s to become free. Usage count = %d
      
      It turns out that netdev_run_todo can dead-lock with itself if
      a second instance of it is run in a thread that will then free
      a reference to the device waited on by the first instance.
      
      The problem is really quite silly.  We were trying to create
      parallelism where none was required.  As netdev_run_todo always
      follows a RTNL section, and that todo tasks can only be added
      with the RTNL held, by definition you should only need to wait
      for the very ones that you've added and be done with it.
      
      There is no need for a second mutex or spinlock.
      
      This is exactly what the following patch does.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58ec3b4d
    • P
      net: only invoke dev->change_rx_flags when device is UP · b6c40d68
      Patrick McHardy 提交于
      Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN
      device down that is in promiscous mode:
      
      When the VLAN device is set down, the promiscous count on the real
      device is decremented by one by vlan_dev_stop(). When removing the
      promiscous flag from the VLAN device afterwards, the promiscous
      count on the real device is decremented a second time by the
      vlan_change_rx_flags() callback.
      
      The root cause for this is that the ->change_rx_flags() callback is
      invoked while the device is down. The synchronization is meant to mirror
      the behaviour of the ->set_rx_mode callbacks, meaning the ->open function
      is responsible for doing a full sync on open, the ->close() function is
      responsible for doing full cleanup on ->stop() and ->change_rx_flags()
      is meant to do incremental changes while the device is UP.
      
      Only invoke ->change_rx_flags() while the device is UP to provide the
      intended behaviour.
      Tested-by: NJesper Dangaard Brouer <jdb@comx.dk>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6c40d68
  3. 30 9月, 2008 2 次提交
  4. 24 9月, 2008 1 次提交
  5. 23 9月, 2008 1 次提交
    • S
      net: network device name ifalias support · 0b815a1a
      Stephen Hemminger 提交于
      This patch add support for keeping an additional character alias
      associated with an network interface. This is useful for maintaining
      the SNMP ifAlias value which is a user defined value. Routers use this
      to hold information like which circuit or line it is connected to. It
      is just an arbitrary text label on the network device.
      
      There are two exposed interfaces with this patch, the value can be
      read/written either via netlink or sysfs.
      
      This could be maintained just by the snmp daemon, but it is more
      generally useful for other management tools, and the kernel is good
      place to act as an agreed upon interface to store it.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b815a1a
  6. 21 9月, 2008 2 次提交
  7. 09 9月, 2008 1 次提交
    • H
      net: Enable TSO if supported by at least one device · e2a6b852
      Herbert Xu 提交于
      As it stands users of netdev_compute_features (e.g., bridges/bonding)
      will only enable TSO if all consituent devices support it.  This
      is unnecessarily pessimistic since even on devices that do not
      support hardware TSO and SG, emulated TSO still performs to a par
      with TSO off.
      
      This patch enables TSO if at least on constituent device supports
      it in hardware.
      
      The direct beneficiaries will be virtualisation that uses bridging
      since this means that TSO will always be enabled for communication
      from the host to the guests.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2a6b852
  8. 08 9月, 2008 1 次提交
  9. 19 8月, 2008 1 次提交
    • D
      pkt_sched: Prevent livelock in TX queue running. · 195648bb
      David S. Miller 提交于
      If dev_deactivate() is trying to quiesce the queue, it
      is theoretically possible for another cpu to livelock
      trying to process that queue.  This happens because
      dev_deactivate() grabs the queue spinlock as it checks
      the queue state, whereas net_tx_action() does a trylock
      and reschedules the qdisc if it hits the lock.
      
      This breaks the livelock by adding a check on
      __QDISC_STATE_DEACTIVATED to net_tx_action() when
      the trylock fails.
      
      Based upon feedback from Herbert Xu and Jarek Poplawski.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      195648bb
  10. 18 8月, 2008 3 次提交
  11. 07 8月, 2008 3 次提交
  12. 05 8月, 2008 1 次提交
  13. 04 8月, 2008 1 次提交
  14. 03 8月, 2008 2 次提交
  15. 01 8月, 2008 1 次提交
  16. 30 7月, 2008 1 次提交
    • D
      pkt_sched: Fix OOPS on ingress qdisc add. · 8d50b53d
      David S. Miller 提交于
      Bug report from Steven Jan Springl:
      
      	Issuing the following command causes a kernel oops:
      		tc qdisc add dev eth0 handle ffff: ingress
      
      The problem mostly stems from all of the special case handling of
      ingress qdiscs.
      
      So, to fix this, do the grafting operation the same way we do for TX
      qdiscs.  Which means that dev_activate() and dev_deactivate() now do
      the "qdisc_sleeping <--> qdisc" transitions on dev->rx_queue too.
      
      Future simplifications are possible now, mainly because it is
      impossible for dev_queue->{qdisc,qdisc_sleeping} to be NULL.  There
      are NULL checks all over to handle the ingress qdisc special case
      that used to exist before this commit.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d50b53d
  17. 26 7月, 2008 1 次提交
  18. 24 7月, 2008 1 次提交
    • D
      netdev: Remove warning from __netif_schedule(). · 5b3ab1db
      David S. Miller 提交于
      It isn't helping anything and we aren't going to be able to change all
      the drivers that do queue wakeups in strange situations.
      
      Just letting a noop_qdisc get scheduled will work because when
      qdisc_run() executes via net_tx_work() it will simply find no packets
      pending when it makes the ->dequeue() call in qdisc_restart.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b3ab1db
  19. 23 7月, 2008 2 次提交
  20. 22 7月, 2008 4 次提交
  21. 20 7月, 2008 1 次提交
  22. 19 7月, 2008 1 次提交
    • D
      pkt_sched: Manage qdisc list inside of root qdisc. · 30723673
      David S. Miller 提交于
      Idea is from Patrick McHardy.
      
      Instead of managing the list of qdiscs on the device level, manage it
      in the root qdisc of a netdev_queue.  This solves all kinds of
      visibility issues during qdisc destruction.
      
      The way to iterate over all qdiscs of a netdev_queue is to visit
      the netdev_queue->qdisc, and then traverse it's list.
      
      The only special case is to ignore builting qdiscs at the root when
      dumping or doing a qdisc_lookup().  That was not needed previously
      because builtin qdiscs were not added to the device's qdisc_list.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30723673
  23. 18 7月, 2008 6 次提交
    • D
      pkt_sched: Kill netdev_queue lock. · 83874000
      David S. Miller 提交于
      We can simply use the qdisc->q.lock for all of the
      qdisc tree synchronization.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83874000
    • D
      netdevice: Move qdisc_list back into net_device proper. · ead81cc5
      David S. Miller 提交于
      And give it it's own lock.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ead81cc5
    • D
      pkt_sched: Schedule qdiscs instead of netdev_queue. · 37437bb2
      David S. Miller 提交于
      When we have shared qdiscs, packets come out of the qdiscs
      for multiple transmit queues.
      
      Therefore it doesn't make any sense to schedule the transmit
      queue when logically we cannot know ahead of time the TX
      queue of the SKB that the qdisc->dequeue() will give us.
      
      Just for sanity I added a BUG check to make sure we never
      get into a state where the noop_qdisc is scheduled.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37437bb2
    • D
      net: Implement simple sw TX hashing. · 8f0f2223
      David S. Miller 提交于
      It just xor hashes over IPv4/IPv6 addresses and ports of transport.
      
      The only assumption it makes is that skb_network_header() is set
      correctly.
      
      With bug fixes from Eric Dumazet.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f0f2223
    • D
      netdev: Add netdev->select_queue() method. · eae792b7
      David S. Miller 提交于
      Devices or device layers can set this to control the queue selection
      performed by dev_pick_tx().
      
      This function runs under RCU protection, which allows overriding
      functions to have some way of synchronizing with things like dynamic
      ->real_num_tx_queues adjustments.
      
      This makes the spinlock prefetch in dev_queue_xmit() a little bit
      less effective, but that's the price right now for correctness.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eae792b7
    • D
      net: Use queue aware tests throughout. · fd2ea0a7
      David S. Miller 提交于
      This effectively "flips the switch" by making the core networking
      and multiqueue-aware drivers use the new TX multiqueue structures.
      
      Non-multiqueue drivers need no changes.  The interfaces they use such
      as netif_stop_queue() degenerate into an operation on TX queue zero.
      So everything "just works" for them.
      
      Code that really wants to do "X" to all TX queues now invokes a
      routine that does so, such as netif_tx_wake_all_queues(),
      netif_tx_stop_all_queues(), etc.
      
      pktgen and netpoll required a little bit more surgery than the others.
      
      In particular the pktgen changes, whilst functional, could be largely
      improved.  The initial check in pktgen_xmit() will sometimes check the
      wrong queue, which is mostly harmless.  The thing to do is probably to
      invoke fill_packet() earlier.
      
      The bulk of the netpoll changes is to make the code operate solely on
      the TX queue indicated by by the SKB queue mapping.
      
      Setting of the SKB queue mapping is entirely confined inside of
      net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
      special semantics (drops, for example) it will be implemented here.
      
      Finally, we now have a "real_num_tx_queues" which is where the driver
      indicates how many TX queues are actually active.
      
      With IGB changes from Jeff Kirsher.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd2ea0a7