1. 19 5月, 2009 1 次提交
    • E
      net: release dst entry in dev_hard_start_xmit() · 93f154b5
      Eric Dumazet 提交于
      One point of contention in high network loads is the dst_release() performed
      when a transmited skb is freed. This is because NIC tx completion calls
      dev_kree_skb() long after original call to dev_queue_xmit(skb).
      
      CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
      quite visible if one CPU is 100% handling softirqs for a network device,
      since dst_clone() is done by other cpus, involving cache line ping pongs.
      
      It seems right place to release dst is in dev_hard_start_xmit(), for most
      devices but ones that are virtual, and some exceptions.
      
      David Miller suggested to define a new device flag, set in alloc_netdev_mq()
      (so that most devices set it at init time), and carefuly unset in devices
      which dont want a NULL skb->dst in their ndo_start_xmit().
      
      List of devices that must clear this flag is :
      
      - loopback device, because it calls netif_rx() and quoting Patrick :
          "ip_route_input() doesn't accept loopback addresses, so loopback packets
           already need to have a dst_entry attached."
      - appletalk/ipddp.c : needs skb->dst in its xmit function
      
      - And all devices that call again dev_queue_xmit() from their xmit function
      (as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93f154b5
  2. 21 11月, 2008 1 次提交
  3. 20 11月, 2008 1 次提交
  4. 01 8月, 2008 1 次提交
  5. 18 7月, 2008 2 次提交
    • D
      pkt_sched: Kill netdev_queue lock. · 83874000
      David S. Miller 提交于
      We can simply use the qdisc->q.lock for all of the
      qdisc tree synchronization.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83874000
    • D
      netdev: Allocate multiple queues for TX. · e8a0464c
      David S. Miller 提交于
      alloc_netdev_mq() now allocates an array of netdev_queue
      structures for TX, based upon the queue_count argument.
      
      Furthermore, all accesses to the TX queues are now vectored
      through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
      interfaces.  This makes it easy to grep the tree for all
      things that want to get to a TX queue of a net device.
      
      Problem spots which are not really multiqueue aware yet, and
      only work with one queue, can easily be spotted by grepping
      for all netdev_get_tx_queue() calls that pass in a zero index.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8a0464c
  6. 09 7月, 2008 2 次提交
  7. 21 3月, 2008 1 次提交
    • J
      [NET] ifb: set separate lockdep classes for queue locks · 94833dfb
      Jarek Poplawski 提交于
      [   10.536424] =======================================================
      [   10.536424] [ INFO: possible circular locking dependency detected ]
      [   10.536424] 2.6.25-rc3-devel #3
      [   10.536424] -------------------------------------------------------
      [   10.536424] swapper/0 is trying to acquire lock:
      [   10.536424]  (&dev->queue_lock){-+..}, at: [<c0299b4a>] 
      dev_queue_xmit+0x175/0x2f3
      [   10.536424]
      [   10.536424] but task is already holding lock:
      [   10.536424]  (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/0x178 
      [act_mirred]
      [   10.536424]
      [   10.536424] which lock already depends on the new lock.
      
      lockdep warns of locking order while using ifb with sch_ingress and
      act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock
      is at the beginning). This patch is only to tell lockdep that ifb is
      a different device (e.g. from eth) and has its own pair of queue
      locks. (This warning is a false-positive in common scenario of using
      ifb; yet there are possible situations, when this order could be
      dangerous; lockdep should warn in such a case.) (With suggestions by
      David S. Miller)
      Reported-and-tested-by: NDenys Fedoryshchenko <denys@visp.net.lb>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94833dfb
  8. 11 10月, 2007 3 次提交
    • J
      [NET] drivers/net: statistics cleanup #1 -- save memory and shrink code · 09f75cd7
      Jeff Garzik 提交于
      We now have struct net_device_stats embedded in struct net_device,
      and the default ->get_stats() hook does the obvious thing for us.
      
      Run through drivers/net/* and remove the driver-local storage of
      statistics, and driver-local ->get_stats() hook where applicable.
      
      This was just the low-hanging fruit in drivers/net; plenty more drivers
      remain to be updated.
      
      [ Resolved conflicts with napi_struct changes and fix sunqe build
        regression... -DaveM ]
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09f75cd7
    • R
      [NET]: Nuke SET_MODULE_OWNER macro. · 10d024c1
      Ralf Baechle 提交于
      It's been a useless no-op for long enough in 2.6 so I figured it's time to
      remove it.  The number of people that could object because they're
      maintaining unified 2.4 and 2.6 drivers is probably rather small.
      
      [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10d024c1
    • E
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman 提交于
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      881d966b
  9. 12 7月, 2007 2 次提交
  10. 11 7月, 2007 2 次提交
  11. 30 3月, 2007 1 次提交
  12. 31 1月, 2007 1 次提交
    • L
      Revert "net: ifb error path loop fix" · bcdddfb6
      Linus Torvalds 提交于
      This reverts commit 0c0b3ae6.
      
      Quoth David:
      
        "Jeff, please revert
      
         It's wrong.  We had a lengthy analysis of this piece of code
         several months ago, and it is correct.
      
         Consider, if we run the loop and we get an error
         the following happens:
      
         1) attempt of ifb_init_one(i) fails, therefore we should
            not try to "ifb_free_one()" on "i" since it failed
         2) the loop iteration first increments "i", then it
            check for error
      
         Therefore we must decrement "i" twice before the first
         free during the cleanup.  One to "undo" the for() loop
         increment, and one to "skip" the ifb_init_one() case which
         failed."
      Reported-by: NDavid Miller <davem@davemloft.net>
      Acked-by: NJeff Garzik <jgarzik@pobox.com>
      Cc: Andrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bcdddfb6
  13. 30 1月, 2007 1 次提交
  14. 04 1月, 2007 1 次提交
  15. 03 10月, 2006 1 次提交
  16. 14 9月, 2006 1 次提交
  17. 22 7月, 2006 1 次提交
  18. 01 7月, 2006 1 次提交
  19. 18 6月, 2006 1 次提交
    • H
      [NET]: Add netif_tx_lock · 932ff279
      Herbert Xu 提交于
      Various drivers use xmit_lock internally to synchronise with their
      transmission routines.  They do so without setting xmit_lock_owner.
      This is fine as long as netpoll is not in use.
      
      With netpoll it is possible for deadlocks to occur if xmit_lock_owner
      isn't set.  This is because if a printk occurs while xmit_lock is held
      and xmit_lock_owner is not set can cause netpoll to attempt to take
      xmit_lock recursively.
      
      While it is possible to resolve this by getting netpoll to use
      trylock, it is suboptimal because netpoll's sole objective is to
      maximise the chance of getting the printk out on the wire.  So
      delaying or dropping the message is to be avoided as much as possible.
      
      So the only alternative is to always set xmit_lock_owner.  The
      following patch does this by introducing the netif_tx_lock family of
      functions that take care of setting/unsetting xmit_lock_owner.
      
      I renamed xmit_lock to _xmit_lock to indicate that it should not be
      used directly.  I didn't provide irq versions of the netif_tx_lock
      functions since xmit_lock is meant to be a BH-disabling lock.
      
      This is pretty much a straight text substitution except for a small
      bug fix in winbond.  It currently uses
      netif_stop_queue/spin_unlock_wait to stop transmission.  This is
      unsafe as an IRQ can potentially wake up the queue.  So it is safer to
      use netif_tx_disable.
      
      The hamradio bits used spin_lock_irq but it is unnecessary as
      xmit_lock must never be taken in an IRQ handler.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      932ff279
  20. 24 2月, 2006 1 次提交
  21. 10 1月, 2006 1 次提交
    • J
      [NET]: Add IFB (Intermediate Functional Block) network device. · 253af423
      Jamal Hadi Salim 提交于
      A new device to do intermidiate functional block in a system shared
      manner.  To use the new functionality, you need to turn on
      qos/classifier actions.
      
      The new functionality can be grouped as:
      
      1) qdiscs/policies that are per device as opposed to system wide.  ifb
      allows for a device which can be redirected to thus providing an
      impression of sharing.
      
      2) Allows for queueing incoming traffic for shaping instead of
      dropping.
      
      Packets are redirected to this device using tc/action mirred redirect
      construct. If they are sent to it by plain routing instead then they
      will merely be dropped and the stats would indicate that.
      Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      253af423