1. 02 6月, 2010 1 次提交
  2. 02 4月, 2010 1 次提交
    • E
      gen_estimator: deadlock fix · 5d944c64
      Eric Dumazet 提交于
      One of my test machine got a deadlock during "tc" sessions,
      adding/deleting classes & filters, using traffic estimators.
      
      After some analysis, I believe we have a potential use after free case
      in est_timer() :
      
      spin_lock(e->stats_lock); << HERE >>
      read_lock(&est_lock);
      if (e->bstats == NULL)   << TEST >>
      	goto skip;
      
      Test is done a bit late, because after estimator is killed, and before
      rcu grace period elapsed, we might already have freed/reuse memory where
      e->stats_locks points to (some qdisc->q.lock)
      
      A possible fix is to respect a rcu grace period at Qdisc dismantle time.
      
      On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to
      it (for struct rcu_head) is a problem because it might change
      performance, given QDISC_ALIGNTO is 32 bytes.
      
      This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most
      current alignment requirements.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d944c64
  3. 29 1月, 2010 1 次提交
  4. 04 11月, 2009 1 次提交
  5. 02 9月, 2009 1 次提交
    • D
      pkt_sched: Revert tasklet_hrtimer changes. · 2fbd3da3
      David S. Miller 提交于
      These are full of unresolved problems, mainly that conversions don't
      work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers
      tasklets can't be killed from softirq context.
      
      And when a qdisc gets reset, that's exactly what we need to do here.
      
      We'll work this out in the net-next-2.6 tree and if warranted we'll
      backport that work to -stable.
      
      This reverts the following 3 changesets:
      
      a2cb6a4d
      ("pkt_sched: Fix bogon in tasklet_hrtimer changes.")
      
      38acce2d
      ("pkt_sched: Convert CBQ to tasklet_hrtimer.")
      
      ee5f9757
      ("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fbd3da3
  6. 23 8月, 2009 1 次提交
  7. 07 8月, 2009 1 次提交
    • K
      net: Avoid enqueuing skb for default qdiscs · bbd8a0d3
      Krishna Kumar 提交于
      dev_queue_xmit enqueue's a skb and calls qdisc_run which
      dequeue's the skb and xmits it. In most cases, the skb that
      is enqueue'd is the same one that is dequeue'd (unless the
      queue gets stopped or multiple cpu's write to the same queue
      and ends in a race with qdisc_run). For default qdiscs, we
      can remove the redundant enqueue/dequeue and simply xmit the
      skb since the default qdisc is work-conserving.
      
      The patch uses a new flag - TCQ_F_CAN_BYPASS to identify the
      default fast queue. The controversial part of the patch is
      incrementing qlen when a skb is requeued - this is to avoid
      checks like the second line below:
      
      +  } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) &&
      >>         !q->gso_skb &&
      +          !test_and_set_bit(__QDISC_STATE_RUNNING, &q->state)) {
      
      Results of a 2 hour testing for multiple netperf sessions (1,
      2, 4, 8, 12 sessions on a 4 cpu system-X). The BW numbers are
      aggregate Mb/s across iterations tested with this version on
      System-X boxes with Chelsio 10gbps cards:
      
      ----------------------------------
      Size |  ORG BW          NEW BW   |
      ----------------------------------
      128K |  156964          159381   |
      256K |  158650          162042   |
      ----------------------------------
      
      Changes from ver1:
      
      1. Move sch_direct_xmit declaration from sch_generic.h to
         pkt_sched.h
      2. Update qdisc basic statistics for direct xmit path.
      3. Set qlen to zero in qdisc_reset.
      4. Changed some function names to more meaningful ones.
      Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbd8a0d3
  8. 15 6月, 2009 1 次提交
  9. 09 6月, 2009 2 次提交
  10. 01 2月, 2009 1 次提交
  11. 23 9月, 2008 1 次提交
  12. 22 8月, 2008 1 次提交
  13. 13 8月, 2008 1 次提交
    • D
      pkt_sched: Add queue stopped test back to qdisc_run(). · 83f36f3f
      David S. Miller 提交于
      Based upon a bug report by Andrew Gallatin on netdev
      with subject "CPU utilization increased in 2.6.27rc"
      
      In commit 37437bb2
      ("pkt_sched: Schedule qdiscs instead of netdev_queue.")
      the test of the queue being stopped was erroneously
      removed from qdisc_run().
      
      When the TX queue of the device fills up, this omission
      causes lots of extraneous useless work to be queued up
      to softirq context, where we'll just return immediately
      because the device is still stuffed up.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83f36f3f
  14. 20 7月, 2008 1 次提交
  15. 18 7月, 2008 3 次提交
    • D
      pkt_sched: Schedule qdiscs instead of netdev_queue. · 37437bb2
      David S. Miller 提交于
      When we have shared qdiscs, packets come out of the qdiscs
      for multiple transmit queues.
      
      Therefore it doesn't make any sense to schedule the transmit
      queue when logically we cannot know ahead of time the TX
      queue of the SKB that the qdisc->dequeue() will give us.
      
      Just for sanity I added a BUG check to make sure we never
      get into a state where the noop_qdisc is scheduled.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37437bb2
    • D
      pkt_sched: Make QDISC_RUNNING a qdisc state. · e2627c8c
      David S. Miller 提交于
      Currently it is associated with a netdev_queue, but when we have
      qdisc sharing that no longer makes any sense.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2627c8c
    • D
      net: Use queue aware tests throughout. · fd2ea0a7
      David S. Miller 提交于
      This effectively "flips the switch" by making the core networking
      and multiqueue-aware drivers use the new TX multiqueue structures.
      
      Non-multiqueue drivers need no changes.  The interfaces they use such
      as netif_stop_queue() degenerate into an operation on TX queue zero.
      So everything "just works" for them.
      
      Code that really wants to do "X" to all TX queues now invokes a
      routine that does so, such as netif_tx_wake_all_queues(),
      netif_tx_stop_all_queues(), etc.
      
      pktgen and netpoll required a little bit more surgery than the others.
      
      In particular the pktgen changes, whilst functional, could be largely
      improved.  The initial check in pktgen_xmit() will sometimes check the
      wrong queue, which is mostly harmless.  The thing to do is probably to
      invoke fill_packet() earlier.
      
      The bulk of the netpoll changes is to make the code operate solely on
      the TX queue indicated by by the SKB queue mapping.
      
      Setting of the SKB queue mapping is entirely confined inside of
      net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
      special semantics (drops, for example) it will be implemented here.
      
      Finally, we now have a "real_num_tx_queues" which is where the driver
      indicates how many TX queues are actually active.
      
      With IGB changes from Jeff Kirsher.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd2ea0a7
  16. 09 7月, 2008 2 次提交
  17. 06 7月, 2008 1 次提交
  18. 29 1月, 2008 1 次提交
  19. 11 10月, 2007 1 次提交
  20. 15 7月, 2007 1 次提交
    • P
      [NET_SCHED]: act_api: qdisc internal reclassify support · 73ca4918
      Patrick McHardy 提交于
      The behaviour of NET_CLS_POLICE for TC_POLICE_RECLASSIFY was to return
      it to the qdisc, which could handle it internally or ignore it. With
      NET_CLS_ACT however, tc_classify starts over at the first classifier
      and never returns it to the qdisc. This makes it impossible to support
      qdisc-internal reclassification, which in turn makes it impossible to
      remove the old NET_CLS_POLICE code without breaking compatibility since
      we have two qdiscs (CBQ and ATM) that support this.
      
      This patch adds a tc_classify_compat function that handles
      reclassification the old way and changes CBQ and ATM to use it.
      
      This again is of course not fully backwards compatible with the previous
      NET_CLS_ACT behaviour. Unfortunately there is no way to fully maintain
      compatibility *and* support qdisc internal reclassification with
      NET_CLS_ACT, but this seems like the better choice over keeping the two
      incompatible options around forever.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73ca4918
  21. 26 4月, 2007 11 次提交
  22. 25 7月, 2006 1 次提交
  23. 30 6月, 2006 1 次提交
  24. 20 6月, 2006 1 次提交
    • H
      [NET]: Prevent multiple qdisc runs · 48d83325
      Herbert Xu 提交于
      Having two or more qdisc_run's contend against each other is bad because
      it can induce packet reordering if the packets have to be requeued.  It
      appears that this is an unintended consequence of relinquinshing the queue
      lock while transmitting.  That in turn is needed for devices that spend a
      lot of time in their transmit routine.
      
      There are no advantages to be had as devices with queues are inherently
      single-threaded (the loopback device is not but then it doesn't have a
      queue).
      
      Even if you were to add a queue to a parallel virtual device (e.g., bolt
      a tbf filter in front of an ipip tunnel device), you would still want to
      process the queue in sequence to ensure that the packets are ordered
      correctly.
      
      The solution here is to steal a bit from net_device to prevent this.
      
      BTW, as qdisc_restart is no longer used by anyone as a module inside the
      kernel (IIRC it used to with netif_wake_queue), I have not exported the
      new __qdisc_run function.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48d83325
  25. 10 1月, 2006 1 次提交
  26. 06 7月, 2005 1 次提交