1. 30 9月, 2014 2 次提交
    • J
      net: sched: restrict use of qstats qlen · 64015853
      John Fastabend 提交于
      This removes the use of qstats->qlen variable from the classifiers
      and makes it an explicit argument to gnet_stats_copy_queue().
      
      The qlen represents the qdisc queue length and is packed into
      the qstats at the last moment before passnig to user space. By
      handling it explicitely we avoid, in the percpu stats case, having
      to figure out which per_cpu variable to put it in.
      
      It would probably be best to remove it from qstats completely
      but qstats is a user space ABI and can't be broken. A future
      patch could make an internal only qstats structure that would
      avoid having to allocate an additional u32 variable on the
      Qdisc struct. This would make the qstats struct 128bits instead
      of 128+32.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64015853
    • J
      net: sched: make bstats per cpu and estimator RCU safe · 22e0f8b9
      John Fastabend 提交于
      In order to run qdisc's without locking statistics and estimators
      need to be handled correctly.
      
      To resolve bstats make the statistics per cpu. And because this is
      only needed for qdiscs that are running without locks which is not
      the case for most qdiscs in the near future only create percpu
      stats when qdiscs set the TCQ_F_CPUSTATS flag.
      
      Next because estimators use the bstats to calculate packets per
      second and bytes per second the estimator code paths are updated
      to use the per cpu statistics.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22e0f8b9
  2. 10 12月, 2013 1 次提交
  3. 31 8月, 2013 1 次提交
    • S
      qdisc: allow setting default queuing discipline · 6da7c8fc
      stephen hemminger 提交于
      By default, the pfifo_fast queue discipline has been used by default
      for all devices. But we have better choices now.
      
      This patch allow setting the default queueing discipline with sysctl.
      This allows easy use of better queueing disciplines on all devices
      without having to use tc qdisc scripts. It is intended to allow
      an easy path for distributions to make fq_codel or sfq the default
      qdisc.
      
      This patch also makes pfifo_fast more of a first class qdisc, since
      it is now possible to manually override the default and explicitly
      use pfifo_fast. The behavior for systems who do not use the sysctl
      is unchanged, they still get pfifo_fast
      
      Also removes leftover random # in sysctl net core.
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6da7c8fc
  4. 12 12月, 2012 1 次提交
    • E
      pkt_sched: avoid requeues if possible · 1abbe139
      Eric Dumazet 提交于
      With BQL being deployed, we can more likely have following behavior :
      
      We dequeue a packet from qdisc in dequeue_skb(), then we realize target
      tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
      skb into gso_skb for later.
      
      This shows in stats (tc -s qdisc dev eth0) as requeues.
      
      Problem of these requeues is that high priority packets can not be
      dequeued as long as this (possibly low prio and big TSO packet) is not
      removed from gso_skb.
      
      At 1Gbps speed, a full size TSO packet is 500 us of extra latency.
      
      In some cases, we know that all packets dequeued from a qdisc are
      for a particular and known txq :
      
      - If device is non multi queue
      - For all MQ/MQPRIO slave qdiscs
      
      This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
      this capability, so that dequeue_skb() is allowed to dequeue a packet
      only if the associated txq is not stopped.
      
      This indeed reduce latencies for high prio packets (or improve fairness
      with sfq/fq_codel), and almost remove qdisc 'requeues'.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1abbe139
  5. 01 11月, 2011 1 次提交
  6. 22 1月, 2011 1 次提交
    • E
      net_sched: TCQ_F_CAN_BYPASS generalization · 23624935
      Eric Dumazet 提交于
      Now qdisc stab is handled before TCQ_F_CAN_BYPASS test in
      __dev_xmit_skb(), we can generalize TCQ_F_CAN_BYPASS to other qdiscs
      than pfifo_fast : pfifo, bfifo, pfifo_head_drop and sfq
      
      SFQ is special because it can have external classifiers, and in these
      cases, we cannot bypass queue discipline (packet could be dropped by
      classifier) without admin asking it, or further changes.
      
      Its worth doing this, especially for SFQ, avoiding dirtying memory in
      case no packets are already waiting in queue.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23624935
  7. 21 10月, 2010 1 次提交
  8. 18 5月, 2010 1 次提交
  9. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  10. 18 9月, 2009 1 次提交
  11. 15 9月, 2009 1 次提交
    • J
      pkt_sched: Fix tx queue selection in tc_modify_qdisc · 926e61b7
      Jarek Poplawski 提交于
      After the recent mq change there is the new select_queue qdisc class
      method used in tc_modify_qdisc, but it works OK only for direct child
      qdiscs of mq qdisc. Grandchildren always get the first tx queue, which
      would give wrong qdisc_root etc. results (e.g. for sch_htb as child of
      sch_prio). This patch fixes it by using parent's dev_queue for such
      grandchildren qdiscs. The select_queue method's return type is changed
      BTW.
      
      With feedback from: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      926e61b7
  12. 10 9月, 2009 1 次提交
    • P
      net_sched: fix estimator lock selection for mq child qdiscs · 23bcf634
      Patrick McHardy 提交于
      When new child qdiscs are attached to the mq qdisc, they are actually
      attached as root qdiscs to the device queues. The lock selection for
      new estimators incorrectly picks the root lock of the existing and
      to be replaced qdisc, which results in a use-after-free once the old
      qdisc has been destroyed.
      
      Mark mq qdisc instances with a new flag and treat qdiscs attached to
      mq as children similar to regular root qdiscs.
      
      Additionally prevent estimators from being attached to the mq qdisc
      itself since it only updates its byte and packet counters during dumps.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23bcf634
  13. 06 9月, 2009 1 次提交
    • D
      net_sched: add classful multiqueue dummy scheduler · 6ec1c69a
      David S. Miller 提交于
      This patch adds a classful dummy scheduler which can be used as root qdisc
      for multiqueue devices and exposes each device queue as a child class.
      
      This allows to address queues individually and graft them similar to regular
      classes. Additionally it presents an accumulated view of the statistics of
      all real root qdiscs in the dummy root.
      
      Two new callbacks are added to the qdisc_ops and qdisc_class_ops:
      
      - cl_ops->select_queue selects the tx queue number for new child classes.
      
      - qdisc_ops->attach() overrides root qdisc device grafting to attach
        non-shared qdiscs to the queues.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ec1c69a