1. 24 7月, 2012 1 次提交
    • D
      ipv4: Prepare for change of rt->rt_iif encoding. · 92101b3b
      David S. Miller 提交于
      Use inet_iif() consistently, and for TCP record the input interface of
      cached RX dst in inet sock.
      
      rt->rt_iif is going to be encoded differently, so that we can
      legitimately cache input routes in the FIB info more aggressively.
      
      When the input interface is "use SKB device index" the rt->rt_iif will
      be set to zero.
      
      This forces us to move the TCP RX dst cache installation into the ipv4
      specific code, and as well it should since doing the route caching for
      ipv6 is pointless at the moment since it is not inspected in the ipv6
      input paths yet.
      
      Also, remove the unlikely on dst->obsolete, all ipv4 dsts have
      obsolete set to a non-zero value to force invocation of the check
      callback.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92101b3b
  2. 17 7月, 2012 1 次提交
    • E
      netem: refine early skb orphaning · 5a308f40
      Eric Dumazet 提交于
      netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
      or any mechanism relying on socket sk_wmem_alloc feedback.
      
      Ideally, we should perform this orphaning after the rate module and
      before the delay module, to mimic what happens on a real link :
      
      skb orphaning is indeed normally done at TX completion, before the
      transit on the link.
      
      +-------+   +--------+  +---------------+  +-----------------+
      + Qdisc +---> Device +--> TX completion +--> links / hops    +->
      +       +   +  xmit  +  + skb orphaning +  + propagation     +
      +-------+   +--------+  +---------------+  +-----------------+
            < rate limiting >                  < delay, drops, reorders >
      
      If netem is used without delay feature (drops, reorders, rate
      limiting), then we should avoid early skb orphaning, to keep pressure
      on sockets as long as packets are still in qdisc queue.
      
      Ideally, netem should be refactored to implement delay module
      as the last stage. Current algorithm merges the two phases
      (rate limiting + delay) so its not correct.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: Mark Gordon <msg@google.com>
      Cc: Andreas Terzis <aterzis@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a308f40
  3. 12 7月, 2012 2 次提交
  4. 09 7月, 2012 1 次提交
    • E
      netem: add limitation to reordered packets · 960fb66e
      Eric Dumazet 提交于
      Fix two netem bugs :
      
      1) When a frame was dropped by tfifo_enqueue(), drop counter
         was incremented twice.
      
      2) When reordering is triggered, we enqueue a packet without
         checking queue limit. This can OOM pretty fast when this
         is repeated enough, since skbs are orphaned, no socket limit
         can help in this situation.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Mark Gordon <msg@google.com>
      Cc: Andreas Terzis <aterzis@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      960fb66e
  5. 05 7月, 2012 1 次提交
  6. 04 7月, 2012 1 次提交
  7. 27 6月, 2012 3 次提交
  8. 01 6月, 2012 1 次提交
  9. 18 5月, 2012 1 次提交
  10. 17 5月, 2012 1 次提交
    • E
      fq_codel: should use qdisc backlog as threshold · 865ec552
      Eric Dumazet 提交于
      codel_should_drop() logic allows a packet being not dropped if queue
      size is under max packet size.
      
      In fq_codel, we have two possible backlogs : The qdisc global one, and
      the flow local one.
      
      The meaningful one for codel_should_drop() should be the global backlog,
      not the per flow one, so that thin flows can have a non zero drop/mark
      probability.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Dave Taht <dave.taht@bufferbloat.net>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Van Jacobson <van@pollere.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      865ec552
  11. 16 5月, 2012 1 次提交
  12. 15 5月, 2012 2 次提交
    • S
      net: codel: fix build errors · 669d67bf
      Sasha Levin 提交于
      Fix the following build error:
      
      net/sched/sch_fq_codel.c: In function 'fq_codel_dump_stats':
      net/sched/sch_fq_codel.c:464:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:464:3: warning: missing braces around initializer
      net/sched/sch_fq_codel.c:464:3: warning: (near initialization for 'st.<anonymous>')
      net/sched/sch_fq_codel.c:465:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:465:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:465:3: warning: (near initialization for 'st')
      net/sched/sch_fq_codel.c:466:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:466:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:466:3: warning: (near initialization for 'st')
      net/sched/sch_fq_codel.c:467:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:467:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:467:3: warning: (near initialization for 'st')
      make[1]: *** [net/sched/sch_fq_codel.o] Error 1
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      669d67bf
    • G
      net/codel: Add missing #include <linux/prefetch.h> · ce5b4b97
      Geert Uytterhoeven 提交于
      m68k allmodconfig:
      
      net/sched/sch_codel.c: In function ‘dequeue’:
      net/sched/sch_codel.c:70: error: implicit declaration of function ‘prefetch’
      make[1]: *** [net/sched/sch_codel.o] Error 1
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce5b4b97
  13. 13 5月, 2012 1 次提交
    • E
      fq_codel: Fair Queue Codel AQM · 4b549a2e
      Eric Dumazet 提交于
      Fair Queue Codel packet scheduler
      
      Principles :
      
      - Packets are classified (internal classifier or external) on flows.
      - This is a Stochastic model (as we use a hash, several flows might
                                    be hashed on same slot)
      - Each flow has a CoDel managed queue.
      - Flows are linked onto two (Round Robin) lists,
        so that new flows have priority on old ones.
      
      - For a given flow, packets are not reordered (CoDel uses a FIFO)
      - head drops only.
      - ECN capability is on by default.
      - Very low memory footprint (64 bytes per flow)
      
      tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                            [ target TIME ] [ interval TIME ] [ noecn ]
                            [ quantum BYTES ]
      
      defaults : 1024 flows, 10240 packets limit, quantum : device MTU
                 target : 5ms (CoDel default)
                 interval : 100ms (CoDel default)
      
      Impressive results on load :
      
      class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
       Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
       rate 201691Kbit 28595pps backlog 0b 312p requeues 0
       lended: 33063109 borrowed: 0 giants: 0
       tokens: -912 ctokens: -912
      
      class fq_codel 10:1735 parent 10:
       (dropped 1292, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4524 parent 10:
       (dropped 1291, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4e74 parent 10:
       (dropped 1290, overlimits 0 requeues 0)
       backlog 6056b 4p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
      class fq_codel 10:628a parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 7570b 5p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
      class fq_codel 10:a4b3 parent 10:
       (dropped 302, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:c3c2 parent 10:
       (dropped 1284, overlimits 0 requeues 0)
       backlog 13626b 9p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:d331 parent 10:
       (dropped 299, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.0ms
      class fq_codel 10:d526 parent 10:
       (dropped 12160, overlimits 0 requeues 0)
       backlog 35870b 211p requeues 0
        deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
      class fq_codel 10:e2c6 parent 10:
       (dropped 1288, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:eab5 parent 10:
       (dropped 1285, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:f220 parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      
      qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
       Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
       rate 201697Kbit 28602pps backlog 0b 260p requeues 71
      qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
       Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
       rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
        maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
        new_flows_len 0 old_flows_len 11
      
      PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
      64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
      64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
      64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
      64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
      64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
      64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
      64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
      64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
      64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
      64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms
      
      10 packets transmitted, 10 received, 0% packet loss, time 8999ms
      rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms
      
      Much better than SFQ because of priority given to new flows, and fast
      path dirtying less cache lines.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b549a2e
  14. 11 5月, 2012 2 次提交
    • E
      codel: Controlled Delay AQM · 76e3cc12
      Eric Dumazet 提交于
      An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.
      
      http://queue.acm.org/detail.cfm?id=2209336
      
      This AQM main input is no longer queue size in bytes or packets, but the
      delay packets stay in (FIFO) queue.
      
      As we don't have infinite memory, we still can drop packets in enqueue()
      in case of massive load, but mean of CoDel is to drop packets in
      dequeue(), using a control law based on two simple parameters :
      
      target : target sojourn time (default 5ms)
      interval : width of moving time window (default 100ms)
      
      Based on initial work from Dave Taht.
      
      Refactored to help future codel inclusion as a plugin for other linux
      qdisc (FQ_CODEL, ...), like RED.
      
      include/net/codel.h contains codel algorithm as close as possible than
      Kathleen reference.
      
      net/sched/sch_codel.c contains the linux qdisc specific glue.
      
      Separate structures permit a memory efficient implementation of fq_codel
      (to be sent as a separate work) : Each flow has its own struct
      codel_vars.
      
      timestamps are taken at enqueue() time with 1024 ns precision, allowing
      a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses
      usec as base unit.
      
      Selected packets are dropped, unless ECN is enabled and packets can get
      ECN mark instead.
      
      Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and
      tg3 drivers (BQL enabled).
      
      Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
                                [ interval TIME ] [ ecn ]
      
      qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
       Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
       rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
        count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
        maxpacket 1514 ecn_mark 84399 drop_overlimit 0
      
      CoDel must be seen as a base module, and should be used keeping in mind
      there is still a FIFO queue. So a typical setup will probably need a
      hierarchy of several qdiscs and packet classifiers to be able to meet
      whatever constraints a user might have.
      
      One possible example would be to use fq_codel, which combines Fair
      Queueing and CoDel, in replacement of sfq / sfq_red.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDave Taht <dave.taht@bufferbloat.net>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Van Jacobson <van@pollere.net>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76e3cc12
    • E
      net_sched: update bstats in dequeue() · 2dd875ff
      Eric Dumazet 提交于
      Class bytes/packets stats can be misleading because they are updated in
      enqueue() while packet might be dropped later.
      
      We already fixed all qdiscs but sch_atm.
      
      This patch makes the final cleanup.
      
      class rate estimators can now match qdisc ones.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2dd875ff
  15. 04 5月, 2012 1 次提交
  16. 02 5月, 2012 1 次提交
  17. 01 5月, 2012 1 次提交
  18. 17 4月, 2012 1 次提交
    • D
      net_sched: gred: Fix oops in gred_dump() in WRED mode · 244b65db
      David Ward 提交于
      A parameter set exists for WRED mode, called wred_set, to hold the same
      values for qavg and qidlestart across all VQs. The WRED mode values had
      been previously held in the VQ for the default DP. After these values
      were moved to wred_set, the VQ for the default DP was no longer created
      automatically (so that it could be omitted on purpose, to have packets
      in the default DP enqueued directly to the device without using RED).
      
      However, gred_dump() was overlooked during that change; in WRED mode it
      still reads qavg/qidlestart from the VQ for the default DP, which might
      not even exist. As a result, this command sequence will cause an oops:
      
      tc qdisc add dev $DEV handle $HANDLE parent $PARENT gred setup \
          DPs 3 default 2 grio
      tc qdisc change dev $DEV handle $HANDLE gred DP 0 prio 8 $RED_OPTIONS
      tc qdisc change dev $DEV handle $HANDLE gred DP 1 prio 8 $RED_OPTIONS
      
      This fixes gred_dump() in WRED mode to use the values held in wred_set.
      Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      244b65db
  19. 02 4月, 2012 3 次提交
    • D
      pkt_sched: Stop using NLA_PUT*(). · 1b34ec43
      David S. Miller 提交于
      These macros contain a hidden goto, and are thus extremely error
      prone and make code hard to audit.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b34ec43
    • T
      cgroup: convert all non-memcg controllers to the new cftype interface · 4baf6e33
      Tejun Heo 提交于
      Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio,
      net_cls and device controllers to use the new cftype based interface.
      Termination entry is added to cftype arrays and populate callbacks are
      replaced with cgroup_subsys->base_cftypes initializations.
      
      This is functionally identical transformation.  There shouldn't be any
      visible behavior change.
      
      memcg is rather special and will be converted separately.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Paul Menage <paul@paulmenage.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      4baf6e33
    • T
      cgroup: relocate cftype and cgroup_subsys definitions in controllers · 676f7c8f
      Tejun Heo 提交于
      blk-cgroup, netprio_cgroup, cls_cgroup and tcp_memcontrol
      unnecessarily define cftype array and cgroup_subsys structures at the
      top of the file, which is unconventional and necessiates forward
      declaration of methods.
      
      This patch relocates those below the definitions of the methods and
      removes the forward declarations.  Note that forward declaration of
      tcp_files[] is added in tcp_memcontrol.c for tcp_init_cgroup().  This
      will be removed soon by another patch.
      
      This patch doesn't introduce any functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      676f7c8f
  20. 16 3月, 2012 1 次提交
    • E
      sch_sfq: revert dont put new flow at the end of flows · cc34eb67
      Eric Dumazet 提交于
      This reverts commit d47a0ac7 (sch_sfq: dont put new flow at the end of
      flows)
      
      As Jesper found out, patch sounded great but has bad side effects.
      
      In stress situation, pushing new flows in front of the queue can prevent
      old flows doing any progress. Packets can stay in SFQ queue for
      unlimited amount of time.
      
      It's possible to add heuristics to limit this problem, but this would
      add complexity outside of SFQ scope.
      
      A more sensible answer to Dave Taht concerns (who reported the issued I
      tried to solve in original commit) is probably to use a qdisc hierarchy
      so that high prio packets dont enter a potentially crowded SFQ qdisc.
      Reported-by: NJesper Dangaard Brouer <jdb@comx.dk>
      Cc: Dave Taht <dave.taht@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc34eb67
  21. 20 2月, 2012 1 次提交
  22. 14 2月, 2012 1 次提交
  23. 10 2月, 2012 1 次提交
  24. 08 2月, 2012 1 次提交
    • S
      net/sched: sch_plug - Queue traffic until an explicit release command · c3059be1
      Shriram Rajagopalan 提交于
      The qdisc supports two operations - plug and unplug. When the
      qdisc receives a plug command via netlink request, packets arriving
      henceforth are buffered until a corresponding unplug command is received.
      Depending on the type of unplug command, the queue can be unplugged
      indefinitely or selectively.
      
      This qdisc can be used to implement output buffering, an essential
      functionality required for consistent recovery in checkpoint based
      fault-tolerance systems. Output buffering enables speculative execution
      by allowing generated network traffic to be rolled back. It is used to
      provide network protection for Xen Guests in the Remus high availability
      project, available as part of Xen.
      
      This module is generic enough to be used by any other system that wishes
      to add speculative execution and output buffering to its applications.
      
      This module was originally available in the linux 2.6.32 PV-OPS tree,
      used as dom0 for Xen.
      
      For more information, please refer to http://nss.cs.ubc.ca/remus/
      and http://wiki.xensource.com/xenwiki/Remus
      
      Changes in V3:
        * Removed debug output (printk) on queue overflow
        * Added TCQ_PLUG_RELEASE_INDEFINITE - that allows the user to
          use this qdisc, for simple plug/unplug operations.
        * Use of packet counts instead of pointers to keep track of
          the buffers in the queue.
      Signed-off-by: NShriram Rajagopalan <rshriram@cs.ubc.ca>
      Signed-off-by: NBrendan Cully <brendan@cs.ubc.ca>
      [author of the code in the linux 2.6.32 pvops tree]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3059be1
  25. 07 2月, 2012 1 次提交
  26. 03 2月, 2012 1 次提交
    • L
      cgroup: remove cgroup_subsys argument from callbacks · 761b3ef5
      Li Zefan 提交于
      The argument is not used at all, and it's not necessary, because
      a specific callback handler of course knows which subsys it
      belongs to.
      
      Now only ->pupulate() takes this argument, because the handlers of
      this callback always call cgroup_add_file()/cgroup_add_files().
      
      So we reduce a few lines of code, though the shrinking of object size
      is minimal.
      
       16 files changed, 113 insertions(+), 162 deletions(-)
      
         text    data     bss     dec     hex filename
      5486240  656987 7039960 13183187         c928d3 vmlinux.o.orig
      5486170  656987 7039960 13183117         c9288d vmlinux.o
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      761b3ef5
  27. 23 1月, 2012 1 次提交
  28. 13 1月, 2012 1 次提交
    • E
      net_sched: sfq: add optional RED on top of SFQ · ddecf0f4
      Eric Dumazet 提交于
      Adds an optional Random Early Detection on each SFQ flow queue.
      
      Traditional SFQ limits count of packets, while RED permits to also
      control number of bytes per flow, and adds ECN capability as well.
      
      1) We dont handle the idle time management in this RED implementation,
      since each 'new flow' begins with a null qavg. We really want to address
      backlogged flows.
      
      2) if headdrop is selected, we try to ecn mark first packet instead of
      currently enqueued packet. This gives faster feedback for tcp flows
      compared to traditional RED [ marking the last packet in queue ]
      
      Example of use :
      
      tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 4sec sfq \
      	limit 3000 headdrop flows 512 divisor 16384 \
      	redflowlimit 100000 min 8000 max 60000 probability 0.20 ecn
      
      qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
      flows 512/16384 divisor 16384
       ewma 6 min 8000b max 60000b probability 0.2 ecn
       prob_mark 0 prob_mark_head 4876 prob_drop 6131
       forced_mark 0 forced_mark_head 0 forced_drop 0
       Sent 1175211782 bytes 777537 pkt (dropped 6131, overlimits 11007
      requeues 0)
       rate 99483Kbit 8219pps backlog 689392b 456p requeues 0
      
      In this test, with 64 netperf TCP_STREAM sessions, 50% using ECN enabled
      flows, we can see number of packets CE marked is smaller than number of
      drops (for non ECN flows)
      
      If same test is run, without RED, we can check backlog is much bigger.
      
      qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
      flows 512/16384 divisor 16384
       Sent 1148683617 bytes 795006 pkt (dropped 0, overlimits 0 requeues 0)
       rate 98429Kbit 8521pps backlog 1221290b 841p requeues 0
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      CC: Dave Taht <dave.taht@gmail.com>
      Tested-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddecf0f4
  29. 06 1月, 2012 3 次提交
    • E
      net_sched: red: split red_parms into parms and vars · eeca6688
      Eric Dumazet 提交于
      This patch splits the red_parms structure into two components.
      
      One holding the RED 'constant' parameters, and one containing the
      variables.
      
      This permits a size reduction of GRED qdisc, and is a preliminary step
      to add an optional RED unit to SFQ.
      
      SFQRED will have a single red_parms structure shared by all flows, and a
      private red_vars per flow.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Dave Taht <dave.taht@gmail.com>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeca6688
    • E
      net_sched: sfq: extend limits · 18cb8098
      Eric Dumazet 提交于
      SFQ as implemented in Linux is very limited, with at most 127 flows
      and limit of 127 packets. [ So if 127 flows are active, we have one
      packet per flow ]
      
      This patch brings to SFQ following features to cope with modern needs.
      
      - Ability to specify a smaller per flow limit of inflight packets.
          (default value being at 127 packets)
      
      - Ability to have up to 65408 active flows (instead of 127)
      
      - Ability to have head drops instead of tail drops
        (to drop old packets from a flow)
      
      Example of use : No more than 20 packets per flow, max 8000 flows, max
      20000 packets in SFQ qdisc, hash table of 65536 slots.
      
      tc qdisc add ... sfq \
              flows 8000 \
              depth 20 \
              headdrop \
              limit 20000 \
      	divisor 65536
      
      Ram usage :
      
      2 bytes per hash table entry (instead of previous 1 byte/entry)
      32 bytes per flow on 64bit arches, instead of 384 for QFQ, so much
      better cache hit ratio.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Dave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18cb8098
    • H
      net_sched: Bug in netem reordering · eb101924
      Hagen Paul Pfeifer 提交于
      Not now, but it looks you are correct. q->qdisc is NULL until another
      additional qdisc is attached (beside tfifo). See 50612537.
      The following patch should work.
      
      From: Hagen Paul Pfeifer <hagen@jauu.net>
      
      netem: catch NULL pointer by updating the real qdisc statistic
      Reported-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb101924
  30. 05 1月, 2012 2 次提交