1. 14 9月, 2014 1 次提交
  2. 23 8月, 2014 1 次提交
  3. 07 3月, 2014 1 次提交
  4. 23 1月, 2014 1 次提交
  5. 01 1月, 2014 2 次提交
  6. 12 12月, 2013 2 次提交
  7. 21 9月, 2013 2 次提交
  8. 12 9月, 2013 1 次提交
  9. 15 8月, 2013 1 次提交
    • J
      net_sched: restore "linklayer atm" handling · 8a8e3d84
      Jesper Dangaard Brouer 提交于
      commit 56b765b7 ("htb: improved accuracy at high rates")
      broke the "linklayer atm" handling.
      
       tc class add ... htb rate X ceil Y linklayer atm
      
      The linklayer setting is implemented by modifying the rate table
      which is send to the kernel.  No direct parameter were
      transferred to the kernel indicating the linklayer setting.
      
      The commit 56b765b7 ("htb: improved accuracy at high rates")
      removed the use of the rate table system.
      
      To keep compatible with older iproute2 utils, this patch detects
      the linklayer by parsing the rate table.  It also supports future
      versions of iproute2 to send this linklayer parameter to the
      kernel directly. This is done by using the __reserved field in
      struct tc_ratespec, to convey the choosen linklayer option, but
      only using the lower 4 bits of this field.
      
      Linklayer detection is limited to speeds below 100Mbit/s, because
      at high rates the rtab is gets too inaccurate, so bad that
      several fields contain the same values, this resembling the ATM
      detect.  Fields even start to contain "0" time to send, e.g. at
      1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
      been more broken than we first realized.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a8e3d84
  10. 03 8月, 2013 1 次提交
  11. 20 6月, 2013 1 次提交
    • E
      htb: refactor struct htb_sched fields for performance · c9364636
      Eric Dumazet 提交于
      htb_sched structures are big, and source of false sharing on SMP.
      
      Every time a packet is queued or dequeue, many cache lines must be
      touched because structures are not lay out properly.
      
      By carefully splitting htb_sched in two parts, and define sub structures
      to increase data locality, we can improve performance dramatically on
      SMP.
      
      New htb_prio structure can also be used in htb_class to increase data
      locality.
      
      I got 26 % performance increase on a 24 threads machine, with 200
      concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9364636
  12. 14 6月, 2013 1 次提交
  13. 12 6月, 2013 1 次提交
  14. 11 6月, 2013 1 次提交
    • E
      net_sched: add 64bit rate estimators · 45203a3b
      Eric Dumazet 提交于
      struct gnet_stats_rate_est contains u32 fields, so the bytes per second
      field can wrap at 34360Mbit.
      
      Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields,
      and switch the kernel to use this structure natively.
      
      This structure is dumped to user space as a new attribute :
      
      TCA_STATS_RATE_EST64
      
      Old tc command will now display the capped bps (to 34360Mbit), instead
      of wrapped values, and updated tc command will display correct
      information.
      
      Old tc command output, after patch :
      
      eric:~# tc -s -d qd sh dev lo
      qdisc pfifo 8001: root refcnt 2 limit 1000p
       Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0)
       rate 34360Mbit 189696pps backlog 0b 0p requeues 0
      
      This patch carefully reorganizes "struct Qdisc" layout to get optimal
      performance on SMP.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45203a3b
  15. 05 6月, 2013 1 次提交
  16. 03 6月, 2013 1 次提交
  17. 07 3月, 2013 1 次提交
  18. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  19. 13 2月, 2013 5 次提交
  20. 22 12月, 2012 1 次提交
  21. 07 11月, 2012 1 次提交
    • E
      htb: fix two bugs · 196d97f6
      Eric Dumazet 提交于
      Commit 56b765b7 (htb: improved accuracy at high rates)
      introduced two bugs :
      
      1) one bstats_update() was inadvertently removed from
         htb_dequeue_tree(), breaking statistics/rate estimation.
      
      2) Missing qdisc_put_rtab() calls in htb_change_class(),
         leaking kernel memory, now struct htb_class no longer
         retains pointers to qdisc_rate_table structs.
      
         Since only rate is used, dont use qdisc_get_rtab() calls
         copying data we ignore anyway.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Vimalkumar <j.vimal@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      196d97f6
  22. 04 11月, 2012 1 次提交
    • V
      htb: improved accuracy at high rates · 56b765b7
      Vimalkumar 提交于
      Current HTB (and TBF) uses rate table computed by the "tc"
      userspace program, which has the following issue:
      
      The rate table has 256 entries to map packet lengths
      to token (time units).  With TSO sized packets, the
      256 entry granularity leads to loss/gain of rate,
      making the token bucket inaccurate.
      
      Thus, instead of relying on rate table, this patch
      explicitly computes the time and accounts for packet
      transmission times with nanosecond granularity.
      
      This greatly improves accuracy of HTB with a wide
      range of packet sizes.
      
      Example:
      
      tc qdisc add dev $dev root handle 1: \
              htb default 1
      
      tc class add dev $dev classid 1:1 parent 1: \
              rate 5Gbit mtu 64k
      
      Here is an example of inaccuracy:
      
      $ iperf -c host -t 10 -i 1
      
      With old htb:
      eth4:   34.76 Mb/s In  5827.98 Mb/s Out -  65836.0 p/s In  481273.0 p/s Out
      [SUM]  9.0-10.0 sec   669 MBytes  5.61 Gbits/sec
      [SUM]  0.0-10.0 sec  6.50 GBytes  5.58 Gbits/sec
      
      With new htb:
      eth4:   28.36 Mb/s In  5208.06 Mb/s Out -  53704.0 p/s In  430076.0 p/s Out
      [SUM]  9.0-10.0 sec   594 MBytes  4.98 Gbits/sec
      [SUM]  0.0-10.0 sec  5.80 GBytes  4.98 Gbits/sec
      
      The bits per second on the wire is still 5200Mb/s with new HTB
      because qdisc accounts for packet length using skb->len, which
      is smaller than total bytes on the wire if GSO is used.  But
      that is for another patch regardless of how time is accounted.
      
      Many thanks to Eric Dumazet for review and feedback.
      Signed-off-by: NVimalkumar <j.vimal@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56b765b7
  23. 11 5月, 2012 1 次提交
  24. 04 5月, 2012 1 次提交
  25. 02 4月, 2012 1 次提交
  26. 31 3月, 2011 1 次提交
  27. 21 1月, 2011 2 次提交
    • E
      net_sched: accurate bytes/packets stats/rates · 9190b3b3
      Eric Dumazet 提交于
      In commit 44b82883 (net_sched: pfifo_head_drop problem), we fixed
      a problem with pfifo_head drops that incorrectly decreased
      sch->bstats.bytes and sch->bstats.packets
      
      Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
      previously enqueued packet, and bstats cannot be changed, so
      bstats/rates are not accurate (over estimated)
      
      This patch changes the qdisc_bstats updates to be done at dequeue() time
      instead of enqueue() time. bstats counters no longer account for dropped
      frames, and rates are more correct, since enqueue() bursts dont have
      effect on dequeue() rate.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9190b3b3
    • E
      net_sched: move TCQ_F_THROTTLED flag · fd245a4a
      Eric Dumazet 提交于
      In commit 37112105 (net: QDISC_STATE_RUNNING dont need atomic bit
      ops) I moved QDISC_STATE_RUNNING flag to __state container, located in
      the cache line containing qdisc lock and often dirtied fields.
      
      I now move TCQ_F_THROTTLED bit too, so that we let first cache line read
      mostly, and shared by all cpus. This should speedup HTB/CBQ for example.
      
      Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an
      "unsigned int" for __state container, reducing by 8 bytes Qdisc size.
      
      Introduce helpers to hide implementation details.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Jesper Dangaard Brouer <hawk@diku.dk>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Jamal Hadi Salim <hadi@cyberus.ca>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd245a4a
  28. 20 1月, 2011 1 次提交
  29. 11 1月, 2011 1 次提交
  30. 21 10月, 2010 1 次提交
  31. 10 10月, 2010 1 次提交
  32. 07 6月, 2010 1 次提交