1. 20 1月, 2011 2 次提交
    • E
      net_sched: cleanups · cc7ec456
      Eric Dumazet 提交于
      Cleanup net/sched code to current CodingStyle and practices.
      
      Reduce inline abuse
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc7ec456
    • J
      net_sched: implement a root container qdisc sch_mqprio · b8970f0b
      John Fastabend 提交于
      This implements a mqprio queueing discipline that by default creates
      a pfifo_fast qdisc per tx queue and provides the needed configuration
      interface.
      
      Using the mqprio qdisc the number of tcs currently in use along
      with the range of queues alloted to each class can be configured. By
      default skbs are mapped to traffic classes using the skb priority.
      This mapping is configurable.
      
      Configurable parameters,
      
      struct tc_mqprio_qopt {
      	__u8    num_tc;
      	__u8    prio_tc_map[TC_BITMASK + 1];
      	__u8    hw;
      	__u16   count[TC_MAX_QUEUE];
      	__u16   offset[TC_MAX_QUEUE];
      };
      
      Here the count/offset pairing give the queue alignment and the
      prio_tc_map gives the mapping from skb->priority to tc.
      
      The hw bit determines if the hardware should configure the count
      and offset values. If the hardware bit is set then the operation
      will fail if the hardware does not implement the ndo_setup_tc
      operation. This is to avoid undetermined states where the hardware
      may or may not control the queue mapping. Also minimal bounds
      checking is done on the count/offset to verify a queue does not
      exceed num_tx_queues and that queue ranges do not overlap. Otherwise
      it is left to user policy or hardware configuration to create
      useful mappings.
      
      It is expected that hardware QOS schemes can be implemented by
      creating appropriate mappings of queues in ndo_tc_setup().
      
      One expected use case is drivers will use the ndo_setup_tc to map
      queue ranges onto 802.1Q traffic classes. This provides a generic
      mechanism to map network traffic onto these traffic classes and
      removes the need for lower layer drivers to know specifics about
      traffic types.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8970f0b
  2. 14 1月, 2011 1 次提交
    • E
      net: remove dev_txq_stats_fold() · 1ac9ad13
      Eric Dumazet 提交于
      After recent changes, (percpu stats on vlan/tunnels...), we dont need
      anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters.
      
      Only remaining users are ixgbe, sch_teql, gianfar & macvlan :
      
      1) ixgbe can be converted to use existing tx_ring counters.
      
      2) macvlan incremented txq->tx_dropped, it can use the
      dev->stats.tx_dropped counter.
      
      3) sch_teql : almost revert ab35cd4b (Use net_device internal stats)
          Now we have ndo_get_stats64(), use it, even for "unsigned long"
      fields (No need to bring back a struct net_device_stats)
      
      4) gianfar adds a stats structure per tx queue to hold
      tx_bytes/tx_packets
      
      This removes a lockdep warning (and possible lockup) in rndis gadget,
      calling dev_get_stats() from hard IRQ context.
      
      Ref: http://www.spinics.net/lists/netdev/msg149202.htmlReported-by: NNeil Jones <neiljay@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Sandeep Gopalpet <sandeep.kumar@freescale.com>
      CC: Michal Nazarewicz <mina86@mina86.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ac9ad13
  3. 11 1月, 2011 1 次提交
  4. 06 1月, 2011 1 次提交
    • E
      net_sched: pfifo_head_drop problem · 44b82883
      Eric Dumazet 提交于
      commit 57dbb2d8 (sched: add head drop fifo queue)
      introduced pfifo_head_drop, and broke the invariant that
      sch->bstats.bytes and sch->bstats.packets are COUNTER (increasing
      counters only)
      
      This can break estimators because est_timer() handles unsigned deltas
      only. A decreasing counter can then give a huge unsigned delta.
      
      My mid term suggestion would be to change things so that
      sch->bstats.bytes and sch->bstats.packets are incremented in dequeue()
      only, not at enqueue() time. We also could add drop_bytes/drop_packets
      and provide estimations of drop rates.
      
      It would be more sensible anyway for very low speeds, and big bursts.
      Right now, if we drop packets, they still are accounted in byte/packets
      abolute counters and rate estimators.
      
      Before this mid term change, this patch makes pfifo_head_drop behavior
      similar to other qdiscs in case of drops :
      Dont decrement sch->bstats.bytes and sch->bstats.packets
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44b82883
  5. 04 1月, 2011 1 次提交
  6. 01 1月, 2011 2 次提交
  7. 23 12月, 2010 1 次提交
    • E
      sfq: fix sfq class stats handling · ee09b3c1
      Eric Dumazet 提交于
      sfq_walk() runs without qdisc lock. By the time it selects a non empty
      hash slot and sfq_dump_class_stats() is run (with lock held), slot might
      have been freed : We then access q->slots[SFQ_EMPTY_SLOT], out of
      bounds, and crash in slot_queue_walk()
      
      On previous kernels, bug is here but out of bounds qs[SFQ_DEPTH] and
      allot[SFQ_DEPTH] are located in struct sfq_sched_data, so no illegal
      memory access happens, only possibly wrong data reported to user.
      
      Also, slot_dequeue_tail() should make sure slot skb chain is correctly
      terminated, or sfq_dump_class_stats() can access freed skbs.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee09b3c1
  8. 21 12月, 2010 3 次提交
    • E
      net_sched: sch_sfq: better struct layouts · eda83e3b
      Eric Dumazet 提交于
      Here is a respin of patch.
      
      I'll send a short patch to make SFQ more fair in presence of large
      packets as well.
      
      Thanks
      
      [PATCH v3 net-next-2.6] net_sched: sch_sfq: better struct layouts
      
      This patch shrinks sizeof(struct sfq_sched_data)
      from 0x14f8 (or more if spinlocks are bigger) to 0x1180 bytes, and
      reduce text size as well.
      
         text    data     bss     dec     hex filename
         4821     152       0    4973    136d old/net/sched/sch_sfq.o
         4627     136       0    4763    129b new/net/sched/sch_sfq.o
      
      All data for a slot/flow is now grouped in a compact and cache friendly
      structure, instead of being spreaded in many different points.
      
      struct sfq_slot {
              struct sk_buff  *skblist_next;
              struct sk_buff  *skblist_prev;
              sfq_index       qlen; /* number of skbs in skblist */
              sfq_index       next; /* next slot in sfq chain */
              struct sfq_head dep; /* anchor in dep[] chains */
              unsigned short  hash; /* hash value (index in ht[]) */
              short           allot; /* credit for this slot */
      };
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Jarek Poplawski <jarkao2@gmail.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eda83e3b
    • E
      net_sched: sch_sfq: fix allot handling · aa3e2199
      Eric Dumazet 提交于
      When deploying SFQ/IFB here at work, I found the allot management was
      pretty wrong in sfq, even changing allot from short to int...
      
      We should init allot for each new flow, not using a previous value found
      in slot.
      
      Before patch, I saw bursts of several packets per flow, apparently
      denying the default "quantum 1514" limit I had on my SFQ class.
      
      class sfq 11:1 parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 7p requeues 0 
       allot 11546 
      
      class sfq 11:46 parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 1p requeues 0 
       allot -23873 
      
      class sfq 11:78 parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 5p requeues 0 
       allot 11393 
      
      After patch, better fairness among each flow, allot limit being
      respected, allot is positive :
      
      class sfq 11:e parent 11: 
       (dropped 0, overlimits 0 requeues 86) 
       backlog 0b 3p requeues 86 
       allot 596 
      
      class sfq 11:94 parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 3p requeues 0 
       allot 1468 
      
      class sfq 11:a4 parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 4p requeues 0 
       allot 650 
      
      class sfq 11:bb parent 11: 
       (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 3p requeues 0 
       allot 596 
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa3e2199
    • E
      net_sched: sch_sfq: add backlog info in sfq_dump_class_stats() · c4266263
      Eric Dumazet 提交于
      We currently return for each active SFQ slot the number of packets in
      queue. We can also give number of bytes accounted for these packets.
      
      tc -s class show dev ifb0
      
      Before patch :
      
      class sfq 11:3d9 parent 11:
       (dropped 0, overlimits 0 requeues 0)
       backlog 0b 3p requeues 0
       allot 1266
      
      After patch :
      
      class sfq 11:3e4 parent 11:
       (dropped 0, overlimits 0 requeues 0)
       backlog 4380b 3p requeues 0
       allot 1212
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4266263
  9. 17 12月, 2010 1 次提交
  10. 02 12月, 2010 1 次提交
  11. 29 11月, 2010 1 次提交
  12. 16 11月, 2010 1 次提交
    • M
      Docs/Kconfig: Update: osdl.org -> linuxfoundation.org · c996d8b9
      Michael Witten 提交于
      Some of the documentation refers to web pages under
      the domain `osdl.org'. However, `osdl.org' now
      redirects to `linuxfoundation.org'.
      
      Rather than rely on redirections, this patch updates
      the addresses appropriately; for the most part, only
      documentation that is meant to be current has been
      updated.
      
      The patch should be pretty quick to scan and check;
      each new web-page url was gotten by trying out the
      original URL in a browser and then simply copying the
      the redirected URL (formatting as necessary).
      
      There is some conflict as to which one of these domain
      names is preferred:
      
        linuxfoundation.org
        linux-foundation.org
      
      So, I wrote:
      
        info@linuxfoundation.org
      
      and got this reply:
      
        Message-ID: <4CE17EE6.9040807@linuxfoundation.org>
        Date: Mon, 15 Nov 2010 10:41:42 -0800
        From: David Ames <david@linuxfoundation.org>
      
        ...
      
        linuxfoundation.org is preferred. The canonical name for our web site is
        www.linuxfoundation.org. Our list site is actually
        lists.linux-foundation.org.
      
        Regarding email linuxfoundation.org is preferred there are a few people
        who choose to use linux-foundation.org for their own reasons.
      
      Consequently, I used `linuxfoundation.org' for web pages and
      `lists.linux-foundation.org' for mailing-list web pages and email addresses;
      the only personal email address I updated from `@osdl.org' was that of
      Andrew Morton, who prefers `linux-foundation.org' according `git log'.
      Signed-off-by: NMichael Witten <mfwitten@gmail.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      c996d8b9
  13. 09 11月, 2010 1 次提交
  14. 04 11月, 2010 1 次提交
  15. 01 11月, 2010 1 次提交
  16. 21 10月, 2010 2 次提交
  17. 19 10月, 2010 1 次提交
    • V
      sched: Fix softirq time accounting · 75e1056f
      Venkatesh Pallipadi 提交于
      Peter Zijlstra found a bug in the way softirq time is accounted in
      VIRT_CPU_ACCOUNTING on this thread:
      
         http://lkml.indiana.edu/hypermail//linux/kernel/1009.2/01366.html
      
      The problem is, softirq processing uses local_bh_disable internally. There
      is no way, later in the flow, to differentiate between whether softirq is
      being processed or is it just that bh has been disabled. So, a hardirq when bh
      is disabled results in time being wrongly accounted as softirq.
      
      Looking at the code a bit more, the problem exists in !VIRT_CPU_ACCOUNTING
      as well. As account_system_time() in normal tick based accouting also uses
      softirq_count, which will be set even when not in softirq with bh disabled.
      
      Peter also suggested solution of using 2*SOFTIRQ_OFFSET as irq count
      for local_bh_{disable,enable} and using just SOFTIRQ_OFFSET while softirq
      processing. The patch below does that and adds API in_serving_softirq() which
      returns whether we are currently processing softirq or not.
      
      Also changes one of the usages of softirq_count in net/sched/cls_cgroup.c
      to in_serving_softirq.
      
      Looks like many usages of in_softirq really want in_serving_softirq. Those
      changes can be made individually on a case by case basis.
      Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1286237003-12406-2-git-send-email-venki@google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75e1056f
  18. 14 10月, 2010 2 次提交
  19. 12 10月, 2010 1 次提交
  20. 10 10月, 2010 1 次提交
  21. 05 10月, 2010 2 次提交
  22. 30 9月, 2010 1 次提交
  23. 13 9月, 2010 1 次提交
  24. 02 9月, 2010 2 次提交
  25. 25 8月, 2010 1 次提交
  26. 24 8月, 2010 2 次提交
  27. 23 8月, 2010 1 次提交
  28. 22 8月, 2010 1 次提交
  29. 20 8月, 2010 3 次提交