1. 05 4月, 2013 1 次提交
    • J
      net: frag queue per hash bucket locking · 19952cc4
      Jesper Dangaard Brouer 提交于
      This patch implements per hash bucket locking for the frag queue
      hash.  This removes two write locks, and the only remaining write
      lock is for protecting hash rebuild.  This essentially reduce the
      readers-writer lock to a rebuild lock.
      
      This patch is part of "net: frag performance followup"
       http://thread.gmane.org/gmane.linux.network/263644
      of which two patches have already been accepted:
      
      Same test setup as previous:
       (http://thread.gmane.org/gmane.linux.network/257155)
       Two 10G interfaces, on seperate NUMA nodes, are under-test, and uses
       Ethernet flow-control.  A third interface is used for generating the
       DoS attack (with trafgen).
      
      Notice, I have changed the frag DoS generator script to be more
      efficient/deadly.  Before it would only hit one RX queue, now its
      sending packets causing multi-queue RX, due to "better" RX hashing.
      
      Test types summary (netperf UDP_STREAM):
       Test-20G64K     == 2x10G with 65K fragments
       Test-20G3F      == 2x10G with 3x fragments (3*1472 bytes)
       Test-20G64K+DoS == Same as 20G64K with frag DoS
       Test-20G3F+DoS  == Same as 20G3F  with frag DoS
       Test-20G64K+MQ  == Same as 20G64K with Multi-Queue frag DoS
       Test-20G3F+MQ   == Same as 20G3F  with Multi-Queue frag DoS
      
      When I rebased this-patch(03) (on top of net-next commit a210576c) and
      removed the _bh spinlock, I saw a performance regression.  BUT this
      was caused by some unrelated change in-between.  See tests below.
      
      Test (A) is what I reported before for patch-02, accepted in commit 1b5ab0de.
      Test (B) verifying-retest of commit 1b5ab0de corrospond to patch-02.
      Test (C) is what I reported before for this-patch
      
      Test (D) is net-next master HEAD (commit a210576c), which reveals some
      (unknown) performance regression (compared against test (B)).
      Test (D) function as a new base-test.
      
      Performance table summary (in Mbit/s):
      
      (#) Test-type:  20G64K    20G3F    20G64K+DoS  20G3F+DoS  20G64K+MQ 20G3F+MQ
          ----------  -------   -------  ----------  ---------  --------  -------
      (A) Patch-02  : 18848.7   13230.1   4103.04     5310.36     130.0    440.2
      (B) 1b5ab0de  : 18841.5   13156.8   4101.08     5314.57     129.0    424.2
      (C) Patch-03v1: 18838.0   13490.5   4405.11     6814.72     196.6    461.6
      
      (D) a210576c  : 18321.5   11250.4   3635.34     5160.13     119.1    405.2
      (E) with _bh  : 17247.3   11492.6   3994.74     6405.29     166.7    413.6
      (F) without bh: 17471.3   11298.7   3818.05     6102.11     165.7    406.3
      
      Test (E) and (F) is this-patch(03), with(V1) and without(V2) the _bh spinlocks.
      
      I cannot explain the slow down for 20G64K (but its an artificial
      "lab-test" so I'm not worried).  But the other results does show
      improvements.  And test (E) "with _bh" version is slightly better.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NEric Dumazet <edumazet@google.com>
      
      ----
      V2:
      - By analysis from Hannes Frederic Sowa and Eric Dumazet, we don't
        need the spinlock _bh versions, as Netfilter currently does a
        local_bh_disable() before entering inet_fragment.
      - Fold-in desc from cover-mail
      V3:
      - Drop the chain_len counter per hash bucket.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19952cc4
  2. 28 3月, 2013 1 次提交
  3. 25 3月, 2013 1 次提交
  4. 19 3月, 2013 1 次提交
    • H
      inet: limit length of fragment queue hash table bucket lists · 5a3da1fe
      Hannes Frederic Sowa 提交于
      This patch introduces a constant limit of the fragment queue hash
      table bucket list lengths. Currently the limit 128 is choosen somewhat
      arbitrary and just ensures that we can fill up the fragment cache with
      empty packets up to the default ip_frag_high_thresh limits. It should
      just protect from list iteration eating considerable amounts of cpu.
      
      If we reach the maximum length in one hash bucket a warning is printed.
      This is implemented on the caller side of inet_frag_find to distinguish
      between the different users of inet_fragment.c.
      
      I dropped the out of memory warning in the ipv4 fragment lookup path,
      because we already get a warning by the slab allocator.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Jesper Dangaard Brouer <jbrouer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a3da1fe
  5. 23 2月, 2013 1 次提交
  6. 30 1月, 2013 6 次提交
  7. 20 9月, 2012 1 次提交
  8. 27 8月, 2012 1 次提交
  9. 18 5月, 2012 1 次提交
  10. 01 7月, 2010 1 次提交
    • C
      fragment: add fast path for in-order fragments · d6bebca9
      Changli Gao 提交于
      add fast path for in-order fragments
      
      As the fragments are sent in order in most of OSes, such as Windows, Darwin and
      FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
      In the fast path, we check if the skb at the end of the inet_frag_queue is the
      prev we expect.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      ----
       include/net/inet_frag.h |    1 +
       net/ipv4/ip_fragment.c  |   12 ++++++++++++
       net/ipv6/reassembly.c   |   11 +++++++++++
       3 files changed, 24 insertions(+)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6bebca9
  11. 27 2月, 2009 1 次提交
  12. 29 3月, 2008 1 次提交
  13. 29 1月, 2008 8 次提交
  14. 18 10月, 2007 5 次提交
  15. 16 10月, 2007 8 次提交