1. 22 1月, 2014 1 次提交
    • H
      reciprocal_divide: update/correction of the algorithm · 809fa972
      Hannes Frederic Sowa 提交于
      Jakub Zawadzki noticed that some divisions by reciprocal_divide()
      were not correct [1][2], which he could also show with BPF code
      after divisions are transformed into reciprocal_value() for runtime
      invariance which can be passed to reciprocal_divide() later on;
      reverse in BPF dump ended up with a different, off-by-one K in
      some situations.
      
      This has been fixed by Eric Dumazet in commit aee636c4
      ("bpf: do not use reciprocal divide"). This follow-up patch
      improves reciprocal_value() and reciprocal_divide() to work in
      all cases by using Granlund and Montgomery method, so that also
      future use is safe and without any non-obvious side-effects.
      Known problems with the old implementation were that division by 1
      always returned 0 and some off-by-ones when the dividend and divisor
      where very large. This seemed to not be problematic with its
      current users, as far as we can tell. Eric Dumazet checked for
      the slab usage, we cannot surely say so in the case of flex_array.
      Still, in order to fix that, we propose an extension from the
      original implementation from commit 6a2d7a95 resp. [3][4],
      by using the algorithm proposed in "Division by Invariant Integers
      Using Multiplication" [5], Torbjörn Granlund and Peter L.
      Montgomery, that is, pseudocode for q = n/d where q, n, d is in
      u32 universe:
      
      1) Initialization:
      
        int l = ceil(log_2 d)
        uword m' = floor((1<<32)*((1<<l)-d)/d)+1
        int sh_1 = min(l,1)
        int sh_2 = max(l-1,0)
      
      2) For q = n/d, all uword:
      
        uword t = (n*m')>>32
        q = (t+((n-t)>>sh_1))>>sh_2
      
      The assembler implementation from Agner Fog [6] also helped a lot
      while implementing. We have tested the implementation on x86_64,
      ppc64, i686, s390x; on x86_64/haswell we're still half the latency
      compared to normal divide.
      
      Joint work with Daniel Borkmann.
      
        [1] http://www.wireshark.org/~darkjames/reciprocal-buggy.c
        [2] http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
        [3] https://gmplib.org/~tege/division-paper.pdf
        [4] http://homepage.cs.uiowa.edu/~jones/bcd/divide.html
        [5] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556
        [6] http://www.agner.org/optimize/asmlib.zipReported-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      809fa972
  2. 08 3月, 2012 1 次提交
  3. 27 5月, 2011 1 次提交
  4. 29 4月, 2011 5 次提交
  5. 14 1月, 2011 1 次提交
  6. 10 8月, 2010 1 次提交
  7. 25 4月, 2010 1 次提交
  8. 22 9月, 2009 5 次提交
  9. 27 8月, 2009 3 次提交
  10. 05 8月, 2009 1 次提交
  11. 30 7月, 2009 1 次提交
    • D
      lib: flexible array implementation · 534acc05
      Dave Hansen 提交于
      Once a structure goes over PAGE_SIZE*2, we see occasional allocation
      failures.  Some people have chosen to switch over to things like vmalloc()
      that will let them keep array-like access to such a large structures.
      But, vmalloc() has plenty of downsides.
      
      Here's an alternative.  I think it's what Andrew was suggesting here:
      
      	http://lkml.org/lkml/2009/7/2/518
      
      I call it a flexible array.  It does all of its work in PAGE_SIZE bits, so
      never does an order>0 allocation.  The base level has
      PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
       So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
      storage when the objects pack nicely into a page.  It is half that on
      64-bit because the pointers are twice the size.  There's a table detailing
      this in the code.
      
      There are kerneldocs for the functions, but here's an
      overview:
      
      flex_array_alloc() - dynamically allocate a base structure
      flex_array_free() - free the array and all of the
      		    second-level pages
      flex_array_free_parts() - free the second-level pages, but
      			  not the base (for static bases)
      flex_array_put() - copy into the array at the given index
      flex_array_get() - copy out of the array at the given index
      flex_array_prealloc() - preallocate the second-level pages
      			between the given indexes to
      			guarantee no allocs will occur at
      			put() time.
      
      We could also potentially just pass the "element_size" into each of the
      API functions instead of storing it internally.  That would get us one
      more base pointer on 32-bit.
      
      I've been testing this by running it in userspace.  The header and patch
      that I've been using are here, as well as the little script I'm using to
      generate the size table which goes in the kerneldocs.
      
      	http://sr71.net/~dave/linux/flexarray/
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      534acc05
新手
引导
客服 返回
顶部