1. 22 1月, 2014 1 次提交
    • H
      reciprocal_divide: update/correction of the algorithm · 809fa972
      Hannes Frederic Sowa 提交于
      Jakub Zawadzki noticed that some divisions by reciprocal_divide()
      were not correct [1][2], which he could also show with BPF code
      after divisions are transformed into reciprocal_value() for runtime
      invariance which can be passed to reciprocal_divide() later on;
      reverse in BPF dump ended up with a different, off-by-one K in
      some situations.
      
      This has been fixed by Eric Dumazet in commit aee636c4
      ("bpf: do not use reciprocal divide"). This follow-up patch
      improves reciprocal_value() and reciprocal_divide() to work in
      all cases by using Granlund and Montgomery method, so that also
      future use is safe and without any non-obvious side-effects.
      Known problems with the old implementation were that division by 1
      always returned 0 and some off-by-ones when the dividend and divisor
      where very large. This seemed to not be problematic with its
      current users, as far as we can tell. Eric Dumazet checked for
      the slab usage, we cannot surely say so in the case of flex_array.
      Still, in order to fix that, we propose an extension from the
      original implementation from commit 6a2d7a95 resp. [3][4],
      by using the algorithm proposed in "Division by Invariant Integers
      Using Multiplication" [5], Torbjörn Granlund and Peter L.
      Montgomery, that is, pseudocode for q = n/d where q, n, d is in
      u32 universe:
      
      1) Initialization:
      
        int l = ceil(log_2 d)
        uword m' = floor((1<<32)*((1<<l)-d)/d)+1
        int sh_1 = min(l,1)
        int sh_2 = max(l-1,0)
      
      2) For q = n/d, all uword:
      
        uword t = (n*m')>>32
        q = (t+((n-t)>>sh_1))>>sh_2
      
      The assembler implementation from Agner Fog [6] also helped a lot
      while implementing. We have tested the implementation on x86_64,
      ppc64, i686, s390x; on x86_64/haswell we're still half the latency
      compared to normal divide.
      
      Joint work with Daniel Borkmann.
      
        [1] http://www.wireshark.org/~darkjames/reciprocal-buggy.c
        [2] http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
        [3] https://gmplib.org/~tege/division-paper.pdf
        [4] http://homepage.cs.uiowa.edu/~jones/bcd/divide.html
        [5] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556
        [6] http://www.agner.org/optimize/asmlib.zipReported-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      809fa972
  2. 07 1月, 2014 1 次提交
  3. 04 1月, 2014 2 次提交
  4. 20 12月, 2013 5 次提交
  5. 18 12月, 2013 5 次提交
  6. 14 12月, 2013 9 次提交
  7. 07 12月, 2013 2 次提交
  8. 29 11月, 2013 1 次提交
  9. 16 11月, 2013 1 次提交
  10. 15 11月, 2013 1 次提交
  11. 14 11月, 2013 1 次提交
  12. 08 11月, 2013 1 次提交
  13. 20 10月, 2013 3 次提交
  14. 18 10月, 2013 1 次提交
  15. 04 10月, 2013 1 次提交
    • N
      bonding: modify the old and add new xmit hash policies · 32819dc1
      Nikolay Aleksandrov 提交于
      This patch adds two new hash policy modes which use skb_flow_dissect:
      3 - Encapsulated layer 2+3
      4 - Encapsulated layer 3+4
      There should be a good improvement for tunnel users in those modes.
      It also changes the old hash functions to:
      hash ^= (__force u32)flow.dst ^ (__force u32)flow.src;
      hash ^= (hash >> 16);
      hash ^= (hash >> 8);
      
      Where hash will be initialized either to L2 hash, that is
      SRCMAC[5] XOR DSTMAC[5], or to flow->ports which should be extracted
      from the upper layer. Flow's dst and src are also extracted based on the
      xmit policy either directly from the buffer or by using skb_flow_dissect,
      but in both cases if the protocol is IPv6 then dst and src are obtained by
      ipv6_addr_hash() on the real addresses. In case of a non-dissectable
      packet, the algorithms fall back to L2 hashing.
      The bond_set_mode_ops() function is now obsolete and thus deleted
      because it was used only to set the proper hash policy. Also we trim a
      pointer from struct bonding because we no longer need to keep the hash
      function, now there's only a single hash function - bond_xmit_hash that
      works based on bond->params.xmit_policy.
      
      The hash function and skb_flow_dissect were suggested by Eric Dumazet.
      The layer names were suggested by Andy Gospodarek, because I suck at
      semantics.
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32819dc1
  16. 27 9月, 2013 5 次提交
    • T
      sysfs: make attr namespace interface less convoluted · 58292cbe
      Tejun Heo 提交于
      sysfs ns (namespace) implementation became more convoluted than
      necessary while trying to hide ns information from visible interface.
      The relatively recent attr ns support is a good example.
      
      * attr ns tag is determined by sysfs_ops->namespace() callback while
        dir tag is determined by kobj_type->namespace().  The placement is
        arbitrary.
      
      * Instead of performing operations with explicit ns tag, the namespace
        callback is routed through sysfs_attr_ns(), sysfs_ops->namespace(),
        class_attr_namespace(), class_attr->namespace().  It's not simpler
        in any sense.  The only thing this convolution does is traversing
        the whole stack backwards.
      
      The namespace callbacks are unncessary because the operations involved
      are inherently synchronous.  The information can be provided in in
      straight-forward top-down direction and reversing that direction is
      unnecessary and against basic design principles.
      
      This backward interface is unnecessarily convoluted and hinders
      properly separating out sysfs from driver model / kobject for proper
      layering.  This patch updates attr ns support such that
      
      * sysfs_ops->namespace() and class_attr->namespace() are dropped.
      
      * sysfs_{create|remove}_file_ns(), which take explicit @ns param, are
        added and sysfs_{create|remove}_file() are now simple wrappers
        around the ns aware functions.
      
      * ns handling is dropped from sysfs_chmod_file().  Nobody uses it at
        this point.  sysfs_chmod_file_ns() can be added later if necessary.
      
      * Explicit @ns is propagated through class_{create|remove}_file_ns()
        and netdev_class_{create|remove}_file_ns().
      
      * driver/net/bonding which is currently the only user of attr
        namespace is updated to use netdev_class_{create|remove}_file_ns()
        with @bh->net as the ns tag instead of using the namespace callback.
      
      This patch should be an equivalent conversion without any functional
      difference.  It makes the code easier to follow, reduces lines of code
      a bit and helps proper separation and layering.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58292cbe
    • V
      net: create sysfs symlinks for neighbour devices · 5831d66e
      Veaceslav Falico 提交于
      Also, remove the same functionality from bonding - it will be already done
      for any device that links to its lower/upper neighbour.
      
      The links will be created for dev's kobject, and will look like
      lower_eth0 for lower device eth0 and upper_bridge0 for upper device
      bridge0.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5831d66e
    • V
      net: expose the master link to sysfs, and remove it from bond · 842d67a7
      Veaceslav Falico 提交于
      Currently, we can have only one master upper neighbour, so it would be
      useful to create a symlink to it in the sysfs device directory, the way
      that bonding now does it, for every device. Lower devices from
      bridge/team/etc will automagically get it, so we could rely on it.
      
      Also, remove the same functionality from bonding.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842d67a7
    • V
      bonding: add bond_has_slaves() and use it · 0965a1f3
      Veaceslav Falico 提交于
      Currently we verify if we have slaves by checking if bond->slave_list is
      empty. Create a define bond_has_slaves() and use it, a bit more readable
      and easier to change in the future.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0965a1f3
    • V
      bonding: make bond_for_each_slave() use lower neighbour's private · 9caff1e7
      Veaceslav Falico 提交于
      It needs a list_head *iter, so add it wherever needed. Use both non-rcu and
      rcu variants.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Dimitris Michailidis <dm@chelsio.com>
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9caff1e7