1. 18 7月, 2011 1 次提交
  2. 10 6月, 2011 1 次提交
    • G
      rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679
      Greg Rose 提交于
      The message size allocated for rtnl ifinfo dumps was limited to
      a single page.  This is not enough for additional interface info
      available with devices that support SR-IOV and caused a bug in
      which VF info would not be displayed if more than approximately
      40 VFs were created per interface.
      
      Implement a new function pointer for the rtnl_register service that will
      calculate the amount of data required for the ifinfo dump and allocate
      enough data to satisfy the request.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c7ac8679
  3. 03 5月, 2011 1 次提交
    • E
      net: dont hold rtnl mutex during netlink dump callbacks · e67f88dd
      Eric Dumazet 提交于
      Four years ago, Patrick made a change to hold rtnl mutex during netlink
      dump callbacks.
      
      I believe it was a wrong move. This slows down concurrent dumps, making
      good old /proc/net/ files faster than rtnetlink in some situations.
      
      This occurred to me because one "ip link show dev ..." was _very_ slow
      on a workload adding/removing network devices in background.
      
      All dump callbacks are able to use RCU locking now, so this patch does
      roughly a revert of commits :
      
      1c2d670f : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
      6313c1e0 : [RTNETLINK]: Remove unnecessary locking in dump callbacks
      
      This let writers fight for rtnl mutex and readers going full speed.
      
      It also takes care of phonet : phonet_route_get() is now called from rcu
      read section. I renamed it to phonet_route_get_rcu()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e67f88dd
  4. 23 4月, 2011 1 次提交
  5. 13 3月, 2011 1 次提交
  6. 17 10月, 2010 1 次提交
  7. 11 6月, 2010 1 次提交
  8. 21 4月, 2010 1 次提交
  9. 31 3月, 2010 1 次提交
    • Y
      ipv6 fib: Use "Sweezle" to optimize addr_bit_test(). · 02cdce53
      YOSHIFUJI Hideaki / 吉藤英明 提交于
      addr_bit_test() is used in various places in IPv6 routing table
      subsystem.  It checks if the given fn_bit is set,
      where fn_bit counts bits from MSB in words in network-order.
      
       fn_bit        :   0 .... 31 32 .... 64 65 .... 95 96 ....127
      
      fn_bit >> 5 gives offset of word, and (~fn_bit & 0x1f) gives
      count from LSB in the network-endian word in question.
      
       fn_bit >> 5   :       0          1          2          3
       ~fn_bit & 0x1f:  31 ....  0 31 ....  0 31 ....  0 31 ....  0
      
      Thus, the mask was generated as htonl(1 << (~fn_bit & 0x1f)).
      This can be optimized by "sweezle" (See include/asm-generic/bitops/le.h).
      
      In little-endian,
        htonl(1 << bit) = 1 << (bit ^ BITOP_BE32_SWIZZLE)
      where
        BITOP_BE32_SWIZZLE is (0x1f & ~7)
      So,
        htonl(1 << (~fn_bit & 0x1f)) = 1 << ((~fn_bit & 0x1f) ^ (0x1f & ~7))
                                     = 1 << ((~fn_bit ^ ~7) & 0x1f)
                                     = 1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f)
      
      In big-endian, BITOP_BE32_SWIZZLE is equal to 0.
        1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f)
                                     = 1 << ((~fn_bit) & 0x1f)
                                     = htonl(1 << (~fn_bit & 0x1f))
      Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02cdce53
  10. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  11. 19 2月, 2010 1 次提交
  12. 13 2月, 2010 1 次提交
    • P
      ipv6: fib: fix crash when changing large fib while dumping it · 2bec5a36
      Patrick McHardy 提交于
      When the fib size exceeds what can be dumped in a single skb, the
      dump is suspended and resumed once the last skb has been received
      by userspace. When the fib is changed while the dump is suspended,
      the walker might contain stale pointers, causing a crash when the
      dump is resumed.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      PGD 5347a067 PUD 65c7067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      ...
      RIP: 0010:[<ffffffffa01bce04>]
      [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      ...
      Call Trace:
       [<ffffffff8104aca3>] ? mutex_spin_on_owner+0x59/0x71
       [<ffffffffa01bd105>] inet6_dump_fib+0x11b/0x1b9 [ipv6]
       [<ffffffff81371af4>] netlink_dump+0x5b/0x19e
       [<ffffffff8134f288>] ? consume_skb+0x28/0x2a
       [<ffffffff81373b69>] netlink_recvmsg+0x1ab/0x2c6
       [<ffffffff81372781>] ? netlink_unicast+0xfa/0x151
       [<ffffffff813483e0>] __sock_recvmsg+0x6d/0x79
       [<ffffffff81348a53>] sock_recvmsg+0xca/0xe3
       [<ffffffff81066d4b>] ? autoremove_wake_function+0x0/0x38
       [<ffffffff811ed1f8>] ? radix_tree_lookup_slot+0xe/0x10
       [<ffffffff810b3ed7>] ? find_get_page+0x90/0xa5
       [<ffffffff810b5dc5>] ? filemap_fault+0x201/0x34f
       [<ffffffff810ef152>] ? fget_light+0x2f/0xac
       [<ffffffff813519e7>] ? verify_iovec+0x4f/0x94
       [<ffffffff81349a65>] sys_recvmsg+0x14d/0x223
      
      Store the serial number when beginning to walk the fib and reload
      pointers when continuing to walk after a change occured. Similar
      to other dumping functions, this might cause unrelated entries to
      be missed when entries are deleted.
      Tested-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bec5a36
  13. 18 1月, 2010 1 次提交
  14. 31 7月, 2009 1 次提交
    • N
      xfrm: select sane defaults for xfrm[4|6] gc_thresh · a33bc5c1
      Neil Horman 提交于
      Choose saner defaults for xfrm[4|6] gc_thresh values on init
      
      Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
      (set to 1024).  Given that the ipv4 and ipv6 routing caches are sized
      dynamically at boot time, the static selections can be non-sensical.
      This patch dynamically selects an appropriate gc threshold based on
      the corresponding main routing table size, using the assumption that
      we should in the worst case be able to handle as many connections as
      the routing table can.
      
      For ipv4, the maximum route cache size is 16 * the number of hash
      buckets in the route cache.  Given that xfrm4 starts garbage
      collection at the gc_thresh and prevents new allocations at 2 *
      gc_thresh, we set gc_thresh to half the maximum route cache size.
      
      For ipv6, its a bit trickier.  there is no maximum route cache size,
      but the ipv6 dst_ops gc_thresh is statically set to 1024.  It seems
      sane to select a simmilar gc_thresh for the xfrm6 code that is half
      the number of hash buckets in the v6 route cache times 16 (like the v4
      code does).
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a33bc5c1
  15. 14 1月, 2009 1 次提交
  16. 15 8月, 2008 1 次提交
  17. 26 7月, 2008 1 次提交
  18. 23 7月, 2008 5 次提交
  19. 22 7月, 2008 1 次提交
  20. 12 6月, 2008 1 次提交
  21. 22 4月, 2008 1 次提交
  22. 18 4月, 2008 1 次提交
  23. 26 3月, 2008 1 次提交
  24. 05 3月, 2008 2 次提交
  25. 04 3月, 2008 10 次提交
  26. 19 2月, 2008 1 次提交