1. 31 7月, 2012 12 次提交
  2. 20 7月, 2012 1 次提交
    • T
      debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC · e9c31b32
      Tony Luck 提交于
      The stack_not_used() function in <linux/sched.h> assumes that stacks
      grow downwards. This is not true on IA64 or PARISC, so this function
      would walk off in the wrong direction and into the weeds.
      
      Found on IA64 because of a compilation failure with recursive dependencies
      on IA64_TASKSIZE and IA64_THREAD_INFO_SIZE.
      
      Fixing the code is possible, but should be combined with other
      infrastructure additions to set up the "canary" at the end of the stack.
      
      Reported-by: Fengguang Wu <fengguang.wu@intel.com> (failed allmodconfig build)
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e9c31b32
  3. 06 7月, 2012 1 次提交
  4. 02 7月, 2012 1 次提交
  5. 30 6月, 2012 1 次提交
    • P
      netlink: add netlink_kernel_cfg parameter to netlink_kernel_create · a31f2d17
      Pablo Neira Ayuso 提交于
      This patch adds the following structure:
      
      struct netlink_kernel_cfg {
              unsigned int    groups;
              void            (*input)(struct sk_buff *skb);
              struct mutex    *cb_mutex;
      };
      
      That can be passed to netlink_kernel_create to set optional configurations
      for netlink kernel sockets.
      
      I've populated this structure by looking for NULL and zero parameters at the
      existing code. The remaining parameters that always need to be set are still
      left in the original interface.
      
      That includes optional parameters for the netlink socket creation. This allows
      easy extensibility of this interface in the future.
      
      This patch also adapts all callers to use this new interface.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a31f2d17
  6. 28 6月, 2012 1 次提交
  7. 21 6月, 2012 1 次提交
  8. 08 6月, 2012 2 次提交
  9. 06 6月, 2012 2 次提交
  10. 01 6月, 2012 5 次提交
    • D
      vsprintf: further optimize decimal conversion · 133fd9f5
      Denys Vlasenko 提交于
      Previous code was using optimizations which were developed to work well
      even on narrow-word CPUs (by today's standards).  But Linux runs only on
      32-bit and wider CPUs.  We can use that.
      
      First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
      divide by 10 much larger numbers, and thus we can print groups of 9 digits
      instead of groups of 5 digits.
      
      Next: there are two algorithms to print larger numbers.  One is generic:
      divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
      It's conceptually simple, but requires an (unsigned long long) /
      1000000000 division.
      
      Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
      manipulates them cleverly and generates groups of 4 decimal digits.  It so
      happens that it does NOT require long long division.
      
      If long is > 32 bits, division of 64-bit values is relatively easy, and we
      will use the first algorithm.  If long long is > 64 bits (strange
      architecture with VERY large long long), second algorithm can't be used,
      and we again use the first one.
      
      Else (if long is 32 bits and long long is 64 bits) we use second one.
      
      And third: there is a simple optimization which takes fast path not only
      for zero as was done before, but for all one-digit numbers.
      
      In all tested cases new code is faster than old one, in many cases by 30%,
      in few cases by more than 50% (for example, on x86-32, conversion of
      12345678).  Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
      case.
      
      This patch is based upon an original from Michal Nazarewicz.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Douglas W Jones <jones@cs.uiowa.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      133fd9f5
    • G
      vsprintf: correctly handle width when '#' flag used in %#p format · 725fe002
      Grant Likely 提交于
      The '%p' output of the kernel's vsprintf() uses spec.field_width to
      determine how many digits to output based on 2 * sizeof(void*) so that all
      digits of a pointer are shown.  ie.  a pointer will be output as
      "001A2B3C" instead of "1A2B3C".  However, if the '#' flag is used in the
      format (%#p), then the code doesn't take into account the width of the
      '0x' prefix and will end up outputing "0x1A2B3C" instead of "0x001A2B3C".
      
      This patch reworks the "pointer()" format hook to include 2 characters for
      the '0x' prefix if the '#' flag is included.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      725fe002
    • H
      bql: Avoid possible inconsistent calculation. · 914bec10
      Hiroaki SHIMODA 提交于
      dql->num_queued could change while processing dql_completed().
      To provide consistent calculation, added an on stack variable.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      914bec10
    • H
      bql: Avoid unneeded limit decrement. · 25426b79
      Hiroaki SHIMODA 提交于
      When below pattern is observed,
      
                                                     TIME
             dql_queued()         dql_completed()     |
            a) initial state                          |
                                                      |
            b) X bytes queued                         V
      
            c) Y bytes queued
                                 d) X bytes completed
            e) Z bytes queued
                                 f) Y bytes completed
      
      a) dql->limit has already some value and there is no in-flight packet.
      b) X bytes queued.
      c) Y bytes queued and excess limit.
      d) X bytes completed and dql->prev_ovlimit is set and also
         dql->prev_num_queued is set Y.
      e) Z bytes queued.
      f) Y bytes completed. inprogress and prev_inprogress are true.
      
      At f), according to the comment, all_prev_completed becomes
      true and limit should be increased. But POSDIFF() ignores
      (completed == dql->prev_num_queued) case, so limit is decreased.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25426b79
    • H
      bql: Fix POSDIFF() to integer overflow aware. · 0cfd32b7
      Hiroaki SHIMODA 提交于
      POSDIFF() fails to take into account integer overflow case.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cfd32b7
  11. 30 5月, 2012 9 次提交
  12. 28 5月, 2012 2 次提交
  13. 27 5月, 2012 2 次提交
    • L
      lib: add generic strnlen_user() function · a08c5356
      Linus Torvalds 提交于
      This adds a new generic optimized strnlen_user() function that uses the
      <asm/word-at-a-time.h> infrastructure to portably do efficient string
      handling.
      
      In many ways, strnlen is much simpler than strncpy, and in particular we
      can always pre-align the words we load from memory.  That means that all
      the worries about alignment etc are a non-issue, so this one can easily
      be used on any architecture.  You obviously do have to do the
      appropriate word-at-a-time.h macros.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a08c5356
    • L
      word-at-a-time: make the interfaces truly generic · 36126f8f
      Linus Torvalds 提交于
      This changes the interfaces in <asm/word-at-a-time.h> to be a bit more
      complicated, but a lot more generic.
      
      In particular, it allows us to really do the operations efficiently on
      both little-endian and big-endian machines, pretty much regardless of
      machine details.  For example, if you can rely on a fast population
      count instruction on your architecture, this will allow you to make your
      optimized <asm/word-at-a-time.h> file with that.
      
      NOTE! The "generic" version in include/asm-generic/word-at-a-time.h is
      not truly generic, it actually only works on big-endian.  Why? Because
      on little-endian the generic algorithms are wasteful, since you can
      inevitably do better. The x86 implementation is an example of that.
      
      (The only truly non-generic part of the asm-generic implementation is
      the "find_zero()" function, and you could make a little-endian version
      of it.  And if the Kbuild infrastructure allowed us to pick a particular
      header file, that would be lovely)
      
      The <asm/word-at-a-time.h> functions are as follows:
      
       - WORD_AT_A_TIME_CONSTANTS: specific constants that the algorithm
         uses.
      
       - has_zero(): take a word, and determine if it has a zero byte in it.
         It gets the word, the pointer to the constant pool, and a pointer to
         an intermediate "data" field it can set.
      
         This is the "quick-and-dirty" zero tester: it's what is run inside
         the hot loops.
      
       - "prep_zero_mask()": take the word, the data that has_zero() produced,
         and the constant pool, and generate an *exact* mask of which byte had
         the first zero.  This is run directly *outside* the loop, and allows
         the "has_zero()" function to answer the "is there a zero byte"
         question without necessarily getting exactly *which* byte is the
         first one to contain a zero.
      
         If you do multiple byte lookups concurrently (eg "hash_name()", which
         looks for both NUL and '/' bytes), after you've done the prep_zero_mask()
         phase, the result of those can be or'ed together to get the "either
         or" case.
      
       - The result from "prep_zero_mask()" can then be fed into "find_zero()"
         (to find the byte offset of the first byte that was zero) or into
         "zero_bytemask()" (to find the bytemask of the bytes preceding the
         zero byte).
      
         The existence of zero_bytemask() is optional, and is not necessary
         for the normal string routines.  But dentry name hashing needs it, so
         if you enable DENTRY_WORD_AT_A_TIME you need to expose it.
      
      This changes the generic strncpy_from_user() function and the dentry
      hashing functions to use these modified word-at-a-time interfaces.  This
      gets us back to the optimized state of the x86 strncpy that we lost in
      the previous commit when moving over to the generic version.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36126f8f