1. 01 6月, 2012 5 次提交
    • D
      vsprintf: further optimize decimal conversion · 133fd9f5
      Denys Vlasenko 提交于
      Previous code was using optimizations which were developed to work well
      even on narrow-word CPUs (by today's standards).  But Linux runs only on
      32-bit and wider CPUs.  We can use that.
      
      First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
      divide by 10 much larger numbers, and thus we can print groups of 9 digits
      instead of groups of 5 digits.
      
      Next: there are two algorithms to print larger numbers.  One is generic:
      divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
      It's conceptually simple, but requires an (unsigned long long) /
      1000000000 division.
      
      Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
      manipulates them cleverly and generates groups of 4 decimal digits.  It so
      happens that it does NOT require long long division.
      
      If long is > 32 bits, division of 64-bit values is relatively easy, and we
      will use the first algorithm.  If long long is > 64 bits (strange
      architecture with VERY large long long), second algorithm can't be used,
      and we again use the first one.
      
      Else (if long is 32 bits and long long is 64 bits) we use second one.
      
      And third: there is a simple optimization which takes fast path not only
      for zero as was done before, but for all one-digit numbers.
      
      In all tested cases new code is faster than old one, in many cases by 30%,
      in few cases by more than 50% (for example, on x86-32, conversion of
      12345678).  Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
      case.
      
      This patch is based upon an original from Michal Nazarewicz.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Douglas W Jones <jones@cs.uiowa.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      133fd9f5
    • G
      vsprintf: correctly handle width when '#' flag used in %#p format · 725fe002
      Grant Likely 提交于
      The '%p' output of the kernel's vsprintf() uses spec.field_width to
      determine how many digits to output based on 2 * sizeof(void*) so that all
      digits of a pointer are shown.  ie.  a pointer will be output as
      "001A2B3C" instead of "1A2B3C".  However, if the '#' flag is used in the
      format (%#p), then the code doesn't take into account the width of the
      '0x' prefix and will end up outputing "0x1A2B3C" instead of "0x001A2B3C".
      
      This patch reworks the "pointer()" format hook to include 2 characters for
      the '0x' prefix if the '#' flag is included.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      725fe002
    • H
      bql: Avoid possible inconsistent calculation. · 914bec10
      Hiroaki SHIMODA 提交于
      dql->num_queued could change while processing dql_completed().
      To provide consistent calculation, added an on stack variable.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      914bec10
    • H
      bql: Avoid unneeded limit decrement. · 25426b79
      Hiroaki SHIMODA 提交于
      When below pattern is observed,
      
                                                     TIME
             dql_queued()         dql_completed()     |
            a) initial state                          |
                                                      |
            b) X bytes queued                         V
      
            c) Y bytes queued
                                 d) X bytes completed
            e) Z bytes queued
                                 f) Y bytes completed
      
      a) dql->limit has already some value and there is no in-flight packet.
      b) X bytes queued.
      c) Y bytes queued and excess limit.
      d) X bytes completed and dql->prev_ovlimit is set and also
         dql->prev_num_queued is set Y.
      e) Z bytes queued.
      f) Y bytes completed. inprogress and prev_inprogress are true.
      
      At f), according to the comment, all_prev_completed becomes
      true and limit should be increased. But POSDIFF() ignores
      (completed == dql->prev_num_queued) case, so limit is decreased.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25426b79
    • H
      bql: Fix POSDIFF() to integer overflow aware. · 0cfd32b7
      Hiroaki SHIMODA 提交于
      POSDIFF() fails to take into account integer overflow case.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Denys Fedoryshchenko <denys@visp.net.lb>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cfd32b7
  2. 30 5月, 2012 9 次提交
  3. 28 5月, 2012 1 次提交
  4. 27 5月, 2012 2 次提交
    • L
      lib: add generic strnlen_user() function · a08c5356
      Linus Torvalds 提交于
      This adds a new generic optimized strnlen_user() function that uses the
      <asm/word-at-a-time.h> infrastructure to portably do efficient string
      handling.
      
      In many ways, strnlen is much simpler than strncpy, and in particular we
      can always pre-align the words we load from memory.  That means that all
      the worries about alignment etc are a non-issue, so this one can easily
      be used on any architecture.  You obviously do have to do the
      appropriate word-at-a-time.h macros.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a08c5356
    • L
      word-at-a-time: make the interfaces truly generic · 36126f8f
      Linus Torvalds 提交于
      This changes the interfaces in <asm/word-at-a-time.h> to be a bit more
      complicated, but a lot more generic.
      
      In particular, it allows us to really do the operations efficiently on
      both little-endian and big-endian machines, pretty much regardless of
      machine details.  For example, if you can rely on a fast population
      count instruction on your architecture, this will allow you to make your
      optimized <asm/word-at-a-time.h> file with that.
      
      NOTE! The "generic" version in include/asm-generic/word-at-a-time.h is
      not truly generic, it actually only works on big-endian.  Why? Because
      on little-endian the generic algorithms are wasteful, since you can
      inevitably do better. The x86 implementation is an example of that.
      
      (The only truly non-generic part of the asm-generic implementation is
      the "find_zero()" function, and you could make a little-endian version
      of it.  And if the Kbuild infrastructure allowed us to pick a particular
      header file, that would be lovely)
      
      The <asm/word-at-a-time.h> functions are as follows:
      
       - WORD_AT_A_TIME_CONSTANTS: specific constants that the algorithm
         uses.
      
       - has_zero(): take a word, and determine if it has a zero byte in it.
         It gets the word, the pointer to the constant pool, and a pointer to
         an intermediate "data" field it can set.
      
         This is the "quick-and-dirty" zero tester: it's what is run inside
         the hot loops.
      
       - "prep_zero_mask()": take the word, the data that has_zero() produced,
         and the constant pool, and generate an *exact* mask of which byte had
         the first zero.  This is run directly *outside* the loop, and allows
         the "has_zero()" function to answer the "is there a zero byte"
         question without necessarily getting exactly *which* byte is the
         first one to contain a zero.
      
         If you do multiple byte lookups concurrently (eg "hash_name()", which
         looks for both NUL and '/' bytes), after you've done the prep_zero_mask()
         phase, the result of those can be or'ed together to get the "either
         or" case.
      
       - The result from "prep_zero_mask()" can then be fed into "find_zero()"
         (to find the byte offset of the first byte that was zero) or into
         "zero_bytemask()" (to find the bytemask of the bytes preceding the
         zero byte).
      
         The existence of zero_bytemask() is optional, and is not necessary
         for the normal string routines.  But dentry name hashing needs it, so
         if you enable DENTRY_WORD_AT_A_TIME you need to expose it.
      
      This changes the generic strncpy_from_user() function and the dentry
      hashing functions to use these modified word-at-a-time interfaces.  This
      gets us back to the optimized state of the x86 strncpy that we lost in
      the previous commit when moving over to the generic version.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36126f8f
  5. 25 5月, 2012 1 次提交
  6. 22 5月, 2012 4 次提交
  7. 17 5月, 2012 1 次提交
  8. 10 5月, 2012 1 次提交
  9. 08 5月, 2012 2 次提交
  10. 05 5月, 2012 1 次提交
  11. 02 5月, 2012 1 次提交
  12. 01 5月, 2012 9 次提交
    • J
      dynamic_debug: init with early_initcall, not arch_initcall · 3ec5652a
      Jim Cromie 提交于
      1- Call dynamic_debug_init() from early_initcall, not arch_initcall.
      2- Call dynamic_debug_init_debugfs() from fs_initcall, not module_init.
      
      RFC: This works for me on a 64 bit desktop and a i586 SBC, but is
      untested on other arches.  I presume there is or was a reason
      original code used arch_initcall, maybe the constraints have changed.
      
      This makes facility available as soon as possible.
      
      2nd change has a downside when dynamic_debug.verbose=1; all the
      vpr_info()s called in the proc-fs code are activated, causing
      voluminous output from dmesg.  TBD: Im unsure of this explanation, but
      the output is there.  This could be fixed by changing those callsites
      to v2pr_info(if verbose > 1).
      
      1st change is still not early enough to enable pr_debugs in
      kernel/params, so parsing of boot-args isnt logged.  The reparse of
      those args is however visible after params.dyndbg="+p" is processed.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ec5652a
    • J
      dynamic_debug: update Documentation/*, Kconfig.debug · 29e36c9f
      Jim Cromie 提交于
      In dynamic-debug-howto.txt:
      
      - add section: Debug Messages at Module Initialization Time
      - update flags indicators in example outputs to include '='
      - make flags descriptions tabular
      - add item on '_' flag-char
      - add dyndbg, boot-args examples
      - rewrap some paragraphs with long lines
      
      In Kconfig.debug, note that compiling with -DDEBUG enables all
      pr_debug()s in that code.
      
      In kernel-parameters.txt, add dyndbg and module.dyndbg items,
      and deprecate ddebug_query.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29e36c9f
    • J
      dynamic_debug: add modname arg to exec_query callchain · 8e59b5cf
      Jim Cromie 提交于
      Pass module name into ddebug_exec_queries(), ddebug_exec_query(), and
      ddebug_parse_query() as separate parameter.  In ddebug_parse_query(),
      the module name is added into the query struct before the query-string
      is parsed.  This allows the query-string to be shorter:
      
      instead of:
         $modname.dyndbg="module $modname +fp"
      do this:
         $modname.dyndbg="+fp"
      
      Omitting "module $modname" from the query string is actually required
      for $modname.dyndbg rules; the set-only-once check added in a previous
      patch will throw an error if its added again.  ddebug_query="..." has
      no $modname associated with it, so the query string may include it.
      
      This also fixes redundant "module $modname" otherwise needed to handle
      multiple queries per string:
      
         $modname.dyndbg="func foo +fp; func bar +fp"
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e59b5cf
    • J
      dynamic_debug: print ram usage by ddebug tables if verbose · 41076927
      Jim Cromie 提交于
      Print ram usage of dynamic-debug tables and verbose section so user
      knows cost of enabling CONFIG_DYNAMIC_DEBUG.  This only counts the
      size of the _ddebug tables for builtins and the __verbose section that
      they refer to, not those used in loadable modules.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41076927
    • J
      dynamic_debug: simplify dynamic_debug_init error exit · af442399
      Jim Cromie 提交于
      We dont want errors while parsing ddebug_query to unload ddebug
      tables, so set success after tables are loaded, and return 0 after
      query parsing is done.
      
      Simplify error handling code since its no longer used for success,
      and change goto label to out_err to clarify this.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af442399
    • J
      dynamic_debug: combine parse_args callbacks together · 6ab676e9
      Jim Cromie 提交于
      Refactor ddebug_dyndbg_boot_param_cb and ddebug_dyndbg_module_param_cb
      into a common helper function, and call it from both.  The handling of
      foo.dyndbg is unneeded by the latter, but harmless.
      
      The 2 callers differ only by pr_info and the return code they pass to
      the helper for when an unknown param is handled.  I could slightly
      reduce dmesg clutter by putting the vpr_info in the common helper,
      after the return on_err, but that loses __func__ context, is overly
      silent on module_cb unknown param errors, and the clutter is only when
      dynamic_debug.verbose=1 anyway.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ab676e9
    • J
      dynamic_debug: deprecate ddebug_query, suggest dyndbg instead · f0b919d9
      Jim Cromie 提交于
      With ddebug_dyndbg_boot_params_cb() handling bare dyndbg params, we
      dont need ddebug_query param anymore.  Add a warning when processing
      ddebug_query= param that it is deprecated, and to change it to dyndbg=
      
      Add a deprecation notice for v3.8 to feature-removal-schedule.txt, and
      add a suggested deprecation period of 3 releases to the header.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0b919d9
    • J
      dynamic_debug: make dynamic-debug work for module initialization · b48420c1
      Jim Cromie 提交于
      This introduces a fake module param $module.dyndbg.  Its based upon
      Thomas Renninger's $module.ddebug boot-time debugging patch from
      https://lkml.org/lkml/2010/9/15/397
      
      The 'fake' module parameter is provided for all modules, whether or
      not they need it.  It is not explicitly added to each module, but is
      implemented in callbacks invoked from parse_args.
      
      For builtin modules, dynamic_debug_init() now directly calls
      parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params
      undeclared in the modules, just after the ddebug tables are processed.
      
      While its slightly weird to reprocess the boot params, parse_args() is
      already called repeatedly by do_initcall_levels().  More importantly,
      the dyndbg queries (given in ddebug_query or dyndbg params) cannot be
      activated until after the ddebug tables are ready, and reusing
      parse_args is cleaner than doing an ad-hoc parse.  This reparse would
      break options like inc_verbosity, but they probably should be params,
      like verbosity=3.
      
      ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka:
      ddebug_query) and module-prefixed dyndbg params, and ignores all other
      parameters.  For example, the following will enable pr_debug()s in 4
      builtin modules, in the order given:
      
        dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg
      
      For loadable modules, parse_args() in load_module() calls
      ddebug_dyndbg_module_params_cb().  This handles bare dyndbg params as
      passed from modprobe, and errors on other unknown params.
      
      Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all
      foo.params, strips the "foo.", and passes these to the kernel.
      ddebug_dyndbg_module_params_cb() is again called for the unknown
      params; it handles dyndbg, and errors on others.  The "doing" arg
      added previously contains the module name.
      
      For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts
      and ignores $module.dyndbg params, other unknowns get -ENOENT.
      
      If no param value is given (as in pci.dyndbg example above), "+p" is
      assumed, which enables all pr_debug callsites in the module.
      
      The dyndbg fake parameter is not shown in /sys/module/*/parameters,
      thus it does not use any resources.  Changes to it are made via the
      control file.
      
      Also change pr_info in ddebug_exec_queries to vpr_info,
      no need to see it all the time.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      CC: Thomas Renninger <trenn@suse.de>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b48420c1
    • J
      dynamic_debug: replace if (verbose) pr_info with macro vpr_info · b8ccd5de
      Jim Cromie 提交于
      Use vpr_info to declutter code, reduce indenting, and change one
      additional pr_info call in ddebug_exec_queries.
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8ccd5de
  13. 25 4月, 2012 1 次提交
  14. 24 4月, 2012 1 次提交
  15. 21 4月, 2012 1 次提交
    • W
      lib: add support for stmp-style devices · 4ccf4bea
      Wolfram Sang 提交于
      MX23/28 use IP cores which follow a register layout I have first seen on
      STMP3xxx SoCs. In this layout, every register actually has four u32:
      
       1.) to store a value directly
       2.) a SET register where every 1-bit sets the corresponding bit,
           others are unaffected
       3.) same with a CLR register
       4.) same with a TOG (toggle) register
      
      Also, the 2 MSBs in register 0 are always the same and can be used to reset
      the IP core.
      
      All this is strictly speaking not mach-specific (but IP core specific) and,
      thus, doesn't need to be in mach-mxs/include. At least mx6 also uses IP cores
      following this stmp-style. So:
      
      Introduce a stmp-style device, put the code and defines for that in a public
      place (lib/), and let drivers for stmp-style devices select that code.
      To avoid regressions and ease reviewing, the actual code is simply copied from
      mach-mxs. It definately wants updates, but those need a seperate patch series.
      
      Voila, mach dependency gone, reusable code introduced. Note that I didn't
      remove the duplicated code from mach-mxs yet, first the drivers have to be
      converted.
      Signed-off-by: NWolfram Sang <w.sang@pengutronix.de>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      Acked-by: NDong Aisheng <dong.aisheng@linaro.org>
      4ccf4bea