1. 10 2月, 2008 1 次提交
    • D
      x86: trivial printk optimizations · 9b706aee
      Denys Vlasenko 提交于
      In arch/x86/boot/printf.c gets rid of unused tail of digits: const char
      *digits = "0123456789abcdefghijklmnopqrstuvwxyz"; (we are using 0-9a-f
      only)
      
      Uses smaller/faster lowercasing (by ORing with 0x20)
      if we know that we work on numbers/digits. Makes
      strtoul smaller, and also we are getting rid of
      
        static const char small_digits[] = "0123456789abcdefx";
        static const char large_digits[] = "0123456789ABCDEFX";
      
      since this works equally well:
      
        static const char digits[16] = "0123456789ABCDEF";
      
      Size savings:
      
      $ size vmlinux.org vmlinux
         text    data     bss     dec     hex filename
       877320  112252   90112 1079684  107984 vmlinux.org
       877048  112252   90112 1079412  107874 vmlinux
      
      It may be also a tiny bit faster because code has less
      branches now, but I doubt it is measurable.
      
      [ hugh@veritas.com: uppercase pointers fix ]
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      9b706aee
  2. 09 2月, 2008 1 次提交
    • Y
      Add new string functions strict_strto* and convert kernel params to use them · 06b2a76d
      Yi Yang 提交于
      Currently, for every sysfs node, the callers will be responsible for
      implementing store operation, so many many callers are doing duplicate
      things to validate input, they have the same mistakes because they are
      calling simple_strtol/ul/ll/uul, especially for module params, they are
      just numeric, but you can echo such values as 0x1234xxx, 07777888 and
      1234aaa, for these cases, module params store operation just ignores
      succesive invalid char and converts prefix part to a numeric although input
      is acctually invalid.
      
      This patch tries to fix the aforementioned issues and implements
      strict_strtox serial functions, kernel/params.c uses them to strictly
      validate input, so module params will reject such values as 0x1234xxxx and
      returns an error:
      
      write error: Invalid argument
      
      Any modules which export numeric sysfs node can use strict_strtox instead of
      simple_strtox to reject any invalid input.
      
      Here are some test results:
      
      Before applying this patch:
      
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000g > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000gggggggg > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 010000 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0100008 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 010000aaaaa > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]#
      
      After applying this patch:
      
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000g > /sys/module/e1000/parameters/copybreak
      -bash: echo: write error: Invalid argument
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo 0x1000gggggggg > /sys/module/e1000/parameters/copybreak
      -bash: echo: write error: Invalid argument
      [root@yangyi-dev /]# echo 010000 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# echo 0100008 > /sys/module/e1000/parameters/copybreak
      -bash: echo: write error: Invalid argument
      [root@yangyi-dev /]# echo 010000aaaaa > /sys/module/e1000/parameters/copybreak
      -bash: echo: write error: Invalid argument
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]# echo -n 4096 > /sys/module/e1000/parameters/copybreak
      [root@yangyi-dev /]# cat /sys/module/e1000/parameters/copybreak
      4096
      [root@yangyi-dev /]#
      
      [akpm@linux-foundation.org: fix compiler warnings]
      [akpm@linux-foundation.org: fix off-by-one found by tiwai@suse.de]
      Signed-off-by: NYi Yang <yi.y.yang@intel.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      06b2a76d
  3. 01 8月, 2007 1 次提交
  4. 17 7月, 2007 2 次提交
    • D
      vsprintf.c: optimizing, part 2: base 10 conversion speedup, v2 · 4277eedd
      Denis Vlasenko 提交于
      Optimize integer-to-string conversion in vsprintf.c for base 10.  This is
      by far the most used conversion, and in some use cases it impacts
      performance.  For example, top reads /proc/$PID/stat for every process, and
      with 4000 processes decimal conversion alone takes noticeable time.
      
      Using code from
      
      http://www.cs.uiowa.edu/~jones/bcd/decimal.html
      (with permission from the author, Douglas W. Jones)
      
      binary-to-decimal-string conversion is done in groups of five digits at
      once, using only additions/subtractions/shifts (with -O2; -Os throws in
      some multiply instructions).
      
      On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.
      
      This patch is run tested. Userspace benchmark/test is also attached.
      I tested it on PIII and AMD64 and new code is generally ~2.5 times
      faster. On AMD64:
      
      # ./vsprintf_verify-O2
      Original decimal conv: .......... 151 ns per iteration
      Patched decimal conv:  .......... 62 ns per iteration
      Testing correctness
      12895992590592 ok...        [Ctrl-C]
      # ./vsprintf_verify-O2
      Original decimal conv: .......... 151 ns per iteration
      Patched decimal conv:  .......... 62 ns per iteration
      Testing correctness
      26025406464 ok...        [Ctrl-C]
      
      More realistic test: top from busybox project was modified to
      report how many us it took to scan /proc (this does not account
      any processing done after that, like sorting process list),
      and then I test it with 4000 processes:
      
      #!/bin/sh
      i=4000
      while test $i != 0; do
          sleep 30 &
          let i--
      done
      busybox top -b -n3 >/dev/null
      
      on unpatched kernel:
      
      top: 4120 processes took 102864 microseconds to scan
      top: 4120 processes took 91757 microseconds to scan
      top: 4120 processes took 92517 microseconds to scan
      top: 4120 processes took 92581 microseconds to scan
      
      on patched kernel:
      
      top: 4120 processes took 75460 microseconds to scan
      top: 4120 processes took 66451 microseconds to scan
      top: 4120 processes took 67267 microseconds to scan
      top: 4120 processes took 67618 microseconds to scan
      
      The speedup comes from much faster generation of /proc/PID/stat
      by sprintf() calls inside the kernel.
      Signed-off-by: NDouglas W Jones <jones@cs.uiowa.edu>
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4277eedd
    • D
      vsprintf.c: optimizing, part 1 (easy and obvious stuff) · b39a7340
      Denis Vlasenko 提交于
      * There is no point in having full "0...9a...z" constant vector,
        if we use only "0...9a...f" (and "x" for "0x").
      
      * Post-decrement usually needs a few more instructions, so use
        pre decrement instead where makes sense:
      -       while (i < precision--) {
      +       while (i <= --precision) {
      
      * if base != 10 (=> base 8 or 16), we can avoid using division
        in a loop and use mask/shift, obtaining much faster conversion.
        (More complex optimization for base 10 case is in the second patch).
      
      Overall, size vsprintf.o shows ~80 bytes smaller text section
      with this patch applied.
      Signed-off-by: NDouglas W Jones <jones@cs.uiowa.edu>
      Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b39a7340
  5. 09 5月, 2007 1 次提交
  6. 01 5月, 2007 1 次提交
  7. 13 2月, 2007 1 次提交
  8. 12 2月, 2007 1 次提交
  9. 29 6月, 2006 1 次提交
  10. 26 6月, 2006 2 次提交
  11. 31 10月, 2005 1 次提交
    • T
      [PATCH] fix missing includes · 4e57b681
      Tim Schmielau 提交于
      I recently picked up my older work to remove unnecessary #includes of
      sched.h, starting from a patch by Dave Jones to not include sched.h
      from module.h. This reduces the number of indirect includes of sched.h
      by ~300. Another ~400 pointless direct includes can be removed after
      this disentangling (patch to follow later).
      However, quite a few indirect includes need to be fixed up for this.
      
      In order to feed the patches through -mm with as little disturbance as
      possible, I've split out the fixes I accumulated up to now (complete for
      i386 and x86_64, more archs to follow later) and post them before the real
      patch.  This way this large part of the patch is kept simple with only
      adding #includes, and all hunks are independent of each other.  So if any
      hunk rejects or gets in the way of other patches, just drop it.  My scripts
      will pick it up again in the next round.
      Signed-off-by: NTim Schmielau <tim@physik3.uni-rostock.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4e57b681
  12. 24 8月, 2005 1 次提交
  13. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4