1. 23 6月, 2006 5 次提交
  2. 22 6月, 2006 1 次提交
  3. 21 6月, 2006 1 次提交
  4. 06 6月, 2006 1 次提交
  5. 22 5月, 2006 1 次提交
  6. 13 5月, 2006 1 次提交
  7. 28 4月, 2006 2 次提交
  8. 21 4月, 2006 2 次提交
    • D
      [RBTREE] Merge colour and parent fields of struct rb_node. · 55a98102
      David Woodhouse 提交于
      We only used a single bit for colour information, so having a whole
      machine word of space allocated for it was a bit wasteful. Instead,
      store it in the lowest bit of the 'parent' pointer, since that was
      always going to be aligned anyway.
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      55a98102
    • D
      [RBTREE] Remove dead code in rb_erase() · 1975e593
      David Woodhouse 提交于
      Observe rb_erase(), when the victim node 'old' has two children so
      neither of the simple cases at the beginning are taken.
      
      Observe that it effectively does an 'rb_next()' operation to find the
      next (by value) node in the tree. That is; we go to the victim's
      right-hand child and then follow left-hand pointers all the way
      down the tree as far as we can until we find the next node 'node'. We
      end up with 'node' being either the same immediate right-hand child of
      'old', or one of its descendants on the far left-hand side.
      
      For a start, we _know_ that 'node' has a parent. We can drop that check.
      
      We also know that if 'node's parent is 'old', then 'node' is the
      right-hand child of its parent. And that if 'node's parent is _not_
      'old', then 'node' is the left-hand child of its parent.
      
      So instead of checking for 'node->rb_parent == old' in one place and
      also checking 'node's heritage separately when we're trying to change
      its link from its parent, we can shuffle things around a bit and do
      it like this...
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      1975e593
  9. 20 4月, 2006 1 次提交
    • T
      [PATCH] Kconfig.debug: Set DEBUG_MUTEX to off by default · cca57c5b
      Tim Chen 提交于
      DEBUG_MUTEX flag is on by default in current kernel configuration.
      
      During performance testing, we saw mutex debug functions like
      mutex_debug_check_no_locks_freed (called by kfree()) is expensive as it
      goes through a global list of memory areas with mutex lock and do the
      checking.  For benchmarks such as Volanomark and Hackbench, we have seen
      more than 40% drop in performance on some platforms.  We suggest to set
      DEBUG_MUTEX off by default.  Or at least do that later when we feel that
      the mutex changes in the current code have stabilized.
      Signed-off-by: NTim Chen <tim.c.chen@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cca57c5b
  10. 15 4月, 2006 1 次提交
    • N
      [PATCH] sysfs: Allow sysfs attribute files to be pollable · 4508a7a7
      NeilBrown 提交于
      It works like this:
        Open the file
        Read all the contents.
        Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
        When poll returns,
           close the file and go to top of loop.
         or lseek to start of file and go back to the 'read'.
      
      Events are signaled by an object manager calling
         sysfs_notify(kobj, dir, attr);
      
      If the dir is non-NULL, it is used to find a subdirectory which
      contains the attribute (presumably created by sysfs_create_group).
      
      This has a cost of one int  per attribute, one wait_queuehead per kobject,
      one int per open file.
      
      The name "sysfs_notify" may be confused with the inotify
      functionality.  Maybe it would be nice to support inotify for sysfs
      attributes as well?
      
      This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
      to be pollable
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      4508a7a7
  11. 11 4月, 2006 3 次提交
  12. 31 3月, 2006 1 次提交
  13. 27 3月, 2006 6 次提交
    • J
      [PATCH] Don't make debugfs depend on DEBUG_KERNEL · ae36b883
      Jens Axboe 提交于
      We use it generally now, at least blktrace isn't a specific debug
      kernel feature.
      Signed-off-by: NJens Axboe <axboe@suse.de>
      ae36b883
    • A
      [PATCH] bitops: hweight() speedup · f9b41929
      Akinobu Mita 提交于
      <linux@horizon.com> wrote:
      
      This is an extremely well-known technique.  You can see a similar version that
      uses a multiply for the last few steps at
      http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel whch
      refers to "Software Optimization Guide for AMD Athlon 64 and Opteron
      Processors"
      http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF
      
      It's section 8.6, "Efficient Implementation of Population-Count Function in
      32-bit Mode", pages 179-180.
      
      It uses the name that I am more familiar with, "popcount" (population count),
      although "Hamming weight" also makes sense.
      
      Anyway, the proof of correctness proceeds as follows:
      
      	b = a - ((a >> 1) & 0x55555555);
      	c = (b & 0x33333333) + ((b >> 2) & 0x33333333);
      	d = (c + (c >> 4)) & 0x0f0f0f0f;
      #if SLOW_MULTIPLY
      	e = d + (d >> 8)
      	f = e + (e >> 16);
      	return f & 63;
      #else
      	/* Useful if multiply takes at most 4 cycles */
      	return (d * 0x01010101) >> 24;
      #endif
      
      The input value a can be thought of as 32 1-bit fields each holding their own
      hamming weight.  Now look at it as 16 2-bit fields.  Each 2-bit field a1..a0
      has the value 2*a1 + a0.  This can be converted into the hamming weight of the
      2-bit field a1+a0 by subtracting a1.
      
      That's what the (a >> 1) & mask subtraction does.  Since there can be no
      borrows, you can just do it all at once.
      
      Enumerating the 4 possible cases:
      
      0b00 = 0  ->  0 - 0 = 0
      0b01 = 1  ->  1 - 0 = 1
      0b10 = 2  ->  2 - 1 = 1
      0b11 = 3  ->  3 - 1 = 2
      
      The next step consists of breaking up b (made of 16 2-bir fields) into
      even and odd halves and adding them into 4-bit fields.  Since the largest
      possible sum is 2+2 = 4, which will not fit into a 4-bit field, the 2-bit
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                "which will not fit into a 2-bit field"
      
      fields have to be masked before they are added.
      
      After this point, the masking can be delayed.  Each 4-bit field holds a
      population count from 0..4, taking at most 3 bits.  These numbers can be added
      without overflowing a 4-bit field, so we can compute c + (c >> 4), and only
      then mask off the unwanted bits.
      
      This produces d, a number of 4 8-bit fields, each in the range 0..8.  From
      this point, we can shift and add d multiple times without overflowing an 8-bit
      field, and only do a final mask at the end.
      
      The number to mask with has to be at least 63 (so that 32 on't be truncated),
      but can also be 128 or 255.  The x86 has a special encoding for signed
      immediate byte values -128..127, so the value of 255 is slower.  On other
      processors, a special "sign extend byte" instruction might be faster.
      
      On a processor with fast integer multiplies (Athlon but not P4), you can
      reduce the final few serially dependent instructions to a single integer
      multiply.  Consider d to be 3 8-bit values d3, d2, d1 and d0, each in the
      range 0..8.  The multiply forms the partial products:
      
      	           d3 d2 d1 d0
      	        d3 d2 d1 d0
      	     d3 d2 d1 d0
      	+ d3 d2 d1 d0
      	----------------------
      	           e3 e2 e1 e0
      
      Where e3 = d3 + d2 + d1 + d0.   e2, e1 and e0 obviously cannot generate
      any carries.
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9b41929
    • A
      [PATCH] bitops: hweight() related cleanup · 37d54111
      Akinobu Mita 提交于
      By defining generic hweight*() routines
      
      - hweight64() will be defined on all architectures
      - hweight_long() will use architecture optimized hweight32() or hweight64()
      
      I found two possible cleanups by these reasons.
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      37d54111
    • A
      [PATCH] bitops: generic ext2_{set,clear,test,find_first_zero,find_next_zero}_bit() · 930ae745
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      int ext2_set_bit(int nr, volatile unsigned long *addr);
      int ext2_clear_bit(int nr, volatile unsigned long *addr);
      int ext2_test_bit(int nr, const volatile unsigned long *addr);
      unsigned long ext2_find_first_zero_bit(const unsigned long *addr,
                                             unsigned long size);
      unsinged long ext2_find_next_zero_bit(const unsigned long *addr,
                                            unsigned long size);
      
      In include/asm-generic/bitops/ext2-non-atomic.h
      
      This code largely copied from:
      
      include/asm-powerpc/bitops.h
      include/asm-parisc/bitops.h
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      930ae745
    • A
      [PATCH] bitops: generic hweight{64,32,16,8}() · 3b9ed1a5
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      unsigned int hweight32(unsigned int w);
      unsigned int hweight16(unsigned int w);
      unsigned int hweight8(unsigned int w);
      unsigned long hweight64(__u64 w);
      
      In include/asm-generic/bitops/hweight.h
      
      This code largely copied from: include/linux/bitops.h
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3b9ed1a5
    • A
      [PATCH] bitops: generic find_{next,first}{,_zero}_bit() · c7f612cd
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      unsigned logn find_next_bit(const unsigned long *addr, unsigned long size,
                                  unsigned long offset);
      unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
                                       unsigned long offset);
      unsigned long find_first_zero_bit(const unsigned long *addr,
                                        unsigned long size);
      unsigned long find_first_bit(const unsigned long *addr, unsigned long size);
      
      In include/asm-generic/bitops/find.h
      
      This code largely copied from: arch/powerpc/lib/bitops.c
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c7f612cd
  14. 26 3月, 2006 8 次提交
  15. 25 3月, 2006 1 次提交
  16. 24 3月, 2006 4 次提交
    • J
      [PATCH] CONFIG_UNWIND_INFO · 604bf5a2
      Jan Beulich 提交于
      As a foundation for reliable stack unwinding, this adds a config option
      (available to all architectures except IA64 and those where the module
      loader might have problems with the resulting relocations) to enable the
      generation of frame unwind information.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>,
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      604bf5a2
    • P
      [PATCH] bitmap: region restructuring · 3cf64b93
      Paul Jackson 提交于
      Restructure the bitmap_*_region() operations, to avoid code duplication.
      
      Also reduces binary text size by about 100 bytes (ia64 arch).  The original
      Bottomley bitmap_*_region patch added about 1000 bytes of compiled kernel text
      (ia64).  The Mundt multiword extension added another 600 bytes, and this
      restructuring patch gets back about 100 bytes.
      
      But the real motivation was the reduced amount of duplicated code.
      
      Tested by Paul Mundt using <= BITS_PER_LONG as well as power of
      2 aligned multiword spanning allocations.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3cf64b93
    • P
      [PATCH] bitmap: region multiword spanning support · 74373c6a
      Paul Mundt 提交于
      Add support to the lib/bitmap.c bitmap_*_region() routines
      
      For bitmap regions larger than one word (nbits > BITS_PER_LONG).  This removes
      a BUG_ON() in lib bitmap.
      
      I have an updated store queue API for SH that is currently using this with
      relative success, and at first glance, it seems like this could be useful for
      x86 (arch/i386/kernel/pci-dma.c) as well.  Particularly for anything using
      dma_declare_coherent_memory() on large areas and that attempts to allocate
      large buffers from that space.
      
      Paul Jackson also did some cleanup to this patch.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      74373c6a
    • P
      [PATCH] bitmap: region cleanup · 87e24802
      Paul Jackson 提交于
      Paul Mundt <lethal@linux-sh.org> says:
      
      This patch set implements a number of patches to clean up and restructure the
      bitmap region code, in addition to extending the interface to support
      multiword spanning allocations.
      
      The current implementation (before this patch set) is limited by only being
      able to allocate pages <= BITS_PER_LONG, as noted by the strategically
      positioned BUG_ON() at lib/bitmap.c:752:
      
              /* We don't do regions of pages > BITS_PER_LONG.  The
      	 * algorithm would be a simple look for multiple zeros in the
      	 * array, but there's no driver today that needs this.  If you
      	 * trip this BUG(), you get to code it... */
              BUG_ON(pages > BITS_PER_LONG);
      
      As I seem to have been the first person to trigger this, the result ends up
      being the following patch set with the help of Paul Jackson.
      
      The final patch in the series eliminates quite a bit of code duplication, so
      the bitmap code size ends up being smaller than the current implementation as
      an added bonus.
      
      After these are applied, it should already be possible to do multiword
      allocations with dma_alloc_coherent() out of ranges established by
      dma_declare_coherent_memory() on x86 without having to change any of the code,
      and the SH store queue API will follow up on this as the other user that needs
      support for this.
      
      This patch:
      
      Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:
      
       * spacing
       * variable names
       * comments
      
      Has no change to code function.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      87e24802
  17. 23 3月, 2006 1 次提交