1. 27 6月, 2006 1 次提交
  2. 26 6月, 2006 8 次提交
  3. 23 6月, 2006 8 次提交
  4. 22 6月, 2006 1 次提交
  5. 21 6月, 2006 1 次提交
  6. 06 6月, 2006 1 次提交
  7. 22 5月, 2006 1 次提交
  8. 13 5月, 2006 1 次提交
  9. 28 4月, 2006 2 次提交
  10. 21 4月, 2006 2 次提交
    • D
      [RBTREE] Merge colour and parent fields of struct rb_node. · 55a98102
      David Woodhouse 提交于
      We only used a single bit for colour information, so having a whole
      machine word of space allocated for it was a bit wasteful. Instead,
      store it in the lowest bit of the 'parent' pointer, since that was
      always going to be aligned anyway.
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      55a98102
    • D
      [RBTREE] Remove dead code in rb_erase() · 1975e593
      David Woodhouse 提交于
      Observe rb_erase(), when the victim node 'old' has two children so
      neither of the simple cases at the beginning are taken.
      
      Observe that it effectively does an 'rb_next()' operation to find the
      next (by value) node in the tree. That is; we go to the victim's
      right-hand child and then follow left-hand pointers all the way
      down the tree as far as we can until we find the next node 'node'. We
      end up with 'node' being either the same immediate right-hand child of
      'old', or one of its descendants on the far left-hand side.
      
      For a start, we _know_ that 'node' has a parent. We can drop that check.
      
      We also know that if 'node's parent is 'old', then 'node' is the
      right-hand child of its parent. And that if 'node's parent is _not_
      'old', then 'node' is the left-hand child of its parent.
      
      So instead of checking for 'node->rb_parent == old' in one place and
      also checking 'node's heritage separately when we're trying to change
      its link from its parent, we can shuffle things around a bit and do
      it like this...
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      1975e593
  11. 20 4月, 2006 1 次提交
    • T
      [PATCH] Kconfig.debug: Set DEBUG_MUTEX to off by default · cca57c5b
      Tim Chen 提交于
      DEBUG_MUTEX flag is on by default in current kernel configuration.
      
      During performance testing, we saw mutex debug functions like
      mutex_debug_check_no_locks_freed (called by kfree()) is expensive as it
      goes through a global list of memory areas with mutex lock and do the
      checking.  For benchmarks such as Volanomark and Hackbench, we have seen
      more than 40% drop in performance on some platforms.  We suggest to set
      DEBUG_MUTEX off by default.  Or at least do that later when we feel that
      the mutex changes in the current code have stabilized.
      Signed-off-by: NTim Chen <tim.c.chen@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cca57c5b
  12. 15 4月, 2006 1 次提交
    • N
      [PATCH] sysfs: Allow sysfs attribute files to be pollable · 4508a7a7
      NeilBrown 提交于
      It works like this:
        Open the file
        Read all the contents.
        Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
        When poll returns,
           close the file and go to top of loop.
         or lseek to start of file and go back to the 'read'.
      
      Events are signaled by an object manager calling
         sysfs_notify(kobj, dir, attr);
      
      If the dir is non-NULL, it is used to find a subdirectory which
      contains the attribute (presumably created by sysfs_create_group).
      
      This has a cost of one int  per attribute, one wait_queuehead per kobject,
      one int per open file.
      
      The name "sysfs_notify" may be confused with the inotify
      functionality.  Maybe it would be nice to support inotify for sysfs
      attributes as well?
      
      This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
      to be pollable
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      4508a7a7
  13. 11 4月, 2006 3 次提交
  14. 31 3月, 2006 1 次提交
  15. 27 3月, 2006 6 次提交
    • J
      [PATCH] Don't make debugfs depend on DEBUG_KERNEL · ae36b883
      Jens Axboe 提交于
      We use it generally now, at least blktrace isn't a specific debug
      kernel feature.
      Signed-off-by: NJens Axboe <axboe@suse.de>
      ae36b883
    • A
      [PATCH] bitops: hweight() speedup · f9b41929
      Akinobu Mita 提交于
      <linux@horizon.com> wrote:
      
      This is an extremely well-known technique.  You can see a similar version that
      uses a multiply for the last few steps at
      http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel whch
      refers to "Software Optimization Guide for AMD Athlon 64 and Opteron
      Processors"
      http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF
      
      It's section 8.6, "Efficient Implementation of Population-Count Function in
      32-bit Mode", pages 179-180.
      
      It uses the name that I am more familiar with, "popcount" (population count),
      although "Hamming weight" also makes sense.
      
      Anyway, the proof of correctness proceeds as follows:
      
      	b = a - ((a >> 1) & 0x55555555);
      	c = (b & 0x33333333) + ((b >> 2) & 0x33333333);
      	d = (c + (c >> 4)) & 0x0f0f0f0f;
      #if SLOW_MULTIPLY
      	e = d + (d >> 8)
      	f = e + (e >> 16);
      	return f & 63;
      #else
      	/* Useful if multiply takes at most 4 cycles */
      	return (d * 0x01010101) >> 24;
      #endif
      
      The input value a can be thought of as 32 1-bit fields each holding their own
      hamming weight.  Now look at it as 16 2-bit fields.  Each 2-bit field a1..a0
      has the value 2*a1 + a0.  This can be converted into the hamming weight of the
      2-bit field a1+a0 by subtracting a1.
      
      That's what the (a >> 1) & mask subtraction does.  Since there can be no
      borrows, you can just do it all at once.
      
      Enumerating the 4 possible cases:
      
      0b00 = 0  ->  0 - 0 = 0
      0b01 = 1  ->  1 - 0 = 1
      0b10 = 2  ->  2 - 1 = 1
      0b11 = 3  ->  3 - 1 = 2
      
      The next step consists of breaking up b (made of 16 2-bir fields) into
      even and odd halves and adding them into 4-bit fields.  Since the largest
      possible sum is 2+2 = 4, which will not fit into a 4-bit field, the 2-bit
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                "which will not fit into a 2-bit field"
      
      fields have to be masked before they are added.
      
      After this point, the masking can be delayed.  Each 4-bit field holds a
      population count from 0..4, taking at most 3 bits.  These numbers can be added
      without overflowing a 4-bit field, so we can compute c + (c >> 4), and only
      then mask off the unwanted bits.
      
      This produces d, a number of 4 8-bit fields, each in the range 0..8.  From
      this point, we can shift and add d multiple times without overflowing an 8-bit
      field, and only do a final mask at the end.
      
      The number to mask with has to be at least 63 (so that 32 on't be truncated),
      but can also be 128 or 255.  The x86 has a special encoding for signed
      immediate byte values -128..127, so the value of 255 is slower.  On other
      processors, a special "sign extend byte" instruction might be faster.
      
      On a processor with fast integer multiplies (Athlon but not P4), you can
      reduce the final few serially dependent instructions to a single integer
      multiply.  Consider d to be 3 8-bit values d3, d2, d1 and d0, each in the
      range 0..8.  The multiply forms the partial products:
      
      	           d3 d2 d1 d0
      	        d3 d2 d1 d0
      	     d3 d2 d1 d0
      	+ d3 d2 d1 d0
      	----------------------
      	           e3 e2 e1 e0
      
      Where e3 = d3 + d2 + d1 + d0.   e2, e1 and e0 obviously cannot generate
      any carries.
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9b41929
    • A
      [PATCH] bitops: hweight() related cleanup · 37d54111
      Akinobu Mita 提交于
      By defining generic hweight*() routines
      
      - hweight64() will be defined on all architectures
      - hweight_long() will use architecture optimized hweight32() or hweight64()
      
      I found two possible cleanups by these reasons.
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      37d54111
    • A
      [PATCH] bitops: generic ext2_{set,clear,test,find_first_zero,find_next_zero}_bit() · 930ae745
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      int ext2_set_bit(int nr, volatile unsigned long *addr);
      int ext2_clear_bit(int nr, volatile unsigned long *addr);
      int ext2_test_bit(int nr, const volatile unsigned long *addr);
      unsigned long ext2_find_first_zero_bit(const unsigned long *addr,
                                             unsigned long size);
      unsinged long ext2_find_next_zero_bit(const unsigned long *addr,
                                            unsigned long size);
      
      In include/asm-generic/bitops/ext2-non-atomic.h
      
      This code largely copied from:
      
      include/asm-powerpc/bitops.h
      include/asm-parisc/bitops.h
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      930ae745
    • A
      [PATCH] bitops: generic hweight{64,32,16,8}() · 3b9ed1a5
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      unsigned int hweight32(unsigned int w);
      unsigned int hweight16(unsigned int w);
      unsigned int hweight8(unsigned int w);
      unsigned long hweight64(__u64 w);
      
      In include/asm-generic/bitops/hweight.h
      
      This code largely copied from: include/linux/bitops.h
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3b9ed1a5
    • A
      [PATCH] bitops: generic find_{next,first}{,_zero}_bit() · c7f612cd
      Akinobu Mita 提交于
      This patch introduces the C-language equivalents of the functions below:
      
      unsigned logn find_next_bit(const unsigned long *addr, unsigned long size,
                                  unsigned long offset);
      unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
                                       unsigned long offset);
      unsigned long find_first_zero_bit(const unsigned long *addr,
                                        unsigned long size);
      unsigned long find_first_bit(const unsigned long *addr, unsigned long size);
      
      In include/asm-generic/bitops/find.h
      
      This code largely copied from: arch/powerpc/lib/bitops.c
      Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c7f612cd
  16. 26 3月, 2006 2 次提交