1. 05 8月, 2015 1 次提交
  2. 17 4月, 2015 1 次提交
    • Y
      lib: find_*_bit reimplementation · 2c57a0e2
      Yury Norov 提交于
      This patchset does rework to find_bit function family to achieve better
      performance, and decrease size of text.  All rework is done in patch 1.
      Patches 2 and 3 are about code moving and renaming.
      
      It was boot-tested on x86_64 and MIPS (big-endian) machines.
      Performance tests were ran on userspace with code like this:
      
      	/* addr[] is filled from /dev/urandom */
      	start = clock();
      	while (ret < nbits)
      		ret = find_next_bit(addr, nbits, ret + 1);
      
      	end = clock();
      	printf("%ld\t", (unsigned long) end - start);
      
      On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
      find_next_bit, nbits is 8M, for find_first_bit - 80K)
      
      	find_next_bit:		find_first_bit:
      	new	current		new	current
      	26932	43151		14777	14925
      	26947	43182		14521	15423
      	26507	43824		15053	14705
      	27329	43759		14473	14777
      	26895	43367		14847	15023
      	26990	43693		15103	15163
      	26775	43299		15067	15232
      	27282	42752		14544	15121
      	27504	43088		14644	14858
      	26761	43856		14699	15193
      	26692	43075		14781	14681
      	27137	42969		14451	15061
      	...			...
      
      find_next_bit performance gain is 35-40%;
      find_first_bit - no measurable difference.
      
      On ARM machine, there is arch-specific implementation for find_bit.
      
      Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
      helpful discussions.
      
      This patch (of 3):
      
      New implementations takes less space in source file (see diffstat) and in
      object.  For me it's 710 vs 453 bytes of text.  It also shows better
      performance.
      
      find_last_bit description fixed due to obvious typo.
      
      [akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus]
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      Reviewed-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: NGeorge Spelvin <linux@horizon.com>
      Cc: Alexey Klimov <klimov.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Valentin Rothberg <valentinrothberg@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2c57a0e2
  3. 16 11月, 2014 1 次提交
  4. 13 8月, 2014 1 次提交
    • P
      locking: Remove deprecated smp_mb__() barriers · 2e39465a
      Peter Zijlstra 提交于
      Its been a while and there are no in-tree users left, so remove the
      deprecated barriers.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2e39465a
  5. 18 4月, 2014 1 次提交
    • P
      arch: Prepare for smp_mb__{before,after}_atomic() · febdbfe8
      Peter Zijlstra 提交于
      Since the smp_mb__{before,after}*() ops are fundamentally dependent on
      how an arch can implement atomics it doesn't make sense to have 3
      variants of them. They must all be the same.
      
      Furthermore, the 3 variants suggest they're only valid for those 3
      atomic ops, while we have many more where they could be applied.
      
      So move away from
      smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}() and reduce the
      interface to just the two: smp_mb__{before,after}_atomic().
      
      This patch prepares the way by introducing default implementations in
      asm-generic/barrier.h that default to a full barrier and providing
      __deprecated inlines for the previous 6 barriers if they're not
      provided by the arch.
      
      This should allow for a mostly painless transition (lots of deprecated
      warns in the interim).
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/n/tip-wr59327qdyi9mbzn6x937s4e@git.kernel.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Chen, Gong" <gong.chen@linux.intel.com>
      Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      febdbfe8
  6. 31 3月, 2014 1 次提交
  7. 22 10月, 2013 1 次提交
  8. 17 10月, 2013 1 次提交
  9. 24 3月, 2012 3 次提交
  10. 16 2月, 2012 1 次提交
  11. 15 1月, 2012 1 次提交
  12. 06 12月, 2011 1 次提交
  13. 27 5月, 2011 2 次提交
  14. 16 11月, 2010 1 次提交
  15. 10 10月, 2010 2 次提交
    • A
      bitops: remove duplicated extern declarations · d852a6af
      Akinobu Mita 提交于
      If CONFIG_GENERIC_FIND_NEXT_BIT is enabled, find_next_bit() and
      find_next_zero_bit() are doubly declared in asm-generic/bitops/find.h
      and linux/bitops.h.
      
      asm/bitops.h includes asm-generic/bitops/find.h if and only if the
      architecture enables CONFIG_GENERIC_FIND_NEXT_BIT. And asm/bitops.h
      is included by linux/bitops.h
      
      So we can just remove the extern declarations of find_next_bit() and
      find_next_zero_bit() in linux/bitops.h.
      
      Also we can remove unneeded #ifndef CONFIG_GENERIC_FIND_NEXT_BIT in
      asm-generic/bitops/find.h.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      d852a6af
    • A
      bitops: make asm-generic/bitops/find.h more generic · 708ff2a0
      Akinobu Mita 提交于
      asm-generic/bitops/find.h has the extern declarations of find_next_bit()
      and find_next_zero_bit() and the macro definitions of find_first_bit()
      and find_first_zero_bit(). It is only usable by the architectures which
      enables CONFIG_GENERIC_FIND_NEXT_BIT and disables
      CONFIG_GENERIC_FIND_FIRST_BIT.
      
      x86 and tile enable both CONFIG_GENERIC_FIND_NEXT_BIT and
      CONFIG_GENERIC_FIND_FIRST_BIT. These architectures cannot include
      asm-generic/bitops/find.h in their asm/bitops.h. So ifdefed extern
      declarations of find_first_bit and find_first_zero_bit() are put in
      linux/bitops.h.
      
      This makes asm-generic/bitops/find.h usable by these architectures
      and use it. Also this change is needed for the forthcoming duplicated
      extern declarations cleanup.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      708ff2a0
  16. 05 5月, 2010 1 次提交
  17. 07 4月, 2010 2 次提交
  18. 07 3月, 2010 1 次提交
  19. 04 2月, 2010 1 次提交
  20. 29 1月, 2010 1 次提交
  21. 23 4月, 2009 1 次提交
  22. 01 1月, 2009 1 次提交
  23. 29 4月, 2008 2 次提交
    • T
      bitops: remove "optimizations" · fee4b19f
      Thomas Gleixner 提交于
      The mapsize optimizations which were moved from x86 to the generic
      code in commit 64970b68 increased the
      binary size on non x86 architectures.
      
      Looking into the real effects of the "optimizations" it turned out
      that they are not used in find_next_bit() and find_next_zero_bit().
      
      The ones in find_first_bit() and find_first_zero_bit() are used in a
      couple of places but none of them is a real hot path.
      
      Remove the "optimizations" all together and call the library functions
      unconditionally.
      
      Boot-tested on x86 and compile tested on every cross compiler I have.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fee4b19f
    • E
      Avoid divides in BITS_TO_LONGS · ede9c697
      Eric Dumazet 提交于
      BITS_PER_LONG is a signed value (32 or 64)
      
      DIV_ROUND_UP(nr, BITS_PER_LONG) performs signed arithmetic if "nr" is signed too.
      
      Converting BITS_TO_LONGS(nr) to DIV_ROUND_UP(nr, BITS_PER_BYTE *
      sizeof(long)) makes sure compiler can perform a right shift, even if "nr"
      is a signed value, instead of an expensive integer divide.
      
      Applying this patch saves 141 bytes on x86 when CONFIG_CC_OPTIMIZE_FOR_SIZE=y
      and speedup bitmap operations.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ede9c697
  24. 27 4月, 2008 3 次提交
    • A
      x86: optimize find_first_bit for small bitmaps · 3a483050
      Alexander van Heukelum 提交于
      Avoid a call to find_first_bit if the bitmap size is know at
      compile time and small enough to fit in a single long integer.
      Modeled after an optimization in the original x86_64-specific
      code.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a483050
    • A
      x86: generic versions of find_first_(zero_)bit, convert i386 · 77b9bd9c
      Alexander van Heukelum 提交于
      Generic versions of __find_first_bit and __find_first_zero_bit
      are introduced as simplified versions of __find_next_bit and
      __find_next_zero_bit. Their compilation and use are guarded by
      a new config variable GENERIC_FIND_FIRST_BIT.
      
      The generic versions of find_first_bit and find_first_zero_bit
      are implemented in terms of the newly introduced __find_first_bit
      and __find_first_zero_bit.
      
      This patch does not remove the i386-specific implementation,
      but it does switch i386 to use the generic functions by setting
      GENERIC_FIND_FIRST_BIT=y for X86_32.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77b9bd9c
    • A
      x86, generic: optimize find_next_(zero_)bit for small constant-size bitmaps · 64970b68
      Alexander van Heukelum 提交于
      This moves an optimization for searching constant-sized small
      bitmaps form x86_64-specific to generic code.
      
      On an i386 defconfig (the x86#testing one), the size of vmlinux hardly
      changes with this applied. I have observed only four places where this
      optimization avoids a call into find_next_bit:
      
      In the functions return_unused_surplus_pages, alloc_fresh_huge_page,
      and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap.
      In __next_cpu a call is avoided for a 32-bit bitmap. That's it.
      
      On x86_64, 52 locations are optimized with a minimal increase in
      code size:
      
      Current #testing defconfig:
      	146 x bsf, 27 x find_next_*bit
         text    data     bss     dec     hex filename
         5392637  846592  724424 6963653  6a41c5 vmlinux
      
      After removing the x86_64 specific optimization for find_next_*bit:
      	94 x bsf, 79 x find_next_*bit
         text    data     bss     dec     hex filename
         5392358  846592  724424 6963374  6a40ae vmlinux
      
      After this patch (making the optimization generic):
      	146 x bsf, 27 x find_next_*bit
         text    data     bss     dec     hex filename
         5392396  846592  724424 6963412  6a40d4 vmlinux
      
      [ tglx@linutronix.de: build fixes ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64970b68
  25. 29 3月, 2008 1 次提交
  26. 20 10月, 2007 3 次提交
  27. 17 10月, 2007 1 次提交
  28. 27 1月, 2007 1 次提交
  29. 27 3月, 2006 1 次提交
  30. 26 3月, 2006 1 次提交
    • A
      [PATCH] roundup_pow_of_two() 64-bit fix · 962749af
      Andrew Morton 提交于
      fls() takes an integer, so roundup_pow_of_two() is busted for ulongs larger
      than 2^32-1.
      
      Fix this by implementing and using fls_long().
      
      (Why does roundup_pow_of_two() return a long?)
      
      (Why is roundup_pow_of_two() __attribute_const__ whereas long_log2() is
      __attribute_pure__?)
      
      (Why does long_log2() suck so much?  Because we were missing fls_long()?)
      
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      Cc: John Hawkes <hawkes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      962749af
新手
引导
客服 返回
顶部