1. 17 4月, 2015 7 次提交
    • B
      util_macros.h: add find_closest() macro · 95d11952
      Bartosz Golaszewski 提交于
      This series unduplicates the code used to find the member in an array
      closest to 'x'.
      
      The first patch adds a macro implementing the algorithm in two flavors -
      for arrays sorted in ascending and descending order.  The second updates
      Documentation/CodingStyle on the naming convention for local variables in
      macros resembling functions.  Other three patches replace duplicated code
      with calls to one of these macros in some hwmon drivers.
      
      This patch (of 5):
      
      Searching for the member of an array closest to 'x' is duplicated in
      several places.
      
      Add a new include - util_macros.h - and two macros that implement this
      algorithm for arrays sorted both in ascending and descending order.
      
      Uses linear search.
      Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95d11952
    • S
      lib/dma-debug: fix bucket_find_contain() · a7a2c02a
      Sebastian Ott 提交于
      bucket_find_contain() will search the bucket list for a dma_debug_entry.
      When the entry isn't found it needs to search other buckets too, since
      only the start address of a dma range is hashed (which might be in a
      different bucket).
      
      A copy of the dma_debug_entry is used to get the previous hash bucket
      but when its list is searched the original dma_debug_entry is to be used
      not its modified copy.
      
      This fixes false "device driver tries to sync DMA memory it has not allocated"
      warnings.
      Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a2c02a
    • R
      lib/vsprintf.c: even faster binary to decimal conversion · 7c43d9a3
      Rasmus Villemoes 提交于
      The most expensive part of decimal conversion is the divisions by 10
      (albeit done using reciprocal multiplication with appropriately chosen
      constants).  I decided to see if one could eliminate around half of
      these multiplications by emitting two digits at a time, at the cost of a
      200 byte lookup table, and it does indeed seem like there is something
      to be gained, especially on 64 bits.  Microbenchmarking shows
      improvements ranging from -50% (for numbers uniformly distributed in [0,
      2^64-1]) to -25% (for numbers heavily biased toward the smaller end, a
      more realistic distribution).
      
      On a larger scale, perf shows that top, one of the big consumers of /proc
      data, uses 0.5-1.0% fewer cpu cycles.
      
      I had to jump through some hoops to get the 32 bit code to compile and run
      on my 64 bit machine, so I'm not sure how relevant these numbers are, but
      just for comparison the microbenchmark showed improvements between -30%
      and -10%.
      
      The bloat-o-meter costs are around 150 bytes (the generated code is a
      little smaller, so it's not the full 200 bytes) on both 32 and 64 bit.
      I'm aware that extra cache misses won't show up in a microbenchmark as
      used above, but on the other hand decimal conversions often happen in bulk
      (for example in the case of top).
      
      I have of course tested that the new code generates the same output as the
      old, for both the first and last 1e10 numbers in [0,2^64-1] and 4e9
      'random' numbers in-between.
      
      Test and verification code on github: https://github.com/Villemoes/dec.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Tested-by: NJeff Epler <jepler@unpythonic.net>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c43d9a3
    • Y
      lib: rename lib/find_next_bit.c to lib/find_bit.c · 840620a1
      Yury Norov 提交于
      This file contains implementation for all find_*_bit{,_le}
      So giving it more generic name looks reasonable.
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      Reviewed-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: NGeorge Spelvin <linux@horizon.com>
      Cc: Alexey Klimov <klimov.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Valentin Rothberg <valentinrothberg@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      840620a1
    • Y
      lib: move find_last_bit to lib/find_next_bit.c · 8f6f19dd
      Yury Norov 提交于
      Currently all 'find_*_bit' family is located in lib/find_next_bit.c,
      except 'find_last_bit', which is in lib/find_last_bit.c. It seems,
      there's no major benefit to have it separated.
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      Reviewed-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: NGeorge Spelvin <linux@horizon.com>
      Cc: Alexey Klimov <klimov.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Valentin Rothberg <valentinrothberg@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f6f19dd
    • Y
      lib: find_*_bit reimplementation · 2c57a0e2
      Yury Norov 提交于
      This patchset does rework to find_bit function family to achieve better
      performance, and decrease size of text.  All rework is done in patch 1.
      Patches 2 and 3 are about code moving and renaming.
      
      It was boot-tested on x86_64 and MIPS (big-endian) machines.
      Performance tests were ran on userspace with code like this:
      
      	/* addr[] is filled from /dev/urandom */
      	start = clock();
      	while (ret < nbits)
      		ret = find_next_bit(addr, nbits, ret + 1);
      
      	end = clock();
      	printf("%ld\t", (unsigned long) end - start);
      
      On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
      find_next_bit, nbits is 8M, for find_first_bit - 80K)
      
      	find_next_bit:		find_first_bit:
      	new	current		new	current
      	26932	43151		14777	14925
      	26947	43182		14521	15423
      	26507	43824		15053	14705
      	27329	43759		14473	14777
      	26895	43367		14847	15023
      	26990	43693		15103	15163
      	26775	43299		15067	15232
      	27282	42752		14544	15121
      	27504	43088		14644	14858
      	26761	43856		14699	15193
      	26692	43075		14781	14681
      	27137	42969		14451	15061
      	...			...
      
      find_next_bit performance gain is 35-40%;
      find_first_bit - no measurable difference.
      
      On ARM machine, there is arch-specific implementation for find_bit.
      
      Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
      helpful discussions.
      
      This patch (of 3):
      
      New implementations takes less space in source file (see diffstat) and in
      object.  For me it's 710 vs 453 bytes of text.  It also shows better
      performance.
      
      find_last_bit description fixed due to obvious typo.
      
      [akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus]
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      Reviewed-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: NGeorge Spelvin <linux@horizon.com>
      Cc: Alexey Klimov <klimov.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Valentin Rothberg <valentinrothberg@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2c57a0e2
    • R
      alpha: forward declare struct pt_regs in processor.h · 396ada68
      Richard Weinberger 提交于
      Removal of exec domains uncovered this new warning.  processor.h re-used
      struct pt_regs from personality.h which is now gone.
      
        ./arch/alpha/include/asm/processor.h:47:33: warning: 'struct pt_regs' declared inside parameter list [enabled by default]
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      396ada68
  2. 16 4月, 2015 33 次提交