提交 · fb5ac54263ef3fcb5c469a61e0ab6b06e45e2307 · openanolis / cloud-kernel

07 8月, 2014 40 次提交

lib: bitmap: make the start index of bitmap_set unsigned · fb5ac542

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "start" is non-negative.

Also, use the names "start" and "len" for the two parameters in both
header file and implementation, instead of the previous mix.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb5ac542

lib: bitmap: make nbits parameter of bitmap_weight unsigned · 877d9f3b

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.

I didn't change the return type, since that might change the semantics
of some expression containing a call to bitmap_weight(). Certainly an
int is capable of holding the result.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

877d9f3b

lib: bitmap: make nbits parameter of bitmap_subset unsigned · 5be20213

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5be20213

lib: bitmap: make nbits parameter of bitmap_intersects unsigned · 6dfe9799

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6dfe9799

lib: bitmap: make nbits parameter of bitmap_{and,or,xor,andnot} unsigned · 2f9305eb

由 Rasmus Villemoes 提交于 8月 06, 2014

This change is only for consistency with the changes to the other
bitmap_* functions; it doesn't change the size of the generated code:
inside BITS_TO_LONGS there is a sizeof(long), which causes bits to be
interpreted as unsigned anyway.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f9305eb

lib: bitmap: remove unnecessary mask from bitmap_complement · 65b4ee62

由 Rasmus Villemoes 提交于 8月 06, 2014

Since the extra bits are "don't care", there is no reason to mask the
last word to the used bits when complementing.  This shaves off yet a
few bytes.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

65b4ee62

lib: bitmap: make nbits parameter of bitmap_complement unsigned · 3d6684f4

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3d6684f4

lib: bitmap: make nbits parameter of bitmap_equal unsigned · 5e068069

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5e068069

lib: bitmap: make nbits parameter of bitmap_full unsigned · 8397927c

由 Rasmus Villemoes 提交于 8月 06, 2014

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8397927c

lib: bitmap: make nbits parameter of bitmap_empty unsigned · 0679cc48

由 Rasmus Villemoes 提交于 8月 06, 2014

Many functions in lib/bitmap.c start with an expression such as lim =
bits/BITS_PER_LONG.  Since bits has type (signed) int, and since gcc
cannot know that it is in fact non-negative, it generates worse code
than it could.  These patches, mostly consisting of changing various
parameters to unsigned, gives a slight overall code reduction:

  add/remove: 1/1 grow/shrink: 8/16 up/down: 251/-414 (-163)
  function                                     old     new   delta
  tick_device_uses_broadcast                   335     425     +90
  __irq_alloc_descs                            498     554     +56
  __bitmap_andnot                               73     115     +42
  __bitmap_and                                  70     101     +31
  bitmap_weight                                  -      11     +11
  copy_hugetlb_page_range                      752     762     +10
  follow_hugetlb_page                          846     854      +8
  hugetlb_init                                1415    1417      +2
  hugetlb_nrpages_setup                        130     131      +1
  hugetlb_add_hstate                           377     376      -1
  bitmap_allocate_region                        82      80      -2
  select_task_rq_fair                         2202    2191     -11
  hweight_long                                  66      55     -11
  __reg_op                                     230     219     -11
  dm_stats_message                            2849    2833     -16
  bitmap_parselist                              92      74     -18
  __bitmap_weight                              115      97     -18
  __bitmap_subset                              153     129     -24
  __bitmap_full                                128     104     -24
  __bitmap_empty                               120      96     -24
  bitmap_set                                   179     149     -30
  bitmap_clear                                 185     155     -30
  __bitmap_equal                               136     105     -31
  __bitmap_intersects                          148     108     -40
  __bitmap_complement                          109      67     -42
  tick_device_setup_broadcast_func.isra         81       -     -81

[The increases in __bitmap_and{,not} are due to bug fixes 17/18,18/18.
No idea why bitmap_weight suddenly appears.] While 163 bytes treewide is
insignificant, I believe the bitmap functions are often called with
locks held, so saving even a few cycles might be worth it.

While making these changes, I found a few other things that might be
worth including.  16,17,18 are actual bug fixes.  The rest shouldn't
change the behaviour of any of the functions, provided no-one passed
negative nbits values.  If something should come up, it should be fairly
bisectable.

A few issues I thought about, but didn't know what to do with:

* Many of the functions misbehave if nbits is compile-time 0; the
  out-of-line functions generally handle 0 correctly.  bitmap_fill() is
  particularly bad, whether the 0 is known at compile time or not.  It
  would probably be nice to add detection of at least compile-time 0 and
  handle that appropriately.

* I didn't change __bitmap_shift_{left,right} to use unsigned because I
  want to fully understand why the algorithm works before making that
  change.  However, AFAICT, they behave correctly for all (positive) shift
  amounts.  This is not the case for the small_const_nbits versions.  If
  for example nbits = n = BITS_PER_LONG, the shift operators turn into
  no-ops (at least on x86), so one get *dst = *src, whereas one would
  expect to get *dst=0.  That difference in behaviour is somewhat
  annoying.

This patch (of 18):

The compiler can generate slightly smaller and simpler code when it
knows that "nbits" is non-negative.  Since no-one passes a negative
bit-count, this shouldn't affect the semantics.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0679cc48

lib/list_sort.c: convert to pr_foo · d0da23b0

由 Andrew Morton 提交于 8月 06, 2014

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0da23b0

lib: list_sort.c: Limit number of unused cmp callbacks · 61b3d6c4

由 Rasmus Villemoes 提交于 8月 06, 2014

The helper merge_and_restore_back_links() makes sure to call the
caller's cmp function during the final ->prev pointer fixup, so that the
cmp function may call cond_resched().  However, if the cmp function does
not call cond_resched() at all, this is entirely redundant.  If it does,
doing at least two function calls for every two pointer assignments is a
bit excessive.  This patch limits the calls to once for every 256
iterations.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61b3d6c4

lib: list_sort_test(): simplify and harden cleanup · 69412303

由 Rasmus Villemoes 提交于 8月 06, 2014

There is no reason to maintain the list structure while freeing the
debug elements.  Aside from the redundant pointer manipulations, it is
also inefficient from a locality-of-reference viewpoint, since they are
visited in a random order (wrt.  the order they were allocated).
Furthermore, if we jumped to exit: after detecting list corruption, it
is actually dangerous.

So just free the elements in the order they were allocated, using the
backing array elts.  Allocate that using kcalloc(), so that if
allocation of one of the debug element fails, we just end up calling
kfree(NULL) for the trailing elements.

Minor details: Use sizeof(*elts) instead of sizeof(void *), and return
err immediately when allocation of elts fails, to avoid introducing
another label just before the final return statement.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

69412303

lib: list_sort_test(): add extra corruption check · 9d418dcc

由 Rasmus Villemoes 提交于 8月 06, 2014

Add a check to make sure that the prev pointer of the list head points
to the last element on the list.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d418dcc

lib: list_sort_test(): return -ENOMEM when allocation fails · 27d555d1

由 Rasmus Villemoes 提交于 8月 06, 2014

Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27d555d1

kernel.h: remove deprecated pack_hex_byte · 087face5

由 Joe Perches 提交于 8月 06, 2014

It's been nearly 3 years now since commit 55036ba7 ("lib: rename
pack_hex_byte() to hex_byte_pack()") so it's time to remove this
deprecated and unused static inline.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

087face5

lib/test-kstrtox.c: use ARRAY_SIZE instead of sizeof/sizeof[0] · 129965a9

由 Fabian Frederick 提交于 8月 06, 2014

Use kernel.h definition.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

129965a9

lib/string_helpers.c: constify static arrays · 142cda5d

由 Mathias Krause 提交于 8月 06, 2014

Complement commit 68aecfb9 ("lib/string_helpers.c: make arrays
static") by making the arrays const -- not only pointing to const
strings.  This moves them out of the data section to the r/o data
section:

   text    data     bss     dec     hex filename
   1150     176       0    1326     52e lib/string_helpers.old.o
   1326       0       0    1326     52e lib/string_helpers.new.o
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

142cda5d

lib/cmdline.c: add size unit t/p/e to memparse · e004f3c7

由 Gui Hecheng 提交于 8月 06, 2014

For modern filesystems such as btrfs, t/p/e size level operations are
common.  add size unit t/p/e parsing to memparse
Signed-off-by: NGui Hecheng <guihc.fnst@cn.fujitsu.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e004f3c7

libata: Use glob_match from lib/glob.c · 428ac5fc

由 George Spelvin 提交于 8月 06, 2014

The function may be useful for other drivers, so export it.  (Suggested
by Tejun Heo.)

Note that I inverted the return value of glob_match; returning true on
match seemed to make more sense.
Signed-off-by: NGeorge Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

428ac5fc

lib/glob.c: add CONFIG_GLOB_SELFTEST · 5f9be824

由 George Spelvin 提交于 8月 06, 2014

This was useful during development, and is retained for future
regression testing.

GCC appears to have no way to place string literals in a particular
section; adding __initconst to a char pointer leaves the string itself
in the default string section, where it will not be thrown away after
module load.

Thus all string constants are kept in explicitly declared and named
arrays.  Sorry this makes printk a bit harder to read.  At least the
tests are more compact.
Signed-off-by: NGeorge Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5f9be824

lib: add lib/glob.c · b0125085

由 George Spelvin 提交于 8月 06, 2014

This is a helper function from drivers/ata/libata_core.c, where it is
used to blacklist particular device models.  It's being moved to lib/ so
other drivers may use it for the same purpose.

This implementation in non-recursive, so is safe for the kernel stack.

[akpm@linux-foundation.org: fix sparse warning]
Signed-off-by: NGeorge Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b0125085

zlib: clean up some dead code · 62e7ca52

由 Sergey Senozhatsky 提交于 8月 06, 2014

Cleanup unused `if 0'-ed functions, which have been dead since 2006
(commits 87c2ce3b ("lib/zlib*: cleanups") by Adrian Bunk and
4f3865fb ("zlib_inflate: Upgrade library code to a recent version")
by Richard Purdie):

 - zlib_deflateSetDictionary
 - zlib_deflateParams
 - zlib_deflateCopy
 - zlib_inflateSync
 - zlib_syncsearch
 - zlib_inflateSetDictionary
 - zlib_inflatePrime
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

62e7ca52

klist: use same naming scheme as hlist for klist_add_after() · 0f9859ca

由 Ken Helias 提交于 8月 06, 2014

The name was modified from hlist_add_after() to hlist_add_behind() when
adjusting the order of arguments to match the one with
klist_add_after().  This is necessary to break old code when it would
use it the wrong way.

Make klist follow this naming scheme for consistency.
Signed-off-by: NKen Helias <kenhelias@firemail.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0f9859ca

list: fix order of arguments for hlist_add_after(_rcu) · 1d023284

由 Ken Helias 提交于 8月 06, 2014

All other add functions for lists have the new item as first argument
and the position where it is added as second argument.  This was changed
for no good reason in this function and makes using it unnecessary
confusing.

The name was changed to hlist_add_behind() to cause unconverted code to
generate a compile error instead of using the wrong parameter order.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKen Helias <kenhelias@firemail.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[intel driver bits]
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d023284

list: make hlist_add_after() argument names match hlist_add_after_rcu() · bc18dd33

由 Ken Helias 提交于 8月 06, 2014

The argument names for hlist_add_after() are poorly chosen because they
look the same as the ones for hlist_add_before() but have to be used
differently.

hlist_add_after_rcu() has made a better choice.
Signed-off-by: NKen Helias <kenhelias@firemail.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bc18dd33

kernel/printk/printk.c: fix bool assignements · d25d9fec

由 Neil Zhang 提交于 8月 06, 2014

Fix coccinelle warnings.
Signed-off-by: NNeil Zhang <zhangwm@marvell.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d25d9fec

printk: enable interrupts before calling console_trylock_for_printk() · 5874af20

由 Jan Kara 提交于 8月 06, 2014

We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()).  However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().

We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock().

We differ from (reverted) commit 939f04be in that we avoid calling
console_unlock() from vprintk_emit() with lockdep enabled as that has
unveiled quite some bugs leading to system freezes during boot (e.g.
  https://lkml.org/lkml/2014/5/30/242,
  https://lkml.org/lkml/2014/6/28/521).
Signed-off-by: NJan Kara <jack@suse.cz>
Tested-by: NAndreas Bombe <aeb@debian.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5874af20

printk: miscellaneous cleanups · 249771b8

由 Alex Elder 提交于 8月 06, 2014

Some small cleanups to kernel/printk/printk.c.  None of them should
cause any change in behavior.

  - When CONFIG_PRINTK is defined, parenthesize the value of LOG_LINE_MAX.
  - When CONFIG_PRINTK is *not* defined, there is an extra LOG_LINE_MAX
    definition; delete it.
  - Pull an assignment out of a conditional expression in console_setup().
  - Use isdigit() in console_setup() rather than open coding it.
  - In update_console_cmdline(), drop a NUL-termination assignment;
    the strlcpy() call that precedes it guarantees it's not needed.
  - Simplify some logic in printk_timed_ratelimit().
Signed-off-by: NAlex Elder <elder@linaro.org>
Reviewed-by: NPetr Mladek <pmladek@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

249771b8

printk: use a clever macro · e99aa461

由 Alex Elder 提交于 8月 06, 2014

Use the IS_ENABLED() macro rather than #ifdef blocks to set certain
global values.
Signed-off-by: NAlex Elder <elder@linaro.org>
Acked-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NPetr Mladek <pmladek@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e99aa461

printk: fix some comments · 0b90fec3

由 Alex Elder 提交于 8月 06, 2014

Fix a few comments that don't accurately describe their corresponding
code.  It also fixes some minor typographical errors.
Signed-off-by: NAlex Elder <elder@linaro.org>
Reviewed-by: NPetr Mladek <pmladek@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0b90fec3

printk: rename DEFAULT_MESSAGE_LOGLEVEL · 42a9dc0b

由 Alex Elder 提交于 8月 06, 2014

Commit a8fe19eb ("kernel/printk: use symbolic defines for console
loglevels") makes consistent use of symbolic values for printk() log
levels.

The naming scheme used is different from the one used for
DEFAULT_MESSAGE_LOGLEVEL though.  Change that symbol name to be
MESSAGE_LOGLEVEL_DEFAULT for consistency.  And because the value of that
symbol comes from a similarly-named config option, rename
CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well.
Signed-off-by: NAlex Elder <elder@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

42a9dc0b

printk: tweak do_syslog() to match comments · e97e1267

由 Alex Elder 提交于 8月 06, 2014

In do_syslog() there's a path used by kmsg_poll() and kmsg_read() that
only needs to know whether there's any data available to read (and not
its size).  These callers only check for non-zero return.  As a
shortcut, do_syslog() returns the difference between what has been
logged and what has been "seen."

The comments say that the "count of records" should be returned but it's
not.  Instead it returns (log_next_idx - syslog_idx), which is a
difference between buffer offsets--and the result could be negative.

The behavior is the same (it'll be zero or not in the same cases), but
the count of records is more meaningful and it matches what the comments
say.  So change the code to return that.
Signed-off-by: NAlex Elder <elder@linaro.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e97e1267

printk: allow increasing the ring buffer depending on the number of CPUs · 23b2899f

由 Luis R. Rodriguez 提交于 8月 06, 2014

The default size of the ring buffer is too small for machines with a
large amount of CPUs under heavy load.  What ends up happening when
debugging is the ring buffer overlaps and chews up old messages making
debugging impossible unless the size is passed as a kernel parameter.
An idle system upon boot up will on average spew out only about one or
two extra lines but where this really matters is on heavy load and that
will vary widely depending on the system and environment.

There are mechanisms to help increase the kernel ring buffer for tracing
through debugfs, and those interfaces even allow growing the kernel ring
buffer per CPU.  We also have a static value which can be passed upon
boot.  Relying on debugfs however is not ideal for production, and
relying on the value passed upon bootup is can only used *after* an
issue has creeped up.  Instead of being reactive this adds a proactive
measure which lets you scale the amount of contributions you'd expect to
the kernel ring buffer under load by each CPU in the worst case
scenario.

We use num_possible_cpus() to avoid complexities which could be
introduced by dynamically changing the ring buffer size at run time,
num_possible_cpus() lets us use the upper limit on possible number of
CPUs therefore avoiding having to deal with hotplugging CPUs on and off.
This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT
which is used to specify the maximum amount of contributions to the
kernel ring buffer in the worst case before the kernel ring buffer flips
over, the size is specified as a power of 2.  The total amount of
contributions made by each CPU must be greater than half of the default
kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger
an increase upon bootup.  The kernel ring buffer is increased to the
next power of two that would fit the required minimum kernel ring buffer
size plus the additional CPU contribution.  For example if LOG_BUF_SHIFT
is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs
in order to trigger an increase of the kernel ring buffer.  With a
LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64
possible CPUs to trigger an increase.  If you had 128 possible CPUs the
amount of minimum required kernel ring buffer bumps to:

   ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB

Since we require the ring buffer to be a power of two the new required
size would be 1024 KB.

This CPU contributions are ignored when the "log_buf_len" kernel
parameter is used as it forces the exact size of the ring buffer to an
expected power of two value.

[pmladek@suse.cz: fix build]
Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: NPetr Mladek <pmladek@suse.cz>
Tested-by: NDavidlohr Bueso <davidlohr@hp.com>
Tested-by: NPetr Mladek <pmladek@suse.cz>
Reviewed-by: NDavidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

23b2899f

printk: make dynamic units clear for the kernel ring buffer · f5405172

由 Luis R. Rodriguez 提交于 8月 06, 2014

Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
Suggested-by: NDavidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5405172

printk: move power of 2 practice of ring buffer size to a helper · c0a318a3

由 Luis R. Rodriguez 提交于 8月 06, 2014

In practice the power of 2 practice of the size of the kernel ring
buffer remains purely historical but not a requirement, specially now
that we have LOG_ALIGN and use it for both static and dynamic
allocations.  It could have helped with implicit alignment back in the
days given the even the dynamically sized ring buffer was guaranteed to
be aligned so long as CONFIG_LOG_BUF_SHIFT was set to produce a
__LOG_BUF_LEN which is architecture aligned, since log_buf_len=n would
be allowed only if it was > __LOG_BUF_LEN and we always ended up
rounding the log_buf_len=n to the next power of 2 with
roundup_pow_of_two(), any multiple of 2 then should be also architecture
aligned.  These assumptions of course relied heavily on
CONFIG_LOG_BUF_SHIFT producing an aligned value but users can always
change this.

We now have precise alignment requirements set for the log buffer size
for both static and dynamic allocations, but lets upkeep the old
practice of using powers of 2 for its size to help with easy expected
scalable values and the allocators for dynamic allocations.  We'll reuse
this later so move this into a helper.
Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0a318a3

printk: make dynamic kernel ring buffer alignment explicit · 70300177

由 Luis R. Rodriguez 提交于 8月 06, 2014

We have to consider alignment for the ring buffer both for the default
static size, and then also for when an dynamic allocation is made when
the log_buf_len=n kernel parameter is passed to set the size
specifically to a size larger than the default size set by the
architecture through CONFIG_LOG_BUF_SHIFT.

The default static kernel ring buffer can be aligned properly if
architectures set CONFIG_LOG_BUF_SHIFT properly, we provide ranges for
the size though so even if CONFIG_LOG_BUF_SHIFT has a sensible aligned
value it can be reduced to a non aligned value.  Commit 6ebb017d
("printk: Fix alignment of buf causing crash on ARM EABI") by Andrew
Lunn ensures the static buffer is always aligned and the decision of
alignment is done by the compiler by using __alignof__(struct log).

When log_buf_len=n is used we allocate the ring buffer dynamically.
Dynamic allocation varies, for the early allocation called before
setup_arch() memblock_virt_alloc() requests a page aligment and for the
default kernel allocation memblock_virt_alloc_nopanic() requests no
special alignment, which in turn ends up aligning the allocation to
SMP_CACHE_BYTES, which is L1 cache aligned.

Since we already have the required alignment for the kernel ring buffer
though we can do better and request explicit alignment for LOG_ALIGN.
This does that to be safe and make dynamic allocation alignment
explicit.
Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
Tested-by: NPetr Mladek <pmladek@suse.cz>
Acked-by: NPetr Mladek <pmladek@suse.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70300177

include/linux/byteorder/generic.h: minor comment fix · 90a85643

由 Geoff Levand 提交于 8月 06, 2014

Signed-off-by: NGeoff Levand <geoff@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

90a85643

fs.h, drivers/hwmon/asus_atk0110.c: fix DEFINE_SIMPLE_ATTRIBUTE semicolon definition and use · 68be3029

由 Joe Perches 提交于 8月 06, 2014

The DEFINE_SIMPLE_ATTRIBUTE macro should not end in a ; Fix the one use
in the kernel tree that did not have a semicolon.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NGuenter Roeck <linux@roeck-us.net>
Acked-by: NLuca Tettamanti <kronos.it@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

68be3029

./Makefile: tell gcc optimizer to never introduce new data races · 69102311

由 Jiri Kosina 提交于 8月 06, 2014

We have been chasing a memory corruption bug, which turned out to be
caused by very old gcc (4.3.4), which happily turned conditional load
into a non-conditional one, and that broke correctness (the condition
was met only if lock was held) and corrupted memory.

This particular problem with that particular code did not happen when
never gccs were used.  I've brought this up with our gcc folks, as I
wanted to make sure that this can't really happen again, and it turns
out it actually can.

Quoting Martin Jambor <mjambor@suse.cz>:
 "More current GCCs are more careful when it comes to replacing a
  conditional load with a non-conditional one, most notably they check
  that a store happens in each iteration of _a_ loop but they assume
  loops are executed.  They also perform a simple check whether the
  store cannot trap which currently passes only for non-const
  variables.  A simple testcase demonstrating it on an x86_64 is for
  example the following:

  $ cat cond_store.c

  int g_1 = 1;

  int g_2[1024] __attribute__((section ("safe_section"), aligned (4096)));

  int c = 4;

  int __attribute__ ((noinline))
  foo (void)
  {
    int l;
    for (l = 0; (l != 4); l++) {
      if (g_1)
        return l;
      for (g_2[0] = 0; (g_2[0] >= 26); ++g_2[0])
        ;
    }
    return 2;
  }

  int main (int argc, char* argv[])
  {
    if (mprotect (g_2, sizeof(g_2), PROT_READ) == -1)
      {
        int e = errno;
        error (e, e, "mprotect error %i", e);
      }
    foo ();
    __builtin_printf("OK\n");
    return 0;
  }
  /* EOF */
  $ ~/gcc/trunk/inst/bin/gcc cond_store.c -O2 --param allow-store-data-races=0
  $ ./a.out
  OK
  $ ~/gcc/trunk/inst/bin/gcc cond_store.c -O2 --param allow-store-data-races=1
  $ ./a.out
  Segmentation fault

  The testcase fails the same at least with 4.9, 4.8 and 4.7.  Therefore
  I would suggest building kernels with this parameter set to zero. I
  also agree with Jikos that the default should be changed for -O2.  I
  have run most of the SPEC 2k6 CPU benchmarks (gamess and dealII
  failed, at -O2, not sure why) compiled with and without this option
  and did not see any real difference between respective run-times"

Hopefully the default will be changed in newer gccs, but let's force it
for kernel builds so that we are on a safe side even when older gcc are
used.

The code in question was out-of-tree printk-in-NMI (yeah, surprise
suprise, once again) patch written by Petr Mladek, let me quote his
comment from our internal bugzilla:

 "I have spent few days investigating inconsistent state of kernel ring buffer.
  It went out that it was caused by speculative store generated by
  gcc-4.3.4.

  The problem is in assembly generated for make_free_space(). The functions is
  called the following way:

  + vprintk_emit();
      + log = MAIN_LOG; // with logbuf_lock
         or
         log = NMI_LOG; // with nmi_logbuf_lock
         cont_add(log, ...);
          + cont_flush(log, ...);
              + log_store(log, ...);
                    + log_make_free_space(log, ...);

  If called with log = NMI_LOG then only nmi_log_* global variables are safe to
  modify but the generated code does store also into (main_)log_* global
  variables:

  <log_make_free_space>:
         55                      push   %rbp
         89 f6                   mov    %esi,%esi

         48 8b 05 03 99 51 01    mov    0x1519903(%rip),%rax       # ffffffff82620868 <nmi_log_next_id>
         44 8b 1d ec 98 51 01    mov    0x15198ec(%rip),%r11d      # ffffffff82620858 <log_next_idx>
         8b 35 36 60 14 01       mov    0x1146036(%rip),%esi       # ffffffff8224cfa8 <log_buf_len>
         44 8b 35 33 60 14 01    mov    0x1146033(%rip),%r14d      # ffffffff8224cfac <nmi_log_buf_len>
         4c 8b 2d d0 98 51 01    mov    0x15198d0(%rip),%r13       # ffffffff82620850 <log_next_seq>
         4c 8b 25 11 61 14 01    mov    0x1146111(%rip),%r12       # ffffffff8224d098 <log_buf>
         49 89 c2                mov    %rax,%r10
         48 21 c2                and    %rax,%rdx
         48 8b 1d 0c 99 55 01    mov    0x155990c(%rip),%rbx       # ffffffff826608a0 <nmi_log_buf>
         49 c1 ea 20             shr    $0x20,%r10
         48 89 55 d0             mov    %rdx,-0x30(%rbp)
         44 29 de                sub    %r11d,%esi
         45 29 d6                sub    %r10d,%r14d
         4c 8b 0d 97 98 51 01    mov    0x1519897(%rip),%r9	# ffffffff82620840 <log_first_seq>
         eb 7e                   jmp    ffffffff81107029	<log_make_free_space+0xe9>
  [...]
         85 ff                   test   %edi,%edi                  # edi = 1 for NMI_LOG
         4c 89 e8                mov    %r13,%rax
         4c 89 ca                mov    %r9,%rdx
         74 0a                   je     ffffffff8110703d	<log_make_free_space+0xfd>
         8b 15 27 98 51 01       mov    0x1519827(%rip),%edx       # ffffffff82620860 <nmi_log_first_id>
         48 8b 45 d0             mov    -0x30(%rbp),%rax
         48 39 c2                cmp    %rax,%rdx                  # end of loop
         0f 84 da 00 00 00       je     ffffffff81107120 <log_make_free_space+0x1e0>
  [...]
         85 ff                   test   %edi,%edi                  # edi = 1 for NMI_LOG
         4c 89 0d 17 97 51 01    mov    %r9,0x1519717(%rip)        # ffffffff82620840 <log_first_seq>
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
                                 KABOOOM
         74 35                   je     ffffffff81107160		 <log_make_free_space+0x220>

  It stores log_first_seq when edi == NMI_LOG. This instructions are used also
  when edi == MAIN_LOG but the store is done speculatively before the condition
  is decided.  It is unsafe because we do not have "logbuf_lock" in NMI context
  and some other process migh modify "log_first_seq" in parallel"

I believe that the best course of action is both

 - building kernel (and anything multi-threaded, I guess) with that
   optimization turned off
 - persuade gcc folks to change the default for future releases
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marek Polacek <polacek@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Richard Biener <richard.guenther@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

69102311

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功