提交 · 4b58841149dcaa500ceba1d5378ae70622fe4899 · openanolis / cloud-kernel

20 3月, 2014 1 次提交

audit: Add generic compat syscall support · 4b588411

由 AKASHI Takahiro 提交于 3月 15, 2014

lib/audit.c provides a generic function for auditing system calls.
This patch extends it for compat syscall support on bi-architectures
(32/64-bit) by adding lib/compat_audit.c.
What is required to support this feature are:
 * add asm/unistd32.h for compat system call names
 * select CONFIG_AUDIT_ARCH_COMPAT_GENERIC
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

4b588411

17 1月, 2014 1 次提交

percpu_counter: unbreak __percpu_counter_add() · d1969a84

由 Hugh Dickins 提交于 1月 16, 2014

Commit 74e72f89 ("lib/percpu_counter.c: fix __percpu_counter_add()")
looked very plausible, but its arithmetic was badly wrong: obvious once
you see the fix, but maddening to get there from the weird tmpfs ENOSPCs
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fan Du <fan.du@windriver.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d1969a84

15 1月, 2014 1 次提交

lib/percpu_counter.c: fix __percpu_counter_add() · 74e72f89

由 Ming Lei 提交于 1月 14, 2014

__percpu_counter_add() may be called in softirq/hardirq handler (such
as, blk_mq_queue_exit() is typically called in hardirq/softirq handler),
so we need to call this_cpu_add()(irq safe helper) to update percpu
counter, otherwise counts may be lost.

This fixes the problem that 'rmmod null_blk' hangs in blk_cleanup_queue()
because of miscounting of request_queue->mq_usage_counter.

This patch is the v1 of previous one of "lib/percpu_counter.c:
disable local irq when updating percpu couter", and takes Andrew's
approach which may be more efficient for ARCHs(x86, s390) that
have optimized this_cpu_add().
Signed-off-by: NMing Lei <tom.leiming@gmail.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

74e72f89

02 12月, 2013 1 次提交

KEYS: Fix multiple key add into associative array · 23fd78d7

由 David Howells 提交于 12月 02, 2013

If sufficient keys (or keyrings) are added into a keyring such that a node in
the associative array's tree overflows (each node has a capacity N, currently
16) and such that all N+1 keys have the same index key segment for that level
of the tree (the level'th nibble of the index key), then assoc_array_insert()
calls ops->diff_objects() to indicate at which bit position the two index keys
vary.

However, __key_link_begin() passes a NULL object to assoc_array_insert() with
the intention of supplying the correct pointer later before we commit the
change.  This means that keyring_diff_objects() is given a NULL pointer as one
of its arguments which it does not expect.  This results in an oops like the
attached.

With the previous patch to fix the keyring hash function, this can be forced
much more easily by creating a keyring and only adding keyrings to it.  Add any
other sort of key and a different insertion path is taken - all 16+1 objects
must want to cluster in the same node slot.

This can be tested by:

	r=`keyctl newring sandbox @s`
	for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done

This should work fine, but oopses when the 17th keyring is added.

Since ops->diff_objects() is always called with the first pointer pointing to
the object to be inserted (ie. the NULL pointer), we can fix the problem by
changing the to-be-inserted object pointer to point to the index key passed
into assoc_array_insert() instead.

Whilst we're at it, we also switch the arguments so that they are the same as
for ->compare_object().

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
RIP: 0010:[<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
Call Trace:
 [<ffffffff81191f9d>] keyring_diff_objects+0x21/0xd2
 [<ffffffff811f09ef>] assoc_array_insert+0x3b6/0x908
 [<ffffffff811929a7>] __key_link_begin+0x78/0xe5
 [<ffffffff81191a2e>] key_create_or_update+0x17d/0x36a
 [<ffffffff81192e0a>] SyS_add_key+0x123/0x183
 [<ffffffff81400ddb>] tracesys+0xdd/0xe2
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NStephen Gallagher <sgallagh@redhat.com>

23fd78d7

28 11月, 2013 1 次提交

lockref: include mutex.h rather than reinvent arch_mutex_cpu_relax · 14058d20

由 Will Deacon 提交于 11月 27, 2013

arch_mutex_cpu_relax is already conditionally defined in mutex.h, so
simply include that header rather than replicate the code here.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

14058d20

20 11月, 2013 1 次提交

percpu-refcount: Add percpu-refcount.o to obj-y · fcd40d69

由 Randy Dunlap 提交于 11月 19, 2013

Drop percpu_ida.o from lib-y since it is also listed in obj-y
and it doesn't need to be listed in both places.

Move percpu-refcount.o from lib-y to obj-y to fix build errors
in target_core_mod:

ERROR: "percpu_ref_cancel_init" [drivers/target/target_core_mod.ko] undefined!
ERROR: "percpu_ref_kill_and_confirm" [drivers/target/target_core_mod.ko] undefined!
ERROR: "percpu_ref_init" [drivers/target/target_core_mod.ko] undefined!
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

fcd40d69

15 11月, 2013 7 次提交

kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation · a019e48c

由 Lars-Peter Clausen 提交于 11月 14, 2013

'copied' and 'len' are in bytes, while 'ret' is in elements, so we need to
multiply 'ret' with the size of one element to get the correct result.
Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a019e48c

llists-move-llist_reverse_order-from-raid5-to-llistc-fix · 0791a605

由 Andrew Morton 提交于 11月 14, 2013

fix comment typo, per Jan

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0791a605

llists: move llist_reverse_order from raid5 to llist.c · b89241e8

由 Christoph Hellwig 提交于 11月 14, 2013

Make this useful helper available for other users.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b89241e8

vsprintf: ignore %n again · 9196436a

由 Kees Cook 提交于 11月 14, 2013

This ignores %n in printf again, as was originally documented.
Implementing %n poses a greater security risk than utility, so it should
stay ignored.  To help anyone attempting to use %n, a warning will be
emitted if it is encountered.

Based on an earlier patch by Joe Perches.

Because %n was designed to write to pointers on the stack, it has been
frequently used as an attack vector when bugs are found that leak
user-controlled strings into functions that ultimately process format
strings.  While this class of bug can still be turned into an
information leak, removing %n eliminates the common method of elevating
such a bug into an arbitrary kernel memory writing primitive,
significantly reducing the danger of this class of bug.

For seq_file users that need to know the length of a written string for
padding, please see seq_setwidth() and seq_pad() instead.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9196436a

lockref: use BLOATED_SPINLOCKS to avoid explicit config dependencies · 57f4257e

由 Peter Zijlstra 提交于 11月 14, 2013

Avoid the fragile Kconfig construct guestimating spinlock_t sizes; use a
friendly compile-time test to determine this.

[kirill.shutemov@linux.intel.com: drop CONFIG_CMPXCHG_LOCKREF]
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

57f4257e

random32: use msecs_to_jiffies for reseed timer · 0125737a

由 Daniel Borkmann 提交于 11月 12, 2013

Use msecs_to_jiffies, for these calculations as different HZ
considerations are taken into account for conversion of the timer
shot, and also it makes the code more readable.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0125737a

random32: add __init prefix to prandom_start_seed_timer · 66b25142

由 Daniel Borkmann 提交于 11月 12, 2013

We only call that in functions annotated with __init, so add __init
prefix in prandom_start_seed_timer() as well, so that the kernel can
make use of this hint and we can possibly free up resources after it's
usage. And since it's an internal function rename it to
__prandom_start_seed_timer().
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66b25142

13 11月, 2013 7 次提交

lib/genalloc: add a helper function for DMA buffer allocation · 684f0d3d

由 Nicolin Chen 提交于 11月 12, 2013

When using pool space for DMA buffer, there might be duplicated calling of
gen_pool_alloc() and gen_pool_virt_to_phys() in each implementation.

Thus it's better to add a simple helper function, a compatible one to the
common dma_alloc_coherent(), to save some code.
Signed-off-by: NNicolin Chen <b42378@freescale.com>
Cc: "Hans J. Koch" <hjk@hansjkoch.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

684f0d3d

lib/digsig.c: use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) · ff6092a8

由 Duan Jiong 提交于 11月 12, 2013

Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ff6092a8

lib/vsprintf.c: document formats for dentry and struct file · c0d92a57

由 Olof Johansson 提交于 11月 12, 2013

Looks like these were added to Documentation/printk-formats.txt but
not the in-file table.
Signed-off-by: NOlof Johansson <olof@lixom.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0d92a57

lib/debugobjects.c: remove unnecessary work pending test · d3773ba1

由 Xie XiuQi 提交于 11月 12, 2013

Remove unnecessary work pending test before calling schedule_work().  It
has been tested in queue_work_on() already.  No functional changed.
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d3773ba1

vsprintf: check real user/group id for %pK · 312b4e22

由 Ryan Mallon 提交于 11月 12, 2013

Some setuid binaries will allow reading of files which have read
permission by the real user id.  This is problematic with files which
use %pK because the file access permission is checked at open() time,
but the kptr_restrict setting is checked at read() time.  If a setuid
binary opens a %pK file as an unprivileged user, and then elevates
permissions before reading the file, then kernel pointer values may be
leaked.

This happens for example with the setuid pppd application on Ubuntu 12.04:

  $ head -1 /proc/kallsyms
  00000000 T startup_32

  $ pppd file /proc/kallsyms
  pppd: In file /proc/kallsyms: unrecognized option 'c1000000'

This will only leak the pointer value from the first line, but other
setuid binaries may leak more information.

Fix this by adding a check that in addition to the current process having
CAP_SYSLOG, that effective user and group ids are equal to the real ids.
If a setuid binary reads the contents of a file which uses %pK then the
pointer values will be printed as NULL if the real user is unprivileged.

Update the sysctl documentation to reflect the changes, and also correct
the documentation to state the kptr_restrict=0 is the default.

This is a only temporary solution to the issue.  The correct solution is
to do the permission check at open() time on files, and to replace %pK
with a function which checks the open() time permission.  %pK uses in
printk should be removed since no sane permission check can be done, and
instead protected by using dmesg_restrict.
Signed-off-by: NRyan Mallon <rmallon@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

312b4e22

percpu: add test module for various percpu operations · 623fd807

由 Greg Thelen 提交于 11月 12, 2013

Tests various percpu operations.

Enable with CONFIG_PERCPU_TEST=m.
Signed-off-by: NGreg Thelen <gthelen@google.com>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

623fd807

mm: do not walk all of system memory during show_mem · c78e9363

由 Mel Gorman 提交于 11月 12, 2013

It has been reported on very large machines that show_mem is taking almost
5 minutes to display information.  This is a serious problem if there is
an OOM storm.  The bulk of the cost is in show_mem doing a very expensive
PFN walk to give us the following information

  Total RAM:       Also available as totalram_pages
  Highmem pages:   Also available as totalhigh_pages
  Reserved pages:  Can be inferred from the zone structure
  Shared pages:    PFN walk required
  Unshared pages:  PFN walk required
  Quick pages:     Per-cpu walk required

Only the shared/unshared pages requires a full PFN walk but that
information is useless.  It is also inaccurate as page pins of unshared
pages would be accounted for as shared.  Even if the information was
accurate, I'm struggling to think how the shared/unshared information
could be useful for debugging OOM conditions.  Maybe it was useful before
rmap existed when reclaiming shared pages was costly but it is less
relevant today.

The PFN walk could be optimised a bit but why bother as the information is
useless.  This patch deletes the PFN walker and infers the total RAM,
highmem and reserved pages count from struct zone.  It omits the
shared/unshared page usage on the grounds that it is useless.  It also
corrects the reporting of HighMem as HighMem/MovableOnly as ZONE_MOVABLE
has similar problems to HighMem with respect to lowmem/highmem exhaustion.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c78e9363

12 11月, 2013 5 次提交

random32: add test cases for taus113 implementation · a6a9c0f1

由 Daniel Borkmann 提交于 11月 11, 2013

We generated a battery of 100 test cases from GSL taus113 implemention
and compare the results from a particular seed and a particular
iteration with our implementation in the kernel. We have verified on
32 and 64 bit machines that our taus113 kernel implementation gives
same results as GSL taus113 implementation:

  [    0.147370] prandom: seed boundary self test passed
  [    0.148078] prandom: 100 self tests passed

This is a Kconfig option that is disabled on default, just like the
crc32 init selftests in order to not unnecessary slow down boot process.
We also refactored out prandom_seed_very_weak() as it's now used in
multiple places in order to reduce redundant code.

GSL code we used for generating test cases:

  int i, j;
  srand(time(NULL));
  for (i = 0; i < 100; ++i) {
    int iteration = 500 + (rand() % 500);
    gsl_rng_default_seed = rand() + 1;
    gsl_rng *r = gsl_rng_alloc(gsl_rng_taus113);
    printf("\t{ %lu, ", gsl_rng_default_seed);
    for (j = 0; j < iteration - 1; ++j)
      gsl_rng_get(r);
    printf("%u, %lu },\n", iteration, gsl_rng_get(r));
    gsl_rng_free(r);
  }

Joint work with Hannes Frederic Sowa.

Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6a9c0f1

random32: upgrade taus88 generator to taus113 from errata paper · a98814ce

由 Daniel Borkmann 提交于 11月 11, 2013

Since we use prandom*() functions quite often in networking code
i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
from Pierre L'Ecuyer's original paper "Maximally Equidistributed
Combined Tausworthe Generators", Mathematics of Computation, 65,
213 (1996), 203--213 to the version published in his errata paper [1].

The Tausworthe generator is a maximally-equidistributed generator,
that is fast and has good statistical properties [1].

The version presented there upgrades the 3 state LFSR to a 4 state
LFSR with increased periodicity from about 2^88 to 2^113. The
algorithm is presented in [1] by the very same author who also
designed the original algorithm in [2].

Also, by increasing the state, we make it a bit harder for attackers
to "guess" the PRNGs internal state. See also discussion in [3].

Now, as we use this sort of weak initialization discussed in [3]
only between core_initcall() until late_initcall() time [*] for
prandom32*() users, namely in prandom_init(), it is less relevant
from late_initcall() onwards as we overwrite seeds through
prandom_reseed() anyways with a seed source of higher entropy, that
is, get_random_bytes(). In other words, a exhaustive keysearch of
96 bit would be needed. Now, with the help of this patch, this
state-search increases further to 128 bit. Initialization needs
to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.

taus88 and taus113 algorithm is also part of GSL. I added a test
case in the next patch to verify internal behaviour of this patch
with GSL and ran tests with the dieharder 3.31.1 RNG test suite:

$ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
$ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113

With this seed configuration, in order to compare both, we get
the following differences:

algorithm                 taus88           taus113
rands/second [**]         1.61e+08         1.37e+08
sts_serial(4, 1st run)    WEAK             PASSED
sts_serial(9, 2nd run)    WEAK             PASSED
rgb_lagged_sum(31)        WEAK             PASSED

We took out diehard_sums test as according to the authors it is
considered broken and unusable [4]. Despite that and the slight
decrease in performance (which is acceptable), taus113 here passes
all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
In general, taus/taus113 is considered "very good" by the authors
of dieharder [5].

The papers [1][2] states a single warm-up step is sufficient by
running quicktaus once on each state to ensure proper initialization
of ~s_{0}:

Our selection of (s) according to Table 1 of [1] row 1 holds the
condition L - k <= r - s, that is,

  (32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)

with r = k - q and q = (6 2 13 3) as also stated by the paper.
So according to [2] we are safe with one round of quicktaus for
initialization. However we decided to include the warm-up phase
of the PRNG as done in GSL in every case as a safety net. We also
use the warm up phase to make the output of the RNG easier to
verify by the GSL output.

In prandom_init(), we also mix random_get_entropy() into it, just
like drivers/char/random.c does it, jiffies ^ random_get_entropy().
random-get_entropy() is get_cycles(). xor is entropy preserving so
it is fine if it is not implemented by some architectures.

Note, this PRNG is *not* used for cryptography in the kernel, but
rather as a fast PRNG for various randomizations i.e. in the
networking code, or elsewhere for debugging purposes, for example.

[*]: In order to generate some "sort of pseduo-randomness", since
get_random_bytes() is not yet available for us, we use jiffies and
initialize states s1 - s3 with a simple linear congruential generator
(LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
initialization from s1. So the above quote from [3] accounts only
for the time from core to late initcall, not afterwards.
[**] Single threaded run on MacBook Air w/ Intel Core i5-3317U

 [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
 [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
 [3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
 [4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
 [5] http://www.phy.duke.edu/~rgb/General/dieharder.php

Joint work with Hannes Frederic Sowa.

Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a98814ce

random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized · 4af712e8

由 Hannes Frederic Sowa 提交于 11月 11, 2013

The Tausworthe PRNG is initialized at late_initcall time. At that time the
entropy pool serving get_random_bytes is not filled sufficiently. This
patch adds an additional reseeding step as soon as the nonblocking pool
gets marked as initialized.

On some machines it might be possible that late_initcall gets called after
the pool has been initialized. In this situation we won't reseed again.

(A call to prandom_seed_late blocks later invocations of early reseed
attempts.)

Joint work with Daniel Borkmann.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4af712e8

random32: add periodic reseeding · 6d319202

由 Hannes Frederic Sowa 提交于 11月 11, 2013

The current Tausworthe PRNG is never reseeded with truly random data after
the first attempt in late_initcall. As this PRNG is used for some critical
random data as e.g. UDP port randomization we should try better and reseed
the PRNG once in a while with truly random data from get_random_bytes().

When we reseed with prandom_seed we now make also sure to throw the first
output away. This suffices the reseeding procedure.

The delay calculation is based on a proposal from Eric Dumazet.

Joint work with Daniel Borkmann.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d319202

random32: fix off-by-one in seeding requirement · 51c37a70

由 Daniel Borkmann 提交于 11月 11, 2013

For properly initialising the Tausworthe generator [1], we have
a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.

Commit 697f8d03 ("random32: seeding improvement") introduced
a __seed() function that imposes boundary checks proposed by the
errata paper [2] to properly ensure above conditions.

However, we're off by one, as the function is implemented as:
"return (x < m) ? x + m : x;", and called with __seed(X, 1),
__seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
would be possible, whereas the lower boundary should actually
be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
an initialization with an unwanted seed could have the effect
that Tausworthe's PRNG properties cannot not be ensured.

Note that this PRNG is *not* used for cryptography in the kernel.

 [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
 [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps

Joint work with Hannes Frederic Sowa.

Fixes: 697f8d03 ("random32: seeding improvement")
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51c37a70

08 11月, 2013 1 次提交

percpu-refcount: Add EXPORT_SYMBOL to use percpu_ref from modules · c9e8d128

由 Nicholas Bellinger 提交于 11月 07, 2013

This patch adds EXPORT_SYMBOL() for percpu_ref_init(),
percpu_ref_cancel_init() and percpu_ref_kill_and_confirm() so
that percpu refcounting can be used by external modules.

Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

c9e8d128

07 11月, 2013 1 次提交

Revert "sysfs: drop kobj_ns_type handling" · a1212d27

由 Linus Torvalds 提交于 11月 07, 2013

This reverts commit cb26a311.

It mysteriously causes NetworkManager to not find the wireless device
for me.  As far as I can tell, Tejun *meant* for this commit to not make
any semantic changes, but there clearly are some.  So revert it, taking
into account some of the calling convention changes that happened in
this area in subsequent commits.

Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a1212d27

06 11月, 2013 3 次提交

locking: Move the percpu-rwsem code to kernel/locking/ · 32cf7c3c

由 Peter Zijlstra 提交于 11月 04, 2013

32cf7c3c

locking: Move the rwsem code to kernel/locking/ · ed428bfc

由 Peter Zijlstra 提交于 10月 31, 2013

Notably: changed lib/rwsem* targets from lib- to obj-, no idea about
the ramifications of that.
Suggested-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-g0kynfh5feriwc6p3h6kpbw6@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

ed428bfc

locking: Move the spinlock code to kernel/locking/ · 60fc2874

由 Peter Zijlstra 提交于 10月 31, 2013

60fc2874

05 11月, 2013 2 次提交

lib: crc32: reduce number of cases for crc32{, c}_combine · 16514839

由 Daniel Borkmann 提交于 11月 04, 2013

We can safely reduce the number of test cases by a tenth.
There is no particular need to run as many as we're running
now for crc32{,c}_combine, that gives us still ~8000 tests
we're doing if people run kernels with crc selftests enabled
which is perfectly fine.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16514839

lib: crc32: conditionally resched when running testcases · cc0ac199

由 Daniel Borkmann 提交于 11月 04, 2013

Fengguang reports that when crc32 selftests are running on startup, on
some e.g. 32bit systems, we can get a CPU stall like "INFO: rcu_sched
self-detected stall on CPU { 0} (t=2101 jiffies g=4294967081 c=4294967080
q=41)". As this is not intended, add a cond_resched() at the end of a
test case to fix it. Introduced by efba721f ("lib: crc32: add test cases
for crc32{, c}_combine routines").
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc0ac199

04 11月, 2013 3 次提交

lib: crc32: add test cases for crc32{, c}_combine routines · efba721f

由 Daniel Borkmann 提交于 10月 30, 2013

We already have 100 test cases for crcs itself, so split the test
buffer with a-prio known checksums, and test crc of two blocks
against crc of the whole block for the same results.

Output/result with CONFIG_CRC32_SELFTEST=y:

  [    2.687095] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
  [    2.687097] crc32: self tests passed, processed 225944 bytes in 278177 nsec
  [    2.687383] crc32c: CRC_LE_BITS = 64
  [    2.687385] crc32c: self tests passed, processed 225944 bytes in 141708 nsec
  [    7.336771] crc32_combine: 113072 self tests passed
  [   12.050479] crc32c_combine: 113072 self tests passed
  [   17.633089] alg: No test for crc32 (crc32-pclmul)
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efba721f

lib: crc32: add functionality to combine two crc32{, c}s in GF(2) · 6e95fcaa

由 Daniel Borkmann 提交于 10月 30, 2013

This patch adds a combinator to merge two or more crc32{,c}s
into a new one. This is useful for checksum computations of
fragmented skbs that use crc32/crc32c as checksums.

The arithmetics for combining both in the GF(2) was taken and
slightly modified from zlib. Only passing two crcs is insufficient
as two crcs and the length of the second piece is needed for
merging. The code is made generic, so that only polynomials
need to be passed for crc32_le resp. crc32c_le.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e95fcaa

lib: crc32: clean up spacing in test cases · d921e049

由 Daniel Borkmann 提交于 10月 30, 2013

This is nothing more but a whitepace cleanup, as 80 chars is not a
hard but soft limit, and otherwise makes the test cases array really
look ugly. So fix it up.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d921e049

01 11月, 2013 1 次提交

lib/scatterlist.c: don't flush_kernel_dcache_page on slab page · 3d77b50c

由 Ming Lei 提交于 10月 31, 2013

Commit b1adaf65 ("[SCSI] block: add sg buffer copy helper
functions") introduces two sg buffer copy helpers, and calls
flush_kernel_dcache_page() on pages in SG list after these pages are
written to.

Unfortunately, the commit may introduce a potential bug:

 - Before sending some SCSI commands, kmalloc() buffer may be passed to
   block layper, so flush_kernel_dcache_page() can see a slab page
   finally

 - According to cachetlb.txt, flush_kernel_dcache_page() is only called
   on "a user page", which surely can't be a slab page.

 - ARCH's implementation of flush_kernel_dcache_page() may use page
   mapping information to do optimization so page_mapping() will see the
   slab page, then VM_BUG_ON() is triggered.

Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
before calling flush_kernel_dcache_page().
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Reported-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: NSimon Baatz <gmbnomis@gmail.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>	[3.2+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3d77b50c

29 10月, 2013 1 次提交

Kconfig: make KOBJECT_RELEASE debugging require timer debugging · 2a999aa0

由 Linus Torvalds 提交于 10月 29, 2013

Without the timer debugging, the delayed kobject release will just
result in undebuggable oopses if it triggers any latent bugs.  That
doesn't actually help debugging at all.

So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid
having people enable one without the other.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2a999aa0

25 10月, 2013 3 次提交

percpu_ida: add an API to return free tags · 1dddc01a

由 Shaohua Li 提交于 10月 15, 2013

Add an API to return free tags, blk-mq-tag will use it.

Note, this just returns a snapshot of free tags number. blk-mq-tag has
two usages of it. One is for info output for diagnosis. The other is to
quickly check if there are free tags for request dispatch checking.
Neither requires very precise.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NShaohua Li <shli@fusionio.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1dddc01a

percpu_ida: add percpu_ida_for_each_free · 7fc2ba17

由 Shaohua Li 提交于 10月 15, 2013

Add a new API to iterate free ids. blk-mq-tag will use it.

Note, this doesn't guarantee to iterate all free ids restrictly. Caller
should be aware of this. blk-mq uses it to do sanity check for request
timedout, so can tolerate the limitation.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NShaohua Li <shli@fusionio.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7fc2ba17

percpu_ida: make percpu_ida percpu size/batch configurable · e26b53d0

由 Shaohua Li 提交于 10月 15, 2013

Make percpu_ida percpu size/batch configurable. The block-mq-tag will
use it.

After block-mq uses percpu_ida to manage tags, performance is improved.
My test is done in a 2 sockets machine, 12 process cross the 2 sockets.
So if there is lock contention or ipi, should be stressed heavily.
Testing is done for null-blk.

hw_queue_depth	nopatch iops	patch iops
64		~800k/s		~1470k/s
2048		~4470k/s	~4340k/s

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NShaohua Li <shli@fusionio.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e26b53d0

openanolis / cloud-kernel 11 个月 前同步成功

openanolis / cloud-kernel
11 个月前同步成功