提交 · 36fd61cb80fcf07c20230face1a0f6e1505c8322 · openeuler / Kernel

29 5月, 2015 1 次提交

percpu_counter: batch size aware __percpu_counter_compare() · 80188b0d

由 Dave Chinner 提交于 9年前

XFS uses non-stanard batch sizes for avoiding frequent global
counter updates on it's allocated inode counters, as they increment
or decrement in batches of 64 inodes. Hence the standard percpu
counter batch of 32 means that the counter is effectively a global
counter. Currently Xfs uses a batch size of 128 so that it doesn't
take the global lock on every single modification.

However, Xfs also needs to compare accurately against zero, which
means we need to use percpu_counter_compare(), and that has a
hard-coded batch size of 32, and hence will spuriously fail to
detect when it is supposed to use precise comparisons and hence
the accounting goes wrong.

Add __percpu_counter_compare() to take a custom batch size so we can
use it sanely in XFS and factor percpu_counter_compare() to use it.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NDave Chinner <david@fromorbit.com>

80188b0d

28 5月, 2015 2 次提交

cpumask_set_cpu_local_first => cpumask_local_spread, lament · f36963c9

由 Rusty Russell 提交于 9年前

da91309e (cpumask: Utility function to set n'th cpu...) created a
genuinely weird function.  I never saw it before, it went through DaveM.
(He only does this to make us other maintainers feel better about our own
mistakes.)

cpumask_set_cpu_local_first's purpose is say "I need to spread things
across N online cpus, choose the ones on this numa node first"; you call
it in a loop.

It can fail.  One of the two callers ignores this, the other aborts and
fails the device open.

It can fail in two ways: allocating the off-stack cpumask, or through a
convoluted codepath which AFAICT can only occur if cpu_online_mask
changes.  Which shouldn't happen, because if cpu_online_mask can change
while you call this, it could return a now-offline cpu anyway.

It contains a nonsensical test "!cpumask_of_node(numa_node)".  This was
drawn to my attention by Geert, who said this causes a warning on Sparc.
It sets a single bit in a cpumask instead of returning a cpu number,
because that's what the callers want.

It could be made more efficient by passing the previous cpu rather than
an index, but that would be more invasive to the callers.

Fixes: da91309e
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (then rebased)
Tested-by: NAmir Vadai <amirv@mellanox.com>
Acked-by: NAmir Vadai <amirv@mellanox.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>

f36963c9

test_bpf: add similarly conflicting jump test case only for classic · bde28bc6

由 Daniel Borkmann 提交于 9年前

While 3b529602 ("test_bpf: add more eBPF jump torture cases")
added the int3 bug test case only for eBPF, which needs exactly 11
passes to converge, here's a version for classic BPF with 11 passes,
and one that would need 70 passes on x86_64 to actually converge for
being successfully JITed. Effectively, all jumps are being optimized
out resulting in a JIT image of just 89 bytes (from originally max
BPF insns), only returning K.

Might be useful as a receipe for folks wanting to craft a test case
when backporting the fix in commit 3f7352bf ("x86: bpf_jit: fix
compilation of large bpf programs") while not having eBPF. The 2nd
one is delegated to the interpreter as the last pass still results
in shrinking, in other words, this one won't be JITed on x86_64.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bde28bc6

25 5月, 2015 1 次提交

test_bpf: add more eBPF jump torture cases · 3b529602

由 Daniel Borkmann 提交于 9年前

Add two more eBPF test cases for JITs, i.e. the second one revealed a
bug in the x86_64 JIT compiler, where only an int3 filled image from
the allocator was emitted and later wrongly set by the compiler as the
bpf_func program code since optimization pass boundary was surpassed
w/o actually emitting opcodes.

Interpreter:

  [   45.782892] test_bpf: #242 BPF_MAXINSNS: Very long jump backwards jited:0 11 PASS
  [   45.783062] test_bpf: #243 BPF_MAXINSNS: Edge hopping nuthouse jited:0 14705 PASS

After x86_64 JIT (fixed):

  [   80.495638] test_bpf: #242 BPF_MAXINSNS: Very long jump backwards jited:1 6 PASS
  [   80.495957] test_bpf: #243 BPF_MAXINSNS: Edge hopping nuthouse jited:1 17157 PASS

Reference: http://thread.gmane.org/gmane.linux.network/364729Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b529602

23 5月, 2015 1 次提交

test_bpf: Add backward jump test case · fe593844

由 Michael Holzheu 提交于 9年前

Currently the testsuite does not have a test case with a backward jump.
The s390x JIT (kernel 4.0) had a bug in that area.
So add one new test case for this now.
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe593844

17 5月, 2015 1 次提交

rhashtable: Add cap on number of elements in hash table · 07ee0722

由 Herbert Xu 提交于 9年前

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate.  Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size * 2.  If
it is also zero it means that there is no cap on the number of
elements in the table.  However, the table will grow whenever the
utilisation hits 100% and if that growth fails, you will get ENOMEM
on insertion.

As allowing oversubscription is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07ee0722

15 5月, 2015 2 次提交

test_bpf: fix sparse warnings · 56cbaa45

由 Michael Holzheu 提交于 9年前

Fix several sparse warnings like:
lib/test_bpf.c:1824:25: sparse: constant 4294967295 is so big it is long
lib/test_bpf.c:1878:25: sparse: constant 0x0000ffffffff0000 is so big it is long

Fixes: cffc642d ("test_bpf: add 173 new testcases for eBPF")
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56cbaa45

test_bpf: add tests related to BPF_MAXINSNS · a4afd37b

由 Daniel Borkmann 提交于 9年前

Couple of torture test cases related to the bug fixed in 0b59d880
("ARM: net: delegate filter to kernel interpreter when imm_offset()
return value can't fit into 12bits.").

I've added a helper to allocate and fill the insn space. Output on
x86_64 from my laptop:

test_bpf: #233 BPF_MAXINSNS: Maximum possible literals jited:0 7 PASS
test_bpf: #234 BPF_MAXINSNS: Single literal jited:0 8 PASS
test_bpf: #235 BPF_MAXINSNS: Run/add until end jited:0 11553 PASS
test_bpf: #236 BPF_MAXINSNS: Too many instructions PASS
test_bpf: #237 BPF_MAXINSNS: Very long jump jited:0 9 PASS
test_bpf: #238 BPF_MAXINSNS: Ctx heavy transformations jited:0 20329 20398 PASS
test_bpf: #239 BPF_MAXINSNS: Call heavy transformations jited:0 32178 32475 PASS
test_bpf: #240 BPF_MAXINSNS: Jump heavy test jited:0 10518 PASS

test_bpf: #233 BPF_MAXINSNS: Maximum possible literals jited:1 4 PASS
test_bpf: #234 BPF_MAXINSNS: Single literal jited:1 4 PASS
test_bpf: #235 BPF_MAXINSNS: Run/add until end jited:1 1625 PASS
test_bpf: #236 BPF_MAXINSNS: Too many instructions PASS
test_bpf: #237 BPF_MAXINSNS: Very long jump jited:1 8 PASS
test_bpf: #238 BPF_MAXINSNS: Ctx heavy transformations jited:1 3301 3174 PASS
test_bpf: #239 BPF_MAXINSNS: Call heavy transformations jited:1 24107 23491 PASS
test_bpf: #240 BPF_MAXINSNS: Jump heavy test jited:1 8651 PASS
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4afd37b

13 5月, 2015 1 次提交

test_bpf: add 173 new testcases for eBPF · cffc642d

由 Michael Holzheu 提交于 9年前

add an exhaustive set of eBPF tests bringing total to:
test_bpf: Summary: 233 PASSED, 0 FAILED, [0/226 JIT'ed]
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cffc642d

11 5月, 2015 1 次提交

test: bpf: extend "load 64-bit immediate" testcase · 986ccfdb

由 Xi Wang 提交于 9年前

Extend the testcase to catch a signedness bug in the arm64 JIT:

test_bpf: #58 load 64-bit immediate jited:1 ret -1 != 1 FAIL (1 times)

This is useful to ensure other JITs won't have a similar bug.

Link: https://lkml.org/lkml/2015/5/8/458
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

986ccfdb

06 5月, 2015 4 次提交

kasan: show gcc version requirements in Kconfig and Documentation · 01e76903

由 Joe Perches 提交于 9年前

The documentation shows a need for gcc > 4.9.2, but it's really >=.  The
Kconfig entries don't show require versions so add them.  Correct a
latter/later typo too.  Also mention that gcc 5 required to catch out of
bounds accesses to global and stack variables.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

01e76903

lib: delete lib/find_last_bit.c · 7d616e4d

由 Yury Norov 提交于 9年前

The file lib/find_last_bit.c was no longer used and supposed to be
deleted by commit 8f6f19dd ("lib: move find_last_bit to
lib/find_next_bit.c") but that delete didn't happen.  This gets rid of
it.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7d616e4d

rhashtable-test: Fix 64bit division · 6decd63a

由 Thomas Graf 提交于 9年前

A 64bit division went in unnoticed. Use do_div() to accomodate
non 64bit architectures.

Reported-by: kbuild test robot
Fixes: 1aa661f5 ("rhashtable-test: Measure time to insert, remove & traverse entries")
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6decd63a

rhashtable: Simplify iterator code · c936a79f

由 Thomas Graf 提交于 9年前

Remove useless obj variable and goto logic.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c936a79f

04 5月, 2015 7 次提交

lib: make memzero_explicit more robust against dead store elimination · 7829fb09

由 Daniel Borkmann 提交于 9年前

In commit 0b053c95 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    barrier_data(s);
  }

  int main(void)
  {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
   0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
   0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
   0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
   0x000000000040041f <+31>: xor   %eax,%eax
   0x0000000000400421 <+33>: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
   0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
   0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
   0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
   0x0000000000400505 <+21>: xor    %eax,%eax
   0x0000000000400507 <+23>: retq
  End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495Reported-by: NStephan Mueller <smueller@chronox.de>
Tested-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7829fb09

rhashtable-test: Detect insertion failures · 67b7cbf4

由 Thomas Graf 提交于 9年前

Account for failed inserts due to memory pressure or EBUSY and
ignore failed entries during the consistency check.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67b7cbf4

rhashtable-test: Use walker to test bucket statistics · 246b23a7

由 Thomas Graf 提交于 9年前

As resizes may continue to run in the background, use walker to
ensure we see all entries. Also print the encountered number
of rehashes queued up while traversing.

This may lead to warnings due to entries being seen multiple
times. We consider them non-fatal.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

246b23a7

rhashtable-test: Do not allocate individual test objects · fcc57020

由 Thomas Graf 提交于 9年前

By far the most expensive part of the selftest was the allocation
of entries. Using a static array allows to measure the rhashtable
operations.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcc57020

rhashtable-test: Get rid of ptr in test_obj structure · c2c8a901

由 Thomas Graf 提交于 9年前

This only blows up the size of the test structure for no gain
in test coverage. Reduces size of test_obj from 24 to 16 bytes.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2c8a901

rhashtable-test: Measure time to insert, remove & traverse entries · 1aa661f5

由 Thomas Graf 提交于 9年前

Make test configurable by allowing to specify all relevant knobs
through module parameters.

Do several test runs and measure the average time it takes to
insert & remove all entries. Note, a deferred resize might still
continue to run in the background.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1aa661f5

rhashtable-test: Remove unused TEST_NEXPANDS · f54e84b6

由 Thomas Graf 提交于 9年前

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f54e84b6

01 5月, 2015 1 次提交

test_bpf: indicate whether bpf prog got jited in test suite · 327941f8

由 Daniel Borkmann 提交于 9年前

I think this is useful to verify whether a filter could be JITed or
not in case of bpf_prog_enable >= 1, which otherwise the test suite
doesn't tell besides taking a good peek at the performance numbers.

Nicolas Schichan reported a bug in the ARM JIT compiler that rejected
and waved the filter to the interpreter although it shouldn't have.
Nevertheless, the test passes as expected, but such information is
not visible.

It's i.e. useful for the remaining classic JITs, but also for
implementing remaining opcodes that are not yet present in eBPF JITs
(e.g. ARM64 waves some of them to the interpreter). This minor patch
allows to grep through dmesg to find those accordingly, but also
provides a total summary, i.e.: [<X>/53 JIT'ed]

  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # insmod lib/test_bpf.ko
  # dmesg | grep "jited:0"

dmesg example on the ARM issue with JIT rejection:

[...]
[   67.925387] test_bpf: #2 ADD_SUB_MUL_K jited:1 24 PASS
[   67.930889] test_bpf: #3 DIV_MOD_KX jited:0 794 PASS
[   67.943940] test_bpf: #4 AND_OR_LSH_K jited:1 20 20 PASS
[...]
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Nicolas Schichan <nschichan@freebox.fr>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

327941f8

23 4月, 2015 2 次提交

rhashtable: Do not schedule more than one rehash if we can't grow further · a87b9ebf

由 Thomas Graf 提交于 9年前

The current code currently only stops inserting rehashes into the
chain when no resizes are currently scheduled. As long as resizes
are scheduled and while inserting above the utilization watermark,
more and more rehashes will be scheduled.

This lead to a perfect DoS storm with thousands of rehashes
scheduled which lead to thousands of spinlocks to be taken
sequentially.

Instead, only allow either a series of resizes or a single rehash.
Drop any further rehashes and return -EBUSY.

Fixes: ccd57b1b ("rhashtable: Add immediate rehash during insertion")
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a87b9ebf

rhashtable: Schedule async resize when sync realloc fails · e2307ed6

由 Thomas Graf 提交于 9年前

When rhashtable_insert_rehash() fails with ENOMEM, this indicates that
we can't allocate the necessary memory in the current context but the
limits as set by the user would still allow to grow.

Thus attempt an async resize in the background where we can allocate
using GFP_KERNEL which is more likely to succeed. The insertion itself
will still fail to indicate pressure.

This fixes a bug where the table would never continue growing once the
utilization is above 100%.

Fixes: ccd57b1b ("rhashtable: Add immediate rehash during insertion")
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2307ed6

22 4月, 2015 4 次提交

md/raid6 algorithms: xor_syndrome() for SSE2 · a582564b

由 Markus Stockhausen 提交于 10年前

The second and (last) optimized XOR syndrome calculation. This version
supports right and left side optimization. All CPUs with architecture
older than Haswell will benefit from it.

It should be noted that SSE2 movntdq kills performance for memory areas
that are read and written simultaneously in chunks smaller than cache
line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
functions.
Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

a582564b

md/raid6 algorithms: xor_syndrome() for generic int · 9a5ce91d

由 Markus Stockhausen 提交于 10年前

Start the algorithms with the very basic one. It is left and right
optimized. That means we can avoid all calculations for unneeded pages
above the right stop offset. For pages below the left start offset we
still need the syndrome multiplication but without reading data pages.
Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

9a5ce91d

md/raid6 algorithms: improve test program · 7e92e1d7

由 Markus Stockhausen 提交于 10年前

It is always helpful to have a test tool in place if we implement
new data critical algorithms. So add some test routines to the raid6
checker that can prove if the new xor_syndrome() works as expected.

Run through all permutations of start/stop pages per algorithm and
simulate a xor_syndrome() assisted rmw run. After each rmw check if
the recovery algorithm still confirms that the stripe is fine.
Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

7e92e1d7

md/raid6 algorithms: delta syndrome functions · fe5cbc6e

由 Markus Stockhausen 提交于 10年前

v3: s-o-b comment, explanation of performance and descision for
the start/stop implementation

Implementing rmw functionality for RAID6 requires optimized syndrome
calculation. Up to now we can only generate a complete syndrome. The
target P/Q pages are always overwritten. With this patch we provide
a framework for inplace P/Q modification. In the first place simply
fill those functions with NULL values.

xor_syndrome() has two additional parameters: start & stop. These
will indicate the first and last page that are changing during a
rmw run. That makes it possible to avoid several unneccessary loops
and speed up calculation. The caller needs to implement the following
logic to make the functions work.

1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
blocks inside P/Q between (and including) start and end.

2) modify any block with start <= block <= stop

3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of
source blocks into P/Q between (and including) start and end.

Pages between start and stop that won't be changed should be filled
with a pointer to the kernel zero page. The reasons for not taking NULL
pages are:

1) Algorithms cross the whole source data line by line. Thus avoid
additional branches.

2) Having a NULL page avoids calculating the XOR P parity but still
need calulation steps for the Q parity. Depending on the algorithm
unrolling that might be only a difference of 2 instructions per loop.

The benchmark numbers of the gen_syndrome() functions are displayed in
the kernel log. Do the same for the xor_syndrome() functions. This
will help to analyze performance problems and give an rough estimate
how well the algorithm works. The choice of the fastest algorithm will
still depend on the gen_syndrome() performance.

With the start/stop page implementation the speed can vary a lot in real
life. E.g. a change of page 0 & page 15 on a stripe will be harder to
compute than the case where page 0 & page 1 are XOR candidates. To be not
to enthusiatic about the expected speeds we will run a worse case test
that simulates a change on the upper half of the stripe. So we do:

1) calculation of P/Q for the upper pages

2) continuation of Q for the lower (empty) pages
Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NNeilBrown <neilb@suse.de>

fe5cbc6e

21 4月, 2015 2 次提交

iommu-common: rename iommu_pool_hash to iommu_hash_common · 7b3372d4

由 Sowmini Varadhan 提交于 9年前

When CONFIG_DEBUG_FORCE_WEAK_PER_CPU is set, the DEFINE_PER_CPU_SECTION
macro will define an extern __pcpu_unique_##name variable that could
conflict with the same definition in powerpc at this time. Avoid that
conflict by renaming iommu_pool_hash in iommu-common.c

Thanks to Guenter Roeck for catching this, and helping to test the fix.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b3372d4

iommu-common: fix x86_64 compiler warnings · b0cc836d

由 Sowmini Varadhan 提交于 9年前

Declare iommu_large_alloc as static. Remove extern definition  for
iommu_tbl_pool_init().
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0cc836d

20 4月, 2015 1 次提交

hexdump: avoid warning in test function · 17974c05

由 Linus Torvalds 提交于 9年前

The test_data_1_le[] array is a const array of const char *.  To avoid
dropping any const information, we need to use "const char * const *",
not just "const char **".

I'm not sure why the different test arrays end up having different
const'ness, but let's make the pointer we use to traverse them as const
as possible, since we modify neither the array of pointers _or_ the
pointers we find in the array.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

17974c05

19 4月, 2015 4 次提交

cpumask: remove __first_cpu / __next_cpu · e4afa120

由 Rusty Russell 提交于 9年前

They were for use by the deprecated first_cpu() and next_cpu() wrappers,
but sparc used them directly.

They're now replaced by cpumask_first / cpumask_next.  And __next_cpu_nr
is completely obsolete.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NDavid S. Miller <davem@davemloft.net>

e4afa120

iommu-common: Fix PARISC compile-time warnings · 2f0c0fdc

由 Sowmini Varadhan 提交于 9年前

Fixes warnings due to
- no DMA_ERROR_CODE on PARISC,
- sizeof (unsigned long) == 4 bytes on PARISC.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f0c0fdc

Break up monolithic iommu table/lock into finer graularity pools and lock · ff7d37a5

由 Sowmini Varadhan 提交于 9年前

Investigation of multithreaded iperf experiments on an ethernet
interface show the iommu->lock as the hottest lock identified by
lockstat, with something of the order of 21M contentions out of
27M acquisitions, and an average wait time of 26 us for the lock.
This is not efficient. A more scalable design is to follow the ppc
model, where the iommu_map_table has multiple pools, each stretching
over a segment of the map, and with a separate lock for each pool.
This model allows for better parallelization of the iommu map search.

This patch adds the iommu range alloc/free function infrastructure.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff7d37a5

sparc: Revert generic IOMMU allocator. · c12f048f

由 David S. Miller 提交于 9年前

I applied the wrong version of this patch series, V4 instead
of V10, due to a patchwork bundling snafu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c12f048f

18 4月, 2015 1 次提交

iommu-common: Fix PARISC compile-time warnings · cb97201c

由 Sowmini Varadhan 提交于 9年前

Fixes warnings due to
- no DMA_ERROR_CODE on PARISC,
- sizeof (unsigned long) == 4 bytes on PARISC.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb97201c

17 4月, 2015 4 次提交

lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text · 9e522c0d

由 Andrew Morton 提交于 9年前

Cc: Yalin Wang <yalin.wang@sonymobile.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e522c0d

cpumask: don't perform while loop in cpumask_next_and() · 534b483a

由 Sergey Senozhatsky 提交于 9年前

cpumask_next_and() is looking for cpumask_next() in src1 in a loop and
tests if found cpu is also present in src2. remove that loop, perform
cpumask_and() of src1 and src2 first and use that new mask to find
cpumask_next().

Apart from removing while loop, ./bloat-o-meter on x86_64 shows
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function                                     old     new   delta
cpumask_next_and                              62      54      -8
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

534b483a

lib/bitmap.c: bitmap_[empty,full]: remove code duplication · 2afe27c7

由 Yury Norov 提交于 9年前

bitmap_empty() has its own implementation.  But it's clearly as simple as:

	find_first_bit(src, nbits) == nbits

The same is true for 'bitmap_full'.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2afe27c7

lib/vsprintf.c: improve put_dec_trunc8 slightly · 675cf53c

由 Rasmus Villemoes 提交于 9年前

I hadn't had enough coffee when I wrote this. Currently, the final
increment of buf depends on the value loaded from the table, and
causes gcc to emit a cmov immediately before the return. It is smarter
to let it depend on r, since the increment can then be computed in
parallel with the final load/store pair. It also shaves 16 bytes of
.text.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

675cf53c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功