提交 · 84a1c2109d23df3543d96231c4fee1757299bb1a · openeuler / Kernel

11 12月, 2018 1 次提交

mtd: spi-nor: add macros related to MICRON flash · 0005aad0

由 Yogesh Narayan Gaur 提交于 10月 12, 2018

Some MICRON related macros in spi-nor domain were ST.
Rename entries related to STMicroelectronics under macro SNOR_MFR_ST.

Added entry of MFR Id for Micron flashes, 0x002C.
Signed-off-by: NYogesh Gaur <yogeshnarayan.gaur@nxp.com>
Reviewed-by: NTudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>

0005aad0

01 12月, 2018 1 次提交

psi: make disabling/enabling easier for vendor kernels · e0c27447

由 Johannes Weiner 提交于 11月 30, 2018

Mel Gorman reports a hackbench regression with psi that would prohibit
shipping the suse kernel with it default-enabled, but he'd still like
users to be able to opt in at little to no cost to others.

With the current combination of CONFIG_PSI and the psi_disabled bool set
from the commandline, this is a challenge.  Do the following things to
make it easier:

1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros
   to enable CONFIG_PSI in their kernel but leave the feature disabled
   unless a user requests it at boot-time.

   To avoid double negatives, rename psi_disabled= to psi=.

2. Make psi_disabled a static branch to eliminate any branch costs
   when the feature is disabled.

In terms of numbers before and after this patch, Mel says:

: The following is a comparision using CONFIG_PSI=n as a baseline against
: your patch and a vanilla kernel
:
:                          4.20.0-rc4             4.20.0-rc4             4.20.0-rc4
:                 kconfigdisable-v1r1                vanilla        psidisable-v1r1
: Amean     1       1.3100 (   0.00%)      1.3923 (  -6.28%)      1.3427 (  -2.49%)
: Amean     3       3.8860 (   0.00%)      4.1230 *  -6.10%*      3.8860 (  -0.00%)
: Amean     5       6.8847 (   0.00%)      8.0390 * -16.77%*      6.7727 (   1.63%)
: Amean     7       9.9310 (   0.00%)     10.8367 *  -9.12%*      9.9910 (  -0.60%)
: Amean     12     16.6577 (   0.00%)     18.2363 *  -9.48%*     17.1083 (  -2.71%)
: Amean     18     26.5133 (   0.00%)     27.8833 *  -5.17%*     25.7663 (   2.82%)
: Amean     24     34.3003 (   0.00%)     34.6830 (  -1.12%)     32.0450 (   6.58%)
: Amean     30     40.0063 (   0.00%)     40.5800 (  -1.43%)     41.5087 (  -3.76%)
: Amean     32     40.1407 (   0.00%)     41.2273 (  -2.71%)     39.9417 (   0.50%)
:
: It's showing that the vanilla kernel takes a hit (as the bisection
: indicated it would) and that disabling PSI by default is reasonably
: close in terms of performance for this particular workload on this
: particular machine so;

Link: http://lkml.kernel.org/r/20181127165329.GA29728@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Tested-by: NMel Gorman <mgorman@techsingularity.net>
Reported-by: NMel Gorman <mgorman@techsingularity.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0c27447

30 11月, 2018 2 次提交

tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique · 0c7a52e4

由 Zenghui Yu 提交于 11月 28, 2018

After enabling KVM event tracing, almost all of trace_kvm_exit()'s
printk shows

	"kvm_exit: IRQ: ..."

even if the actual exception_type is NOT IRQ.  More specifically,
trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT.

This slight problem may have existed after commit e6753f23
("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are
two variables in trace_kvm_exit() and __DO_TRACE() which have the
same name, *idx*. Thus the actual value of *idx* will be overwritten
when tracing. Fix it by adding a simple prefix.

Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Wang Haibin <wanghaibin.wang@huawei.com>
Cc: linux-trace-devel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

0c7a52e4

pstore/ram: Correctly calculate usable PRZ bytes · 89d328f6

由 Kees Cook 提交于 11月 01, 2018

The actual number of bytes stored in a PRZ is smaller than the
bytes requested by platform data, since there is a header on each
PRZ. Additionally, if ECC is enabled, there are trailing bytes used
as well. Normally this mismatch doesn't matter since PRZs are circular
buffers and the leading "overflow" bytes are just thrown away. However, in
the case of a compressed record, this rather badly corrupts the results.

This corruption was visible with "ramoops.mem_size=204800 ramoops.ecc=1".
Any stored crashes would not be uncompressable (producing a pstorefs
"dmesg-*.enc.z" file), and triggering errors at boot:

  [    2.790759] pstore: crypto_comp_decompress failed, ret = -22!

Backporting this depends on commit 70ad35db ("pstore: Convert console
write to use ->write_buf")
Reported-by: NJoel Fernandes <joel@joelfernandes.org>
Fixes: b0aad7a9 ("pstore: Add compression support to pstore")
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>

89d328f6

28 11月, 2018 7 次提交

fscache: Fix race in fscache_op_complete() due to split atomic_sub & read · 3f2b7b90

由 kiran.modukuri 提交于 11月 26, 2018

The code in fscache_retrieval_complete is using atomic_sub followed by an
atomic_read:

        atomic_sub(n_pages, &op->n_pages);
        if (atomic_read(&op->n_pages) <= 0)
                fscache_op_complete(&op->op, true);

This causes two threads doing a decrement of n_pages to race with each
other seeing the op->refcount 0 at same time - and they end up calling
fscache_op_complete() in both the threads leading to an assertion failure.

Fix this by using atomic_sub_return_relaxed() instead of two calls.  Note
that I'm using 'relaxed' rather than, say, 'release' as there aren't
multiple variables that appear to need ordering across the release.

The oops looks something like:

FS-Cache: Assertion failed
FS-Cache: 0 > 0 is false
...
kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449!
...
Workqueue: fscache_operation fscache_op_work_func [fscache]
...
RIP: 0010:[<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
...
Call Trace:
 [<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles]
 [<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache]
 [<ffffffff81096da0>] process_one_work+0x150/0x3f0
 [<ffffffff8109751a>] worker_thread+0x11a/0x470
 [<ffffffff81808e59>] ? __schedule+0x359/0x980
 [<ffffffff81097400>] ? rescuer_thread+0x310/0x310
 [<ffffffff8109cdd6>] kthread+0xd6/0xf0
 [<ffffffff8109cd00>] ? kthread_park+0x60/0x60
 [<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70
 [<ffffffff8109cd00>] ? kthread_park+0x60/0x60

This seen this in 4.4.x kernels and the same bug affects fscache in latest
upstreams kernels.

Fixes: 1bb4b7f9 ("FS-Cache: The retrieval remaining-pages counter needs to be atomic_t")
Signed-off-by: NKiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3f2b7b90

x86/speculation: Add prctl() control for indirect branch speculation · 9137bb27

由 Thomas Gleixner 提交于 11月 25, 2018

Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
indirect branch speculation via STIBP and IBPB.

Invocations:
 Check indirect branch speculation status with
 - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);

 Enable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);

 Disable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);

 Force disable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);

See Documentation/userspace-api/spec_ctrl.rst.
Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de

9137bb27

ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS · 46f7ecb1

由 Thomas Gleixner 提交于 11月 25, 2018

The IBPB control code in x86 removed the usage. Remove the functionality
which was introduced for this.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.559149393@linutronix.de

46f7ecb1

x86/speculation: Rework SMT state change · a74cfffb

由 Thomas Gleixner 提交于 11月 25, 2018

arch_smt_update() is only called when the sysfs SMT control knob is
changed. This means that when SMT is enabled in the sysfs control knob the
system is considered to have SMT active even if all siblings are offline.

To allow finegrained control of the speculation mitigations, the actual SMT
state is more interesting than the fact that siblings could be enabled.

Rework the code, so arch_smt_update() is invoked from each individual CPU
hotplug function, and simplify the update function while at it.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de

a74cfffb

sched/smt: Expose sched_smt_present static key · 321a874a

由 Thomas Gleixner 提交于 11月 25, 2018

Make the scheduler's 'sched_smt_present' static key globaly available, so
it can be used in the x86 speculation control code.

Provide a query function and a stub for the CONFIG_SMP=n case.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.430168326@linutronix.de

321a874a

function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack · 39eb456d

由 Steven Rostedt (VMware) 提交于 11月 19, 2018

Currently, the depth of the ret_stack is determined by curr_ret_stack index.
The issue is that there's a race between setting of the curr_ret_stack and
calling of the callback attached to the return of the function.

Commit 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling
trace return callback") moved the calling of the callback to after the
setting of the curr_ret_stack, even stating that it was safe to do so, when
in fact, it was the reason there was a barrier() there (yes, I should have
commented that barrier()).

Not only does the curr_ret_stack keep track of the current call graph depth,
it also keeps the ret_stack content from being overwritten by new data.

The function profiler, uses the "subtime" variable of ret_stack structure
and by moving the curr_ret_stack, it allows for interrupts to use the same
structure it was using, corrupting the data, and breaking the profiler.

To fix this, there needs to be two variables to handle the call stack depth
and the pointer to where the ret_stack is being used, as they need to change
at two different locations.

Cc: stable@kernel.org
Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

39eb456d

function_graph: Make ftrace_push_return_trace() static · d125f3f8

由 Steven Rostedt (VMware) 提交于 11月 19, 2018

As all architectures now call function_graph_enter() to do the entry work,
no architecture should ever call ftrace_push_return_trace(). Make it static.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Cc: stable@kernel.org
Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

d125f3f8

27 11月, 2018 2 次提交

bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addr · e2c95a61

由 Daniel Borkmann 提交于 11月 26, 2018

Make fetching of the BPF call address from ppc64 JIT generic. ppc64
was using a slightly different variant rather than through the insns'
imm field encoding as the target address would not fit into that space.
Therefore, the target subprog number was encoded into the insns' offset
and fetched through fp->aux->func[off]->bpf_func instead. Given there
are other JITs with this issue and the mechanism of fetching the address
is JIT-generic, move it into the core as a helper instead. On the JIT
side, we get information on whether the retrieved address is a fixed
one, that is, not changing through JIT passes, or a dynamic one. For
the former, JITs can optimize their imm emission because this doesn't
change jump offsets throughout JIT process.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NSandipan Das <sandipan@linux.ibm.com>
Tested-by: NSandipan Das <sandipan@linux.ibm.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

e2c95a61

function_graph: Create function_graph_enter() to consolidate architecture code · 8114865f

由 Steven Rostedt (VMware) 提交于 11月 18, 2018

Currently all the architectures do basically the same thing in preparing the
function graph tracer on entry to a function. This code can be pulled into a
generic location and then this will allow the function graph tracer to be
fixed, as well as extended.

Create a new function graph helper function_graph_enter() that will call the
hook function (ftrace_graph_entry) and the shadow stack operation
(ftrace_push_return_trace), and remove the need of the architecture code to
manage the shadow stack.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Cc: stable@kernel.org
Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8114865f

26 11月, 2018 2 次提交

gpio: davinci: restore a way to manually specify the GPIO base · 786a9ab1

由 Bartosz Golaszewski 提交于 11月 21, 2018

Commit 587f7a69 ("gpio: davinci: Use dev name for label and
automatic base selection") broke the network support in legacy boot
mode for da850-evm since we can no longer request the MDIO clock GPIO.

Other boards may be broken too, which I haven't tested.

The problem is in the fact that most board files still use the legacy
GPIO API where lines are requested by numbers rather than descriptors.

While this should be fixed eventually, in order to unbreak the board
for now - provide a way to manually specify the GPIO base in platform
data.

Fixes: 587f7a69 ("gpio: davinci: Use dev name for label and automatic base selection")
Cc: stable@vger.kernel.org
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NSekhar Nori <nsekhar@ti.com>

786a9ab1

netfilter: nfnetlink_cttimeout: fetch timeouts for udplite and gre, too · 89259088

由 Florian Westphal 提交于 11月 17, 2018

syzbot was able to trigger the WARN in cttimeout_default_get() by
passing UDPLITE as l4protocol.  Alias UDPLITE to UDP, both use
same timeout values.

Furthermore, also fetch GRE timeouts.  GRE is a bit more complicated,
as it still can be a module and its netns_proto_gre struct layout isn't
visible outside of the gre module. Can't move timeouts around, it
appears conntrack sysctl unregister assumes net_generic() returns
nf_proto_net, so we get crash. Expose layout of netns_proto_gre instead.

A followup nf-next patch could make gre tracker be built-in as well
if needed, its not that large.

Last, make the WARN() mention the missing protocol value in case
anything else is missing.

Reported-by: syzbot+2fae8fa157dd92618cae@syzkaller.appspotmail.com
Fixes: 8866df92 ("netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

89259088

24 11月, 2018 1 次提交

packet: copy user buffers before orphan or clone · 5cd8d46e

由 Willem de Bruijn 提交于 11月 20, 2018

tpacket_snd sends packets with user pages linked into skb frags. It
notifies that pages can be reused when the skb is released by setting
skb->destructor to tpacket_destruct_skb.

This can cause data corruption if the skb is orphaned (e.g., on
transmit through veth) or cloned (e.g., on mirror to another psock).

Create a kernel-private copy of data in these cases, same as tun/tap
zerocopy transmission. Reuse that infrastructure: mark the skb as
SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).

Unlike other zerocopy packets, do not set shinfo destructor_arg to
struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
when the original skb is released and a timestamp is recorded. Do not
change this timestamp behavior. The ubuf_info->callback is not needed
anyway, as no zerocopy notification is expected.

Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
resulting value is not a valid ubuf_info pointer, nor a valid
tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.

The fix relies on features introduced in commit 52267790 ("sock:
add MSG_ZEROCOPY"), so can be backported as is only to 4.14.

Tested with from `./in_netns.sh ./txring_overwrite` from
http://github.com/wdebruij/kerneltools/tests

Fixes: 69e3c75f ("net: TX_RING and packet mmap")
Reported-by: NAnand H. Krishnan <anandhkrishnan@gmail.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cd8d46e

23 11月, 2018 1 次提交

net/dim: Update DIM start sample after each DIM iteration · 0211dda6

由 Tal Gilboa 提交于 11月 21, 2018

On every iteration of net_dim, the algorithm may choose to
check for the system state by comparing current data sample
with previous data sample. After each of these comparison,
regardless of the action taken, the sample used as baseline
is needed to be updated.

This patch fixes a bug that causes DIM to take wrong decisions,
due to never updating the baseline sample for comparison between
iterations. This way, DIM always compares current sample with
zeros.

Although this is a functional fix, it also improves and stabilizes
performance as the algorithm works properly now.

Performance:
Tested single UDP TX stream with pktgen:
samples/pktgen/pktgen_sample03_burst_single_flow.sh -i p4p2 -d 1.1.1.1
-m 24:8a:07:88:26:8b -f 3 -b 128

ConnectX-5 100GbE packet rate improved from 15-19Mpps to 19-20Mpps.
Also, toggling between profiles is less frequent with the fix.

Fixes: 8115b750 ("net/dim: use struct net_dim_sample as arg to net_dim")
Signed-off-by: NTal Gilboa <talgi@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0211dda6

22 11月, 2018 3 次提交

Revert "HID: input: Create a utility class for counting scroll events" · f1539a0c

由 Benjamin Tissoires 提交于 11月 21, 2018

This reverts commit 1ff2e1a4.

It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: NHarry Cutts <hcutts@chromium.org>
Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: NJiri Kosina <jkosina@suse.cz>

f1539a0c

tcp: defer SACK compression after DupThresh · 86de5921

由 Eric Dumazet 提交于 11月 20, 2018

Jean-Louis reported a TCP regression and bisected to recent SACK
compression.

After a loss episode (receiver not able to keep up and dropping
packets because its backlog is full), linux TCP stack is sending
a single SACK (DUPACK).

Sender waits a full RTO timer before recovering losses.

While RFC 6675 says in section 5, "Algorithm Details",

   (2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true --
       indicating at least three segments have arrived above the current
       cumulative acknowledgment point, which is taken to indicate loss
       -- go to step (4).
...
   (4) Invoke fast retransmit and enter loss recovery as follows:

there are old TCP stacks not implementing this strategy, and
still counting the dupacks before starting fast retransmit.

While these stacks probably perform poorly when receivers implement
LRO/GRO, we should be a little more gentle to them.

This patch makes sure we do not enable SACK compression unless
3 dupacks have been sent since last rcv_nxt update.

Ideally we should even rearm the timer to send one or two
more DUPACK if no more packets are coming, but that will
be work aiming for linux-4.21.

Many thanks to Jean-Louis for bisecting the issue, providing
packet captures and testing this patch.

Fixes: 5d9f4262 ("tcp: add SACK compression")
Reported-by: NJean-Louis Dupond <jean-louis@dupond.be>
Tested-by: NJean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86de5921

dma-direct: Make DIRECT_MAPPING_ERROR viable for SWIOTLB · b3408715

由 Robin Murphy 提交于 11月 21, 2018

With the overflow buffer removed, we no longer have a unique address
which is guaranteed not to be a valid DMA target to use as an error
token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
an unlikely DMA target, but unfortunately there are already SWIOTLB
users with DMA-able memory at physical address 0 which now gets falsely
treated as a mapping failure and leads to all manner of misbehaviour.

The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
other commonly-used error value of all-bits-set, since the last single
byte of memory is by far the least-likely-valid DMA target.

Fixes: dff8d6c1 ("swiotlb: remove the overflow buffer")
Reported-by: NJohn Stultz <john.stultz@linaro.org>
Tested-by: NJohn Stultz <john.stultz@linaro.org>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b3408715

16 11月, 2018 1 次提交

iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers · 0145b505

由 Hans de Goede 提交于 10月 31, 2018

Before this commit sensor_hub_input_attr_get_raw_value() failed to take
the signedness of 16 and 8 bit values into account, returning e.g.
65436 instead of -100 for the z-axis reading of an accelerometer.

This commit adds a new is_signed parameter to the function and makes all
callers pass the appropriate value for this.

While at it, this commit also fixes up some neighboring lines where
statements were needlessly split over 2 lines to improve readability.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>

0145b505

15 11月, 2018 1 次提交

efi/arm: Defer persistent reservations until after paging_init() · eff89628

由 Ard Biesheuvel 提交于 11月 14, 2018

The new memory EFI reservation feature we introduced to allow memory
reservations to persist across kexec may trigger an unbounded number
of calls to memblock_reserve(). The memblock subsystem can deal with
this fine, but not before memblock resizing is enabled, which we can
only do after paging_init(), when the memory we reallocate the array
into is actually mapped.

So break out the memreserve table processing into a separate routine
and call it after paging_init() on arm64. On ARM, because of limited
reviewing bandwidth of the maintainer, we cannot currently fix this,
so instead, disable the EFI persistent memreserve entirely on ARM so
we can fix it later.
Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20181114175544.12860-5-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

eff89628

12 11月, 2018 1 次提交

x86/ptrace: Fix documentation for tracehook_report_syscall_entry() · d19f9130

由 Elvira Khabirova 提交于 11月 10, 2018

tracehook_report_syscall_entry() is called not only
if %TIF_SYSCALL_TRACE is set, but also if %TIF_SYSCALL_EMU is set,
as appears from x86's entry code.
Signed-off-by: NElvira Khabirova <lineprinter@altlinux.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: ldv@altlinux.org
Cc: oleg@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20181110042209.26333972@akathisiaSigned-off-by: NIngo Molnar <mingo@kernel.org>

d19f9130

10 11月, 2018 3 次提交

can: rx-offload: rename can_rx_offload_irq_queue_err_skb() to can_rx_offload_queue_tail() · 4530ec36

由 Oleksij Rempel 提交于 9月 18, 2018

This function has nothing todo with error.
Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

4530ec36

can: rx-offload: introduce can_rx_offload_get_echo_skb() and... · 55059f2b

由 Oleksij Rempel 提交于 9月 18, 2018

can: rx-offload: introduce can_rx_offload_get_echo_skb() and can_rx_offload_queue_sorted() functions

Current CAN framework can't guarantee proper/chronological order
of RX and TX-ECHO messages. To make this possible, drivers should use
this functions instead of can_get_echo_skb().
Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

55059f2b

can: dev: can_get_echo_skb(): factor out non sending code to __can_get_echo_skb() · a4310fa2

由 Marc Kleine-Budde 提交于 10月 31, 2018

This patch factors out all non sending parts of can_get_echo_skb() into
a seperate function __can_get_echo_skb(), so that it can be re-used in
an upcoming patch.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

a4310fa2

09 11月, 2018 1 次提交

libceph: assume argonaut on the server side · 23c625ce

由 Ilya Dryomov 提交于 11月 08, 2018

No one is running pre-argonaut.  In addition one of the argonaut
features (NOSRCADDR) has been required since day one (and a half,
2.6.34 vs 2.6.35) of the kernel client.

Allow for the possibility of reusing these feature bits later.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NSage Weil <sage@redhat.com>

23c625ce

08 11月, 2018 1 次提交

Compiler Attributes: improve explanation of header · 24efee41

由 Miguel Ojeda 提交于 11月 06, 2018

Explain better what "optional" attributes are, and avoid calling
them so to avoid confusion. Simply retain "Optional" as a word
to look for in the comments.

Moreover, add a couple sentences to explain a bit more the intention
and the documentation links.
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

24efee41

07 11月, 2018 4 次提交

USB: Wait for extra delay time after USB_PORT_FEAT_RESET for quirky hub · 781f0766

由 Kai-Heng Feng 提交于 10月 19, 2018

Devices connected under Terminus Technology Inc. Hub (1a40:0101) may
fail to work after the system resumes from suspend:
[  206.063325] usb 3-2.4: reset full-speed USB device number 4 using xhci_hcd
[  206.143691] usb 3-2.4: device descriptor read/64, error -32
[  206.351671] usb 3-2.4: device descriptor read/64, error -32

Info for this hub:
T:  Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  2 Spd=480 MxCh= 4
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1a40 ProdID=0101 Rev=01.11
S:  Product=USB 2.0 Hub
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub

Some expirements indicate that the USB devices connected to the hub are
innocent, it's the hub itself is to blame. The hub needs extra delay
time after it resets its port.

Hence wait for extra delay, if the device is connected to this quirky
hub.
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

781f0766

net/mlx5: Fix XRC SRQ umem valid bits · 99b77fef

由 Yishai Hadas 提交于 10月 31, 2018

Adapt XRC SRQ to the latest HW specification with fixed definition
around umem valid bits. The previous definition relied on a bit which
was taken for other purposes in legacy FW.

Fixes: bd371975 ("net/mlx5: Update mlx5_ifc with DEVX UID bits")
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>

99b77fef

watchdog/core: Add missing prototypes for weak functions · 81bd415c

由 Mathieu Malaterre 提交于 6月 06, 2018

The split out of the hard lockup detector exposed two new weak functions,
but no prototypes for them, which triggers the build warning:

kernel/watchdog.c:109:12: warning: no previous prototype for ‘watchdog_nmi_enable’ [-Wmissing-prototypes]
kernel/watchdog.c:115:13: warning: no previous prototype for ‘watchdog_nmi_disable’ [-Wmissing-prototypes]

Add the prototypes.

Fixes: 73ce0511 ("kernel/watchdog.c: move hardlockup detector to separate file")
Signed-off-by: NMathieu Malaterre <malat@debian.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Babu Moger <babu.moger@oracle.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180606194232.17653-1-malat@debian.org

81bd415c

mtd: nand: Fix nanddev_pos_next_page() kernel-doc header · 98ee3fc7

由 Boris Brezillon 提交于 11月 06, 2018

Function name is wrong in the kernel-doc header.

Fixes: 9c3736a3 ("mtd: nand: Add core infrastructure to deal with NAND devices")
Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: NMiquel Raynal <miquel.raynal@bootlin.com>

98ee3fc7

06 11月, 2018 8 次提交

include/linux/compiler*.h: define asm_volatile_goto · 8bd66d14

由 ndesaulniers@google.com 提交于 10月 31, 2018

asm_volatile_goto should also be defined for other compilers that support
asm goto.

Fixes commit 815f0ddb ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive").
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

8bd66d14

HID: fix up .raw_event() documentation · aa9b760c

由 Linus Walleij 提交于 11月 04, 2018

The documentation for the .raw_event() callback says that if the
driver return 1, there will be no further processing of the event,
but this is not true, the actual code in hid-core.c looks like this:

  if (hdrv && hdrv->raw_event && hid_match_report(hid, report)) {
           ret = hdrv->raw_event(hid, report, data, size);
           if (ret < 0)
                   goto unlock;
   }

   ret = hid_report_raw_event(hid, type, data, size, interrupt);

The only return value that has any effect on the processing is
a negative error.

Correct this as it seems to confuse people: I found bogus code in
the Razer out-of-tree driver attempting to return 1 here.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

aa9b760c

M
XArray: Fix Documentation · 804dfaf0
由 Matthew Wilcox 提交于 11月 05, 2018
```
Minor fixes.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>
```
804dfaf0

XArray: Add xa_store_bh() and xa_store_irq() · 84e5acb7

由 Matthew Wilcox 提交于 10月 26, 2018

These convenience wrappers disable interrupts while taking the spinlock.
A number of drivers would otherwise have to open-code these functions.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>

84e5acb7

XArray: Turn xa_erase into an exported function · 9c16bb88

由 Matthew Wilcox 提交于 11月 05, 2018

Make xa_erase() take the spinlock and then call __xa_erase(), but make
it out of line since it's such a common function.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>

9c16bb88

XArray: Unify xa_cmpxchg and __xa_cmpxchg · c5beb07e

由 Matthew Wilcox 提交于 10月 31, 2018

xa_cmpxchg() was one of the largest functions in the xarray
implementation.  By turning it into a wrapper and having the callers
take the lock (like several other functions), we save 160 bytes on a
tinyconfig build and reduce the duplication in xarray.c.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>

c5beb07e

XArray: Regularise xa_reserve · 4c0608f4

由 Matthew Wilcox 提交于 10月 30, 2018

The xa_reserve() function was a little unusual in that it attempted to
be callable for all kinds of locking scenarios. Make it look like the
other APIs with __xa_reserve, xa_reserve_bh and xa_reserve_irq variants.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>

4c0608f4

compiler: remove __no_sanitize_address_or_inline again · 163c8d54

由 Martin Schwidefsky 提交于 11月 05, 2018

The __no_sanitize_address_or_inline and __no_kasan_or_inline defines
are almost identical. The only difference is that __no_kasan_or_inline
does not have the 'notrace' attribute.

To be able to replace __no_sanitize_address_or_inline with the older
definition, add 'notrace' to __no_kasan_or_inline and change to two
users of __no_sanitize_address_or_inline in the s390 code.

The 'notrace' option is necessary for e.g. the __load_psw_mask function
in arch/s390/include/asm/processor.h. Without the option it is possible
to trace __load_psw_mask which leads to kernel stack overflow.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Pointed-out-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

163c8d54

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功