提交 · ace6485a03266cc3c198ce8e927a1ce0ce139699 · openeuler / Kernel

25 10月, 2018 2 次提交

x86/cpufeatures: Enumerate MOVDIR64B instruction · ace6485a

由 Fenghua Yu 提交于 10月 24, 2018

MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity.
Direct store is implemented by using write combining (WC) for writing
data directly into memory without caching the data.

In low latency offload (e.g. Non-Volatile Memory, etc), MOVDIR64B writes
work descriptors (and data in some cases) to device-hosted work-queues
atomically without cache pollution.

Availability of the MOVDIR64B instruction is indicated by the
presence of the CPUID feature flag MOVDIR64B (CPUID.0x07.0x0:ECX[bit 28]).

Please check the latest Intel Architecture Instruction Set Extensions
and Future Features Programming Reference for more details on the CPUID
feature MOVDIR64B flag.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1540418237-125817-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ace6485a

x86/cpufeatures: Enumerate MOVDIRI instruction · 33823f4d

由 Fenghua Yu 提交于 10月 24, 2018

MOVDIRI moves doubleword or quadword from register to memory through
direct store which is implemented by using write combining (WC) for
writing data directly into memory without caching the data.

Programmable agents can handle streaming offload (e.g. high speed packet
processing in network). Hardware implements a doorbell (tail pointer)
register that is updated by software when adding new work-elements to
the streaming offload work-queue.

MOVDIRI can be used as the doorbell write which is a 4-byte or 8-byte
uncachable write to MMIO. MOVDIRI has lower overhead than other ways
to write the doorbell.

Availability of the MOVDIRI instruction is indicated by the presence of
the CPUID feature flag MOVDIRI(CPUID.0x07.0x0:ECX[bit 27]).

Please check the latest Intel Architecture Instruction Set Extensions
and Future Features Programming Reference for more details on the CPUID
feature MOVDIRI flag.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1540418237-125817-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

33823f4d

22 10月, 2018 3 次提交

s390/kasan: support preemptible kernel build · cf3dbe5d

由 Vasily Gorbik 提交于 10月 19, 2018

When the kernel is built with:
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
"stfle" function used by kasan initialization code makes additional
call to preempt_count_add/preempt_count_sub. To avoid removing kasan
instrumentation from sched code where those functions leave split stfle
function and provide __stfle variant without preemption handling to be
used by Kasan.
Reported-by: NBenjamin Block <bblock@linux.ibm.com>
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cf3dbe5d

x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry() · 977e4be5

由 Christophe Leroy 提交于 10月 20, 2018

The following commit:

  d7880812 ("idle: Add the stack canary init to cpu_startup_entry()")

... added an x86 specific boot_init_stack_canary() call to the generic
cpu_startup_entry() as a temporary hack, with the intention to remove
the #ifdef CONFIG_X86 later.

More than 5 years later let's finally realize that plan! :-)

While implementing stack protector support for PowerPC, we found
that calling boot_init_stack_canary() is also needed for PowerPC
which uses per task (TLS) stack canary like the X86.

However, calling boot_init_stack_canary() would break architectures
using a global stack canary (ARM, SH, MIPS and XTENSA).

Instead of modifying the #ifdef CONFIG_X86 to an even messier:

   #if defined(CONFIG_X86) || defined(CONFIG_PPC)

PowerPC implemented the call to boot_init_stack_canary() in the function
calling cpu_startup_entry().

Let's try the same cleanup on the x86 side as well.

On x86 we have two functions calling cpu_startup_entry():

 - start_secondary()
 - cpu_bringup_and_idle()

start_secondary() already calls boot_init_stack_canary(), so
it's good, and this patch adds the call to boot_init_stack_canary()
in cpu_bringup_and_idle().

I.e. now x86 catches up to the rest of the world and the ugly init
sequence in init/main.c can be removed from cpu_startup_entry().

As a final benefit we can also remove the <linux/stackprotector.h>
dependency from <linux/sched.h>.

[ mingo: Improved the changelog a bit, added language explaining x86 borkage and sched.h change. ]
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20181020072649.5B59310483E@pc16082vm.idsi0.si.c-s.frSigned-off-by: NIngo Molnar <mingo@kernel.org>

977e4be5

kprobes/x86: Use preempt_enable() in optimized_callback() · 2e62024c

由 Masami Hiramatsu 提交于 10月 20, 2018

The following commit:

  a19b2e3d ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”)

removed local_irq_save/restore() from optimized_callback(), the handler
might be interrupted by the rescheduling interrupt and might be
rescheduled - so we must not use the preempt_enable_no_resched() macro.

Use preempt_enable() instead, to not lose preemption events.

[ mingo: Improved the changelog. ]
Reported-by: NNadav Amit <namit@vmware.com>
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dwmw@amazon.co.uk
Fixes: a19b2e3d ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”)
Link: http://lkml.kernel.org/r/154002887331.7627.10194920925792947001.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>

2e62024c

21 10月, 2018 1 次提交

x86/mm: Kill stray kernel fault handling comment · 16204142

由 Dave Hansen 提交于 10月 19, 2018

I originally had matching user and kernel comments, but the kernel
one got improved.  Some errant conflict resolution kicked the commment
somewhere wrong.  Kill it.
Reported-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: aa37c51b ("x86/mm: Break out user address space handling")
Link: http://lkml.kernel.org/r/20181019140842.12F929FA@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

16204142

19 10月, 2018 5 次提交

arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work · 4debef55

由 James Morse 提交于 9月 21, 2018

enable_smccc_arch_workaround_1() passes NULL as the hyp_vecs start and
end if the HVC conduit is in use, and ARM_SMCCC_ARCH_WORKAROUND_1 is
detected.

If the guest kernel happened to be built with KVM_INDIRECT_VECTORS,
we go on to allocate a slot, memcpy() the empty workaround in and
do the appropriate cache maintenance.

This works as we always tell memcpy() the range is 0, so it never
accesses the NULL src pointer, but we still do the cache maintenance.

If hyp_vecs_start is NULL we know we're a guest, just update the fn
like the !KVM_INDIRECT_VECTORS version.
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4debef55

x86/intel_rdt: Prevent pseudo-locking from using stale pointers · b61b8bba

由 Jithu Joseph 提交于 10月 12, 2018

When the last CPU in an rdt_domain goes offline, its rdt_domain struct gets
freed. Current pseudo-locking code is unaware of this scenario and tries to
dereference the freed structure in a few places.

Add checks to prevent pseudo-locking code from doing this.

While further work is needed to seamlessly restore resource groups (not
just pseudo-locking) to their configuration when the domain is brought back
online, the immediate issue of invalid pointers is addressed here.

Fixes: f4e80d67 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information")
Fixes: 443810fe ("x86/intel_rdt: Create debugfs files for pseudo-locking testing")
Fixes: 746e0859 ("x86/intel_rdt: Create character device exposing pseudo-locked region")
Fixes: 33dc3e41 ("x86/intel_rdt: Make CPU information accessible for pseudo-locked regions")
Signed-off-by: NJithu Joseph <jithu.joseph@intel.com>
Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: gavin.hindman@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/231f742dbb7b00a31cc104416860e27dba6b072d.1539384145.git.reinette.chatre@intel.com

b61b8bba

s390/perf: Return error when debug_register fails · ec0c0bb4

由 Thomas Richter 提交于 10月 15, 2018

Return an error when the function debug_register() fails allocating
the debug handle.
Also remove the registered debug handle when the initialization fails
later on.
Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ec0c0bb4

x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels · 485734f3

由 Christoph Hellwig 提交于 10月 14, 2018

We already build the swiotlb code for 32-bit kernels with PAE support,
but the code to actually use swiotlb has only been enabled for 64-bit
kernels for an unknown reason.

Before Linux v4.18 we paper over this fact because the networking code,
the SCSI layer and some random block drivers implemented their own
bounce buffering scheme.

[ mingo: Changelog fixes. ]

Fixes: 21e07dba ("scsi: reduce use of block bounce buffers")
Fixes: ab74cfeb ("net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma")
Reported-by: NMatthew Whitehead <tedheadster@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NMatthew Whitehead <tedheadster@gmail.com>
Cc: konrad.wilk@oracle.com
Cc: iommu@lists.linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181014075208.2715-1-hch@lst.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

485734f3

ubd: remove use of blk_rq_map_sg · ecb0a83e

由 Christoph Hellwig 提交于 10月 18, 2018

There is no good reason to create a scatterlist in the ubd driver,
it can just iterate the request directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[rw: Folded in improvements as discussed with hch and jens]
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ecb0a83e

18 10月, 2018 4 次提交

kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack · c2712b85

由 Steven Rostedt (VMware) 提交于 10月 17, 2018

Andy had some concerns about using regs_get_kernel_stack_nth() in a new
function regs_get_kernel_argument() as if there's any error in the stack
code, it could cause a bad memory access. To be on the safe side, call
probe_kernel_read() on the stack address to be extra careful in accessing
the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added
to just return the stack address (or NULL if not on the stack), that will be
used to find the address (and could be used by other functions) and read the
address with kernel_probe_read().
Requested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>

c2712b85

sparc: vDSO: Silence an uninitialized variable warning · 62d6f3b7

由 Dan Carpenter 提交于 10月 13, 2018

Smatch complains that "val" would be uninitialized if kstrtoul() fails.

Fixes: 9a08862a ("vDSO for sparc")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62d6f3b7

sparc: Fix syscall fallback bugs in VDSO. · 776ca154

由 David S. Miller 提交于 10月 17, 2018

First, the trap number for 32-bit syscalls is 0x10.

Also, only negate the return value when syscall error is indicated by
the carry bit being set.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

776ca154

x86/mcelog: Remove one mce_helper definition · 711f76a3

由 Sebastian Andrzej Siewior 提交于 10月 17, 2018

Commit

  5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver")

moved the old interface into one file including mce_helper definition as
static and "extern". Remove one.

Fixes: 5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Tony Luck <tony.luck@intel.com>
CC: linux-edac <linux-edac@vger.kernel.org>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181017170554.18841-3-bigeasy@linutronix.de

711f76a3

17 10月, 2018 5 次提交

x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context... · 2224d616

由 Sebastian Andrzej Siewior 提交于 10月 16, 2018

x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU

Booting an i486 with "no387 nofxsr" ends with with the following crash:

   math_emulate: 0060:c101987d
   Kernel panic - not syncing: Math emulation needed in kernel

on the first context switch in user land.

The reason is that copy_fpregs_to_fpstate() tries FNSAVE which does not work
as the FPU is turned off.

This bug was introduced in:

  f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")

Add a check for X86_FEATURE_FPU before trying to save FPU registers (we
have such a check in switch_fpu_finish() already).
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")
Link: http://lkml.kernel.org/r/20181016202525.29437-4-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

2224d616

x86/fpu: Remove second definition of fpu in __fpu__restore_sig() · 6aa67676

由 Sebastian Andrzej Siewior 提交于 10月 16, 2018

Commit:

  c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")

introduced the 'fpu' variable at top of __restore_xstate_sig(),
which now shadows the other definition:

  arch/x86/kernel/fpu/signal.c:318:28: warning: symbol 'fpu' shadows an earlier one
  arch/x86/kernel/fpu/signal.c:271:20: originally declared here

Remove the shadowed definition of 'fpu', as the two definitions are the same.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")
Link: http://lkml.kernel.org/r/20181016202525.29437-3-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

6aa67676

x86/entry/64: Further improve paranoid_entry comments · ae852495

由 Andy Lutomirski 提交于 10月 14, 2018

Commit:

  16561f27 ("x86/entry: Add some paranoid entry/exit CR3 handling comments")

... added some comments.  This improves them a bit:

 - When I first read the new comments, it was unclear to me whether
   they were referring to the case where paranoid_entry interrupted
   other entry code or where paranoid_entry was itself interrupted.
   Clarify it.

 - Remove the EBX comment.  We no longer use EBX as a SWAPGS
   indicator.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c47daa1888dc2298e7e1d3f82bd76b776ea33393.1539542111.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

ae852495

x86/entry/32: Clear the CS high bits · 04f4f954

由 Jan Kiszka 提交于 10月 15, 2018

Even if not on an entry stack, the CS's high bits must be
initialized because they are unconditionally evaluated in
PARANOID_EXIT_TO_KERNEL_MODE.

Failing to do so broke the boot on Galileo Gen2 and IOT2000 boards.

 [ bp: Make the commit message tone passive and impartial. ]

Fixes: b92a165d ("x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack")
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NJoerg Roedel <jroedel@suse.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andrea Arcangeli <aarcange@redhat.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Brian Gerst <brgerst@gmail.com>
CC: Dave Hansen <dave.hansen@intel.com>
CC: David Laight <David.Laight@aculab.com>
CC: Denys Vlasenko <dvlasenk@redhat.com>
CC: Eduardo Valentin <eduval@amazon.com>
CC: Greg KH <gregkh@linuxfoundation.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Juergen Gross <jgross@suse.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Will Deacon <will.deacon@arm.com>
CC: aliguori@amazon.com
CC: daniel.gruss@iaik.tugraz.at
CC: hughd@google.com
CC: keescook@google.com
CC: linux-mm <linux-mm@kvack.org>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/f271c747-1714-5a5b-a71f-ae189a093b8d@siemens.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

04f4f954

x86/kconfig: Remove redundant 'default n' lines from all x86 Kconfig's · b3569d3a

由 Bartlomiej Zolnierkiewicz 提交于 10月 16, 2018

'default n' is the default value for any bool or tristate Kconfig
setting so there is no need to write it explicitly.

Also, since commit:

  f467c564 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols")

... the Kconfig behavior is the same regardless of 'default n' being present or not:

    ...
    One side effect of (and the main motivation for) this change is making
    the following two definitions behave exactly the same:

        config FOO
                bool

        config FOO
                bool
                default n

    With this change, neither of these will generate a
    '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
    That might make it clearer to people that a bare 'default n' is
    redundant.
    ...
Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.co>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20181016134217eucas1p2102984488b89178a865162553369025b%7EeGpI5NlJo0851008510eucas1p2D@eucas1p2.samsung.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b3569d3a

16 10月, 2018 9 次提交

ataflop: fold headers into C file · 3e6b8c3c

由 Omar Sandoval 提交于 10月 11, 2018

atafd.h and atafdreg.h are only used from ataflop.c, so merge them in
there.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3e6b8c3c

locking/qspinlock, x86: Provide liveness guarantee · 7aa54be2

由 Peter Zijlstra 提交于 9月 26, 2018

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -> (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -> (0,0,0)

 4)					lock
					  trylock -> (0,0,1)

 5)			  tas-pending -> (0,1,1)
			  load-val <- (0,1,0) from 3

 6)			  clear-pending-set-locked -> (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.
Suggested-by: NWill Deacon <will.deacon@arm.com>
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

7aa54be2

x86/asm: 'Simplify' GEN_*_RMWcc() macros · 288e4521

由 Peter Zijlstra 提交于 10月 03, 2018

Currently the GEN_*_RMWcc() macros include a return statement, which
pretty much mandates we directly wrap them in a (inline) function.

Macros with return statements are tricky and, as per the above, limit
use, so remove the return statement and make them
statement-expressions. This allows them to be used more widely.

Also, shuffle the arguments a bit. Place the @cc argument as 3rd, this
makes it consistent between UNARY and BINARY, but more importantly, it
makes the @arg0 argument last.

Since the @arg0 argument is now last, we can do CPP trickery and make
it an optional argument, simplifying the users; 17 out of 18
occurences do not need this argument.

Finally, change to asm symbolic names, instead of the numeric ordering
of operands, which allows us to get rid of __BINARY_RMWcc_ARG and get
cleaner code overall.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: JBeulich@suse.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@linux.intel.com
Link: https://lkml.kernel.org/r/20181003130957.108960094@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

288e4521

perf/x86/intel: Export mem events only if there's PEBS support · d4ae5529

由 Jiri Olsa 提交于 9月 06, 2018

Memory events depends on PEBS support and access to LDLAT MSR, but we
display them in /sys/devices/cpu/events even if the CPU does not
provide those, like for KVM guests.

That brings the false assumption that those events should be
available, while they fail event to open.

Separating the mem-* events attributes and merging them with
cpu_events only if there's PEBS support detected.

We could also check if LDLAT MSR is available, but the PEBS check
seems to cover the need now.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180906135748.GC9577@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>

d4ae5529

arm64: cpufeature: Trap CTR_EL0 access only where it is necessary · 4afe8e79

由 Suzuki K Poulose 提交于 10月 09, 2018

When there is a mismatch in the CTR_EL0 field, we trap
access to CTR from EL0 on all CPUs to expose the safe
value. However, we could skip trapping on a CPU which
matches the safe value.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4afe8e79

arm64: cpufeature: Fix handling of CTR_EL0.IDC field · 1602df02

由 Suzuki K Poulose 提交于 10月 09, 2018

CTR_EL0.IDC reports the data cache clean requirements for instruction
to data coherence. However, if the field is 0, we need to check the
CLIDR_EL1 fields to detect the status of the feature. Currently we
don't do this and generate a warning with tainting the kernel, when
there is a mismatch in the field among the CPUs. Also the userspace
doesn't have a reliable way to check the CLIDR_EL1 register to check
the status.

This patch fixes the problem by checking the CLIDR_EL1 fields, when
(CTR_EL0.IDC == 0) and updates the kernel's copy of the CTR_EL0 for
the CPU with the actual status of the feature. This would allow the
sanity check infrastructure to do the proper checking of the fields
and also allow the CTR_EL0 emulation code to supply the real status
of the feature.

Now, if a CPU has raw CTR_EL0.IDC == 0 and effective IDC == 1 (with
overall system wide IDC == 1), we need to expose the real value to
the user. So, we trap CTR_EL0 access on the CPU which reports incorrect
CTR_EL0.IDC.

Fixes: commit 6ae4b6e0 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1602df02

arm64: cpufeature: ctr: Fix cpu capability check for late CPUs · 8ab66cbe

由 Suzuki K Poulose 提交于 10月 09, 2018

The matches() routine for a capability must honor the "scope"
passed to it and return the proper results.
i.e, when passed with SCOPE_LOCAL_CPU, it should check the
status of the capability on the current CPU. This is used by
verify_local_cpu_capabilities() on a late secondary CPU to make
sure that it's compliant with the established system features.
However, ARM64_HAS_CACHE_{IDC/DIC} always checks the system wide
registers and this could mean that a late secondary CPU could return
"true" (since the CPU hasn't updated the system wide registers yet)
and thus lead the system in an inconsistent state, where
the system assumes it has IDC/DIC feature, while the new CPU
doesn't.

Fixes: commit 6ae4b6e0 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8ab66cbe

parisc: Fix uninitialized variable usage in unwind.c · cf8afe5c

由 Helge Deller 提交于 10月 16, 2018

As noticed by Dave Anglin, the last commit introduced a small bug where
the potentially uninitialized r struct is used instead of the regs
pointer as input for unwind_frame_init(). Fix it.
Signed-off-by: NHelge Deller <deller@gmx.de>
Reported-by: NJohn David Anglin <dave.anglin@bell.net>

cf8afe5c

Revert "sparc: Convert to using %pOFn instead of device_node.name" · a06ecbfe

由 David S. Miller 提交于 10月 15, 2018

This reverts commit 0b9871a3.

Causes crashes with qemu, interacts badly with commit commit
6d0a70a2 ("vsprintf: print OF node name using full_name")
etc.
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a06ecbfe

15 10月, 2018 4 次提交

s390/sthyi: Fix machine name validity indication · b5130dc2

由 Janosch Frank 提交于 10月 02, 2018

When running as a level 3 guest with no host provided sthyi support
sclp_ocf_cpc_name_copy() will only return zeroes. Zeroes are not a
valid group name, so let's not indicate that the group name field is
valid.

Also the group name is not dependent on stsi, let's not return based
on stsi before setting it.

Fixes: 95ca2cb5 ("KVM: s390: Add sthyi emulation")
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b5130dc2

sparc64: Set %l4 properly on trap return after handling signals. · d1f1f98c

由 David S. Miller 提交于 10月 14, 2018

If we did some signal processing, we have to reload the pt_regs
tstate register because it's value may have changed.

In doing so we also have to extract the %pil value contained in there
anre load that into %l4.

This value is at bit 20 and thus needs to be shifted down before we
later write it into the %pil register.

Most of the time this is harmless as we are returning to userspace
and the %pil is zero for that case.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1f1f98c

sparc64: Make proc_id signed. · b3e1eb8e

由 David S. Miller 提交于 10月 14, 2018

So that when it is unset, ie. '-1', userspace can see it
properly.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3e1eb8e

um: Convert ubd driver to blk-mq · 4e6da0fe

由 Richard Weinberger 提交于 11月 26, 2017

Convert the driver to the modern blk-mq framework.
As byproduct we get rid of our open coded restart logic and let
blk-mq handle it.
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4e6da0fe

14 10月, 2018 5 次提交

x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS · dca5203e

由 Nathan Chancellor 提交于 10月 12, 2018

When compiling the kernel with Clang, this warning appears even though
it is disabled for the whole kernel because this folder has its own set
of KBUILD_CFLAGS. It was disabled before the beginning of git history.

In file included from arch/x86/boot/compressed/kaslr.c:29:
In file included from arch/x86/boot/compressed/misc.h:21:
In file included from ./include/linux/elf.h:5:
In file included from ./arch/x86/include/asm/elf.h:77:
In file included from ./arch/x86/include/asm/vdso.h:11:
In file included from ./include/linux/mm_types.h:9:
In file included from ./include/linux/spinlock.h:88:
In file included from ./arch/x86/include/asm/spinlock.h:43:
In file included from ./arch/x86/include/asm/qrwlock.h:6:
./include/asm-generic/qrwlock.h:101:53: warning: passing 'u32 *' (aka
'unsigned int *') to parameter of type 'int *' converts between pointers
to integer types with different sign [-Wpointer-sign]
        if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED)))
                                                           ^~~~~
./include/linux/compiler.h:76:40: note: expanded from macro 'likely'
# define likely(x)      __builtin_expect(!!(x), 1)
                                            ^
./include/asm-generic/atomic-instrumented.h:69:66: note: passing
argument to parameter 'old' here
static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
                                                                 ^
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20181013010713.6999-1-natechancellor@gmail.com

dca5203e

x86/time: Correct the attribute on jiffies' definition · 53c13ba8

由 Nathan Chancellor 提交于 10月 12, 2018

Clang warns that the declaration of jiffies in include/linux/jiffies.h
doesn't match the definition in arch/x86/time/kernel.c:

arch/x86/kernel/time.c:29:42: warning: section does not match previous declaration [-Wsection]
__visible volatile unsigned long jiffies __cacheline_aligned = INITIAL_JIFFIES;
                                         ^
./include/linux/cache.h:49:4: note: expanded from macro '__cacheline_aligned'
                 __section__(".data..cacheline_aligned")))
                 ^
./include/linux/jiffies.h:81:31: note: previous attribute is here
extern unsigned long volatile __cacheline_aligned_in_smp __jiffy_arch_data jiffies;
                              ^
./arch/x86/include/asm/cache.h:20:2: note: expanded from macro '__cacheline_aligned_in_smp'
        __page_aligned_data
        ^
./include/linux/linkage.h:39:29: note: expanded from macro '__page_aligned_data'
#define __page_aligned_data     __section(.data..page_aligned) __aligned(PAGE_SIZE)
                                ^
./include/linux/compiler_attributes.h:233:56: note: expanded from macro '__section'
#define __section(S)                    __attribute__((__section__(#S)))
                                                       ^
1 warning generated.

The declaration was changed in commit 7c30f352 ("jiffies.h: declare
jiffies and jiffies_64 with ____cacheline_aligned_in_smp") but wasn't
updated here. Make them match so Clang no longer warns.

Fixes: 7c30f352 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181013005311.28617-1-natechancellor@gmail.com

53c13ba8

x86/entry: Add some paranoid entry/exit CR3 handling comments · 16561f27

由 Dave Hansen 提交于 10月 12, 2018

Andi Kleen was just asking me about the NMI CR3 handling and why
we restore it unconditionally.  I was *sure* we had documented it
well.  We did not.

Add some documentation.  We have common entry code where the CR3
value is stashed, but three places in two big code paths where we
restore it.  I put bulk of the comments in this common path and
then refer to it from the other spots.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: luto@kernel.org
Cc: bp@alien8.de
Cc: "H. Peter Anvin" <hpa@zytor.come
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20181012232118.3EAAE77B@viggo.jf.intel.com

16561f27

x86/percpu: Fix this_cpu_read() · b59167ac

由 Peter Zijlstra 提交于 10月 11, 2018

Eric reported that a sequence count loop using this_cpu_read() got
optimized out. This is wrong, this_cpu_read() must imply READ_ONCE()
because the interface is IRQ-safe, therefore an interrupt can have
changed the per-cpu value.

Fixes: 7c3576d2 ("[PATCH] i386: Convert PDA into the percpu section")
Reported-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NEric Dumazet <edumazet@google.com>
Cc: hpa@zytor.com
Cc: eric.dumazet@gmail.com
Cc: bp@alien8.de
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181011104019.748208519@infradead.org

b59167ac

x86/tsc: Force inlining of cyc2ns bits · 4907c68a

由 Peter Zijlstra 提交于 10月 11, 2018

Looking at the asm for native_sched_clock() I noticed we don't inline
enough. Mostly caused by sharing code with cyc2ns_read_begin(), which
we didn't used to do. So mark all that __force_inline to make it DTRT.

Fixes: 59eaef78 ("x86/tsc: Remodel cyc2ns to use seqcount_latch()")
Reported-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: eric.dumazet@gmail.com
Cc: bp@alien8.de
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181011104019.695196158@infradead.org

4907c68a

13 10月, 2018 2 次提交

KVM: vmx: hyper-v: don't pass EPT configuration info to vmx_hv_remote_flush_tlb() · 5f8bb004

由 Vitaly Kuznetsov 提交于 10月 11, 2018

I'm observing random crashes in multi-vCPU L2 guests running on KVM on
Hyper-V. I bisected the issue to the commit 877ad952 ("KVM: vmx: Add
tlb_remote_flush callback support"). Hyper-V TLFS states:

"AddressSpace specifies an address space ID (an EPT PML4 table pointer)"

So apparently, Hyper-V doesn't expect us to pass naked EPTP, only PML4
pointer should be used. Strip off EPT configuration information before
calling into vmx_hv_remote_flush_tlb().

Fixes: 877ad952 ("KVM: vmx: Add tlb_remote_flush callback support")
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5f8bb004

sparc: Throttle perf events properly. · 455adb31

由 David S. Miller 提交于 10月 12, 2018

Like x86 and arm, call perf_sample_event_took() in perf event
NMI interrupt handler.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

455adb31

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功