提交 · bd657aa3dd8514e62486ce7f90b5e484c18d684d · openeuler / Kernel

08 7月, 2020 1 次提交

x86/cpufeatures: Add Architectural LBRs feature bit · bd657aa3

由 Kan Liang 提交于 7月 03, 2020

CPUID.(EAX=07H, ECX=0):EDX[19] indicates whether an Intel CPU supports
Architectural LBRs.

The "X86_FEATURE_..., word 18" is already mirrored from CPUID
"0x00000007:0 (EDX)". Add X86_FEATURE_ARCH_LBR under the "word 18"
section.

The feature will appear as "arch_lbr" in /proc/cpuinfo.

The Architectural Last Branch Records (LBR) feature enables recording
of software path history by logging taken branches and other control
flows. The feature will be supported in the perf_events subsystem.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NDave Hansen <dave.hansen@intel.com>
Link: https://lkml.kernel.org/r/1593780569-62993-2-git-send-email-kan.liang@linux.intel.com

bd657aa3

02 7月, 2020 5 次提交

perf/x86: Keep LBR records unchanged in host context for guest usage · e1ad1ac2

由 Like Xu 提交于 6月 13, 2020

When a guest wants to use the LBR registers, its hypervisor creates a guest
LBR event and let host perf schedules it. The LBR records msrs are
accessible to the guest when its guest LBR event is scheduled on
by the perf subsystem.

Before scheduling this event out, we should avoid host changes on
IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise, some unexpected branch
operations may interfere with guest behavior, pollute LBR records, and even
cause host branches leakage. In addition, the read operation
on host is also avoidable.

To ensure that guest LBR records are not lost during the context switch,
the guest LBR event would enable the callstack mode which could
save/restore guest unread LBR records with the help of
intel_pmu_lbr_sched_task() naturally.

However, the guest LBR_SELECT may changes for its own use and the host
LBR event doesn't save/restore it. To ensure that we doesn't lost the guest
LBR_SELECT value when the guest LBR event is running, the vlbr_constraint
is bound up with a new constraint flag PERF_X86_EVENT_LBR_SELECT.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200514083054.62538-6-like.xu@linux.intel.com

e1ad1ac2

perf/x86: Add constraint to create guest LBR event without hw counter · 097e4311

由 Like Xu 提交于 6月 13, 2020

The hypervisor may request the perf subsystem to schedule a time window
to directly access the LBR records msrs for its own use. Normally, it would
create a guest LBR event with callstack mode enabled, which is scheduled
along with other ordinary LBR events on the host but in an exclusive way.

To avoid wasting a counter for the guest LBR event, the perf tracks its
hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR
counter with the help of new vlbr_constraint. As with the BTS event,
there is actually no hardware counter assigned for the guest LBR event.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200514083054.62538-5-like.xu@linux.intel.com

097e4311

perf/x86/lbr: Add interface to get LBR information · b2d65047

由 Like Xu 提交于 6月 13, 2020

The LBR records msrs are model specific. The perf subsystem has already
obtained the base addresses of LBR records based on the cpu model.

Therefore, an interface is added to allow callers outside the perf
subsystem to obtain these LBR information. It's useful for hypervisors
to emulate the LBR feature for guests with less code.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-4-like.xu@linux.intel.com

b2d65047

perf/x86/core: Refactor hw->idx checks and cleanup · 027440b5

由 Like Xu 提交于 6月 13, 2020

For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx
and make them sorted by probability: gp,fixed,bts,others.

Clean up the x86_assign_hw_event() by converting multiple if-else
statements to a switch statement.

To skip x86_perf_event_update() and x86_perf_event_set_period(),
it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with
'!hwc->event_base' because that should be 0 for all non-gp/fixed cases.

Wrap related bit operations into intel_set/clear_masks() and make the main
path more cleaner and readable.

No functional changes.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Original-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-3-like.xu@linux.intel.com

027440b5

perf/x86: Fix variable types for LBR registers · 3cb9d546

由 Wei Wang 提交于 6月 13, 2020

The MSR variable type can be 'unsigned int', which uses less memory than
the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't
be a negative number, so make it 'unsigned int' as well.
Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-2-like.xu@linux.intel.com

3cb9d546

28 6月, 2020 1 次提交

Revert "ARM: sti: Implement dummy L2 cache's write_sec" · 0f77ce26

由 Patrice Chotard 提交于 6月 18, 2020

This reverts commit 7b8e0188.

Initially, STiH410-B2260 was supposed to be secured, that's why
l2c_write_sec was stubbed to avoid secure register access from
non secure world.

But by default, STiH410-B2260 is running in non secure mode,
so L2 cache register accesses are authorized, l2c_write_sec stub
is not needed.

With this patch, L2 cache is configured and performance are enhanced.

Link: https://lore.kernel.org/r/20200618172456.29475-1-patrice.chotard@st.comSigned-off-by: NPatrice Chotard <patrice.chotard@st.com>
Cc: Alain Volmat <alain.volmat@st.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

0f77ce26

26 6月, 2020 7 次提交

arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page · 10d5e97c

由 Christoph Hellwig 提交于 6月 25, 2020

Use PAGE_KERNEL_ROX directly instead of allocating RWX and setting the
page read-only just after the allocation.

Link: http://lkml.kernel.org/r/20200618064307.32739-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10d5e97c

x86/hyperv: allocate the hypercall page with only read and execute bits · 800e26b8

由 Christoph Hellwig 提交于 6月 25, 2020

Patch series "fix a hyperv W^X violation and remove vmalloc_exec"

Dexuan reported a W^X violation due to the fact that the hyper hypercall
page due switching it to be allocated using vmalloc_exec.

The problem is that PAGE_KERNEL_EXEC as used by vmalloc_exec actually
sets writable permissions in the pte.  This series fixes the issue by
switching to the low-level __vmalloc_node_range interface that allows
specifing more detailed permissions instead.  It then also open codes
the other two callers and removes the somewhat confusing vmalloc_exec
interface.

Peter noted that the hyper hypercall page allocation also has another
long standing issue in that it shouldn't use the full vmalloc but just
the module space.  This issue is so far theoretical as the allocation is
done early in the boot process.  I plan to fix it with another bigger
series for 5.9.

This patch (of 3):

Avoid a W^X violation cause by the fact that PAGE_KERNEL_EXEC includes
the writable bit.

For this resurrect the removed PAGE_KERNEL_RX definition, but as
PAGE_KERNEL_ROX to match arm64 and powerpc.

Link: http://lkml.kernel.org/r/20200618064307.32739-2-hch@lst.de
Fixes: 78bb17f7 ("x86/hyperv: use vmalloc_exec for the hypercall page")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NDexuan Cui <decui@microsoft.com>
Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: NWei Liu <wei.liu@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

800e26b8

openrisc: fix boot oops when DEBUG_VM is enabled · 313a5257

由 Stafford Horne 提交于 6月 25, 2020

Since v5.8-rc1 OpenRISC Linux fails to boot when DEBUG_VM is enabled.
This has been bisected to commit 42fc5414 ("mmap locking API: add
mmap_assert_locked() and mmap_assert_write_locked()").

The added locking checks exposed the issue that OpenRISC was not taking
this mmap lock when during page walks for DMA operations.  This patch
locks and unlocks the mmap lock for page walking.

Link: http://lkml.kernel.org/r/20200617090247.1680188-1-shorne@gmail.com
Fixes: 42fc5414 ("mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()"
Signed-off-by: NStafford Horne <shorne@gmail.com>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Steven Price <steven.price@arm.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

313a5257

riscv: Fixup __vdso_gettimeofday broke dynamic ftrace · e05d57dc

由 Guo Ren 提交于 6月 23, 2020

For linux-5.8-rc1, enable ftrace of riscv will cause boot panic:

[    2.388980] Run /sbin/init as init process
[    2.529938] init[39]: unhandled signal 4 code 0x1 at 0x0000003ff449e000
[    2.531078] CPU: 0 PID: 39 Comm: init Not tainted 5.8.0-rc1-dirty #13
[    2.532719] epc: 0000003ff449e000 ra : 0000003ff449e954 sp : 0000003fffedb900
[    2.534005]  gp : 00000000000e8528 tp : 0000003ff449d800 t0 : 000000000000001e
[    2.534965]  t1 : 000000000000000a t2 : 0000003fffedb89e s0 : 0000003fffedb920
[    2.536279]  s1 : 0000003fffedb940 a0 : 0000003ff43d4b2c a1 : 0000000000000000
[    2.537334]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : fffffffffbad8000
[    2.538466]  a5 : 0000003ff449e93a a6 : 0000000000000000 a7 : 0000000000000000
[    2.539511]  s2 : 0000000000000000 s3 : 0000003ff448412c s4 : 0000000000000010
[    2.541260]  s5 : 0000000000000016 s6 : 00000000000d0a30 s7 : 0000003fffedba70
[    2.542152]  s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000003fffedb960
[    2.543335]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000003fffedb8a0
[    2.544471]  t5 : 0000000000000000 t6 : 0000000000000000
[    2.545730] status: 0000000000004020 badaddr: 00000000464c457f cause: 0000000000000002
[    2.549867] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[    2.551267] CPU: 0 PID: 1 Comm: init Not tainted 5.8.0-rc1-dirty #13
[    2.552061] Call Trace:
[    2.552626] [<ffffffe00020374a>] walk_stackframe+0x0/0xc4
[    2.553486] [<ffffffe0002039f4>] show_stack+0x40/0x4c
[    2.553995] [<ffffffe00054a6ae>] dump_stack+0x7a/0x98
[    2.554615] [<ffffffe00020b9b8>] panic+0x114/0x2f4
[    2.555395] [<ffffffe00020ebd6>] do_exit+0x89c/0x8c2
[    2.555949] [<ffffffe00020f930>] do_group_exit+0x3a/0x90
[    2.556715] [<ffffffe000219e08>] get_signal+0xe2/0x6e6
[    2.557388] [<ffffffe000202d72>] do_notify_resume+0x6a/0x37a
[    2.558089] [<ffffffe000201c16>] ret_from_exception+0x0/0xc

"ra:0x3ff449e954" is the return address of "call _mcount" in the
prologue of __vdso_gettimeofday(). Without proper relocate, pc jmp
to 0x0000003ff449e000 (vdso map base) with a illegal instruction
trap.

The solution comes from arch/arm64/kernel/vdso/Makefile:

CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS)

 - CC_FLAGS_SCS is ShadowCallStack feature in Clang and only
   implemented for arm64, no use for riscv.

Fixes: ad5d1122 ("riscv: use vDSO common flow to reduce the latency of the time-related functions")
Cc: stable@vger.kernel.org
Signed-off-by: NGuo Ren <guoren@linux.alibaba.com>
Reviewed-by: NVincent Chen <vincent.chen@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>

e05d57dc

riscv: Add extern declarations for vDSO time-related functions · e93b327d

由 Vincent Chen 提交于 6月 23, 2020

Add extern declarations for vDSO time-related functions to notify the
compiler these functions will be used in somewhere to avoid
"no previous prototype" compile warning.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NVincent Chen <vincent.chen@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>

e93b327d

riscv: Add -fPIC option to CFLAGS_vgettimeofday.o · a0fc3b32

由 Vincent Chen 提交于 6月 23, 2020

The time related vDSO functions use a variable, vdso_data, to access the
vDSO data page to get the system time information. Because the vdso_data
for CFLAGS_vgettimeofday.o is an external variable defined in vdso.o,
the CFLAGS_vgettimeofday.o should be compiled with -fPIC to ensure
that vdso_data is addressable.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NVincent Chen <vincent.chen@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>

a0fc3b32

arm64: Add KRYO{3,4}XX silver CPU cores to SSB safelist · 108447fd

由 Sai Prakash Ranjan 提交于 6月 25, 2020

QCOM KRYO{3,4}XX silver/LITTLE CPU cores are based on
Cortex-A55 and are SSB safe, hence add them to SSB
safelist -> arm64_ssb_cpus[].
Reported-by: NStephen Boyd <swboyd@chromium.org>
Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: NDouglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200625103123.7240-1-saiprakash.ranjan@codeaurora.orgSigned-off-by: NWill Deacon <will@kernel.org>

108447fd

25 6月, 2020 4 次提交

arm64: perf: Report the PC value in REGS_ABI_32 mode · 8dfe804a

由 Jiping Ma 提交于 5月 11, 2020

A 32-bit perf querying the registers of a compat task using REGS_ABI_32
will receive zeroes from w15, when it expects to find the PC.

Return the PC value for register dwarf register 15 when returning register
values for a compat task to perf.

Cc: <stable@vger.kernel.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NJiping Ma <jiping.ma2@windriver.com>
Link: https://lore.kernel.org/r/1589165527-188401-1-git-send-email-jiping.ma2@windriver.com
[will: Shuffled code and added a comment]
Signed-off-by: NWill Deacon <will@kernel.org>

8dfe804a

x86/entry: Fix #UD vs WARN more · 145a773a

由 Peter Zijlstra 提交于 6月 16, 2020

vmlinux.o: warning: objtool: exc_invalid_op()+0x47: call to probe_kernel_read() leaves .noinstr.text section

Since we use UD2 as a short-cut for 'CALL __WARN', treat it as such.
Have the bare exception handler do the report_bug() thing.

Fixes: 15a416e8 ("x86/entry: Treat BUG/WARN as NMI-like entries")
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200622114713.GE577403@hirez.programming.kicks-ass.net

145a773a

x86/entry: Increase entry_stack size to a full page · c7aadc09

由 Peter Zijlstra 提交于 6月 17, 2020

Marco crashed in bad_iret with a Clang11/KCSAN build due to
overflowing the stack. Now that we run C code on it, expand it to a
full page.
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Reported-by: NMarco Elver <elver@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com>
Tested-by: NMarco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200618144801.819246178@infradead.org

c7aadc09

x86/entry: Fixup bad_iret vs noinstr · e3a9e681

由 Peter Zijlstra 提交于 6月 17, 2020

vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section

Worse, when KASAN there is no telling what memcpy() actually is. Force
the use of __memcpy() which is our assmebly implementation.
Reported-by: NMarco Elver <elver@google.com>
Suggested-by: NMarco Elver <elver@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NMarco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200618144801.760070502@infradead.org

e3a9e681

24 6月, 2020 6 次提交

arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist · f4617be3

由 Sai Prakash Ranjan 提交于 6月 24, 2020

QCOM KRYO{3,4}XX silver/LITTLE CPU cores are based on Cortex-A55
and are meltdown safe, hence add them to kpti_safe_list[].
Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/20200624123406.3472-1-saiprakash.ranjan@codeaurora.orgSigned-off-by: NWill Deacon <will@kernel.org>

f4617be3

arm64: Don't insert a BTI instruction at inner labels · 2d21889f

由 Jean-Philippe Brucker 提交于 6月 24, 2020

Some ftrace features are broken since commit 714a8d02 ("arm64: asm:
Override SYM_FUNC_START when building the kernel with BTI"). For example
the function_graph tracer:

$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
[   36.107016] WARNING: CPU: 0 PID: 115 at kernel/trace/ftrace.c:2691 ftrace_modify_all_code+0xc8/0x14c

When ftrace_modify_graph_caller() attempts to write a branch at
ftrace_graph_call, it finds the "BTI J" instruction inserted by
SYM_INNER_LABEL() instead of a NOP, and aborts.

It turns out we don't currently need the BTI landing pads inserted by
SYM_INNER_LABEL:

* ftrace_call and ftrace_graph_call are only used for runtime patching
  of the active tracer. The patched code is not reached from a branch.
* install_el2_stub is reached from a CBZ instruction, which doesn't
  change PSTATE.BTYPE.
* __guest_exit is reached from B instructions in the hyp-entry vectors,
  which aren't subject to BTI checks either.

Remove the BTI annotation from SYM_INNER_LABEL.

Fixes: 714a8d02 ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: NMark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200624112253.1602786-1-jean-philippe@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org>

2d21889f

arm64: vdso: Don't use gcc plugins for building vgettimeofday.c · e56404e8

由 Alexander Popov 提交于 6月 24, 2020

Don't use gcc plugins for building arch/arm64/kernel/vdso/vgettimeofday.c
to avoid unneeded instrumentation.
Signed-off-by: NAlexander Popov <alex.popov@linux.com>
Link: https://lore.kernel.org/r/20200624123330.83226-4-alex.popov@linux.comSigned-off-by: NWill Deacon <will@kernel.org>

e56404e8

arm64: vdso: Only pass --no-eh-frame-hdr when linker supports it · 49a3b0e1

由 Will Deacon 提交于 6月 24, 2020

Commit 87676cfc ("arm64: vdso: Disable dwarf unwinding through the
sigreturn trampoline") unconditionally passes the '--no-eh-frame-hdr'
option to the linker when building the native vDSO in an attempt to
prevent generation of the .eh_frame_hdr section, the presence of which
has been implicated in segfaults originating from the libgcc unwinder.

Unfortunately, not all versions of binutils support this option, which
has been shown to cause build failures in linux-next:

  |   CALL    scripts/atomic/check-atomics.sh
  |   CALL    scripts/checksyscalls.sh
  |   LD      arch/arm64/kernel/vdso/vdso.so.dbg
  | ld: unrecognized option '--no-eh-frame-hdr'
  | ld: use the --help option for usage information
  | arch/arm64/kernel/vdso/Makefile:64: recipe for target
  | 'arch/arm64/kernel/vdso/vdso.so.dbg' failed
  | make[1]: *** [arch/arm64/kernel/vdso/vdso.so.dbg] Error 1
  | arch/arm64/Makefile:175: recipe for target 'vdso_prepare' failed
  | make: *** [vdso_prepare] Error 2

Only link the vDSO with '--no-eh-frame-hdr' when the linker supports it.
If we end up with the section due to linker defaults, the absence of CFI
information in the sigreturn trampoline will prevent the unwinder from
breaking.

Link: https://lore.kernel.org/r/7a7e31a8-9a7b-2428-ad83-2264f20bdc2d@hisilicon.com
Fixes: 87676cfc ("arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline")
Reported-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
Tested-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NWill Deacon <will@kernel.org>

49a3b0e1

ARM: imx6: add missing put_device() call in imx6q_suspend_init() · 48454460

由 yu kuai 提交于 6月 04, 2020

if of_find_device_by_node() succeed, imx6q_suspend_init() doesn't have a
corresponding put_device(). Thus add a jump target to fix the exception
handling for this function implementation.
Signed-off-by: Nyu kuai <yukuai3@huawei.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

48454460

ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram() · 586745f1

由 yu kuai 提交于 6月 04, 2020

if of_find_device_by_node() succeed, imx_suspend_alloc_ocram() doesn't
have a corresponding put_device(). Thus add a jump target to fix the
exception handling for this function implementation.

Fixes: 1579c7b9 ("ARM: imx53: Set DDR pins to high impedance when in suspend to RAM.")
Signed-off-by: Nyu kuai <yukuai3@huawei.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

586745f1

23 6月, 2020 16 次提交

arm64: Depend on newer binutils when building PAC · 4dc9b282

由 Mark Brown 提交于 6月 19, 2020

Versions of binutils prior to 2.33.1 don't understand the ELF notes that
are added by modern compilers to indicate the PAC and BTI options used
to build the code. This causes them to emit large numbers of warnings in
the form:

aarch64-linux-gnu-nm: warning: .tmp_vmlinux.kallsyms2: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000

during the kernel build which is currently causing quite a bit of
disruption for automated build testing using clang.

In commit 15cd0e67 (arm64: Kconfig: ptrauth: Add binutils version
check to fix mismatch) we added a dependency on binutils to avoid this
issue when building with versions of GCC that emit the notes but did not
do so for clang as it was believed that the existing check for
.cfi_negate_ra_state was already requiring a new enough binutils. This
does not appear to be the case for some versions of binutils (eg, the
binutils in Debian 10) so instead refactor so we require a new enough
GNU binutils in all cases other than when we are using an old GCC
version that does not emit notes.

Other, more exotic, combinations of tools are possible such as using
clang, lld and gas together are possible and may have further problems
but rather than adding further version checks it looks like the most
robust thing will be to just test that we can build cleanly with the
configured tools but that will require more review and discussion so do
this for now to address the immediate problem disrupting build testing.
Reported-by: NKernelCI <bot@kernelci.org>
Reported-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1054
Link: https://lore.kernel.org/r/20200619123550.48098-1-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

4dc9b282

arm64: compat: Remove 32-bit sigreturn code from the vDSO · 2d071968

由 Will Deacon 提交于 6月 22, 2020

The sigreturn code in the compat vDSO is unused. Remove it.
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

2d071968

arm64: compat: Always use sigpage for sigreturn trampoline · 8e411be6

由 Will Deacon 提交于 6月 22, 2020

The 32-bit sigreturn trampoline in the compat sigpage matches the binary
representation of the arch/arm/ sigpage exactly. This is important for
debuggers (e.g. GDB) and unwinders (e.g. libunwind) since they rely
on matching the instruction sequence in order to identify that they are
unwinding through a signal. The same cannot be said for the sigreturn
trampoline in the compat vDSO, which defeats the unwinder heuristics and
instead attempts to use unwind directives for the unwinding. This is in
contrast to arch/arm/, which never uses the vDSO for sigreturn.

Ensure compatibility with arch/arm/ and existing unwinders by always
using the sigpage for the sigreturn trampoline, regardless of the
presence of the compat vDSO.
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

8e411be6

arm64: compat: Allow 32-bit vdso and sigpage to co-exist · a39060b0

由 Will Deacon 提交于 6月 22, 2020

In preparation for removing the signal trampoline from the compat vDSO,
allow the sigpage and the compat vDSO to co-exist.

For the moment the vDSO signal trampoline will still be used when built.
Subsequent patches will move to the sigpage consistently.
Acked-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

a39060b0

arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline · 87676cfc

由 Will Deacon 提交于 6月 22, 2020

Commit 7e9f5e66 ("arm64: vdso: Add --eh-frame-hdr to ldflags") results
in a .eh_frame_hdr section for the vDSO, which in turn causes the libgcc
unwinder to unwind out of signal handlers using the .eh_frame information
populated by our .cfi directives. In conjunction with a4eb355a
("arm64: vdso: Fix CFI directives in sigreturn trampoline"), this has
been shown to cause segmentation faults originating from within the
unwinder during thread cancellation:

 | Thread 14 "virtio-net-rx" received signal SIGSEGV, Segmentation fault.
 | 0x0000000000435e24 in uw_frame_state_for ()
 | (gdb) bt
 | #0  0x0000000000435e24 in uw_frame_state_for ()
 | #1  0x0000000000436e88 in _Unwind_ForcedUnwind_Phase2 ()
 | #2  0x00000000004374d8 in _Unwind_ForcedUnwind ()
 | #3  0x0000000000428400 in __pthread_unwind (buf=<optimized out>) at unwind.c:121
 | #4  0x0000000000429808 in __do_cancel () at ./pthreadP.h:304
 | #5  sigcancel_handler (sig=32, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:200
 | #6  sigcancel_handler (sig=<optimized out>, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:165
 | #7  <signal handler called>
 | #8  futex_wait_cancelable (private=0, expected=0, futex_word=0x3890b708) at ../sysdeps/unix/sysv/linux/futex-internal.h:88

After considerable bashing of heads, it appears that our CFI directives
for unwinding out of the sigreturn trampoline are only processed by libgcc
when both a .eh_frame_hdr section is present *and* the mysterious NOP is
covered by an entry in .eh_frame. With both of these now in place, it has
highlighted that our CFI directives are not comprehensive enough to
restore the stack pointer of the interrupted context. This results in libgcc
falling back to an arm64-specific unwinder after computing a bogus PC value
from the unwind tables. The unwinder promptly dereferences this bogus address
in an attempt to see if the pointed-to instruction sequence looks like
the sigreturn trampoline.

Restore the old unwind behaviour, which relied solely on heuristics in
the unwinder, by removing the .eh_frame_hdr section from the vDSO and
commenting out the insufficient CFI directives for now. Add comments to
explain the current, miserable state of affairs.

Cc: Tamas Zsoldos <tamas.zsoldos@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Acked-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Reported-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

87676cfc

s390/debug: avoid kernel warning on too large number of pages · 827c4913

由 Christian Borntraeger 提交于 3月 31, 2020

When specifying insanely large debug buffers a kernel warning is
printed. The debug code does handle the error gracefully, though.
Instead of duplicating the check let us silence the warning to
avoid crashes when panic_on_warn is used.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

827c4913

s390/kasan: fix early pgm check handler execution · 998f5bbe

由 Vasily Gorbik 提交于 6月 17, 2020

Currently if early_pgm_check_handler is called it ends up in pgm check
loop. The problem is that early_pgm_check_handler is instrumented by
KASAN but executed without DAT flag enabled which leads to addressing
exception when KASAN checks try to access shadow memory.

Fix that by executing early handlers with DAT flag on under KASAN as
expected.
Reported-and-tested-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

998f5bbe

s390: fix system call single stepping · e64a1618

由 Sven Schnelle 提交于 6月 17, 2020

When single stepping an svc instruction on s390, the kernel is entered
with a PER program check interruption. The program check handler than
jumps to the system call handler by reloading the PSW. The code didn't
set GPR13 to the thread pointer in struct task_struct. This made the
kernel access invalid memory while trying to fetch the syscall function
address. Fix this by always assigned GPR13 after .Lsysc_per.

Fixes: 0b0ed657 ("s390: remove critical section cleanup from entry.S")
Reported-and-tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NSven Schnelle <svens@linux.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

e64a1618

KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru · e4553b49

由 Sean Christopherson 提交于 6月 16, 2020

Remove vcpu_vmx.host_pkru, which got left behind when PKRU support was
moved to common x86 code.

No functional change intended.

Fixes: 37486135 ("KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200617034123.25647-1-sean.j.christopherson@intel.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e4553b49

KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling · 26769f96

由 Marcelo Tosatti 提交于 6月 16, 2020

The Linux TSC calibration procedure is subject to small variations
(its common to see +-1 kHz difference between reboots on a given CPU, for example).

So migrating a guest between two hosts with identical processor can fail, in case
of a small variation in calibrated TSC between them.

Without TSC scaling, the current kernel interface will either return an error
(if user_tsc_khz <= tsc_khz) or enable TSC catchup mode.

This change enables the following TSC tolerance check to
accept KVM_SET_TSC_KHZ within tsc_tolerance_ppm (which is 250ppm by default).

        /*
         * Compute the variation in TSC rate which is acceptable
         * within the range of tolerance and decide if the
         * rate being applied is within that bounds of the hardware
         * rate.  If so, no scaling or compensation need be done.
         */
        thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
        thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
        if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
                pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
                use_scaling = 1;
        }

NTP daemon in the guest can correct this difference (NTP can correct upto 500ppm).
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

Message-Id: <20200616114741.GA298183@fuller.cnet>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

26769f96

KVM: X86: Fix MSR range of APIC registers in X2APIC mode · bf10bd0b

由 Xiaoyao Li 提交于 6月 16, 2020

Only MSR address range 0x800 through 0x8ff is architecturally reserved
and dedicated for accessing APIC registers in x2APIC mode.

Fixes: 0105d1a5 ("KVM: x2apic interface to lapic")
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200616073307.16440-1-xiaoyao.li@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bf10bd0b

ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain · d22a16cc

由 Frieder Schrempf 提交于 5月 28, 2020

The WDOG_ANY signal is connected to the RESET_IN signal of the SoM
and baseboard. It is currently configured as push-pull, which means
that if some external device like a programmer wants to assert the
RESET_IN signal by pulling it to ground, it drives against the high
level WDOG_ANY output of the SoC.

To fix this we set the WDOG_ANY signal to open-drain configuration.
That way we make sure that the RESET_IN can be asserted by the
watchdog as well as by external devices.

Fixes: 1ea4b76c ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: NFrieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

d22a16cc

ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM · 04a2c051

由 Frieder Schrempf 提交于 5月 28, 2020

The watchdog's WDOG_ANY signal is used to trigger a POR of the SoC,
if a soft reset is issued. As the SoM hardware connects the WDOG_ANY
and the POR signals, the watchdog node itself and the pin
configuration should be part of the common SoM devicetree.
Let's move it from the baseboard's devicetree to its proper place.

Fixes: 1ea4b76c ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: NFrieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

04a2c051

KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL · bf09fb6c

由 Sean Christopherson 提交于 6月 22, 2020

Remove support for context switching between the guest's and host's
desired UMWAIT_CONTROL. Propagating the guest's value to hardware isn't
required for correct functionality, e.g. KVM intercepts reads and writes
to the MSR, and the latency effects of the settings controlled by the
MSR are not architecturally visible.

As a general rule, KVM should not allow the guest to control power
management settings unless explicitly enabled by userspace, e.g. see
KVM_CAP_X86_DISABLE_EXITS. E.g. Intel's SDM explicitly states that C0.2
can improve the performance of SMT siblings. A devious guest could
disable C0.2 so as to improve the performance of their workloads at the
detriment to workloads running in the host or on other VMs.

Wholesale removal of UMWAIT_CONTROL context switching also fixes a race
condition where updates from the host may cause KVM to enter the guest
with the incorrect value. Because updates are are propagated to all
CPUs via IPI (SMP function callback), the value in hardware may be
stale with respect to the cached value and KVM could enter the guest
with the wrong value in hardware. As above, the guest can't observe the
bad value, but it's a weird and confusing wart in the implementation.

Removal also fixes the unnecessary usage of VMX's atomic load/store MSR
lists. Using the lists is only necessary for MSRs that are required for
correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on
old hardware, or for MSRs that need to-the-uop precision, e.g. perf
related MSRs. For UMWAIT_CONTROL, the effects are only visible in the
kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in
vcpu_vmx_run(). Using the atomic lists is undesirable as they are more
expensive than direct RDMSR/WRMSR.

Furthermore, even if giving the guest control of the MSR is legitimate,
e.g. in pass-through scenarios, it's not clear that the benefits would
outweigh the overhead. E.g. saving and restoring an MSR across a VMX
roundtrip costs ~250 cycles, and if the guest diverged from the host
that cost would be paid on every run of the guest. In other words, if
there is a legitimate use case then it should be enabled by a new
per-VM capability.

Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can
correctly expose other WAITPKG features to the guest, e.g. TPAUSE,
UMWAIT and UMONITOR.

Fixes: 6e3ba4ab ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
Cc: stable@vger.kernel.org
Cc: Jingqi Liu <jingqi.liu@intel.com>
Cc: Tao Xu <tao3.xu@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623005135.10414-1-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bf09fb6c

KVM: nVMX: Plumb L2 GPA through to PML emulation · 2dbebf7a

由 Sean Christopherson 提交于 6月 22, 2020

Explicitly pass the L2 GPA to kvm_arch_write_log_dirty(), which for all
intents and purposes is vmx_write_pml_buffer(), instead of having the
latter pull the GPA from vmcs.GUEST_PHYSICAL_ADDRESS. If the dirty bit
update is the result of KVM emulation (rare for L2), then the GPA in the
VMCS may be stale and/or hold a completely unrelated GPA.

Fixes: c5f983f6 ("nVMX: Implement emulated Page Modification Logging")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-2-sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2dbebf7a

KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic() · 312d16c7

由 Vitaly Kuznetsov 提交于 6月 22, 2020

translate_gpa() returns a GPA, assigning it to 'real_gfn' seems obviously
wrong. There is no real issue because both 'gpa_t' and 'gfn_t' are u64 and
we don't use the value in 'real_gfn' as a GFN, we do

 real_gfn = gpa_to_gfn(real_gfn);

instead. 'If you see a "buffalo" sign on an elephant's cage, do not trust
your eyes', but let's fix it for good.

No functional change intended.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200622151435.752560-1-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

312d16c7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功