提交 · a2e164e7f45ab21742b2e32c0195b699ae2ebfc0 · openeuler / Kernel

17 3月, 2018 6 次提交

x86/kvm/hyper-v: add reenlightenment MSRs support · a2e164e7

由 Vitaly Kuznetsov 提交于 3月 01, 2018

Nested Hyper-V/Windows guest running on top of KVM will use TSC page
clocksource in two cases:
- L0 exposes invariant TSC (CPUID.80000007H:EDX[8]).
- L0 provides Hyper-V Reenlightenment support (CPUID.40000003H:EAX[13]).

Exposing invariant TSC effectively blocks migration to hosts with different
TSC frequencies, providing reenlightenment support will be needed when we
start migrating nested workloads.

Implement rudimentary support for reenlightenment MSRs. For now, these are
just read/write MSRs with no effect.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a2e164e7

KVM: x86: Update the exit_qualification access bits while walking an address · ddd6f0e9

由 KarimAllah Ahmed 提交于 2月 28, 2018

... to avoid having a stale value when handling an EPT misconfig for MMIO
regions.

MMIO regions that are not passed-through to the guest are handled through
EPT misconfigs. The first time a certain MMIO page is touched it causes an
EPT violation, then KVM marks the EPT entry to cause an EPT misconfig
instead. Any subsequent accesses to the entry will generate an EPT
misconfig.

Things gets slightly complicated with nested guest handling for MMIO
regions that are not passed through from L0 (i.e. emulated by L0
user-space).

An EPT violation for one of these MMIO regions from L2, exits to L0
hypervisor. L0 would then look at the EPT12 mapping for L1 hypervisor and
realize it is not present (or not sufficient to serve the request). Then L0
injects an EPT violation to L1. L1 would then update its EPT mappings. The
EXIT_QUALIFICATION value for L1 would come from exit_qualification variable
in "struct vcpu". The problem is that this variable is only updated on EPT
violation and not on EPT misconfig. So if an EPT violation because of a
read happened first, then an EPT misconfig because of a write happened
afterwards. The L0 hypervisor will still contain exit_qualification value
from the previous read instead of the write and end up injecting an EPT
violation to the L1 hypervisor with an out of date EXIT_QUALIFICATION.

The EPT violation that is injected from L0 to L1 needs to have the correct
EXIT_QUALIFICATION specially for the access bits because the individual
access bits for MMIO EPTs are updated only on actual access of this
specific type. So for the example above, the L1 hypervisor will keep
updating only the read bit in the EPT then resume the L2 guest. The L2
guest would end up causing another exit where the L0 *again* will inject
another EPT violation to L1 hypervisor with *again* an out of date
exit_qualification which indicates a read and not a write. Then this
ping-pong just keeps happening without making any forward progress.

The behavior of mapping MMIO regions changed in:

   commit a340b3e2 ("kvm: Map PFN-type memory regions as writable (if possible)")

... where an EPT violation for a read would also fixup the write bits to
avoid another EPT violation which by acciddent would fix the bug mentioned
above.

This commit fixes this situation and ensures that the access bits for the
exit_qualifcation is up to date. That ensures that even L1 hypervisor
running with a KVM version before the commit mentioned above would still
work.

( The description above assumes EPT to be available and used by L1
  hypervisor + the L1 hypervisor is passing through the MMIO region to the L2
  guest while this MMIO region is emulated by the L0 user-space ).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ddd6f0e9

KVM: x86: Make enum conversion explicit in kvm_pdptr_read() · 1df372f4

由 Matthias Kaehlcke 提交于 2月 26, 2018

The type 'enum kvm_reg_ex' is an extension of 'enum kvm_reg', however
the extension is only semantical and the compiler doesn't know about the
relationship between the two types. In kvm_pdptr_read() a value of the
extended type is passed to kvm_x86_ops->cache_reg(), which expects a
value of the base type. Clang raises the following warning about the
type mismatch:

arch/x86/kvm/kvm_cache_regs.h:44:32: warning: implicit conversion from
  enumeration type 'enum kvm_reg_ex' to different enumeration type
  'enum kvm_reg' [-Wenum-conversion]
    kvm_x86_ops->cache_reg(vcpu, VCPU_EXREG_PDPTR);

Cast VCPU_EXREG_PDPTR to 'enum kvm_reg' to make the compiler happy.
Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
Reviewed-by: NGuenter Roeck <groeck@chromium.org>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1df372f4

KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use · 0bcc3fb9

由 Vitaly Kuznetsov 提交于 2月 09, 2018

Devices which use level-triggered interrupts under Windows 2016 with
Hyper-V role enabled don't work: Windows disables EOI broadcast in SPIV
unconditionally. Our in-kernel IOAPIC implementation emulates an old IOAPIC
version which has no EOI register so EOI never happens.

The issue was discovered and discussed a while ago:
https://www.spinics.net/lists/kvm/msg148098.html

While this is a guest OS bug (it should check that IOAPIC has the required
capabilities before disabling EOI broadcast) we can workaround it in KVM:
advertising DIRECTED_EOI with in-kernel IOAPIC makes little sense anyway.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0bcc3fb9

KVM: x86: Add support for AMD Core Perf Extension in guest · c51eb52b

由 Janakarajan Natarajan 提交于 2月 05, 2018

Add support for AMD Core Performance counters in the guest. The base
event select and counter MSRs are changed. In addition, with the core
extension, there are 2 extra counters available for performance
measurements for a total of 6.

With the new MSRs, the logic to map them to the gp_counters[] is changed.
New functions are added to check the validity of the get/set MSRs.

If the guest has the X86_FEATURE_PERFCTR_CORE cpuid flag set, the number
of counters available to the vcpu is set to 6. It the flag is not set
then it is 4.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
[Squashed "Expose AMD Core Perf Extension flag to guests" - Radim.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c51eb52b

x86/msr: Add AMD Core Perf Extension MSRs · e84b7119

由 Janakarajan Natarajan 提交于 2月 05, 2018

Add the EventSelect and Counter MSRs for AMD Core Perf Extension.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e84b7119

08 3月, 2018 1 次提交

KVM: nVMX: Enforce NMI controls on vmentry of L2 guests · 0c7f650e

由 Krish Sadhukhan 提交于 2月 20, 2018

According to Intel SDM 26.2.1.1, the following rules should be enforced
on vmentry:

 *  If the "NMI exiting" VM-execution control is 0, "Virtual NMIs"
    VM-execution control must be 0.
 *  If the “virtual NMIs” VM-execution control is 0, the “NMI-window
    exiting” VM-execution control must be 0.

This patch enforces these rules when entering an L2 guest.
Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0c7f650e

07 3月, 2018 8 次提交

KVM: nVMX: expose VMX capabilities for nested hypervisors to userspace · 1389309c

由 Paolo Bonzini 提交于 2月 26, 2018

Use the new MSR feature framework to tell userspace which VMX capabilities
are available for nested hypervisors. Before, these were only accessible
with the KVM_GET_MSR VCPU ioctl, after VCPUs had been created.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1389309c

KVM: nVMX: introduce struct nested_vmx_msrs · 6677f3da

由 Paolo Bonzini 提交于 2月 26, 2018

Move the MSRs to a separate struct, so that we can introduce a global
instance and return it from the /dev/kvm KVM_GET_MSRS ioctl.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

6677f3da

KVM: X86: Don't use PV TLB flush with dedicated physical CPUs · 6beacf74

由 Wanpeng Li 提交于 2月 13, 2018

vCPUs are very unlikely to get preempted when they are the only task
running on a CPU.  PV TLB flush is slower that the native flush in that
case, so disable it.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

6beacf74

KVM: X86: Choose qspinlock when dedicated physical CPUs are available · b2798ba0

由 Wanpeng Li 提交于 2月 13, 2018

Waiman Long mentioned that:
> Generally speaking, unfair lock performs well for VMs with a small
> number of vCPUs. Native qspinlock may perform better than pvqspinlock
> if there is vCPU pinning and there is no vCPU over-commitment.

This patch uses the KVM_HINTS_DEDICATED performance hint, which is
provided by the hypervisor admin, to choose the qspinlock algorithm
when a dedicated physical CPU is available.

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is Hybrid PV queued/unfair lock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b2798ba0

KVM: Introduce paravirtualization hints and KVM_HINTS_DEDICATED · a4429e53

由 Wanpeng Li 提交于 2月 13, 2018

This patch introduces kvm_para_has_hint() to query for hints about
the configuration of the guests.  The first hint KVM_HINTS_DEDICATED,
is set if the guest has dedicated physical CPUs for each vCPU (i.e.
pinning and no over-commitment).  This allows optimizing spinlocks
and tells the guest to avoid PV TLB flush.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a4429e53

KVM: x86: KVM_CAP_SYNC_REGS · 01643c51

由 Ken Hofsass 提交于 1月 31, 2018

This commit implements an enhanced x86 version of S390
KVM_CAP_SYNC_REGS functionality. KVM_CAP_SYNC_REGS "allow[s]
userspace to access certain guest registers without having
to call SET/GET_*REGS”. This reduces ioctl overhead which
is particularly important when userspace is making synchronous
guest state modifications (e.g. when emulating and/or intercepting
instructions).

Originally implemented upstream for the S390, the x86 differences
follow:
- userspace can select the register sets to be synchronized with kvm_run
using bit-flags in the kvm_valid_registers and kvm_dirty_registers
fields.
- vcpu_events is available in addition to the regs and sregs register
sets.
Signed-off-by: NKen Hofsass <hofsass@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[Removed wrapper around check for reserved kvm_valid_regs. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

01643c51

kvm: x86: hyperv: guest->host event signaling via eventfd · faeb7833

由 Roman Kagan 提交于 2月 01, 2018

In Hyper-V, the fast guest->host notification mechanism is the
SIGNAL_EVENT hypercall, with a single parameter of the connection ID to
signal.

Currently this hypercall incurs a user exit and requires the userspace
to decode the parameters and trigger the notification of the potentially
different I/O context.

To avoid the costly user exit, process this hypercall and signal the
corresponding eventfd in KVM, similar to ioeventfd.  The association
between the connection id and the eventfd is established via the newly
introduced KVM_HYPERV_EVENTFD ioctl, and maintained in an
(srcu-protected) IDR.
Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[asm/hyperv.h changes approved by KY Srinivasan. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

faeb7833

kvm: x86: factor out kvm.arch.hyperv (de)init · cbc0236a

由 Roman Kagan 提交于 2月 01, 2018

Move kvm.arch.hyperv initialization and cleanup to separate functions.

For now only a mutex is inited in the former, and the latter is empty;
more stuff will go in there in a followup patch.
Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

cbc0236a

02 3月, 2018 11 次提交

parisc: Reduce irq overhead when run in qemu · 636a415b

由 Helge Deller 提交于 2月 12, 2018

When run under QEMU, calling mfctl(16) creates some overhead because the
qemu timer has to be scaled and moved into the register. This patch
reduces the number of calls to mfctl(16) by moving the calls out of the
loops.

Additionally, increase the minimal time interval to 8000 cycles instead
of 500 to compensate possible QEMU delays when delivering interrupts.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+

636a415b

parisc: Use cr16 interval timers unconditionally on qemu · 5ffa8518

由 Helge Deller 提交于 1月 12, 2018

When running on qemu we know that the (emulated) cr16 cpu-internal
clocks are syncronized. So let's use them unconditionally on qemu.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+

5ffa8518

parisc: Check if secondary CPUs want own PDC calls · 0ed1fe4a

由 Helge Deller 提交于 1月 12, 2018

The architecture specification says (for 64-bit systems): PDC is a per
processor resource, and operating system software must be prepared to
manage separate pointers to PDCE_PROC for each processor. The address
of PDCE_PROC for the monarch processor is stored in the Page Zero
location MEM_PDC. The address of PDCE_PROC for each non-monarch
processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ.

Currently we still use one PDC for all CPUs, but in case we face a
machine which is following the specification let's warn about it.
Signed-off-by: NHelge Deller <deller@gmx.de>

0ed1fe4a

parisc: Hide virtual kernel memory layout · fd8d0ca2

由 Helge Deller 提交于 1月 12, 2018

For security reasons do not expose the virtual kernel memory layout to
userspace.
Signed-off-by: NHelge Deller <deller@gmx.de>
Suggested-by: NKees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # 4.15
Reviewed-by: NKees Cook <keescook@chromium.org>

fd8d0ca2

parisc: Fix ordering of cache and TLB flushes · 0adb24e0

由 John David Anglin 提交于 2月 27, 2018

The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls.  The problem is some drivers call these routines with
interrupts disabled.  Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work.  This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts.  When interrupts are disabled, we now drop into slower code.

The attached change fixes the ordering of cache and TLB flushes in
several cases.  When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush.  We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries.  The same is true for
tmpalias region flushes.

The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.

Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range().  Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.

Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation.  So far, testing indicates that this is the case.  I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code.  This increases the
probability of stalls.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: NHelge Deller <deller@gmx.de>

0adb24e0

sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE · 1b1e4ee8

由 Masahiro Yamada 提交于 2月 20, 2018

If CONFIG_USE_BUILTIN_DTB is enabled, but CONFIG_BUILTIN_DTB_SOURCE
is empty (for example, allmodconfig), it fails to build, like this:

  make[2]: *** No rule to make target 'arch/sh/boot/dts/.dtb.o',
  needed by 'arch/sh/boot/dts/built-in.o'.  Stop.

Surround obj-y with ifneq ... endif.

I replaced $(CONFIG_USE_BUILTIN_DTB) with 'y' since this is always
the case from the following code from arch/sh/Makefile:

  core-$(CONFIG_USE_BUILTIN_DTB)  += arch/sh/boot/dts/
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

1b1e4ee8

KVM: x86: fix vcpu initialization with userspace lapic · b7e31be3

由 Radim Krčmář 提交于 3月 01, 2018

Moving the code around broke this rare configuration.
Use this opportunity to finally call lapic reset from vcpu reset.

Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Fixes: 0b2e9904 ("KVM: x86: move LAPIC initialization after VMCS creation")
Cc: stable@vger.kernel.org
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b7e31be3

KVM: X86: Allow userspace to define the microcode version · 518e7b94

由 Wanpeng Li 提交于 2月 28, 2018

Linux (among the others) has checks to make sure that certain features
aren't enabled on a certain family/model/stepping if the microcode version
isn't greater than or equal to a known good version.

By exposing the real microcode version, we're preventing buggy guests that
don't check that they are running virtualized (i.e., they should trust the
hypervisor) from disabling features that are effectively not buggy.
Suggested-by: NFilippo Sironi <sironi@amazon.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

518e7b94

KVM: X86: Introduce kvm_get_msr_feature() · 66421c1e

由 Wanpeng Li 提交于 2月 28, 2018

Introduce kvm_get_msr_feature() to handle the msrs which are supported
by different vendors and sharing the same emulation logic.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

66421c1e

KVM: SVM: Add MSR-based feature support for serializing LFENCE · d1d93fa9

由 Tom Lendacky 提交于 2月 24, 2018

In order to determine if LFENCE is a serializing instruction on AMD
processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state
of bit 1 checked.  This patch will add support to allow a guest to
properly make this determination.

Add the MSR feature callback operation to svm.c and add MSR 0xc0011029
to the list of MSR-based features.  If LFENCE is serializing, then the
feature is supported, allowing the hypervisor to set the value of the
MSR that guest will see.  Support is also added to write (hypervisor only)
and read the MSR value for the guest.  A write by the guest will result in
a #GP.  A read by the guest will return the value as set by the host.  In
this way, the support to expose the feature to the guest is controlled by
the hypervisor.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d1d93fa9

KVM: x86: Add a framework for supporting MSR-based features · 801e459a

由 Tom Lendacky 提交于 2月 21, 2018

Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

801e459a

01 3月, 2018 7 次提交

x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table · 945fd17a

由 Thomas Gleixner 提交于 2月 28, 2018

The separation of the cpu_entry_area from the fixmap missed the fact that
on 32bit non-PAE kernels the cpu_entry_area mapping might not be covered in
initial_page_table by the previous synchronizations.

This results in suspend/resume failures because 32bit utilizes initial page
table for resume. The absence of the cpu_entry_area mapping results in a
triple fault, aka. insta reboot.

With PAE enabled this works by chance because the PGD entry which covers
the fixmap and other parts incindentally provides the cpu_entry_area
mapping as well.

Synchronize the initial page table after setting up the cpu entry
area. Instead of adding yet another copy of the same code, move it to a
function and invoke it from the various places.

It needs to be investigated if the existing calls in setup_arch() and
setup_per_cpu_areas() can be replaced by the later invocation from
setup_cpu_entry_areas(), but that's beyond the scope of this fix.

Fixes: 92a0f81d ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: NWoody Suwalski <terraluna977@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NWoody Suwalski <terraluna977@gmail.com>
Cc: William Grant <william.grant@canonical.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1802282137290.1392@nanos.tec.linutronix.de

945fd17a

x86/xen: add tty0 and hvc0 as preferred consoles for dom0 · 47b02f4c

由 Juergen Gross 提交于 2月 27, 2018

Today the tty0 and hvc0 consoles are added as a preferred consoles for
pv domUs only. As this requires a boot parameter for getting dom0
messages per default, add them for dom0, too.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>

47b02f4c

ARC: setup cpu possible mask according to possible-cpus dts property · a29a2527

由 Eugeniy Paltsev 提交于 2月 23, 2018

As we have option in u-boot to set CPU mask for running linux,
we want to pass information to kernel about CPU cores should
be brought up. So we patch kernel dtb in u-boot to set
possible-cpus property.

This also allows us to have correctly setuped MCIP debug mask.
Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

a29a2527

ARC: mcip: update MCIP debug mask when the new cpu came online · f3205de9

由 Eugeniy Paltsev 提交于 2月 23, 2018

As of today we use hardcoded MCIP debug mask, so if we launch
kernel via debugger and kick fever cores than HW has all cpus
hang at the momemt of setup MCIP debug mask.

So update MCIP debug mask when the new cpu came online, instead of
use hardcoded MCIP debug mask.
Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

f3205de9

ARC: mcip: halt GFRC counter when ARC cores halt · 07423d00

由 Eugeniy Paltsev 提交于 2月 23, 2018

In SMP systems, GFRC is used for clocksource. However by default the
counter keeps running even when core is halted (say when debugging via a
JTAG debugger). This confuses Linux timekeeping and triggers flase RCU stall
splat such as below:

| [ARCLinux]# while true; do ./shm_open_23-1.run-test ; done
| Running with 1000 processes for 1000 objects
| hrtimer: interrupt took 485060 ns
|
| create_cnt: 1000
| Running with 1000 processes for 1000 objects
| [ARCLinux]# INFO: rcu_preempt self-detected stall on CPU
|       2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
| INFO: rcu_preempt detected stalls on CPUs/tasks:
| 	0-...: (1 GPs behind) idle=71e/0/0 softirq=135264/135264 fqs=0
|	2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
|	3-...: (1 GPs behind) idle=4e0/0/0 softirq=134304/134304 fqs=0
|	(detected by 1, t=13648 jiffies, g=31493, c=31492, q=1)

Starting from ARC HS v3.0 it's possible to tie GFRC to state of up-to 4
ARC cores with help of GFRC's CORE register where we set a mask for
cores which state we need to rely on.

We update cpu mask every time new cpu came online instead of using
hardcoded one or using mask generated from "possible_cpus" as we
want it set correctly even if we run kernel on HW which has fewer cores
than expected (or we launch kernel via debugger and kick fever cores
than HW has)

Note that GFRC halts when all cores have halted and thus relies on
programming of Inter-Core-dEbug register to halt all cores when one
halts.
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]

07423d00

V
ARCv2: boot log: fix HS48 release number · 701eda01
由 Vineet Gupta 提交于 2月 21, 2018
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
701eda01

x86/platform/intel-mid: Handle Intel Edison reboot correctly · 028091f8

由 Sebastian Panceac 提交于 2月 28, 2018

When the Intel Edison module is powered with 3.3V, the reboot command makes
the module stuck.  If the module is powered at a greater voltage, like 4.4V
(as the Edison Mini Breakout board does), reboot works OK.

The official Intel Edison BSP sends the IPCMSG_COLD_RESET message to the
SCU by default. The IPCMSG_COLD_BOOT which is used by the upstream kernel
is only sent when explicitely selected on the kernel command line.

Use IPCMSG_COLD_RESET unconditionally which makes reboot work independent
of the power supply voltage.

[ tglx: Massaged changelog ]

Fixes: bda7b072 ("x86/platform/intel-mid: Implement power off sequence")
Signed-off-by: NSebastian Panceac <sebastian@resin.io>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1519810849-15131-1-git-send-email-sebastian@resin.io

028091f8

28 2月, 2018 7 次提交

x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend · 71c208dd

由 Juergen Gross 提交于 2月 26, 2018

Older Xen versions (4.5 and before) might have problems migrating pv
guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before
suspending zero that MSR and restore it after being resumed.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Link: https://lkml.kernel.org/r/20180226140818.4849-1-jgross@suse.com

71c208dd

x86/asm: Add instruction suffixes to bitops · 22636f8c

由 Jan Beulich 提交于 2月 26, 2018

Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the missing suffixes here. Note that for
64-bit this means some operations change from being 32-bit to 64-bit.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F98702000078001ABACC@prv-mh.provo.novell.com

22636f8c

x86/entry/64: Add instruction suffix · a368d7fd

由 Jan Beulich 提交于 2月 26, 2018

Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the single missing suffix here.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F96902000078001ABAC8@prv-mh.provo.novell.com

a368d7fd

x86/refcounts: Switch to UD2 for exceptions · cb097be7

由 Kees Cook 提交于 2月 25, 2018

As done in commit 3b3a371c ("x86/debug: Use UD2 for WARN()"), this
switches to UD2 from UD0 to keep disassembly readable.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20180225165056.GA11719@beast

cb097be7

ARM: dts: bcm283x: Move arm-pmu out of soc node · 2944866a

由 Stefan Wahren 提交于 2月 24, 2018

The ARM PMU doesn't have a reg address, so fix the following DTC warning
(requires W=1):
Node /soc/arm-pmu missing or empty reg/ranges property
Signed-off-by: NStefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>

2944866a

ARM: dts: bcm283x: Fix unit address of local_intc · 808b7de8

由 Stefan Wahren 提交于 2月 24, 2018

This patch fixes the following DTC warning (requires W=1):
Node /soc/local_intc simple-bus unit address format error, expected "40000000"
Signed-off-by: NStefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>

808b7de8

ARM: dts: NSP: Fix amount of RAM on BCM958625HR · 0a5aff64

由 Florian Fainelli 提交于 2月 26, 2018

Jon attempted to fix the amount of RAM on the BCM958625HR in commit
c53beb47 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
but it seems like we tripped over some poorly documented schematics.

The top-level page of the schematics says the board has 2GB, but when
you end-up scrolling to page 6, you see two chips of 4GBit (512MB) but
what the bootloader really initializes only 512MB, any attempt to use
more than that results in data aborts. Fix this again back to 512MB.

Fixes: c53beb47 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
Acked-by: NJon Mason <jon.mason@broadcom.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>

0a5aff64

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功