- 12 8月, 2017 2 次提交
-
-
由 Wanpeng Li 提交于
watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [warn_test:3089] irq event stamp: 20532 hardirqs last enabled at (20531): [<ffffffff8e9b6908>] restore_regs_and_iret+0x0/0x1d hardirqs last disabled at (20532): [<ffffffff8e9b7ae8>] apic_timer_interrupt+0x98/0xb0 softirqs last enabled at (8266): [<ffffffff8e9badc6>] __do_softirq+0x206/0x4c1 softirqs last disabled at (8253): [<ffffffff8e083918>] irq_exit+0xf8/0x100 CPU: 5 PID: 3089 Comm: warn_test Tainted: G OE 4.13.0-rc3+ #8 RIP: 0010:kvm_mmu_prepare_zap_page+0x72/0x4b0 [kvm] Call Trace: make_mmu_pages_available.isra.120+0x71/0xc0 [kvm] kvm_mmu_load+0x1cf/0x410 [kvm] kvm_arch_vcpu_ioctl_run+0x1316/0x1bf0 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 ? __this_cpu_preempt_check+0x13/0x20 This can be reproduced readily by ept=N and running syzkaller tests since many syzkaller testcases don't setup any memory regions. However, if ept=Y rmode identity map will be created, then kvm_mmu_calculate_mmu_pages() will extend the number of VM's mmu pages to at least KVM_MIN_ALLOC_MMU_PAGES which just hide the issue. I saw the scenario kvm->arch.n_max_mmu_pages == 0 && kvm->arch.n_used_mmu_pages == 1, so there is one active mmu page on the list, kvm_mmu_prepare_zap_page() fails to zap any pages, however prepare_zap_oldest_mmu_page() always returns true. It incurs infinite loop in make_mmu_pages_available() which causes mmu->lock softlockup. This patch fixes it by setting the return value of prepare_zap_oldest_mmu_page() according to whether or not there is mmu page zapped. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Hildenbrand 提交于
Let's reuse the function introduced with eptp switching. We don't explicitly have to check against enable_ept_ad_bits, as this is implicitly done when checking against nested_vmx_ept_caps in valid_ept_address(). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 8月, 2017 3 次提交
-
-
由 Paolo Bonzini 提交于
This is the same as commit 14727754 ("kvm: svm: Add support for additional SVM NPF error codes", 2016-11-23), but for Intel processors. In this case, the exit qualification field's bit 8 says whether the EPT violation occurred while translating the guest's final physical address or rather while translating the guest page tables. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Brijesh Singh 提交于
Commit 14727754 ("kvm: svm: Add support for additional SVM NPF error codes", 2016-11-23) added a new error code to aid nested page fault handling. The commit unprotects (kvm_mmu_unprotect_page) the page when we get a NPF due to guest page table walk where the page was marked RO. However, if an L0->L2 shadow nested page table can also be marked read-only when a page is read only in L1's nested page table. If such a page is accessed by L2 while walking page tables it can cause a nested page fault (page table walks are write accesses). However, after kvm_mmu_unprotect_page we may get another page fault, and again in an endless stream. To cover this use case, we qualify the new error_code check with vcpu->arch.mmu_direct_map so that the error_code check would run on L1 guest, and not the L2 guest. This avoids hitting the above scenario. Fixes: 14727754 Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Reported by syzkaller: The kvm-intel.unrestricted_guest=0 WARNING: CPU: 5 PID: 1014 at /home/kernel/data/kvm/arch/x86/kvm//x86.c:7227 kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm] CPU: 5 PID: 1014 Comm: warn_test Tainted: G W OE 4.13.0-rc3+ #8 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm] Call Trace: ? put_pid+0x3a/0x50 ? rcu_read_lock_sched_held+0x79/0x80 ? kmem_cache_free+0x2f2/0x350 kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 ? __this_cpu_preempt_check+0x13/0x20 The syszkaller folks reported a residual mmio emulation request to userspace due to vm86 fails to emulate inject real mode interrupt(fails to read CS) and incurs a triple fault. The vCPU returns to userspace with vcpu->mmio_needed == true and KVM_EXIT_SHUTDOWN exit reason. However, the syszkaller testcase constructs several threads to launch the same vCPU, the thread which lauch this vCPU after the thread whichs get the vcpu->mmio_needed == true and KVM_EXIT_SHUTDOWN will trigger the warning. #define _GNU_SOURCE #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/wait.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> #include <fcntl.h> #include <unistd.h> #include <linux/kvm.h> #include <stdio.h> int kvmcpu; struct kvm_run *run; void* thr(void* arg) { int res; res = ioctl(kvmcpu, KVM_RUN, 0); printf("ret1=%d exit_reason=%d suberror=%d\n", res, run->exit_reason, run->internal.suberror); return 0; } void test() { int i, kvm, kvmvm; pthread_t th[4]; kvm = open("/dev/kvm", O_RDWR); kvmvm = ioctl(kvm, KVM_CREATE_VM, 0); kvmcpu = ioctl(kvmvm, KVM_CREATE_VCPU, 0); run = (struct kvm_run*)mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, kvmcpu, 0); srand(getpid()); for (i = 0; i < 4; i++) { pthread_create(&th[i], 0, thr, 0); usleep(rand() % 10000); } for (i = 0; i < 4; i++) pthread_join(th[i], 0); } int main() { for (;;) { int pid = fork(); if (pid < 0) exit(1); if (pid == 0) { test(); exit(0); } int status; while (waitpid(pid, &status, __WALL) != pid) {} } return 0; } This patch fixes it by resetting the vcpu->mmio_needed once we receive the triple fault to avoid the residue. Reported-by: NDmitry Vyukov <dvyukov@google.com> Tested-by: NDmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 8月, 2017 4 次提交
-
-
由 Longpeng(Mike) 提交于
This implements the kvm_arch_vcpu_in_kernel() for ARM, and adjusts the calls to kvm_vcpu_on_spin(). Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Longpeng(Mike) 提交于
This implements kvm_arch_vcpu_in_kernel() for s390. DIAG is a privileged operation, so it cannot be called from problem state (user mode). Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Longpeng(Mike) 提交于
get_cpl requires vcpu_load, so we must cache the result (whether the vcpu was preempted when its cpl=0) in kvm_vcpu_arch. Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Longpeng(Mike) 提交于
If a vcpu exits due to request a user mode spinlock, then the spinlock-holder may be preempted in user mode or kernel mode. (Note that not all architectures trap spin loops in user mode, only AMD x86 and ARM/ARM64 currently do). But if a vcpu exits in kernel mode, then the holder must be preempted in kernel mode, so we should choose a vcpu in kernel mode as a more likely candidate for the lock holder. This introduces kvm_arch_vcpu_in_kernel() to decide whether the vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's new argument says the same of the spinning VCPU. Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 8月, 2017 10 次提交
-
-
由 Radim Krčmář 提交于
Add guest_cpuid_clear() and use it instead of kvm_find_cpuid_entry(). Also replace some uses of kvm_find_cpuid_entry() with guest_cpuid_has(). Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
This patch turns guest_cpuid_has_XYZ(cpuid) into guest_cpuid_has(cpuid, X86_FEATURE_XYZ), which gets rid of many very similar helpers. When seeing a X86_FEATURE_*, we can know which cpuid it belongs to, but this information isn't in common code, so we recreate it for KVM. Add some BUILD_BUG_ONs to make sure that it runs nicely. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
bit(X86_FEATURE_NRIPS) is 3 since 2ccd71f1 ("x86/cpufeature: Move some of the scattered feature bits to x86_capability"), so we can simplify the code. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Bandan Das 提交于
When L2 uses vmfunc, L0 utilizes the associated vmexit to emulate a switching of the ept pointer by reloading the guest MMU. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NBandan Das <bsd@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Bandan Das 提交于
Expose VMFUNC in MSRs and VMCS fields. No actual VMFUNCs are enabled. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NBandan Das <bsd@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Bandan Das 提交于
Enable VMFUNC in the secondary execution controls. This simplifies the changes necessary to expose it to nested hypervisors. VMFUNCs still cause #UD when invoked. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NBandan Das <bsd@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
Let's also just use the underlying functions directly here. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> [Rebased on top of 9f744c59 ("KVM: nVMX: do not pin the VMCS12")] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Hildenbrand 提交于
nested_get_page() just sounds confusing. All we want is a page from G1. This is even unrelated to nested. Let's introduce kvm_vcpu_gpa_to_page() so we don't get too lengthy lines. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> [Squash pasto fix from Wanpeng Li. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Expose the "Enable INVPCID" secondary execution control to the guest and properly reflect the exit reason. In addition, before this patch the guest was always running with INVPCID enabled, causing pcid.flat's "Test on INVPCID when disabled" test to fail. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ladi Prosek 提交于
It has been experimentally confirmed that supporting these two MSRs is one of the necessary conditions for nested Hyper-V to use the TSC page. Modern Windows guests are noticeably slower when they fall back to reading timestamps from the HV_X64_MSR_TIME_REF_COUNT MSR instead of using the TSC page. The newly supported MSRs are advertised with the AccessFrequencyRegs partition privilege flag and CPUID.40000003H:EDX[8] "Support for determining timer frequencies is available" (both outside of the scope of this KVM patch). Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 8月, 2017 1 次提交
-
-
由 David S. Miller 提交于
Mikael Pettersson reported that some test programs in the strace-4.18 testsuite cause an OOPS. After some debugging it turns out that garbage values are returned when an exception occurs, causing the fixup memset() to be run with bogus arguments. The problem is that two of the exception handler stubs write the successfully copied length into the wrong register. Fixes: ee841d0a ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.") Reported-by: NMikael Pettersson <mikpelinux@gmail.com> Tested-by: NMikael Pettersson <mikpelinux@gmail.com> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2017 5 次提交
-
-
由 Nick Desaulniers 提交于
The bitmask used to define these values produces overflow, as seen by this compiler warning: arch/arm64/kernel/head.S:47:8: warning: integer overflow in preprocessor expression #elif (PAGE_OFFSET & 0x1fffff) != 0 ^~~~~~~~~~~ arch/arm64/include/asm/memory.h:52:46: note: expanded from macro 'PAGE_OFFSET' #define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1)) ~~~~~~~~~~~~~~~~~~ ^ It would be preferrable to use GENMASK_ULL() instead, but it's not set up to be used from assembly (the UL() macro token pastes UL suffixes when not included in assembly sources). Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Suggested-by: NYury Norov <ynorov@caviumnetworks.com> Suggested-by: NMatthias Kaehlcke <mka@chromium.org> Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Catalin Marinas 提交于
In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon <will.deacon@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NSteve Capper <steve.capper@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Marc Gonzalez 提交于
RX and TX clock delays are required. Request them explicitly. Fixes: cad008b8 ("ARM: dts: tango4: Initial device trees") Cc: stable@vger.kernel.org Signed-off-by: NMarc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Nicholas Piggin 提交于
If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: e0e0d6b7 ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 03 8月, 2017 6 次提交
-
-
由 Wanpeng Li 提交于
------------[ cut here ]------------ WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] Call Trace: vmx_check_nested_events+0x131/0x1f0 [kvm_intel] ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which means that tells the kernel to not make use of any IOAPICs that may be present in the system. Actually external_intr variable in nested_vmx_vmexit() is the req_int_win variable passed from vcpu_enter_guest() which means that the L0's userspace requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && L0's userspace reqeusts an irq window) is true, so there is no interrupt which L1 requires to inject to L2, we should not attempt to emualte "Acknowledge interrupt on exit" for the irq window requirement in this scenario. This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit" if there is no L1 requirement to inject an interrupt to L2. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Added code comment to make it obvious that the behavior is not correct. We should do a userspace exit with open interrupt window instead of the nested VM exit. This patch still improves the behavior, so it was accepted as a (temporary) workaround.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Matlack 提交于
The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Matlack 提交于
According to the Intel SDM, software cannot rely on the current VMCS to be coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 flushes. 24.11.1 Software Use of Virtual-Machine Control Structures ... If a logical processor leaves VMX operation, any VMCSs active on that logical processor may be corrupted (see below). To prevent such corruption of a VMCS that may be used either after a return to VMX operation or on another logical processor, software should execute VMCLEAR for that VMCS before executing the VMXOFF instruction or removing power from the processor (e.g., as part of a transition to the S3 and S4 power states). ... This fixes a "suspicious rcu_dereference_check() usage!" warning during kvm_vm_release() because nested_release_vmcs12() calls kvm_vcpu_write_guest_page() without holding kvm->srcu. Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Since the current implementation of VMCS12 does a memcpy in and out of guest memory, we do not need current_vmcs12 and current_vmcs12_page anymore. current_vmptr is enough to read and write the VMCS12. And David Matlack noted: This patch also fixes dirty tracking (memslot->dirty_bitmap) of the VMCS12 page by using kvm_write_guest. nested_release_page() only marks the struct page dirty. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> [Added David Matlack's note and nested_release_page_clean() fix.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Longpeng(Mike) 提交于
'lapic_irq' is a local variable and its 'level' field isn't initialized, so 'level' is random, it doesn't matter but makes UBSAN unhappy: UBSAN: Undefined behaviour in .../lapic.c:... load of value 10 is not a valid value for type '_Bool' ... Call Trace: [<ffffffff81f030b6>] dump_stack+0x1e/0x20 [<ffffffff81f03173>] ubsan_epilogue+0x12/0x55 [<ffffffff81f03b96>] __ubsan_handle_load_invalid_value+0x118/0x162 [<ffffffffa1575173>] kvm_apic_set_irq+0xc3/0xf0 [kvm] [<ffffffffa1575b20>] kvm_irq_delivery_to_apic_fast+0x450/0x910 [kvm] [<ffffffffa15858ea>] kvm_irq_delivery_to_apic+0xfa/0x7a0 [kvm] [<ffffffffa1517f4e>] kvm_emulate_hypercall+0x62e/0x760 [kvm] [<ffffffffa113141a>] handle_vmcall+0x1a/0x30 [kvm_intel] [<ffffffffa114e592>] vmx_handle_exit+0x7a2/0x1fa0 [kvm_intel] ... Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
When SMP VM start, AP may lost INIT because of receiving INIT between kvm_vcpu_ioctl_x86_get/set_vcpu_events. vcpu 0 vcpu 1 kvm_vcpu_ioctl_x86_get_vcpu_events events->smi.latched_init = 0 send INIT to vcpu1 set vcpu1's pending_events kvm_vcpu_ioctl_x86_set_vcpu_events if (events->smi.latched_init == 0) clear INIT in pending_events This patch fixes it by just update SMM related flags if we are in SMM. Thanks Peng Hao for the report and original commit message. Reported-by: NPeng Hao <peng.hao2@zte.com.cn> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 02 8月, 2017 5 次提交
-
-
由 Gregory CLEMENT 提交于
The number of pins in South Bridge is 30 and not 29. There is a fix for the driver for the pinctrl, but a fix is also need at device tree level for the GPIO. Fixes: afda007f ("ARM64: dts: marvell: Add pinctrl nodes for Armada 3700") Cc: <stable@vger.kernel.org> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
-
由 Sergei Shtylyov 提交于
of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Freescale MPC832x RDB board code still only regards 0 as a failure indication -- fix it up. Fixes: 7a4228bb ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: NScott Wood <oss@buserror.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Wanpeng Li 提交于
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0 CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1 RIP: 0010:rcu_note_context_switch+0x207/0x6b0 Call Trace: __schedule+0xda/0xba0 ? kvm_async_pf_task_wait+0x1b2/0x270 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 RIP: 0010:__d_lookup_rcu+0x90/0x1e0 I encounter this when trying to stress the async page fault in L1 guest w/ L2 guests running. Commit 9b132fbe (Add rcu user eqs exception hooks for async page fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu idle eqs when needed, to protect the code that needs use rcu. However, we need to call the pair even if the function calls schedule(), as seen from the above backtrace. This patch fixes it by informing the RCU subsystem exit/enter the irq towards/away from idle for both n.halted and !n.halted. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
There are three issues in nested_vmx_check_exception: 1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported by Wanpeng Li; 2) it should rebuild the interruption info and exit qualification fields from scratch, as reported by Jim Mattson, because the values from the L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes a page fault, the EPT misconfig's exit qualification is incorrect). 3) CR2 and DR6 should not be written for exception intercept vmexits (CR2 only for AMD). This patch fixes the first two and adds a comment about the last, outlining the fix. Cc: Jim Mattson <jmattson@google.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Do this in the caller of nested_vmx_vmexit instead. nested_vmx_check_exception was doing a vmwrite to the vmcs02's VM_EXIT_INTR_ERROR_CODE field, so that prepare_vmcs12 would move the field to vmcs12->vm_exit_intr_error_code. However that isn't possible on pre-Haswell machines. Moving the vmcs12 write to the callers fixes it. Reported-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Changed nested_vmx_reflect_vmexit() return type to (int)1 from (bool)1, thanks to fengguang.wu@intel.com] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 01 8月, 2017 2 次提交
-
-
由 Marc Zyngier 提交于
In an ideal world, CNTFRQ_EL0 always contains the timer frequency for the kernel to use. Sadly, we get quite a few broken systems where the firmware authors cannot be bothered to program that register on all CPUs, and rely on DT to provide that frequency. So when trapping CNTFRQ_EL0, make sure to return the actual rate (as known by the kernel), and not CNTFRQ_EL0. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Thomas Gleixner 提交于
The HPET resume path abuses irq_domain_[de]activate_irq() to restore the MSI message in the HPET chip for the boot CPU on resume and it relies on an implementation detail of the interrupt core code, which magically makes the HPET unmask call invoked via a irq_disable/enable pair. This worked as long as the irq code did unconditionally invoke the unmask() callback. With the recent changes which keep track of the masked state to avoid expensive hardware access, this does not longer work. As a consequence the HPET timer interrupts are not unmasked which breaks resume as the boot CPU waits forever that a timer interrupt arrives. Make the restore of the MSI message explicit and invoke the unmask() function directly. While at it get rid of the pointless affinity setting as nothing can change the affinity of the interrupt and the vector across suspend/resume. The restore of the MSI message reestablishes the previous affinity setting which is the correct one. Fixes: bf22ff45 ("genirq: Avoid unnecessary low level irq function calls") Reported-and-tested-by: NTomi Sarvela <tomi.p.sarvela@intel.com> Reported-by: NMartin Peres <martin.peres@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: N"Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: jeffy.chen@rock-chips.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707312158590.2287@nanos
-
- 31 7月, 2017 2 次提交
-
-
由 Babu Moger 提交于
While working on enabling queued rwlock on SPARC, found this following code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN to clear a byte. static inline u8 *__qrwlock_write_byte(struct qrwlock *lock) { return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); } Problem is many of the fixed big endian architectures don't define CPU_BIG_ENDIAN and clears the wrong byte. Define CPU_BIG_ENDIAN for parisc architecture to fix it. Signed-off-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Nicholas Piggin 提交于
The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-