- 16 9月, 2019 37 次提交
-
-
由 Sean Christopherson 提交于
[ Upstream commit beb8d93b3e423043e079ef3dda19dad7b28467a8 ] A previous fix to prevent KVM from consuming stale VMCS state after a failed VM-Entry inadvertantly blocked KVM's handling of machine checks that occur during VM-Entry. Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways, depending on when the #MC is recognoized. As it pertains to this bug fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to indicate the VM-Entry failed. If a machine-check event occurs during a VM entry, one of the following occurs: - The machine-check event is handled as if it occurred before the VM entry: ... - The machine-check event is handled after VM entry completes: ... - A VM-entry failure occurs as described in Section 26.7. The basic exit reason is 41, for "VM-entry failure due to machine-check event". Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit(). Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY in a sane fashion and also simplifies vmx_complete_atomic_exit() since VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh. Fixes: b060ca3b ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sean Christopherson 提交于
[ Upstream commit d28f4290b53a157191ed9991ad05dffe9e8c0c89 ] The behavior of WRMSR is in no way dependent on whether or not KVM consumes the value. Fixes: 4566654b ("KVM: vmx: Inject #GP on invalid PAT CR") Cc: stable@vger.kernel.org Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Paolo Bonzini 提交于
[ Upstream commit 674ea351cdeb01d2740edce31db7f2d79ce6095d ] This check will soon be done on every nested vmentry and vmexit, "parallelize" it using bitwise operations. Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Peter Xu 提交于
[ Upstream commit 654f1f13ea56b92bacade8ce2725aea0457f91c0 ] When assigning kvm irqfd we didn't check the irqchip mode but we allow KVM_IRQFD to succeed with all the irqchip modes. However it does not make much sense to create irqfd even without the kernel chips. Let's provide a arch-dependent helper to check whether a specific irqfd is allowed by the arch. At least for x86, it should make sense to check: - when irqchip mode is NONE, all irqfds should be disallowed, and, - when irqchip mode is SPLIT, irqfds that are with resamplefd should be disallowed. For either of the case, previously we'll silently ignore the irq or the irq ack event if the irqchip mode is incorrect. However that can cause misterious guest behaviors and it can be hard to triage. Let's fail KVM_IRQFD even earlier to detect these incorrect configurations. CC: Paolo Bonzini <pbonzini@redhat.com> CC: Radim Krčmář <rkrcmar@redhat.com> CC: Alex Williamson <alex.williamson@redhat.com> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Eugeniy Paltsev 提交于
[ Upstream commit a8c715b4dd73c26a81a9cc8dc792aa715d8b4bb2 ] As of today if userspace process tries to access a kernel virtual addres (0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already exists, that process hangs instead of being killed with SIGSEGV Fix that by ensuring that do_page_fault() handles kenrel vaddr only if in kernel mode. And given this, we can also simplify the code a bit. Now a vmalloc fault implies kernel mode so its failure (for some reason) can reuse the @no_context label and we can remove @bad_area_nosemaphore. Reproduce user test for original problem: ------------------------>8----------------- #include <stdlib.h> #include <stdint.h> int main(int argc, char *argv[]) { volatile uint32_t temp; temp = *(uint32_t *)(0x70000000); } ------------------------>8----------------- Cc: <stable@vger.kernel.org> Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Eugeniy Paltsev 提交于
[ Upstream commit 121e38e5acdc8e1e4cdb750fcdcc72f94e420968 ] Commit 15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate") introduced undefined behaviour by leaving si_code unitiailized and leaking random kernel values to user space. Fixes: 15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate") Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Eric W. Biederman 提交于
[ Upstream commit 15773ae938d8d93d982461990bebad6e1d7a1830 ] Acked-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christian Lamparter 提交于
[ Upstream commit f3e35357cd460a8aeb48b8113dc4b761a7d5c828 ] David Bauer reported that the VDSL modem (attached via PCIe) on his AVM Fritz!Box 7530 was complaining about not having enough space in the BAR. A closer inspection of the old qcom-ipq40xx.dtsi pulled from the GL-iNet repository listed: | qcom,pcie@80000 { | compatible = "qcom,msm_pcie"; | reg = <0x80000 0x2000>, | <0x99000 0x800>, | <0x40000000 0xf1d>, | <0x40000f20 0xa8>, | <0x40100000 0x1000>, | <0x40200000 0x100000>, | <0x40300000 0xd00000>; | reg-names = "parf", "phy", "dm_core", "elbi", | "conf", "io", "bars"; Matching the reg-names with the listed reg leads to <0xd00000> as the size for the "bars". Cc: stable@vger.kernel.org BugLink: https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg45212.htmlReported-by: NDavid Bauer <mail@david-bauer.net> Signed-off-by: NChristian Lamparter <chunkeey@gmail.com> Signed-off-by: NAndy Gross <agross@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Niklas Cassel 提交于
[ Upstream commit 97131f85c08e024df49480ed499aae8fb754067f ] The databook clearly states that the MSI IRQ (msi_ctrl_int) is a level triggered interrupt. The msi_ctrl_int will be high for as long as any MSI status bit is set, thus the IRQ type should be set to IRQ_TYPE_LEVEL_HIGH, causing the IRQ handler to keep getting called, as long as any MSI status bit is set. A git grep shows that ipq4019 is the only SoC using snps,dw-pcie that has configured this IRQ incorrectly. Not having the correct IRQ type defined will cause us to lose interrupts, which in turn causes timeouts in the PCIe endpoint drivers. Signed-off-by: NNiklas Cassel <niklas.cassel@linaro.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NAndy Gross <andy.gross@linaro.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Mathias Kresin 提交于
[ Upstream commit da89f500cb55fb3f19c4b399b46d8add0abbd4d6 ] The PCI range is invalid and PCI attached devices doen't work. Signed-off-by: NMathias Kresin <dev@kresin.me> Signed-off-by: NJohn Crispin <john@phrozen.org> Signed-off-by: NAndy Gross <andy.gross@linaro.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sean Christopherson 提交于
[ Upstream commit b68f3cc7d978943fcf85148165b00594c38db776 ] Invoking the 64-bit variation on a 32-bit kenrel will crash the guest, trigger a WARN, and/or lead to a buffer overrun in the host, e.g. rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64. KVM allows userspace to report long mode support via CPUID, even though the guest is all but guaranteed to crash if it actually tries to enable long mode. But, a pure 32-bit guest that is ignorant of long mode will happily plod along. SMM complicates things as 64-bit CPUs use a different SMRAM save state area. KVM handles this correctly for 64-bit kernels, e.g. uses the legacy save state map if userspace has hid long mode from the guest, but doesn't fare well when userspace reports long mode support on a 32-bit host kernel (32-bit KVM doesn't support 64-bit guests). Since the alternative is to crash the guest, e.g. by not loading state or explicitly requesting shutdown, unconditionally use the legacy SMRAM save state map for 32-bit KVM. If a guest has managed to get far enough to handle SMIs when running under a weird/buggy userspace hypervisor, then don't deliberately crash the guest since there are no downsides (from KVM's perspective) to allow it to continue running. Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 WANG Chao 提交于
[ Upstream commit 1811d979c71621aafc7b879477202d286f7e863b ] guest xcr0 could leak into host when MCE happens in guest mode. Because do_machine_check() could schedule out at a few places. For example: kvm_load_guest_xcr0 ... kvm_x86_ops->run(vcpu) { vmx_vcpu_run vmx_complete_atomic_exit kvm_machine_check do_machine_check do_memory_failure memory_failure lock_page In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule out, host cpu has guest xcr0 loaded (0xff). In __switch_to { switch_fpu_finish copy_kernel_to_fpregs XRSTORS If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in and tries to reinitialize fpu by restoring init fpu state. Same story as last #GP, except we get DOUBLE FAULT this time. Cc: stable@vger.kernel.org Signed-off-by: NWANG Chao <chao.wang@ucloud.cn> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ben Gardon 提交于
[ Upstream commit bc8a3d8925a8fa09fa550e0da115d95851ce33c6 ] KVM bases its memory usage limits on the total number of guest pages across all memslots. However, those limits, and the calculations to produce them, use 32 bit unsigned integers. This can result in overflow if a VM has more guest pages that can be represented by a u32. As a result of this overflow, KVM can use a low limit on the number of MMU pages it will allocate. This makes KVM unable to map all of guest memory at once, prompting spurious faults. Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch introduced no new failures. Signed-off-by: NBen Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dinh Nguyen 提交于
[ Upstream commit 8efd6365417a044db03009724ecc1a9521524913 ] The gmac ethernet driver uses the "altr,sysmgr-syscon" property to configure phy settings for the gmac controller. Add the "altr,sysmgr-syscon" property to all gmac nodes. This patch fixes: [ 0.917530] socfpga-dwmac ff800000.ethernet: No sysmgr-syscon node found [ 0.924209] socfpga-dwmac ff800000.ethernet: Unable to parse OF data Cc: stable@vger.kernel.org Reported-by: NLey Foon Tan <ley.foon.tan@intel.com> Signed-off-by: NDinh Nguyen <dinguyen@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit c3c7470c75566a077c8dc71dcf8f1948b8ddfab4 ] When the hash MMU is active the AMR, IAMR and UAMOR are used for pkeys. The AMR is directly writable by user space, and the UAMOR masks those writes, meaning both registers are effectively user register state. The IAMR is used to create an execute only key. Also we must maintain the value of at least the AMR when running in process context, so that any memory accesses done by the kernel on behalf of the process are correctly controlled by the AMR. Although we are correctly switching all registers when going into a guest, on returning to the host we just write 0 into all regs, except on Power9 where we restore the IAMR correctly. This could be observed by a user process if it writes the AMR, then runs a guest and we then return immediately to it without rescheduling. Because we have written 0 to the AMR that would have the effect of granting read/write permission to pages that the process was trying to protect. In addition, when using the Radix MMU, the AMR can prevent inadvertent kernel access to userspace data, writing 0 to the AMR disables that protection. So save and restore AMR, IAMR and UAMOR. Fixes: cf43d3b2 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Pavel Tatashin 提交于
[ Upstream commit b5179ec4187251a751832193693d6e474d3445ac ] VMs may show incorrect uptime and dmesg printk offsets on hypervisors with unstable clock. The problem is produced when VM is rebooted without exiting from qemu. The fix is to calculate clock offset not only for stable clock but for unstable clock as well, and use kvm_sched_clock_read() which substracts the offset for both clocks. This is safe, because pvclock_clocksource_read() does the right thing and makes sure that clock always goes forward, so once offset is calculated with unstable clock, we won't get new reads that are smaller than offset, and thus won't get negative results. Thank you Jon DeVree for helping to reproduce this issue. Fixes: 857baa87 ("sched/clock: Enable sched clock early") Cc: stable@vger.kernel.org Reported-by: NDominique Martinet <asmadeus@codewreck.org> Signed-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sean Christopherson 提交于
[ Upstream commit 61c08aa9606d4e48a8a50639c956448a720174c3 ] The vCPU-run asm blob does a manual comparison of a VMCS' launched status to execute the correct VM-Enter instruction, i.e. VMLAUNCH vs. VMRESUME. The launched flag is a bool, which is a typedef of _Bool. C99 does not define an exact size for _Bool, stating only that is must be large enough to hold '0' and '1'. Most, if not all, compilers use a single byte for _Bool, including gcc[1]. Originally, 'launched' was of type 'int' and so the asm blob used 'cmpl' to check the launch status. When 'launched' was moved to be stored on a per-VMCS basis, struct vcpu_vmx's "temporary" __launched flag was added in order to avoid having to pass the current VMCS into the asm blob. The new '__launched' was defined as a 'bool' and not an 'int', but the 'cmp' instruction was not updated. This has not caused any known problems, likely due to compilers aligning variables to 4-byte or 8-byte boundaries and KVM zeroing out struct vcpu_vmx during allocation. I.e. vCPU-run accesses "junk" data, it just happens to always be zero and so doesn't affect the result. [1] https://gcc.gnu.org/ml/gcc-patches/2000-10/msg01127.html Fixes: d462b819 ("KVM: VMX: Keep list of loaded VMCSs, instead of vcpus") Cc: <stable@vger.kernel.org> Reviewed-by: NJim Mattson <jmattson@google.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vineet Gupta 提交于
[ Upstream commit 4d447455e73b47c43dd35fcc38ed823d3182a474 ] do_page_fault() forgot to relinquish mmap_sem if a signal came while handling handle_mm_fault() - due to say a ctl+c or oom etc. This would later cause a deadlock by acquiring it twice. This came to light when running libc testsuite tst-tls3-malloc test but is likely also the cause for prior seen LTP failures. Using lockdep clearly showed what the issue was. | # while true; do ./tst-tls3-malloc ; done | Didn't expect signal from child: got `Segmentation fault' | ^C | ============================================ | WARNING: possible recursive locking detected | 4.17.0+ #25 Not tainted | -------------------------------------------- | tst-tls3-malloc/510 is trying to acquire lock: | 606c7728 (&mm->mmap_sem){++++}, at: __might_fault+0x28/0x5c | |but task is already holding lock: |606c7728 (&mm->mmap_sem){++++}, at: do_page_fault+0x9c/0x2a0 | | other info that might help us debug this: | Possible unsafe locking scenario: | | CPU0 | ---- | lock(&mm->mmap_sem); | lock(&mm->mmap_sem); | | *** DEADLOCK *** | ------------------------------------------------------------ What the change does is not obvious (note to myself) prior code was | do_page_fault | | down_read() <-- lock taken | handle_mm_fault <-- signal pending as this runs | if fatal_signal_pending | if VM_FAULT_ERROR | up_read | if user_mode | return <-- lock still held, this was the BUG New code | do_page_fault | | down_read() <-- lock taken | handle_mm_fault <-- signal pending as this runs | if fatal_signal_pending | if VM_FAULT_RETRY | return <-- not same case as above, but still OK since | core mm already relinq lock for FAULT_RETRY | ... | | < Now falls through for bug case above > | | up_read() <-- lock relinquished Cc: stable@vger.kernel.org Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vineet Gupta 提交于
[ Upstream commit f731a8e89f8c78985707c626680f3e24c7a60772 ] signal handling core calls show_regs() with preemption disabled which on ARC takes mmap_sem for mm/vma access, causing lockdep splat. | [ARCLinux]# ./segv-null-ptr | potentially unexpected fatal signal 11. | BUG: sleeping function called from invalid context at kernel/fork.c:1011 | in_atomic(): 1, irqs_disabled(): 0, pid: 70, name: segv-null-ptr | no locks held by segv-null-ptr/70. | CPU: 0 PID: 70 Comm: segv-null-ptr Not tainted 4.18.0+ #69 | | Stack Trace: | arc_unwind_core+0xcc/0x100 | ___might_sleep+0x17a/0x190 | mmput+0x16/0xb8 | show_regs+0x52/0x310 | get_signal+0x5ee/0x610 | do_signal+0x2c/0x218 | resume_user_mode_begin+0x90/0xd8 Workaround by re-enabling preemption temporarily. Note that the preemption disabling in core code around show_regs() was introduced by commit 3a9f84d3 ("signals, debug: fix BUG: using smp_processor_id() in preemptible code in print_fatal_signal()") to silence a differnt lockdep seen on x86 bakc in 2009. Cc: <stable@vger.kernel.org> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ram Pai 提交于
[ Upstream commit 2cd4bd192ee94848695c1c052d87913260e10f36 ] Protection key tracking information is not copied over to the mm_struct of the child during fork(). This can cause the child to erroneously allocate keys that were already allocated. Any allocated execute-only key is lost aswell. Add code; called by dup_mmap(), to copy the pkey state from parent to child explicitly. This problem was originally found by Dave Hansen on x86, which turns out to be a problem on powerpc aswell. Fixes: cf43d3b2 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: NThiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: NRam Pai <linuxram@us.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Paul Mackerras 提交于
[ Upstream commit 234ff0b729ad882d20f7996591a964965647addf ] Testing has revealed an occasional crash which appears to be caused by a race between kvmppc_switch_mmu_to_hpt and kvm_unmap_hva_range_hv. The symptom is a NULL pointer dereference in __find_linux_pte() called from kvm_unmap_radix() with kvm->arch.pgtable == NULL. Looking at kvmppc_switch_mmu_to_hpt(), it does indeed clear kvm->arch.pgtable (via kvmppc_free_radix()) before setting kvm->arch.radix to NULL, and there is nothing to prevent kvm_unmap_hva_range_hv() or the other MMU callback functions from being called concurrently with kvmppc_switch_mmu_to_hpt() or kvmppc_switch_mmu_to_radix(). This patch therefore adds calls to spin_lock/unlock on the kvm->mmu_lock around the assignments to kvm->arch.radix, and makes sure that the partition-scoped radix tree or HPT is only freed after changing kvm->arch.radix. This also takes the kvm->mmu_lock in kvmppc_rmap_reset() to make sure that the clearing of each rmap array (one per memslot) doesn't happen concurrently with use of the array in the kvm_unmap_hva_range_hv() or the other MMU callbacks. Fixes: 18c3640c ("KVM: PPC: Book3S HV: Add infrastructure for running HPT guests on radix host") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Bartosz Golaszewski 提交于
[ Upstream commit adcf60ce14c8250761af9de907eb6c7d096c26d3 ] Since commit eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") the davinci GPIO driver fails to probe if we boot in legacy mode from any of the board files. Since the driver now expects every interrupt to be defined as a separate resource, split the definition of IRQ resources instead of having a single continuous interrupt range. Fixes: eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") Cc: stable@vger.kernel.org Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Bartosz Golaszewski 提交于
[ Upstream commit 27db7baab640ea28d7994eda943fef170e347081 ] Since commit eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") the davinci GPIO driver fails to probe if we boot in legacy mode from any of the board files. Since the driver now expects every interrupt to be defined as a separate resource, split the definition of IRQ resources instead of having a single continuous interrupt range. Fixes: eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") Cc: stable@vger.kernel.org Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Bartosz Golaszewski 提交于
[ Upstream commit 2c9c83491f30afbce25796e185cd4d5e36080e31 ] Since commit eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") the davinci GPIO driver fails to probe if we boot in legacy mode from any of the board files. Since the driver now expects every interrupt to be defined as a separate resource, split the definition of IRQ resources instead of having a single continuous interrupt range. Fixes: eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") Cc: stable@vger.kernel.org Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Bartosz Golaszewski 提交于
[ Upstream commit 193c04374e281a56c7d4f96e66d329671945bebe ] Since commit eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") the davinci GPIO driver fails to probe if we boot in legacy mode from any of the board files. Since the driver now expects every interrupt to be defined as a separate resource, split the definition of IRQ resources instead of having a single continuous interrupt range. Fixes: eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") Cc: stable@vger.kernel.org Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Bartosz Golaszewski 提交于
[ Upstream commit 58a0afbf4c99ac355df16773af835b919b9432ee ] Since commit eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") the davinci GPIO driver fails to probe if we boot in legacy mode from any of the board files. Since the driver now expects every interrupt to be defined as a separate resource, split the definition of IRQ resources instead of having a single continuous interrupt range. Fixes: eb3744a2 ("gpio: davinci: Do not assume continuous IRQ numbering") Cc: stable@vger.kernel.org Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vitaly Kuznetsov 提交于
[ Upstream commit a7c42bb6da6b1b54b2e7bd567636d72d87b10a79 ] vcpu->arch.pv_eoi is accessible through both HV_X64_MSR_VP_ASSIST_PAGE and MSR_KVM_PV_EOI_EN so on migration userspace may try to restore them in any order. Values match, however, kvm_lapic_enable_pv_eoi() uses different length: for Hyper-V case it's the whole struct hv_vp_assist_page, for KVM native case it is 8. In case we restore KVM-native MSR last cache will be reinitialized with len=8 so trying to access VP assist page beyond 8 bytes with kvm_read_guest_cached() will fail. Check if we re-initializing cache for the same address and preserve length in case it was greater. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ladi Prosek 提交于
[ Upstream commit 72bbf9358c3676bd89dc4bd8fb0b1f2a11c288fc ] The state related to the VP assist page is still managed by the LAPIC code in the pv_eoi field. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NLiran Alon <liran.alon@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vitaly Kuznetsov 提交于
[ Upstream commit 87ee613d076351950b74383215437f841ebbeb75 ] In most common cases VP index of a vcpu matches its vcpu index. Userspace is, however, free to set any mapping it wishes and we need to account for that when we need to find a vCPU with a particular VP index. To keep search algorithms optimal in both cases introduce 'num_mismatched_vp_indexes' counter showing how many vCPUs with mismatching VP index we have. In case the counter is zero we can assume vp_index == vcpu_idx. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vitaly Kuznetsov 提交于
[ Upstream commit 1779a39f786397760ae7a7cc03cf37697d8ae58d ] Rename 'hv' to 'hv_vcpu' in kvm_hv_set_msr/kvm_hv_get_msr(); 'hv' is 'reserved' for 'struct kvm_hv' variables across the file. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vitaly Kuznetsov 提交于
[ Upstream commit 9170200ec0ebad70e5b9902bc93e2b1b11456a3b ] Hyper-V TLFS (5.0b) states: > Virtual processors are identified by using an index (VP index). The > maximum number of virtual processors per partition supported by the > current implementation of the hypervisor can be obtained through CPUID > leaf 0x40000005. A virtual processor index must be less than the > maximum number of virtual processors per partition. Forbid userspace to set VP_INDEX above KVM_MAX_VCPUS. get_vcpu_by_vpidx() can now be optimized to bail early when supplied vpidx is >= KVM_MAX_VCPUS. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Zhimin Gu 提交于
[ Upstream commit cc55f7537db6af371e9c1c6a71161ee40f918824 ] On 32bit systems, nosave_regions(non RAM areas) located between max_low_pfn and max_pfn are not excluded from hibernation snapshot currently, which may result in a machine check exception when trying to access these unsafe regions during hibernation: [ 612.800453] Disabling lock debugging due to kernel taint [ 612.805786] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: fe00000000801136 [ 612.814344] mce: [Hardware Error]: RIP !INEXACT! 60:<00000000d90be566> {swsusp_save+0x436/0x560} [ 612.823167] mce: [Hardware Error]: TSC 1f5939fe276 ADDR dd000000 MISC 30e0000086 [ 612.830677] mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1529487426 SOCKET 0 APIC 0 microcode 24 [ 612.839581] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 612.846394] mce: [Hardware Error]: Machine check: Processor context corrupt [ 612.853380] Kernel panic - not syncing: Fatal machine check [ 612.858978] Kernel Offset: 0x18000000 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff) This is because on 32bit systems, pages above max_low_pfn are regarded as high memeory, and accessing unsafe pages might cause expected MCE. On the problematic 32bit system, there are reserved memory above low memory, which triggered the MCE: e820 memory mapping: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000d160cfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d160d000-0x00000000d1613fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000d1614000-0x00000000d1a44fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d1a45000-0x00000000d1ecffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d1ed0000-0x00000000d7eeafff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d7eeb000-0x00000000d7ffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d8000000-0x00000000d875ffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d8760000-0x00000000d87fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d8800000-0x00000000d8fadfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d8fae000-0x00000000d8ffffff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000d9000000-0x00000000da71bfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000da71c000-0x00000000da7fffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000da800000-0x00000000dbb8bfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000dbb8c000-0x00000000dbffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000dd000000-0x00000000df1fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000041edfffff] usable Fix this problem by changing pfn limit from max_low_pfn to max_pfn. This fix does not impact 64bit system because on 64bit max_low_pfn is the same as max_pfn. Signed-off-by: NZhimin Gu <kookoo.gu@intel.com> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NChen Yu <yu.c.chen@intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Abdurachmanov 提交于
[ Upstream commit 397182e0db56b8894a43631ce72de14d90a29834 ] Noticed while building kernel-4.20.0-0.rc5.git2.1.fc30 for Fedora 30/RISCV. [..] BUILDSTDERR: arch/riscv/kernel/ftrace.c: In function 'prepare_ftrace_return': BUILDSTDERR: arch/riscv/kernel/ftrace.c:135:6: warning: unused variable 'err' [-Wunused-variable] BUILDSTDERR: int err; BUILDSTDERR: ^~~ [..] Signed-off-by: NDavid Abdurachmanov <david.abdurachmanov@gmail.com> Fixes: e949b6db51dc1 ("riscv/function_graph: Simplify with function_graph_enter()") Reviewed-by: NOlof Johansson <olof@lixom.net> Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NPalmer Dabbelt <palmer@sifive.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dmitry Voytik 提交于
[ Upstream commit 26e2d7b03ea7ff254bf78305aa44dda62e70b78e ] After commit ef05bcb60c1a, boot from USB drives is broken. Fix this problem by enabling usb-host regulators during boot time. Fixes: ef05bcb60c1a ("arm64: dts: rockchip: fix vcc_host1_5v pin assign on rk3328-rock64") Cc: stable@vger.kernel.org Signed-off-by: NDmitry Voytik <voytikd@gmail.com> Signed-off-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit 9c4e4c90ec24652921e31e9551fcaedc26eec86d ] Otherwise, the following warning is encountered: WARNING: vmlinux.o(.text+0x3dc6): Section mismatch in reference from the variable start_here_multiplatform to the function .init.text:.early_setup() The function start_here_multiplatform() references the function __init .early_setup(). This is often because start_here_multiplatform lacks a __init annotation or the annotation of .early_setup is wrong. Fixes: 56c46bba9bbf ("powerpc/64: Fix booting large kernels with STRICT_KERNEL_RWX") Cc: Russell Currey <ruscur@russell.cc> Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Steven Rostedt (VMware) 提交于
[ Upstream commit 745cfeaac09ce359130a5451d90cb0bd4094c290 ] Arnd reported the following compiler warning: arch/x86/kernel/ftrace.c:669:23: error: 'ftrace_jmp_replace' defined but not used [-Werror=unused-function] The ftrace_jmp_replace() function now only has a single user and should be simply moved by that user. But looking at the code, it shows that ftrace_jmp_replace() is similar to ftrace_call_replace() except that instead of using the opcode of 0xe8 it uses 0xe9. It makes more sense to consolidate that function into one implementation that both ftrace_jmp_replace() and ftrace_call_replace() use by passing in the op code separate. The structure in ftrace_code_union is also modified to replace the "e8" field with the more appropriate name "op". Cc: stable@vger.kernel.org Reported-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20190304200748.1418790-1-arnd@arndb.de Fixes: d2a68c4effd8 ("x86/ftrace: Do not call function graph from dynamic trampolines") Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Gustavo Romero 提交于
commit 8205d5d98ef7f155de211f5e2eb6ca03d95a5a60 upstream. When we take an FP unavailable exception in a transaction we have to account for the hardware FP TM checkpointed registers being incorrect. In this case for this process we know the current and checkpointed FP registers must be the same (since FP wasn't used inside the transaction) hence in the thread_struct we copy the current FP registers to the checkpointed ones. This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr to determine if FP was on when in userspace. thread->ckpt_regs.msr represents the state of the MSR when exiting userspace. This is setup by check_if_tm_restore_required(). Unfortunatley there is an optimisation in giveup_all() which returns early if tsk->thread.regs->msr (via local variable `usermsr`) has FP=VEC=VSX=SPE=0. This optimisation means that check_if_tm_restore_required() is not called and hence thread->ckpt_regs.msr is not updated and will contain an old value. This can happen if due to load_fp=255 we start a userspace process with MSR FP=1 and then we are context switched out. In this case thread->ckpt_regs.msr will contain FP=1. If that same process is then context switched in and load_fp overflows, MSR will have FP=0. If that process now enters a transaction and does an FP instruction, the FP unavailable will not update thread->ckpt_regs.msr (the bug) and MSR FP=1 will be retained in thread->ckpt_regs.msr. tm_reclaim_thread() will then not perform the required memcpy and the checkpointed FP regs in the thread struct will contain the wrong values. The code path for this happening is: Userspace: Kernel Start userspace with MSR FP/VEC/VSX/SPE=0 TM=1 < ----- ... tbegin bne fp instruction FP unavailable ---- > fp_unavailable_tm() tm_reclaim_current() tm_reclaim_thread() giveup_all() return early since FP/VMX/VSX=0 /* ckpt MSR not updated (Incorrect) */ tm_reclaim() /* thread_struct ckpt FP regs contain junk (OK) */ /* Sees ckpt MSR FP=1 (Incorrect) */ no memcpy() performed /* thread_struct ckpt FP regs not fixed (Incorrect) */ tm_recheckpoint() /* Put junk in hardware checkpoint FP regs */ .... < ----- Return to userspace with MSR TM=1 FP=1 with junk in the FP TM checkpoint TM rollback reads FP junk This is a data integrity problem for the current process as the FP registers are corrupted. It's also a security problem as the FP registers from one process may be leaked to another. This patch moves up check_if_tm_restore_required() in giveup_all() to ensure thread->ckpt_regs.msr is updated correctly. A simple testcase to replicate this will be posted to tools/testing/selftests/powerpc/tm/tm-poison.c Similarly for VMX. This fixes CVE-2019-15030. Fixes: f48e91e8 ("powerpc/tm: Fix FP and VMX register corruption") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: NGustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 9月, 2019 3 次提交
-
-
由 Linus Torvalds 提交于
[ Upstream commit 950b07c14e8c59444e2359f15fd70ed5112e11a0 ] This reverts commit 558682b5291937a70748d36fd9ba757fb25b99ae. Chris Wilson reports that it breaks his CPU hotplug test scripts. In particular, it breaks offlining and then re-onlining the boot CPU, which we treat specially (and the BIOS does too). The symptoms are that we can offline the CPU, but it then does not come back online again: smpboot: CPU 0 is now offline smpboot: Booting Node 0 Processor 0 APIC 0x0 smpboot: do_boot_cpu failed(-1) to wakeup CPU#0 Thomas says he knows why it's broken (my personal suspicion: our magic handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix is to just revert it, since we've never touched the LDR bits before, and it's not worth the risk to do anything else at this stage. [ Hotpluging of the boot CPU is special anyway, and should be off by default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the cpu0_hotplug kernel parameter. In general you should not do it, and it has various known limitations (hibernate and suspend require the boot CPU, for example). But it should work, even if the boot CPU is special and needs careful treatment - Linus ] Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/Reported-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Bandan Das <bsd@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Kirill A. Shutemov 提交于
[ Upstream commit c96e8483cb2da6695c8b8d0896fe7ae272a07b54 ] Gustavo noticed that 'new' can be left uninitialized if 'bios_start' happens to be less or equal to 'entry->addr + entry->size'. Initialize the variable at the begin of the iteration to the current value of 'bios_start'. Fixes: 0a46fff2f910 ("x86/boot/compressed/64: Fix boot on machines with broken E820 table") Reported-by: N"Gustavo A. R. Silva" <gustavo@embeddedor.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826133326.7cxb4vbmiawffv2r@boxSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Kirill A. Shutemov 提交于
[ Upstream commit 0a46fff2f9108c2c44218380a43a736cf4612541 ] BIOS on Samsung 500C Chromebook reports very rudimentary E820 table that consists of 2 entries: BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] usable BIOS-e820: [mem 0x00000000fffff000-0x00000000ffffffff] reserved It breaks logic in find_trampoline_placement(): bios_start lands on the end of the first 4k page and trampoline start gets placed below 0. Detect underflow and don't touch bios_start for such cases. It makes kernel ignore E820 table on machines that doesn't have two usable pages below BIOS_START_MAX. Fixes: 1b3a6264 ("x86/boot/compressed/64: Validate trampoline placement against E820") Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203463 Link: https://lkml.kernel.org/r/20190813131654.24378-1-kirill.shutemov@linux.intel.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-