- 08 9月, 2016 2 次提交
-
-
由 Suraj Jitindar Singh 提交于
The struct kvmppc_vcore is a structure used to store various information about a virtual core for a kvm guest. The runnable_threads element of the struct provides a list of all of the currently runnable vcpus on the core (those in the KVMPPC_VCPU_RUNNABLE state). The previous implementation of this list was a linked_list. The next patch requires that the list be able to be iterated over without holding the vcore lock. Reimplement the runnable_threads list in the kvmppc_vcore struct as an array. Implement function to iterate over valid entries in the array and update access sites accordingly. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Suraj Jitindar Singh 提交于
The next commit will introduce a member to the kvmppc_vcore struct which references MAX_SMT_THREADS which is defined in kvm_book3s_asm.h, however this file isn't included in kvm_host.h directly. Thus compiling for certain platforms such as pmac32_defconfig and ppc64e_defconfig with KVM fails due to MAX_SMT_THREADS not being defined. Move the struct kvmppc_vcore definition to kvm_book3s.h which explicitly includes kvm_book3s_asm.h. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 25 8月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
As discussed recently on the kvm mailing list, David Gibson's intention in commit 178a7875 ("vfio: Enable VFIO device for powerpc", 2016-02-01) was to have the KVM VFIO device built in on all powerpc platforms. This patch adds the "select KVM_VFIO" statement that makes this happen. Currently, arch/powerpc/kvm/Makefile doesn't include vfio.o for the 64-bit kvm module, because the list of objects doesn't use the $(common-objs-y) list. The reason it doesn't is because we don't necessarily want coalesced_mmio.o or emulate.o (for example if HV KVM is the only target), and common-objs-y includes both. Since this is confusing, this patch adjusts the definitions so that we now use $(common-objs-y) in the list for the 64-bit kvm.ko module, emulate.o is removed from common-objs-y and added in the places that need it, and the inclusion of coalesced_mmio.o now depends on CONFIG_KVM_MMIO. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 19 8月, 2016 2 次提交
-
-
由 Paul Mackerras 提交于
It doesn't make sense to create irqfds for a VM that doesn't have in-kernel interrupt controller emulation. There is an existing interface for architecture code to tell the irqfd code whether or not any interrupt controller has been initialized, called kvm_arch_intc_initialized(), so let's implement that for powerpc. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
It turns out that if userspace creates a pseries-type VM without in-kernel XICS (interrupt controller) emulation, and then connects an eventfd to the VM as an irqfd, and the eventfd gets signalled, that the code will try to deliver an interrupt via the non-existent XICS object and crash the host kernel with a NULL pointer dereference. To fix this, we check for the presence of the XICS object before trying to deliver the interrupt, and return with an error if not. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 13 8月, 2016 7 次提交
-
-
由 Guenter Roeck 提交于
h8300 builds fail with arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’ arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’ arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’ and many related errors. Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix") Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
-
由 Guenter Roeck 提交于
unicore32 fails to compile with the following errors. mm/memory.c: In function ‘__handle_mm_fault’: mm/memory.c:3381: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘check_vma_flags’: mm/gup.c:456: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘vma_permits_fault’: mm/gup.c:640: error: too many arguments to function ‘arch_vma_access_permitted’ Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches") Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
-
由 Masahiro Yamada 提交于
When CONFIG_LOCALVERSION_AUTO is disabled, the version string is just a tag name (or with a '+' appended if HEAD is not a tagged commit). During the development (and especially when git-bisecting), longer version string would be helpful to identify the commit we are running. This is a default y option, so drop the unset to enable it. Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Riku Voipio 提交于
Enable options commonly needed by popular virtualization and container applications. Use modules when possible to avoid too much overhead for users not interested. - add namespace and cgroup options needed - add seccomp - optional, but enhances Qemu etc - bridge, nat, veth, macvtap and multicast for routing guests and containers - btfrs and overlayfs modules for container COW backends - while near it, make fuse a module instead of built-in. Generated with make saveconfig and dropping unrelated spurious change hunks while commiting. bloat-o-meter old-vmlinux vmlinux: add/remove: 905/390 grow/shrink: 767/229 up/down: 183513/-94861 (88652) .... Total: Before=10515408, After=10604060, chg +0.84% Signed-off-by: NRiku Voipio <riku.voipio@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
In create_safe_exec_page(), we create a copy of the hibernate exit text, along with some page tables to map this via TTBR0. We then install the new tables in TTBR0. In swsusp_arch_resume() we call create_safe_exec_page() before trying a number of operations which may fail (e.g. copying the linear map page tables). If these fail, we bail out of swsusp_arch_resume() and return an error code, but leave TTBR0 as-is. Subsequently, the core hibernate code will call free_basic_memory_bitmaps(), which will free all of the memory allocations we made, including the page tables installed in TTBR0. Thus, we may have TTBR0 pointing at dangling freed memory for some period of time. If the hibernate attempt was triggered by a user requesting a hibernate test via the reboot syscall, we may return to userspace with the clobbered TTBR0 value. Avoid these issues by reorganising swsusp_arch_resume() such that we have no failure paths after create_safe_exec_page(). We also add a check that the zero page allocation succeeded, matching what we have for other allocations. Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NJames Morse <james.morse@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
In create_safe_exec_page we install a set of global mappings in TTBR0, then subsequently invalidate TLBs. While TTBR0 points at the zero page, and the TLBs should be free of stale global entries, we may have stale ASID-tagged entries (e.g. from the EFI runtime services mappings) for the same VAs. Per the ARM ARM these ASID-tagged entries may conflict with newly-allocated global entries, and we must follow a Break-Before-Make approach to avoid issues resulting from this. This patch reworks create_safe_exec_page to invalidate TLBs while the zero page is still in place, ensuring that there are no potential conflicts when the new TTBR0 value is installed. As a single CPU is online while this code executes, we do not need to perform broadcast TLB maintenance, and can call local_flush_tlb_all(), which also subsumes some barriers. The remaining assembly is converted to use write_sysreg() and isb(). Other than this, we safely manipulate TTBRs in the hibernate dance. The code we install as part of the new TTBR0 mapping (the hibernated kernel's swsusp_arch_suspend_exit) installs a zero page into TTBR1, invalidates TLBs, then installs its preferred value. Upon being restored to the middle of swsusp_arch_suspend, the new image will call __cpu_suspend_exit, which will call cpu_uninstall_idmap, installing the zero page in TTBR0 and invalidating all TLB entries. Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NJames Morse <james.morse@arm.com> Tested-by: NJames Morse <james.morse@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Laura Abbott 提交于
Executing from a non-executable area gives an ugly message: lkdtm: Performing direct entry EXEC_RODATA lkdtm: attempting ok execution at ffff0000084c0e08 lkdtm: attempting bad execution at ffff000008880700 Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL) CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13 Hardware name: linux,dummy-virt (DT) task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000 PC is at lkdtm_rodata_do_nothing+0x0/0x8 LR is at execute_location+0x74/0x88 The 'IABT (current EL)' indicates the error but it's a bit cryptic without knowledge of the ARM ARM. There is also no indication of the specific address which triggered the fault. The increase in kernel page permissions makes hitting this case more likely as well. Handling the case in the vectors gives a much more familiar looking error message: lkdtm: Performing direct entry EXEC_RODATA lkdtm: attempting ok execution at ffff0000084c0840 lkdtm: attempting bad execution at ffff000008880680 Unable to handle kernel paging request at virtual address ffff000008880680 pgd = ffff8000089b2000 [ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000 Internal error: Oops: 8400000e [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24 Hardware name: linux,dummy-virt (DT) task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000 PC is at lkdtm_rodata_do_nothing+0x0/0x8 LR is at execute_location+0x74/0x88 Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 12 8月, 2016 12 次提交
-
-
由 James Hogan 提交于
Propagate errors from kvm_mips_handle_kseg0_tlb_fault() and kvm_mips_handle_mapped_seg_tlb_fault(), usually triggering an internal error since they normally indicate the guest accessed bad physical memory or the commpage in an unexpected way. Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.") Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 James Hogan 提交于
Two consecutive gfns are loaded into host TLB, so ensure the range check isn't off by one if guest_pmap_npages is odd. Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 James Hogan 提交于
kvm_mips_handle_mapped_seg_tlb_fault() calculates the guest frame number based on the guest TLB EntryLo values, however it is not range checked to ensure it lies within the guest_pmap. If the physical memory the guest refers to is out of range then dump the guest TLB and emit an internal error. Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 James Hogan 提交于
kvm_mips_handle_mapped_seg_tlb_fault() appears to map the guest page at virtual address 0 to PFN 0 if the guest has created its own mapping there. The intention is unclear, but it may have been an attempt to protect the zero page from being mapped to anything but the comm page in code paths you wouldn't expect from genuine commpage accesses (guest kernel mode cache instructions on that address, hitting trapping instructions when executing from that address with a coincidental TLB eviction during the KVM handling, and guest user mode accesses to that address). Fix this to check for mappings exactly at KVM_GUEST_COMMPAGE_ADDR (it may not be at address 0 since commit 42aa12e7 ("MIPS: KVM: Move commpage so 0x0 is unmapped")), and set the corresponding EntryLo to be interpreted as 0 (invalid). Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Christoffer Dall 提交于
KVM devices were manipulating list data structures without any form of synchronization, and some implementations of the create operations also suffered from a lack of synchronization. Now when we've split the xics create operation into create and init, we can hold the kvm->lock mutex while calling the create operation and when manipulating the devices list. The error path in the generic code gets slightly ugly because we have to take the mutex again and delete the device from the list, but holding the mutex during anon_inode_getfd or releasing/locking the mutex in the common non-error path seemed wrong. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Christoffer Dall 提交于
As we are about to hold the kvm->lock during the create operation on KVM devices, we should move the call to xics_debugfs_init into its own function, since holding a mutex over extended amounts of time might not be a good idea. Introduce an init operation on the kvm_device_ops struct which cannot fail and call this, if configured, after the device has been created. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Julius Niedworok 提交于
When triggering KVM_RUN without a user memory region being mapped (KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could happen, if the user memory region was not mapped initially or if it was unmapped after the vcpu is initialized. The function kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The check function always clears this bit. If gmap_mprotect_notify returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD was not set anymore. So the next time kvm_s390_handle_requests is called, the execution would fall trough the check for KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if gmap_mprotect_notify returns an error code. Resetting the bit with kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug. Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Julius Niedworok 提交于
When KVM_RUN is triggered on a VCPU without an initial reset, a validity intercept occurs. Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially, thus preventing the bug. Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Kan Liang 提交于
There are bug reports about miscounting uncore counters on some client machines like Sandybridge, Broadwell and Skylake. It is very likely to be observed on idle systems. This issue is caused by a hardware issue. PERF_GLOBAL_CTL could be cleared after Package C7, and nothing will be count. The related errata (HSD 158) could be found in: www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf This patch tries to work around this issue by re-enabling PERF_GLOBAL_CTL in ->enable_box(). The workaround does not cover all cases. It helps for new events after returning from C7. But it cannot prevent C7, it will still miscount if a counter is already active. There is no drawback in leaving it enabled, so it does not need disable_box() here. Signed-off-by: NKan Liang <kan.liang@intel.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1470925874-59943-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
Some uncore boxes' num_counters value for Haswell server and Broadwell server are not correct (too large, off by one). This issue was found by comparing the code with the document. Although there is no bug report from users yet, accessing non-existent counters is dangerous and the behavior is undefined: it may cause miscounting or even crashes. This patch makes them consistent with the uncore document. Reported-by: NLukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1470925820-59847-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
Since instruction decoder now supports EVEX-encoded instructions, two fixes are needed to correctly handle them in uprobes. Extended bits for MODRM.rm field need to be sanitized just like we do it for VEX3, to avoid encoding wrong register for register-relative access. EVEX has _two_ extended bits: b and x. Theoretically, EVEX.x should be ignored by the CPU (since GPRs go only up to 15, not 31), but let's be paranoid here: proper encoding for register-relative access should have EVEX.x = 1. Secondly, we should fetch vex.vvvv for EVEX too. This is now super easy because instruction decoder populates vex_prefix.bytes[2] for all flavors of (e)vex encodings, even for VEX2. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # v4.1+ Fixes: 8a764a87 ("x86/asm/decoder: Create artificial 3rd byte for 2-byte VEX") Link: http://lkml.kernel.org/r/20160811154521.20469-1-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 David A. Long 提交于
Because the arm64 calling standard allows stacked function arguments to be anywhere in the stack frame, do not attempt to duplicate the stack frame for jprobes handler functions. Documentation changes to describe this issue have been broken out into a separate patch in order to simultaneously address them in other architecture(s). Signed-off-by: NDavid A. Long <dave.long@linaro.org> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 11 8月, 2016 16 次提交
-
-
I made a mistake while converting the driver to the hotplug state machine and as a result x2apic_cluster_probe() was accessing cpus_in_cluster before allocating it. This patch fixes it by setting the cpumask after the allocation the memory succeeded. While at it, I marked two functions static which are only used within this file. Reported-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 6b2c2847 ("x86/x2apic: Convert to CPU hotplug state machine") Link: http://lkml.kernel.org/r/1470924515-9444-1-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Alex Thorlton 提交于
This problem has actually been in the UV code for a while, but we didn't catch it until recently, because we had been relying on EFI_OLD_MEMMAP to allow our systems to boot for a period of time. We noticed the issue when trying to kexec a recent community kernel, where we hit this NULL pointer dereference in efi_sync_low_kernel_mappings(): [ 0.337515] BUG: unable to handle kernel NULL pointer dereference at 0000000000000880 [ 0.346276] IP: [<ffffffff8105df8d>] efi_sync_low_kernel_mappings+0x5d/0x1b0 The problem doesn't show up with EFI_OLD_MEMMAP because we skip the chunk of setup_efi_state() that sets the efi_loader_signature for the kexec'd kernel. When the kexec'd kernel boots, it won't set EFI_BOOT in setup_arch, so we completely avoid the bug. We always kexec with noefi on the command line, so this shouldn't be an issue, but since we're not actually checking for efi_runtime_disabled in uv_bios_init(), we end up trying to do EFI runtime callbacks when we shouldn't be. This patch just adds a check for efi_runtime_disabled in uv_bios_init() so that we don't map in uv_systab when runtime_disabled == true. Signed-off-by: NAlex Thorlton <athorlton@sgi.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: <stable@vger.kernel.org> # v4.7 Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <rja@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1470912120-22831-2-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
On my Dell XPS 13 9350 with firmware 1.4.4 and SGX on, if I boot Fedora 24's grub2-efi off a hard disk, my first 1MB of RAM looks like: efi: mem00: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x0000000000000fff] (0MB) efi: mem01: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000001000-0x0000000000027fff] (0MB) efi: mem02: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000028000-0x0000000000029fff] (0MB) efi: mem03: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x000000000002a000-0x000000000002bfff] (0MB) efi: mem04: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000000002c000-0x000000000002cfff] (0MB) efi: mem05: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000000002d000-0x000000000002dfff] (0MB) efi: mem06: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000000002e000-0x0000000000057fff] (0MB) efi: mem07: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000058000-0x0000000000058fff] (0MB) efi: mem08: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000059000-0x000000000009ffff] (0MB) My EBDA is at 0x2c000, which blocks off everything from 0x2c000 and up, and my trampoline is 0x6000 bytes (6 pages), so it doesn't fit in the loader data range at 0x28000. Without this patch, it panics due to a failure to allocate the trampoline. With this patch, it works: [ +0.001744] Base memory trampoline at [ffff880000001000] 1000 size 24576 Signed-off-by: NAndy Lutomirski <luto@kernel.org> Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matt Fleming <mfleming@suse.de> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/998c77b3bf709f3dfed85cb30701ed1a5d8a438b.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
If reserve_real_mode() fails, panicing immediately means we're doomed. Make it safe to try more than once to allocate the trampoline: - Degrade a failure from panic() to pr_info(). (If we make it to setup_real_mode() without reserving the trampoline, we'll panic them.) - Factor out helpers so that platform code can supply a specific address to try. - Warn if reserve_real_mode() is called after we're done with the memblock allocator. If that were to happen, we would behave unpredictably. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matt Fleming <mfleming@suse.de> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/876e383038f3e9971aa72fd20a4f5da05f9d193d.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
There's no need to run setup_real_mode() as early as we run it. Defer it to the same early_initcall that sets up the page permissions for the real mode code. This should be a code size reduction. More importantly, it give us a longer window in which we can allocate the real mode trampoline. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matt Fleming <mfleming@suse.de> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/fd62f0da4f79357695e9bf3e365623736b05f119.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The initialization process for trampoline_cr4_features and mmu_cr4_features was confusing. The intent is for mmu_cr4_features and *trampoline_cr4_features to stay in sync, but trampoline_cr4_features is NULL until setup_real_mode() runs. The old code synchronized *trampoline_cr4_features *twice*, once in setup_real_mode() and once in setup_arch(). It also initialized mmu_cr4_features in setup_real_mode(), which causes the actual value of mmu_cr4_features to potentially depend on when setup_real_mode() is called. With this patch, mmu_cr4_features is initialized directly in setup_arch(), and *trampoline_cr4_features is synchronized to mmu_cr4_features when the trampoline is set up. After this patch, it should be safe to defer setup_real_mode(). Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matt Fleming <mfleming@suse.de> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/d48a263f9912389b957dd495a7127b009259ffe0.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
reserve_bios_regions() is a quirk that reserves memory that we might otherwise think is available. There's no need to run it so early, and running it before we have the memory map initialized with its non-quirky inputs makes it hard to make reserve_bios_regions() more intelligent. Move it right after we populate the memblock state. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matt Fleming <mfleming@suse.de> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/59f58618911005c799c6c9979ce6ae4881d907c2.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Aaron Lu 提交于
Since commit: 52aec330 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR") the TLB remote shootdown is done through call function vector. That commit didn't take care of irq_tlb_count, which a later commit: fd0f5869 ("x86: Distinguish TLB shootdown interrupts from other functions call interrupts") ... tried to fix. The fix assumes every increase of irq_tlb_count has a corresponding increase of irq_call_count. So the irq_call_count is always bigger than irq_tlb_count and we could substract irq_tlb_count from irq_call_count. Unfortunately this is not true for the smp_call_function_single() case. The IPI is only sent if the target CPU's call_single_queue is empty when adding a csd into it in generic_exec_single. That means if two threads are both adding flush tlb csds to the same CPU's call_single_queue, only one IPI is sent. In other words, the irq_call_count is incremented by 1 but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be bigger than irq_call_count and the substract will produce a very large irq_call_count value due to overflow. Considering that: 1) it's not worth to send more IPIs for the sake of accurate counting of irq_call_count in generic_exec_single(); 2) it's not easy to tell if the call function interrupt is for TLB shootdown in __smp_call_function_single_interrupt(). Not to exclude TLB shootdown from call function count seems to be the simplest fix and this patch just does that. This bug was found by LKP's cyclic performance regression tracking recently with the vm-scalability test suite. I have bisected to commit: 3dec0ba0 ("mm/rmap: share the i_mmap_rwsem") This commit didn't do anything wrong but revealed the irq_call_count problem. IIUC, the commit makes rwc->remap_one in rmap_walk_file concurrent with multiple threads. When remap_one is try_to_unmap_one(), then multiple threads could queue flush TLB to the same CPU but only one IPI will be sent. Since the commit was added in Linux v3.19, the counting problem only shows up from v3.19 onwards. Signed-off-by: NAaron Lu <aaron.lu@intel.com> Cc: Alex Shi <alex.shi@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com> Link: http://lkml.kernel.org/r/20160811074430.GA18163@aaronlu.sh.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
A recent patch changed the format of a swap PTE. The comment explaining the format of the swap PTE is wrong about the bits used for the swap type field. Amusingly, the ASCII art and the patch description are correct, but the comment itself is wrong. As I was looking at this, I also noticed that the SWP_OFFSET_FIRST_BIT has an off-by-one error. This does not really hurt anything. It just wasted a bit of space in the PTE, giving us 2^59 bytes of addressable space in our swapfiles instead of 2^60. But, it doesn't match with the comments, and it wastes a bit of space, so fix it. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Fixes: 00839ee3 ("x86/mm: Move swap offset/type up in PTE to work around erratum") Link: http://lkml.kernel.org/r/20160810172325.E56AD7DA@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Nicolas Iooss 提交于
debug_putstr() is used to output strings without using printf-like formatting but debug_putstr(v) is defined as early_printk(v) in arch/x86/lib/kaslr.c. This makes clang reports the following warning when building with -Wformat-security: arch/x86/lib/kaslr.c:57:15: warning: format string is not a string literal (potentially insecure) [-Wformat-security] debug_putstr(purpose); ^~~~~~~ Fix this by using "%s" in early_printk(). Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: NKees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160806102039.27221-1-nicolas.iooss_linux@m4x.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
https://github.com/rjarzmik/linux由 Arnd Bergmann 提交于
This is the pxa changes for v4.8 cycle. This is a tiny fix couple to enable changes in includes in gpio API without breaking pxa boards. * tag 'pxa-fixes-v4.8' of https://github.com/rjarzmik/linux: ARM: pxa: add module.h for corgi symbol_get/symbol_put usage ARM: pxa: add module.h for spitz symbol_get/symbol_put usage
-
由 Andrew Morton 提交于
Revert commit 51d5d12b ("ARM: keystone: dts: add psci command definition"), which was inadvertently added twice. Cc: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: Vitaly Andrianov <vitalya@ti.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sudeep Holla 提交于
Even when PCI is disabled, ARCH_HISI selects HISILICON_IRQ_MBIGEN triggerring the following config warning: warning: (ARM64 && HISILICON_IRQ_MBIGEN) selects ARM_GIC_V3_ITS which has unmet direct dependencies (PCI && PCI_MSI) This patch makes selection of HISILICON_IRQ_MBIGEN conditional on PCI. Cc: Ma Jun <majun258@huawei.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Sudeep Holla 提交于
Even when PCI is disabled, ARCH_ALPINE selects ALPINE_MSI triggerring the following config warning: warning: (ARCH_ALPINE) selects ALPINE_MSI which has unmet direct dependencies (PCI) This patch makes selection of ALPINE_MSI conditional on PCI. Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NAntoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Robin Murphy 提交于
Clearly QEMU is very permissive in how its PL310 model may be set up, but the real hardware turns out to be far more particular about things actually being correct. Fix up the DT description so that the real thing actually boots: - The arm,data-latency and arm,tag-latency properties need 3 cells to be valid, otherwise we end up retaining the default 8-cycle latencies which leads pretty quickly to lockup. - The arm,dirty-latency property is only relevant to L210/L220, so get rid of it. - The cache geometry override also leads to lockup and/or general misbehaviour. Irritatingly, the manual doesn't state the actual PL310 configuration, but based on the boardfile code and poking registers from the Boot Monitor, it would seem to be 8 sets of 16KB ways. With that, we can successfully boot to enjoy the fun of mismatched FPUs... Cc: stable@vger.kernel.org Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Ralf Ramsauer 提交于
c90bb7b9 enabled the high speed UARTs of the Jetson TK1. Due to a merge quirk, wrong addresses were introduced. Fix it and use the correct addresses. Thierry let me know, that there is another patch (b5896f67 in linux-next) in preparation which removes all the '0,' prefixes of unit addresses on Tegra124 and is planned to go upstream in 4.8, so this patch will get reverted then. But for the moment, this patch is necessary to fix current misbehaviour. Fixes: c90bb7b9 ("ARM: tegra: Add high speed UARTs to Jetson TK1 device tree") Signed-off-by: NRalf Ramsauer <ralf@ramses-pyramidenbau.de> Acked-by: NThierry Reding <thierry.reding@gmail.com> Cc: stable@vger.kernel.org # v4.7 Cc: linux-tegra@vger.kernel.org Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-