- 05 8月, 2017 2 次提交
-
-
由 Allen Pais 提交于
Acked-by: NSam Ravnborg <sam@ravnborg.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Mikael Pettersson reported that some test programs in the strace-4.18 testsuite cause an OOPS. After some debugging it turns out that garbage values are returned when an exception occurs, causing the fixup memset() to be run with bogus arguments. The problem is that two of the exception handler stubs write the successfully copied length into the wrong register. Fixes: ee841d0a ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.") Reported-by: NMikael Pettersson <mikpelinux@gmail.com> Tested-by: NMikael Pettersson <mikpelinux@gmail.com> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2017 2 次提交
-
-
由 Nicholas Piggin 提交于
If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: e0e0d6b7 ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 02 8月, 2017 1 次提交
-
-
由 Sergei Shtylyov 提交于
of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Freescale MPC832x RDB board code still only regards 0 as a failure indication -- fix it up. Fixes: 7a4228bb ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: NScott Wood <oss@buserror.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 7月, 2017 4 次提交
-
-
由 Babu Moger 提交于
While working on enabling queued rwlock on SPARC, found this following code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN to clear a byte. static inline u8 *__qrwlock_write_byte(struct qrwlock *lock) { return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); } Problem is many of the fixed big endian architectures don't define CPU_BIG_ENDIAN and clears the wrong byte. Define CPU_BIG_ENDIAN for parisc architecture to fix it. Signed-off-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Nicholas Piggin 提交于
The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Helge Deller 提交于
Since kernel 4.11 the thread and irq stacks on parisc randomly overflow the default size of 16k. The reason why stack usage suddenly grew is yet unknown. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.11+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 John David Anglin 提交于
In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG statement in flush_cache_range() during a system shutdown: kernel BUG at arch/parisc/kernel/cache.c:595! CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx IAOQ[0]: flush_cache_range+0x144/0x148 IAOQ[1]: flush_cache_page+0x0/0x1a8 RP(r2): flush_cache_range+0xec/0x148 Backtrace: [<00000000402910ac>] unmap_page_range+0x84/0x880 [<00000000402918f4>] unmap_single_vma+0x4c/0x60 [<0000000040291a18>] zap_page_range_single+0x110/0x160 [<0000000040291c34>] unmap_mapping_range+0x174/0x1a8 [<000000004026ccd8>] truncate_pagecache+0x50/0xa8 [<000000004026cd84>] truncate_setsize+0x54/0x70 [<000000004033d534>] put_aio_ring_file+0x44/0xb0 [<000000004033d5d8>] aio_free_ring+0x38/0x140 [<000000004033d714>] free_ioctx+0x34/0xa8 [<00000000401b0028>] process_one_work+0x1b8/0x4d0 [<00000000401b04f4>] worker_thread+0x1b4/0x648 [<00000000401b9128>] kthread+0x1b0/0x208 [<0000000040150020>] end_fault_vector+0x20/0x28 [<0000000040639518>] nf_ip_reroute+0x50/0xa8 [<0000000040638ed0>] nf_ip_route+0x10/0x78 [<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8 CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx Backtrace: [<0000000040163bf0>] show_stack+0x20/0x38 [<0000000040688480>] dump_stack+0xa8/0x120 [<0000000040163dc4>] die_if_kernel+0x19c/0x2b0 [<0000000040164d0c>] handle_interruption+0xa24/0xa48 This patch modifies flush_cache_range() to handle non current contexts. In as much as this occurs infrequently, the simplest approach is to flush the entire cache when this happens. Signed-off-by: NJohn David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
- 30 7月, 2017 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit f8475cef "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute in sysfs only behaves as expected on x86 with APERF/MPERF registers available when it is read from at least twice in a row. The value returned by the first read may not be meaningful, because the computations in there use cached values from the previous iteration of aperfmperf_snapshot_khz() which may be stale. To prevent that from happening, modify arch_freq_get_on_cpu() to call aperfmperf_snapshot_khz() twice, with a short delay between these calls, if the previous invocation of aperfmperf_snapshot_khz() was too far back in the past (specifically, more that 1s ago). Also, as pointed out by Doug Smythies, aperf_delta is limited now and the multiplication of it by cpu_khz won't overflow, so simplify the s->khz computations too. Fixes: f8475cef "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" Reported-by: NDoug Smythies <dsmythies@telus.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 7月, 2017 7 次提交
-
-
由 Alistair Popple 提交于
Commit 8e3f1b1d ("powerpc/powernv/pci: Enable 64-bit devices to access >4GB DMA space") introduced the ability for PCI device drivers to request a DMA mask between 64 and 32 bits and actually get a mask greater than 32-bits. However currently if certain machine configuration dependent conditions are not meet the code silently falls back to a 32-bit mask. This makes it hard for device drivers to detect which mask they actually got. Instead we should return an error when the request could not be fulfilled which allows drivers to either fallback or implement other workarounds as documented in DMA-API-HOWTO.txt. Signed-off-by: NAlistair Popple <alistair@popple.id.au> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Historically the boot wrapper was always built 32-bit big endian, even for 64-bit kernels. That was because old firmwares didn't necessarily support booting a 64-bit image. Because of that arch/powerpc/boot/Makefile uses CROSS32CC for compilation. However when we added 64-bit little endian support, we also added support for building the boot wrapper 64-bit. However we kept using CROSS32CC, because in most cases it is just CC and everything works. However if the user doesn't specify CROSS32_COMPILE (which no one ever does AFAIK), and CC is *not* biarch (32/64-bit capable), then CROSS32CC becomes just "gcc". On native systems that is probably OK, but if we're cross building it definitely isn't, leading to eg: gcc ... -m64 -mlittle-endian -mabi=elfv2 ... arch/powerpc/boot/cpm-serial.c gcc: error: unrecognized argument in option ‘-mabi=elfv2’ gcc: error: unrecognized command line option ‘-mlittle-endian’ make: *** [zImage] Error 2 To fix it, stop using CROSS32CC, because we may or may not be building 32-bit. Instead setup a BOOTCC, which defaults to CC, and only use CROSS32_COMPILE if it's set and we're building for 32-bit. Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
-
由 Michael Ellerman 提交于
In smp_cpus_done() we need to call smp_ops->setup_cpu() for the boot CPU, which means it has to run *on* the boot CPU. In the past we ensured it ran on the boot CPU by changing the CPU affinity mask of current directly. That was removed in commit 6d11b87d ("powerpc/smp: Replace open coded task affinity logic"), and replaced with a work queue call. Unfortunately using a work queue leads to a lockdep warning, now that the CPU hotplug lock is a regular semaphore: ====================================================== WARNING: possible circular locking dependency detected ... kworker/0:1/971 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){++++++}, at: [<c000000000100974>] apply_workqueue_attrs+0x34/0xa0 but task is already holding lock: ((&wfc.work)){+.+.+.}, at: [<c0000000000fdb2c>] process_one_work+0x25c/0x800 ... CPU0 CPU1 ---- ---- lock((&wfc.work)); lock(cpu_hotplug_lock.rw_sem); lock((&wfc.work)); lock(cpu_hotplug_lock.rw_sem); Although the deadlock can't happen in practice, because smp_cpus_done() only runs in early boot before CPU hotplug is allowed, lockdep can't tell that. Luckily in commit 8fb12156 ("init: Pin init task to the boot CPU, initially") tglx changed the generic code to pin init to the boot CPU to begin with. The unpinning of init from the boot CPU happens in sched_init_smp(), which is called after smp_cpus_done(). So smp_cpus_done() is always called on the boot CPU, which means we don't need the work queue call at all - and the lockdep warning goes away. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Will Deacon 提交于
The vast majority of virtual allocations in the vmalloc region are followed by a guard page, which can help to avoid overruning on vma into another, which may map a read-sensitive device. This patch adds a guard page to the end of the kernel image mapping (i.e. following the data/bss segments). Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Matthias Kaehlcke 提交于
The clang warning 'address-of-packed-member' is disabled for the general kernel code, also disable it for the x86 boot code. This suppresses a bunch of warnings like this when building with clang: ./arch/x86/include/asm/processor.h:535:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value [-Waddress-of-packed-member] return this_cpu_read_stable(cpu_tss.x86_tss.sp0); ^~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro 'this_cpu_read_stable' #define this_cpu_read_stable(var) percpu_stable_op("mov", var) ^~~ ./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro 'percpu_stable_op' : "p" (&(var))); ^~~ Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Cc: Doug Anderson <dianders@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Gustavo Romero 提交于
Currently flush_tmregs_to_thread() does not save the TM SPRs (TFHAR, TFIAR, TEXASR) to the thread struct, unless the process is currently inside a suspended transaction. If the process is core dumping, and the TM SPRs have changed since the last time the process was context switched, then we will save stale values of the TM SPRs to the core dump. Fix it by saving the live register state to the thread struct in that case. Fixes: 08e1c01d ("powerpc/ptrace: Enable support for TM SPR state") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: NGustavo Romero <gromero@linux.vnet.ibm.com> Reviewed-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Oliver O'Halloran 提交于
The Radix MMU translation tree as defined in ISA v3.0 contains two different types of entry, directories and leaves. Leaves are identified by _PAGE_PTE being set. The formats of the two entries are different, with the directory entries containing no spare bits for use by software. In particular the bit we use for _PAGE_DEVMAP is not reserved for software, and is part of the NLB (Next Level Base) field, essentially the address of the next level in the tree. Note that the Linux pte_t is not == _PAGE_PTE. A huge page pmd entry (or devmap!) is also a leaf and so has _PAGE_PTE set, even though we use a pmd_t for it in Linux. The fix is to ensure that the pmd/pte_devmap() confirm they are looking at a leaf entry (_PAGE_PTE) as well as checking _PAGE_DEVMAP. Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64") Signed-off-by: NOliver O'Halloran <oohall@gmail.com> Tested-by: NLaurent Vivier <lvivier@redhat.com> Tested-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Add a comment in the code and flesh out change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 7月, 2017 7 次提交
-
-
由 Will Deacon 提交于
Since the PMU register interface is banked per CPU, CPU PMU interrrupts cannot be handled by a CPU other than the one with the PMU asserting the interrupt. This means that migrating PMU SPIs, as we do during a CPU hotplug operation doesn't make any sense and can lead to the IRQ being disabled entirely if we route a spurious IRQ to the new affinity target. This has been observed in practice on AMD Seattle, where CPUs on the non-boot cluster appear to take a spurious PMU IRQ when coming online, which is routed to CPU0 where it cannot be handled. This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their affinity prior to requesting them, ensuring that they cannot be migrated during hotplug events. This interacts badly with the DB8500 erratum workaround that ping-pongs the interrupt affinity from the handler, so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags to be overridden in the platdata. Fixes: 3cf7ee98 ("drivers/perf: arm_pmu: move irq request/free into probe") Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Aneesh Kumar K.V 提交于
Fixes: dad6f37c ("powerpc: subpage_protect: Increase the array size to take care of 64TB") Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Tested-by: NRam Pai <linuxram@us.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Wanpeng Li 提交于
Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Add more WARNs. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1: Before NMI IRET test Sending NMI to self NMI isr running stack 0x461000 Sending nested NMI to self After nested NMI to self Nested NMI isr running rip=40038e After iret After NMI to self FAIL: NMI Commit 4c4a6f79 (KVM: nVMX: track NMI blocking state separately for each VMCS) tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough: - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the L2 can generate #PF which can be intercepted by L0. - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 guest and injects the #PF into L1. At this point the vmcs02 has nmi_known_unmasked=true. - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field of vmcs12 (and fixes the shadow page table) before resuming L2 guest. - L1 executes VMRESUME to resume L2, causing a vmexit to L0 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the interruptibility-state field of vmcs02, but nmi_known_unmasked is still true. - L2 immediately exits to L0 with another page fault, because L0 still has not updated the NGVA->HPA page tables. However, nmi_known_unmasked is true so vmx_recover_nmi_blocking does not do anything. The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
The PI vector for L0 and L1 must be different. If dest vcpu0 is in guest mode while vcpu1 is delivering a non-nested PI to vcpu0, there wont't be any vmexit so that the non-nested interrupt will be delayed. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This reverts the change of commit f85c758d, as the behavior it modified was intended. The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual says: Table 4-7. Use of CR3 with PAE Paging Bit Position(s) Contents 4:0 Ignored 31:5 Physical address of the 32-Byte aligned page-directory-pointer table used for linear-address translation 63:32 Ignored (these bits exist only on processors supporting the Intel-64 architecture) To placate the static checker, write the mask explicitly as an unsigned long constant instead of using a 32-bit unsigned constant. Cc: Dan Carpenter <dan.carpenter@oracle.com> Fixes: f85c758dSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 7月, 2017 10 次提交
-
-
由 Dave Martin 提交于
write_sysreg() may misparse the value argument because it is used without parentheses to protect it. This patch adds the ( ) in order to avoid any surprises. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> [will: same change to write_sysreg_s] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Michael Ellerman 提交于
In commit efe0160c ("powerpc/64: Linker on-demand sfpr functions for modules"), we added an ld version check early in the powerpc top-level Makefile. Because the Makefile runs before the kernel config is setup, the checks for CONFIG_CPU_LITTLE_ENDIAN etc. all take the default case. So we end up configuring ld for 32-bit big endian. That would be OK, except that for historical (or perhaps no) reason, we use 'override LD' to add the endian flags to the LD variable itself, rather than the normal approach of adding them to LDFLAGS. The end result is that when we check the ld version we run it as: $(CROSS_COMPILE)ld -EB -m elf32ppc --version This often works, unless you are using a 64-bit only and/or little endian only, toolchain. In which case you see something like: $ make defconfig powerpc64le-linux-ld: unrecognised emulation mode: elf32ppc Supported emulations: elf64lppc elf32lppc elf32lppclinux elf32lppcsim /bin/sh: 1: [: -ge: unexpected operator The proper fix is to stop using 'override LD', but that will require a fair bit of testing. Instead we can fix it for now just by reordering the Makefile to do the version check earlier. Fixes: efe0160c ("powerpc/64: Linker on-demand sfpr functions for modules") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Laurent Vivier 提交于
As for commit 68baf692 ("powerpc/pseries: Fix of_node_put() underflow during DLPAR remove"), the call to of_node_put() must be removed from pSeries_reconfig_remove_node(). dlpar_detach_node() and pSeries_reconfig_remove_node() both call of_detach_node(), and thus the node should not be released in both cases. Fixes: 0829f6d1 ("of: device_node kobject lifecycle fixes") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: NLaurent Vivier <lvivier@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
There's a somewhat architectural issue with Radix MMU and KVM. When coming out of a guest with AIL (Alternate Interrupt Location, ie, MMU enabled), we start executing hypervisor code with the PID register still containing whatever the guest has been using. The problem is that the CPU can (and will) then start prefetching or speculatively load from whatever host context has that same PID (if any), thus bringing translations for that context into the TLB, which Linux doesn't know about. This can cause stale translations and subsequent crashes. Fixing this in a way that is neither racy nor a huge performance impact is difficult. We could just make the host invalidations always use broadcast forms but that would hurt single threaded programs for example. We chose to fix it instead by partitioning the PID space between guest and host. This is possible because today Linux only use 19 out of the 20 bits of PID space, so existing guests will work if we make the host use the top half of the 20 bits space. We additionally add support for a property to indicate to Linux the size of the PID register which will be useful if we eventually have processors with a larger PID space available. There is still an issue with malicious guests purposefully setting the PID register to a value in the hosts PID range. Hopefully future HW can prevent that, but in the meantime, we handle it with a pair of kludges: - On the way out of a guest, before we clear the current VCPU in the PACA, we check the PID and if it's outside of the permitted range we flush the TLB for that PID. - When context switching, if the mm is "new" on that CPU (the corresponding bit was set for the first time in the mm cpumask), we check if any sibling thread is in KVM (has a non-NULL VCPU pointer in the PACA). If that is the case, we also flush the PID for that CPU (core). This second part is needed to handle the case where a process is migrated (or starts a new pthread) on a sibling thread of the CPU coming out of KVM, as there's a window where stale translations can exist before we detect it and flush them out. A future optimization could be added by keeping track of whether the PID has ever been used and avoid doing that for completely fresh PIDs. We could similarily mark PIDs that have been the subject of a global invalidation as "fresh". But for now this will do. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop unneeded include of kvm_book3s_asm.h] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 John David Anglin 提交于
It's always bothered me that we only disable preemption in copy_user_page around the call to flush_dcache_page_asm. This patch extends this to after the copy. Signed-off-by: NJohn David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 John David Anglin 提交于
Helge noticed that we flush the TLB page in flush_cache_page but not in flush_cache_range or flush_cache_mm. For a long time, we have had random segmentation faults building packages on machines with PA8800/8900 processors. These machines only support equivalent aliases. We don't see these faults on machines that don't require strict coherency. So, it appears TLB speculation sometimes leads to cache corruption on machines that require coherency. This patch adds TLB flushes to flush_cache_range and flush_cache_mm when coherency is required. We only flush the TLB in flush_cache_page when coherency is required. The patch also optimizes flush_cache_range. It turns out we always have the right context to use flush_user_dcache_range_asm and flush_user_icache_range_asm. The patch has been tested for some time on rp3440, rp3410 and A500-44. It's been boot tested on c8000. No random segmentation faults were observed during testing. Signed-off-by: NJohn David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Helge Deller 提交于
Some machines can't power off the machine, so disable the lockup detectors to avoid this watchdog BUG to show up every few seconds: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-shutdow:1] Signed-off-by: NHelge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.9+
-
由 Helge Deller 提交于
The Page Deallocation Table (PDT) holds the physical addresses of all broken memory addresses. With the physical address we now are able to show which DIMM slot (e.g. 1a, 3c) actually holds the broken memory module so that users are able to replace it. Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Helge Deller 提交于
Add a firmware wrapper function, which asks PDC firmware for the DIMM slot of a physical address. This is needed to show users which DIMM module needs replacement in case a broken DIMM was encountered. Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Helge Deller 提交于
Commit c9c2877d ("parisc: Add Page Deallocation Table (PDT) support") introduced the pdc_pat_mem_read_pd_pdt() firmware helper function, which crashed the system because it trashed the stack if the pdc_pat_mem_read_pd_retinfo struct was located on the stack (and which is in size less than the required 32 64-bit values). Fix it by using the pdc_result struct instead when calling firmware and copy the return values back into the result struct when finished sucessfully. While debugging this code I noticed that the pdc_type wasn't set correctly either, so let's fix that too. Fixes: c9c2877d ("parisc: Add Page Deallocation Table (PDT) support") Signed-off-by: NHelge Deller <deller@gmx.de>
-
- 25 7月, 2017 4 次提交
-
-
由 Christian Borntraeger 提交于
The following warning was triggered by missing srcu locks around the storage key handling functions. ============================= WARNING: suspicious RCU usage 4.12.0+ #56 Not tainted ----------------------------- ./include/linux/kvm_host.h:572 suspicious rcu_dereference_check() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by live_migration/4936: #0: (&mm->mmap_sem){++++++}, at: [<0000000000141be0>] kvm_arch_vm_ioctl+0x6b8/0x22d0 CPU: 8 PID: 4936 Comm: live_migration Not tainted 4.12.0+ #56 Hardware name: IBM 2964 NC9 704 (LPAR) Call Trace: ([<000000000011378a>] show_stack+0xea/0xf0) [<000000000055cc4c>] dump_stack+0x94/0xd8 [<000000000012ee70>] gfn_to_memslot+0x1a0/0x1b8 [<0000000000130b76>] gfn_to_hva+0x2e/0x48 [<0000000000141c3c>] kvm_arch_vm_ioctl+0x714/0x22d0 [<000000000013306c>] kvm_vm_ioctl+0x11c/0x7b8 [<000000000037e2c0>] do_vfs_ioctl+0xa8/0x6c8 [<000000000037e984>] SyS_ioctl+0xa4/0xb8 [<00000000008b20a4>] system_call+0xc4/0x27c 1 lock held by live_migration/4936: #0: (&mm->mmap_sem){++++++}, at: [<0000000000141be0>] kvm_arch_vm_ioctl+0x6b8/0x22d0 Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Pierre Morel<pmorel@linux.vnet.ibm.com>
-
由 Stefan Assmann 提交于
When EFI runtime services are disabled, for example by the "noefi" kernel cmdline parameter, the reboot_type could still be set to BOOT_EFI causing reboot to fail. Fix this by checking if EFI runtime services are enabled. Signed-off-by: NStefan Assmann <sassmann@kpanic.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724122248.24006-1-sassmann@kpanic.de [ Fixed 'not disabled' double negation. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Michael Davidson 提交于
undef memcpy() and friends in boot/string.c so that the functions defined here will have the correct names, otherwise we end up up trying to redefine __builtin_memcpy() etc. Surprisingly, GCC allows this (and, helpfully, discards the __builtin_ prefix from the function name when compiling it), but clang does not. Adding these #undef's appears to preserve what I assume was the original intent of the code. Signed-off-by: NMichael Davidson <md@google.com> Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724235155.79255-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ard Biesheuvel 提交于
The optional prefetch instructions in the copy_page() routine are inconsistent: at the start of the function, two cachelines are prefetched beyond the one being loaded in the first iteration, but in the loop, the prefetch is one more line ahead. This appears to be unintentional, so let's fix it. While at it, fix the comment style and white space. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 24 7月, 2017 2 次提交
-
-
由 Dave Martin 提交于
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the signal frame can be left partially uninitialised in such a way that userspace cannot parse uc_regspace[] safely. In particular, this means that the VFP registers cannot be located reliably in the signal frame when a multi_v7_defconfig kernel is run on the majority of platforms. The cause is that the uc_regspace[] is laid out statically based on the kernel config, but the decision of whether to save/restore the iWMMXt registers must be a runtime decision. To minimise breakage of software that may assume a fixed layout, this patch emits a dummy block of the same size as iwmmxt_sigframe, for non-iWMMXt threads. However, the magic and size of this block are now filled in to help parsers skip over it. A new DUMMY_MAGIC is defined for this purpose. It is probably legitimate (if non-portable) for userspace to manufacture its own sigframe for sigreturn, and there is no obvious reason why userspace should be required to insert a DUMMY_MAGIC block when running on non-iWMMXt hardware, when omitting it has worked just fine forever in other configurations. So in this case, sigreturn does not require this block to be present. Reported-by: NEdmund Grimley-Evans <Edmund.Grimley-Evans@arm.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Dave Martin 提交于
preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user accessors on their arguments pointing to the user signal frame. There does not be appear to be a bug here, but this omission is inconsistent with the crunch and vfp sigframe access functions. This patch adds the annotations, for consistency. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-