- 09 8月, 2017 5 次提交
-
-
由 Robin Murphy 提交于
Add a clean-to-point-of-persistence cache maintenance helper, and wire up the basic architectural support for the pmem driver based on it. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c] [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Robin Murphy 提交于
Cache clean to PoP is subject to the same access controls as to PoC, so if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we need to be prepared to handle it. To avoid getting into complicated fights with binutils about ARMv8.2 options, we'll just cheat and use the raw SYS instruction rather than the 'proper' DC alias. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Robin Murphy 提交于
The ARMv8.2-DCPoP feature introduces persistent memory support to the architecture, by defining a point of persistence in the memory hierarchy, and a corresponding cache maintenance operation, DC CVAP. Expose the support via HWCAP and MRS emulation. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Robin Murphy 提交于
__inval_cache_range() is already the odd one out among our data cache maintenance routines as the only remaining range-based one; as we're going to want an invalidation routine to call from C code for the pmem API, let's tweak the prototype and name to bring it in line with the clean operations, and to make its relationship with __dma_inv_area() neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area(). The loop clearing the early page tables gets mildly massaged in the process for the sake of consistency. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Robin Murphy 提交于
Clearly, set_memory_valid() has never been seen in the same room as its declaration... Whilst the type mismatch is such that kexec probably wasn't broken in practice, fix it to match the definition as it should. Fixes: 9b0aa14e ("arm64: mm: add set_memory_valid()") Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 07 8月, 2017 3 次提交
-
-
由 Julien Thierry 提交于
When receiving unhandled faults from the CPU, description is very sparse. Adding information about faults decoded from ESR. Added defines to esr.h corresponding ESR fields. Values are based on ARM Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception Syndrome Register (ELx) (pages D7-2275 to D7-2280). New output is of the form: [ 77.818059] Mem abort info: [ 77.820826] Exception class = DABT (current EL), IL = 32 bits [ 77.826706] SET = 0, FnV = 0 [ 77.829742] EA = 0, S1PTW = 0 [ 77.832849] Data abort info: [ 77.835713] ISV = 0, ISS = 0x00000070 [ 77.839522] CM = 0, WnR = 1 Signed-off-by: NJulien Thierry <julien.thierry@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: fix "%lu" in a pr_alert() call] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Dave Martin 提交于
The -1 "no syscall" value is written in various ways, shared with the user ABI in some places, and generally obscure. This patch attempts to make things a little more consistent and readable by replacing all these uses with a single #define. A couple of symbolic helpers are provided to clarify the intent further. Because the in-syscall check in do_signal() is changed from >= 0 to != NO_SYSCALL by this patch, different behaviour may be observable if syscallno is set to values less than -1 by a tracer. However, this is not different from the behaviour that is already observable if a tracer sets syscallno to a value >= __NR_(compat_)syscalls. It appears that this can cause spurious syscall restarting, but that is not a new behaviour either, and does not appear harmful. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Dave Martin 提交于
The upper 32 bits of the syscallno field in thread_struct are handled inconsistently, being sometimes zero extended and sometimes sign-extended. In fact, only the lower 32 bits seem to have any real significance for the behaviour of the code: it's been OK to handle the upper bits inconsistently because they don't matter. Currently, the only place I can find where those bits are significant is in calling trace_sys_enter(), which may be unintentional: for example, if a compat tracer attempts to cancel a syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the syscall-enter-stop, it will be traced as syscall 4294967295 rather than -1 as might be expected (and as occurs for a native tracer doing the same thing). Elsewhere, reads of syscallno cast it to an int or truncate it. There's also a conspicuous amount of code and casting to bodge around the fact that although semantically an int, syscallno is stored as a u64. Let's not pretend any more. In order to preserve the stp x instruction that stores the syscall number in entry.S, this patch special-cases the layout of struct pt_regs for big endian so that the newly 32-bit syscallno field maps onto the low bits of the stored value. This is not beautiful, but benchmarking of the getpid syscall on Juno suggests indicates a minor slowdown if the stp is split into an stp x and stp w. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 05 8月, 2017 1 次提交
-
-
由 David S. Miller 提交于
Mikael Pettersson reported that some test programs in the strace-4.18 testsuite cause an OOPS. After some debugging it turns out that garbage values are returned when an exception occurs, causing the fixup memset() to be run with bogus arguments. The problem is that two of the exception handler stubs write the successfully copied length into the wrong register. Fixes: ee841d0a ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.") Reported-by: NMikael Pettersson <mikpelinux@gmail.com> Tested-by: NMikael Pettersson <mikpelinux@gmail.com> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2017 5 次提交
-
-
由 Nick Desaulniers 提交于
The bitmask used to define these values produces overflow, as seen by this compiler warning: arch/arm64/kernel/head.S:47:8: warning: integer overflow in preprocessor expression #elif (PAGE_OFFSET & 0x1fffff) != 0 ^~~~~~~~~~~ arch/arm64/include/asm/memory.h:52:46: note: expanded from macro 'PAGE_OFFSET' #define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1)) ~~~~~~~~~~~~~~~~~~ ^ It would be preferrable to use GENMASK_ULL() instead, but it's not set up to be used from assembly (the UL() macro token pastes UL suffixes when not included in assembly sources). Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Suggested-by: NYury Norov <ynorov@caviumnetworks.com> Suggested-by: NMatthias Kaehlcke <mka@chromium.org> Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Catalin Marinas 提交于
In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon <will.deacon@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NSteve Capper <steve.capper@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Marc Gonzalez 提交于
RX and TX clock delays are required. Request them explicitly. Fixes: cad008b8 ("ARM: dts: tango4: Initial device trees") Cc: stable@vger.kernel.org Signed-off-by: NMarc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Nicholas Piggin 提交于
If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: e0e0d6b7 ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 03 8月, 2017 6 次提交
-
-
由 Wanpeng Li 提交于
------------[ cut here ]------------ WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] Call Trace: vmx_check_nested_events+0x131/0x1f0 [kvm_intel] ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which means that tells the kernel to not make use of any IOAPICs that may be present in the system. Actually external_intr variable in nested_vmx_vmexit() is the req_int_win variable passed from vcpu_enter_guest() which means that the L0's userspace requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && L0's userspace reqeusts an irq window) is true, so there is no interrupt which L1 requires to inject to L2, we should not attempt to emualte "Acknowledge interrupt on exit" for the irq window requirement in this scenario. This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit" if there is no L1 requirement to inject an interrupt to L2. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Added code comment to make it obvious that the behavior is not correct. We should do a userspace exit with open interrupt window instead of the nested VM exit. This patch still improves the behavior, so it was accepted as a (temporary) workaround.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Matlack 提交于
The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 David Matlack 提交于
According to the Intel SDM, software cannot rely on the current VMCS to be coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 flushes. 24.11.1 Software Use of Virtual-Machine Control Structures ... If a logical processor leaves VMX operation, any VMCSs active on that logical processor may be corrupted (see below). To prevent such corruption of a VMCS that may be used either after a return to VMX operation or on another logical processor, software should execute VMCLEAR for that VMCS before executing the VMXOFF instruction or removing power from the processor (e.g., as part of a transition to the S3 and S4 power states). ... This fixes a "suspicious rcu_dereference_check() usage!" warning during kvm_vm_release() because nested_release_vmcs12() calls kvm_vcpu_write_guest_page() without holding kvm->srcu. Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Since the current implementation of VMCS12 does a memcpy in and out of guest memory, we do not need current_vmcs12 and current_vmcs12_page anymore. current_vmptr is enough to read and write the VMCS12. And David Matlack noted: This patch also fixes dirty tracking (memslot->dirty_bitmap) of the VMCS12 page by using kvm_write_guest. nested_release_page() only marks the struct page dirty. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> [Added David Matlack's note and nested_release_page_clean() fix.] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Longpeng(Mike) 提交于
'lapic_irq' is a local variable and its 'level' field isn't initialized, so 'level' is random, it doesn't matter but makes UBSAN unhappy: UBSAN: Undefined behaviour in .../lapic.c:... load of value 10 is not a valid value for type '_Bool' ... Call Trace: [<ffffffff81f030b6>] dump_stack+0x1e/0x20 [<ffffffff81f03173>] ubsan_epilogue+0x12/0x55 [<ffffffff81f03b96>] __ubsan_handle_load_invalid_value+0x118/0x162 [<ffffffffa1575173>] kvm_apic_set_irq+0xc3/0xf0 [kvm] [<ffffffffa1575b20>] kvm_irq_delivery_to_apic_fast+0x450/0x910 [kvm] [<ffffffffa15858ea>] kvm_irq_delivery_to_apic+0xfa/0x7a0 [kvm] [<ffffffffa1517f4e>] kvm_emulate_hypercall+0x62e/0x760 [kvm] [<ffffffffa113141a>] handle_vmcall+0x1a/0x30 [kvm_intel] [<ffffffffa114e592>] vmx_handle_exit+0x7a2/0x1fa0 [kvm_intel] ... Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
When SMP VM start, AP may lost INIT because of receiving INIT between kvm_vcpu_ioctl_x86_get/set_vcpu_events. vcpu 0 vcpu 1 kvm_vcpu_ioctl_x86_get_vcpu_events events->smi.latched_init = 0 send INIT to vcpu1 set vcpu1's pending_events kvm_vcpu_ioctl_x86_set_vcpu_events if (events->smi.latched_init == 0) clear INIT in pending_events This patch fixes it by just update SMM related flags if we are in SMM. Thanks Peng Hao for the report and original commit message. Reported-by: NPeng Hao <peng.hao2@zte.com.cn> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 02 8月, 2017 5 次提交
-
-
由 Gregory CLEMENT 提交于
The number of pins in South Bridge is 30 and not 29. There is a fix for the driver for the pinctrl, but a fix is also need at device tree level for the GPIO. Fixes: afda007f ("ARM64: dts: marvell: Add pinctrl nodes for Armada 3700") Cc: <stable@vger.kernel.org> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
-
由 Sergei Shtylyov 提交于
of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Freescale MPC832x RDB board code still only regards 0 as a failure indication -- fix it up. Fixes: 7a4228bb ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: NScott Wood <oss@buserror.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Wanpeng Li 提交于
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0 CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1 RIP: 0010:rcu_note_context_switch+0x207/0x6b0 Call Trace: __schedule+0xda/0xba0 ? kvm_async_pf_task_wait+0x1b2/0x270 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 RIP: 0010:__d_lookup_rcu+0x90/0x1e0 I encounter this when trying to stress the async page fault in L1 guest w/ L2 guests running. Commit 9b132fbe (Add rcu user eqs exception hooks for async page fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu idle eqs when needed, to protect the code that needs use rcu. However, we need to call the pair even if the function calls schedule(), as seen from the above backtrace. This patch fixes it by informing the RCU subsystem exit/enter the irq towards/away from idle for both n.halted and !n.halted. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
There are three issues in nested_vmx_check_exception: 1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported by Wanpeng Li; 2) it should rebuild the interruption info and exit qualification fields from scratch, as reported by Jim Mattson, because the values from the L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes a page fault, the EPT misconfig's exit qualification is incorrect). 3) CR2 and DR6 should not be written for exception intercept vmexits (CR2 only for AMD). This patch fixes the first two and adds a comment about the last, outlining the fix. Cc: Jim Mattson <jmattson@google.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Do this in the caller of nested_vmx_vmexit instead. nested_vmx_check_exception was doing a vmwrite to the vmcs02's VM_EXIT_INTR_ERROR_CODE field, so that prepare_vmcs12 would move the field to vmcs12->vm_exit_intr_error_code. However that isn't possible on pre-Haswell machines. Moving the vmcs12 write to the callers fixes it. Reported-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Changed nested_vmx_reflect_vmexit() return type to (int)1 from (bool)1, thanks to fengguang.wu@intel.com] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 01 8月, 2017 2 次提交
-
-
由 Marc Zyngier 提交于
In an ideal world, CNTFRQ_EL0 always contains the timer frequency for the kernel to use. Sadly, we get quite a few broken systems where the firmware authors cannot be bothered to program that register on all CPUs, and rely on DT to provide that frequency. So when trapping CNTFRQ_EL0, make sure to return the actual rate (as known by the kernel), and not CNTFRQ_EL0. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Thomas Gleixner 提交于
The HPET resume path abuses irq_domain_[de]activate_irq() to restore the MSI message in the HPET chip for the boot CPU on resume and it relies on an implementation detail of the interrupt core code, which magically makes the HPET unmask call invoked via a irq_disable/enable pair. This worked as long as the irq code did unconditionally invoke the unmask() callback. With the recent changes which keep track of the masked state to avoid expensive hardware access, this does not longer work. As a consequence the HPET timer interrupts are not unmasked which breaks resume as the boot CPU waits forever that a timer interrupt arrives. Make the restore of the MSI message explicit and invoke the unmask() function directly. While at it get rid of the pointless affinity setting as nothing can change the affinity of the interrupt and the vector across suspend/resume. The restore of the MSI message reestablishes the previous affinity setting which is the correct one. Fixes: bf22ff45 ("genirq: Avoid unnecessary low level irq function calls") Reported-and-tested-by: NTomi Sarvela <tomi.p.sarvela@intel.com> Reported-by: NMartin Peres <martin.peres@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: N"Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: jeffy.chen@rock-chips.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707312158590.2287@nanos
-
- 31 7月, 2017 4 次提交
-
-
由 Babu Moger 提交于
While working on enabling queued rwlock on SPARC, found this following code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN to clear a byte. static inline u8 *__qrwlock_write_byte(struct qrwlock *lock) { return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); } Problem is many of the fixed big endian architectures don't define CPU_BIG_ENDIAN and clears the wrong byte. Define CPU_BIG_ENDIAN for parisc architecture to fix it. Signed-off-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Nicholas Piggin 提交于
The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Helge Deller 提交于
Since kernel 4.11 the thread and irq stacks on parisc randomly overflow the default size of 16k. The reason why stack usage suddenly grew is yet unknown. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.11+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 John David Anglin 提交于
In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG statement in flush_cache_range() during a system shutdown: kernel BUG at arch/parisc/kernel/cache.c:595! CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx IAOQ[0]: flush_cache_range+0x144/0x148 IAOQ[1]: flush_cache_page+0x0/0x1a8 RP(r2): flush_cache_range+0xec/0x148 Backtrace: [<00000000402910ac>] unmap_page_range+0x84/0x880 [<00000000402918f4>] unmap_single_vma+0x4c/0x60 [<0000000040291a18>] zap_page_range_single+0x110/0x160 [<0000000040291c34>] unmap_mapping_range+0x174/0x1a8 [<000000004026ccd8>] truncate_pagecache+0x50/0xa8 [<000000004026cd84>] truncate_setsize+0x54/0x70 [<000000004033d534>] put_aio_ring_file+0x44/0xb0 [<000000004033d5d8>] aio_free_ring+0x38/0x140 [<000000004033d714>] free_ioctx+0x34/0xa8 [<00000000401b0028>] process_one_work+0x1b8/0x4d0 [<00000000401b04f4>] worker_thread+0x1b4/0x648 [<00000000401b9128>] kthread+0x1b0/0x208 [<0000000040150020>] end_fault_vector+0x20/0x28 [<0000000040639518>] nf_ip_reroute+0x50/0xa8 [<0000000040638ed0>] nf_ip_route+0x10/0x78 [<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8 CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx Backtrace: [<0000000040163bf0>] show_stack+0x20/0x38 [<0000000040688480>] dump_stack+0xa8/0x120 [<0000000040163dc4>] die_if_kernel+0x19c/0x2b0 [<0000000040164d0c>] handle_interruption+0xa24/0xa48 This patch modifies flush_cache_range() to handle non current contexts. In as much as this occurs infrequently, the simplest approach is to flush the entire cache when this happens. Signed-off-by: NJohn David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: NHelge Deller <deller@gmx.de>
-
- 30 7月, 2017 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit f8475cef "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute in sysfs only behaves as expected on x86 with APERF/MPERF registers available when it is read from at least twice in a row. The value returned by the first read may not be meaningful, because the computations in there use cached values from the previous iteration of aperfmperf_snapshot_khz() which may be stale. To prevent that from happening, modify arch_freq_get_on_cpu() to call aperfmperf_snapshot_khz() twice, with a short delay between these calls, if the previous invocation of aperfmperf_snapshot_khz() was too far back in the past (specifically, more that 1s ago). Also, as pointed out by Doug Smythies, aperf_delta is limited now and the multiplication of it by cpu_khz won't overflow, so simplify the s->khz computations too. Fixes: f8475cef "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" Reported-by: NDoug Smythies <dsmythies@telus.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 7月, 2017 1 次提交
-
-
由 Geert Uytterhoeven 提交于
Simon Horman reported that Koelsch and Lager hang during boot, and bisected this to commit 1c3c5eab ("sched/core: Enable might_sleep() and smp_processor_id() checks early"). The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus notifier, and unregisters the notifier when it is no longer needed. However, a notifier must not be unregistered from within the call chain. This bug went unnoticed, as blocking_notifier_chain_unregister() didn't take the semaphore during early boot. The aforementioned commit changed that behavior, leading to a deadlock. Fix this by removing the call to bus_unregister_notifier(), and keeping local completion state instead. Reported-by: NSimon Horman <horms+renesas@verge.net.au> Fixes: 663fbb52 ("ARM: shmobile: R-Car Gen2: Add da9063/da9210 regulator quirk") Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
-
- 28 7月, 2017 7 次提交
-
-
由 Alistair Popple 提交于
Commit 8e3f1b1d ("powerpc/powernv/pci: Enable 64-bit devices to access >4GB DMA space") introduced the ability for PCI device drivers to request a DMA mask between 64 and 32 bits and actually get a mask greater than 32-bits. However currently if certain machine configuration dependent conditions are not meet the code silently falls back to a 32-bit mask. This makes it hard for device drivers to detect which mask they actually got. Instead we should return an error when the request could not be fulfilled which allows drivers to either fallback or implement other workarounds as documented in DMA-API-HOWTO.txt. Signed-off-by: NAlistair Popple <alistair@popple.id.au> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Historically the boot wrapper was always built 32-bit big endian, even for 64-bit kernels. That was because old firmwares didn't necessarily support booting a 64-bit image. Because of that arch/powerpc/boot/Makefile uses CROSS32CC for compilation. However when we added 64-bit little endian support, we also added support for building the boot wrapper 64-bit. However we kept using CROSS32CC, because in most cases it is just CC and everything works. However if the user doesn't specify CROSS32_COMPILE (which no one ever does AFAIK), and CC is *not* biarch (32/64-bit capable), then CROSS32CC becomes just "gcc". On native systems that is probably OK, but if we're cross building it definitely isn't, leading to eg: gcc ... -m64 -mlittle-endian -mabi=elfv2 ... arch/powerpc/boot/cpm-serial.c gcc: error: unrecognized argument in option ‘-mabi=elfv2’ gcc: error: unrecognized command line option ‘-mlittle-endian’ make: *** [zImage] Error 2 To fix it, stop using CROSS32CC, because we may or may not be building 32-bit. Instead setup a BOOTCC, which defaults to CC, and only use CROSS32_COMPILE if it's set and we're building for 32-bit. Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
-
由 Michael Ellerman 提交于
In smp_cpus_done() we need to call smp_ops->setup_cpu() for the boot CPU, which means it has to run *on* the boot CPU. In the past we ensured it ran on the boot CPU by changing the CPU affinity mask of current directly. That was removed in commit 6d11b87d ("powerpc/smp: Replace open coded task affinity logic"), and replaced with a work queue call. Unfortunately using a work queue leads to a lockdep warning, now that the CPU hotplug lock is a regular semaphore: ====================================================== WARNING: possible circular locking dependency detected ... kworker/0:1/971 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){++++++}, at: [<c000000000100974>] apply_workqueue_attrs+0x34/0xa0 but task is already holding lock: ((&wfc.work)){+.+.+.}, at: [<c0000000000fdb2c>] process_one_work+0x25c/0x800 ... CPU0 CPU1 ---- ---- lock((&wfc.work)); lock(cpu_hotplug_lock.rw_sem); lock((&wfc.work)); lock(cpu_hotplug_lock.rw_sem); Although the deadlock can't happen in practice, because smp_cpus_done() only runs in early boot before CPU hotplug is allowed, lockdep can't tell that. Luckily in commit 8fb12156 ("init: Pin init task to the boot CPU, initially") tglx changed the generic code to pin init to the boot CPU to begin with. The unpinning of init from the boot CPU happens in sched_init_smp(), which is called after smp_cpus_done(). So smp_cpus_done() is always called on the boot CPU, which means we don't need the work queue call at all - and the lockdep warning goes away. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Will Deacon 提交于
The vast majority of virtual allocations in the vmalloc region are followed by a guard page, which can help to avoid overruning on vma into another, which may map a read-sensitive device. This patch adds a guard page to the end of the kernel image mapping (i.e. following the data/bss segments). Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Matthias Kaehlcke 提交于
The clang warning 'address-of-packed-member' is disabled for the general kernel code, also disable it for the x86 boot code. This suppresses a bunch of warnings like this when building with clang: ./arch/x86/include/asm/processor.h:535:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value [-Waddress-of-packed-member] return this_cpu_read_stable(cpu_tss.x86_tss.sp0); ^~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro 'this_cpu_read_stable' #define this_cpu_read_stable(var) percpu_stable_op("mov", var) ^~~ ./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro 'percpu_stable_op' : "p" (&(var))); ^~~ Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Cc: Doug Anderson <dianders@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Gustavo Romero 提交于
Currently flush_tmregs_to_thread() does not save the TM SPRs (TFHAR, TFIAR, TEXASR) to the thread struct, unless the process is currently inside a suspended transaction. If the process is core dumping, and the TM SPRs have changed since the last time the process was context switched, then we will save stale values of the TM SPRs to the core dump. Fix it by saving the live register state to the thread struct in that case. Fixes: 08e1c01d ("powerpc/ptrace: Enable support for TM SPR state") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: NGustavo Romero <gromero@linux.vnet.ibm.com> Reviewed-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Oliver O'Halloran 提交于
The Radix MMU translation tree as defined in ISA v3.0 contains two different types of entry, directories and leaves. Leaves are identified by _PAGE_PTE being set. The formats of the two entries are different, with the directory entries containing no spare bits for use by software. In particular the bit we use for _PAGE_DEVMAP is not reserved for software, and is part of the NLB (Next Level Base) field, essentially the address of the next level in the tree. Note that the Linux pte_t is not == _PAGE_PTE. A huge page pmd entry (or devmap!) is also a leaf and so has _PAGE_PTE set, even though we use a pmd_t for it in Linux. The fix is to ensure that the pmd/pte_devmap() confirm they are looking at a leaf entry (_PAGE_PTE) as well as checking _PAGE_DEVMAP. Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64") Signed-off-by: NOliver O'Halloran <oohall@gmail.com> Tested-by: NLaurent Vivier <lvivier@redhat.com> Tested-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Add a comment in the code and flesh out change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-