- 06 11月, 2017 27 次提交
-
-
由 Cyril Bur 提交于
This patch adds an _interruptible version of opal_async_wait_response(). This is useful when a long running OPAL call is performed on behalf of a userspace thread, for example, the opal_flash_{read,write,erase} functions performed by the powernv-flash MTD driver. It is foreseeable that these functions would take upwards of two minutes causing the wait_event() to block long enough to cause hung task warnings. Furthermore, wait_event_interruptible() is preferable as otherwise there is no way for signals to stop the process which is going to be confusing in userspace. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Stewart Smith 提交于
Parallel sensor reads could run out of async tokens due to opal_get_sensor_data grabbing tokens but then doing the sensor read behind a mutex, essentially serializing the (possibly asynchronous and relatively slow) sensor read. It turns out that the mutex isn't needed at all, not only should the OPAL interface allow concurrent reads, the implementation is certainly safe for that, and if any sensor we were reading from somewhere isn't, doing the mutual exclusion in the kernel is the wrong place to do it, OPAL should be doing it for the kernel. So, remove the mutex. Additionally, we shouldn't be printing out an error when we don't get a token as the only way this should happen is if we've been interrupted in down_interruptible() on the semaphore. Reported-by: NRobert Lippert <rlippert@google.com> Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
Future work will add an opal_async_wait_response_interruptible() which will call wait_event_interruptible(). This work requires extra token state to be tracked as wait_event_interruptible() can return and the caller could release the token before OPAL responds. Currently token state is tracked with two bitfields which are 64 bits big but may not need to be as OPAL informs Linux how many async tokens there are. It also uses an array indexed by token to store response messages for each token. The bitfields make it difficult to add more state and also provide a hard maximum as to how many tokens there can be - it is possible that OPAL will inform Linux that there are more than 64 tokens. Rather than add a bitfield to track the extra state, rework the internals slightly. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> [mpe: Fix __opal_async_get_token() when no tokens are free] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
There are no callers of both __opal_async_get_token() and __opal_async_release_token(). This patch also removes the possibility of "emergency through synchronous call to __opal_async_get_token()" as such it makes more sense to initialise opal_sync_sem for the maximum number of async tokens. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
Because the MTD core might split up a read() or write() from userspace into several calls to the driver, we may fail to get a token but already have done some work, best to return -EINTR back to userspace and have them decide what to do. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
powernv_flash_probe() has pointless goto statements which jump to the end of the function to simply return a variable. Rather than checking for error and going to the label, just return the error as soon as it is detected. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
While this driver expects to interact asynchronously, OPAL is well within its rights to return OPAL_SUCCESS to indicate that the operation completed without the need for a callback. We shouldn't treat OPAL_SUCCESS as an error rather we should wrap up and return promptly to the caller. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Cyril Bur 提交于
BUG_ON() should be reserved in situations where we can not longer guarantee the integrity of the system. In the case where powernv_flash_async_op() receives an impossible op, we can still guarantee the integrity of the system. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
The current code checks the completion map to look for the first token that is complete. In some cases, a completion can come in but the token can still be on lease to the caller processing the completion. If this completed but unreleased token is the first token found in the bitmap by another tasks trying to acquire a token, then the __test_and_set_bit call will fail since the token will still be on lease. The acquisition will then fail with an EBUSY. This patch reorganizes the acquisition code to look at the opal_async_token_map for an unleased token. If the token has no lease it must have no outstanding completions so we should never see an EBUSY, unless we have leased out too many tokens. Since opal_async_get_token_inrerruptible is protected by a semaphore, we will practically never see EBUSY anymore. Fixes: 8d724823 ("powerpc/powernv: Infrastructure to support OPAL async completion") Signed-off-by: NWilliam A. Kennington III <wak@google.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Arnd Bergmann 提交于
This interface is inefficient and deprecated because of the y2038 overflow. ktime_get_seconds() is an appropriate replacement here, since it has sufficient granularity but is more efficient and uses monotonic time. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Vaibhav Jain 提交于
Presently the PSL9 specific cxl_stop_trace_psl9() only stops the RX0 traces on the CXL adapter when a PSL error irq is triggered. The patch updates the function to stop all the traces arrays and move them to the FIN state. The implementation issues the mmio to TRACECFG register to stop the trace array iff it already not in FIN state. This prevents the issue of trace data being reset in case of multiple stop mmio issued for a single trace array. Also the patch does some refactoring of existing cxl_stop_trace_psl9() and cxl_stop_trace_psl8() functions by moving them to 'pci.c' from 'debugfs.c' file and marking them as static. Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Sandipan Das 提交于
Take advantage of stack_depth tracking, originally introduced for x64, in powerpc JIT as well. Round up allocated stack by 16 bytes to make sure it stays aligned for functions called from JITed bpf program. Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com> Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Currently when we take a TM Bad Thing program check exception, we search the bug table to see if the program check was generated by a WARN/WARN_ON etc. That makes no sense, the WARN macros use trap instructions, which should never generate a TM Bad Thing exception. If they ever did that would be a bug and we should oops. We do have some hand-coded bugs in tm.S, using EMIT_BUG_ENTRY, but those are all BUGs not WARNs, and they all use trap instructions anyway. Almost certainly this check was incorrectly copied from the REASON_TRAP handling in the same function. Remove it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-By: NMichael Neuling <mikey@neuling.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Currently if the hardware supports the radix MMU we will use it, *unless* "disable_radix" is passed on the kernel command line. However some users would like the reverse semantics. ie. The kernel uses the hash MMU by default, unless radix is explicitly requested on the command line. So add a CONFIG option to choose whether we use radix by default or not, and expand the disable_radix command line option to allow "disable_radix=no" which *enables* radix. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU found in "server" processors, from IBM mainly. Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's annoying to have two symbols that always have the same value, it's not quite annoying enough to bother removing one. However with the arrival of Power9, we now have the situation where CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the Radix MMU - *not* the "standard" MMU. So it is now actively confusing to use it, because it implies that code is disabled or inactive when the Radix MMU is in use, however that is not necessarily true. So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor formatting updates of some of the affected lines. This will be a pain for backports, but c'est la vie. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
The last user of CPU_FTR_ICSWX was removed in commit 6ff4d3e9 ("powerpc: Remove old unused icswx based coprocessor support"), so free the bit up for future use. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Make the printks look a bit nicer by adding a prefix. Radix config now do radix-mmu: Page sizes from device-tree: radix-mmu: Page size shift = 12 AP=0x0 radix-mmu: Page size shift = 16 AP=0x5 radix-mmu: Page size shift = 21 AP=0x1 radix-mmu: Page size shift = 30 AP=0x2 This patch update hash config to do similar dmesg output. With the patch we have hash-mmu: Page sizes from device-tree: hash-mmu: base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0 hash-mmu: base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7 hash-mmu: base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56 hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1 hash-mmu: base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8 hash-mmu: base_shift=20: shift=20, sllp=0x0111, avpnm=0x00000000, tlbiel=0, penc=2 hash-mmu: base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0 hash-mmu: base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Christophe Leroy 提交于
IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR which is the mask register. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Alexey Kardashevskiy 提交于
In order to make generic IOV code work, the physical function IOV BAR should start from offset of the first VF. Since M64 segments share PE number space across PHB, and some PEs may be in use at the time when IOV is enabled, the existing code shifts the IOV BAR to the index of the first PE/VF. This creates a hole in IOMEM space which can be potentially taken by some other device. This reserves a temporary hole on a parent and releases it when IOV is disabled; the temporary resources are stored in pci_dn to avoid kmalloc/free. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Tyrel Datwyler 提交于
When a vdevice is DLPAR removed from the system the vio subsystem doesn't bother unmapping the virq from the irq_domain. As a result we have a virq mapped to a hardware irq that is no longer valid for the irq_domain. A side effect is that we are left with /proc/irq/<irq#> affinity entries, and attempts to modify the smp_affinity of the irq will fail. In the following observed example the kernel log is spammed by ics_rtas_set_affinity errors after the removal of a VSCSI adapter. This is a result of irqbalance trying to adjust the affinity every 10 seconds. rpadlpar_io: slot U8408.E8E.10A7ACV-V5-C25 removed ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3 ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3 This patch fixes the issue by calling irq_dispose_mapping() on the virq of the viodev on unregister. Fixes: f2ab6219 ("powerpc/pseries: Add PFO support to the VIO bus") Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
According to the architecture, the process table entry cache must be flushed with tlbie RIC=2. Currently the process table entry is set to invalid right before the PID is returned to the allocator, with no invalidation. This works on existing implementations that are known to not cache the process table entry for any except the current PIDR. It is architecturally correct and cleaner to invalidate with RIC=2 after clearing the process table entry and before the PID is returned to the allocator. This can be done in arch_exit_mmap that runs before the final flush, and to ensure the final flush (fullmm) is always a RIC=2 variant. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Preempt should be consistently disabled for mm_is_thread_local tests, so bring the rest of these under preempt_disable(). Preempt does not need to be disabled for the mm->context.id tests, which allows simplification and removal of gotos. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Close the recoverability gap for OPAL calls by using FIXUP_ENDIAN_HV in the return path. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Add an HV variant of FIXUP_ENDIAN which uses HSRR[01] and does not clear MSR[RI], which improves recoverability. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
When returning from an exception to a soft-enabled context, pending IRQs are replayed but IRQ tracing is not reset, so a number of them can get chained together into the same IRQ-disabled trace. Fix this by having __check_irq_replay re-set IRQ trace. This is conceptually where we respond to the next interrupt, so it fits the semantics of the IRQ tracer. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
If the host takes a system reset interrupt while a guest is running, the CPU must exit the guest before processing the host exception handler. After this patch, taking a sysrq+x with a CPU running in a guest gives a trace like this: cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0] pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv] lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv] sp: c000000fdf577850 msr: 9000000002803033 current = 0xc000000fdf4b1e00 paca = 0xc00000000fd4d680 softe: 3 irq_happened: 0x01 pid = 6608, comm = qemu-system-ppc Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv] [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm] [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm] [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm] [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0 [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100 [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c --- Exception: c01 (System Call) at 00007fff76868df0 SP (7fff7069baf0) is in userspace Fixes: e36d0a2e ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 22 10月, 2017 12 次提交
-
-
由 Markus Elfring 提交于
Although kfree(NULL) is legal, it's a bit lazy to rely on that to implement the error handling. So do it the normal Linux way using labels for each failure path. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> [mpe: Squash a few patches and rewrite change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Markus Elfring 提交于
Fix a word in these descriptions. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Markus Elfring 提交于
The local variable "rc" will eventually be set only to an error code. Thus omit the explicit initialisation at the beginning. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Geert Uytterhoeven 提交于
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
In the hv-24x7 code there is a function memord() which tries to implement a sort function return -1, 0, 1. However one of the conditions is incorrect, such that it can never be true, because we will have already returned. I don't believe there is a bug in practice though, because the comparisons are an optimisation prior to calling memcmp(). Fix it by swapping the second comparision, so it can be true. Reported-by: NDavid Binderman <dcb314@hotmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Back in 2008 we added support for "fast little-endian switch" in the syscall path. This added a special case syscall number 0x1ebe, which is caught very early in the system call exception and switches endian with as little overhead as possible. See commit 745a14cc ("[POWERPC] Add fast little-endian switch system call") for full details. Although it is fast, it's also completely non standard. The "syscall number" is out of the range of normal syscalls, it can't be traced or audited, and it's a bit of a wart. To the best of our knowledge it was only used by one program, now long since discontinued. So in an effort to shake out any current users, put it behind a config option, and make it default n. If anyone *is* using it they can quickly reinstate it with a rebuild, and we can flip it to default y. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
So we can #ifdef them in the next patch. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
When dumping the paca in xmon we currently show kstack. Although it's not hard it's a bit fiddly to work out what the bounds of the kernel stack should be based on the kstack value. To make life easier and "kstack_base" which is the base (lowest address) of the kernel stack, eg: kstack = 0xc0000000f1a7be30 (0x258) kstack_base = 0xc0000000f1a78000 Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Andrew Donnellan 提交于
i2c-dev provides an interface for userspace programs to interact with I2C devices, and is very helpful for I2C-related debugging. Enable it in pseries_defconfig and powernv_defconfig. It's already enabled in many other powerpc defconfigs. Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
We call these functions with non-NULL mm or vma. Hence we can skip the NULL check in these functions. We also remove now unused function __local_flush_hugetlb_page(). Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Drop the checks with is_vm_hugetlb_page() as noticed by Nick] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Breno Leitao 提交于
Currently xmon could call XIVE functions from OPAL even if the XIVE is disabled or does not exist in the system, as in POWER8 machines. This causes the following exception: 1:mon> dx cpu 0x1: Vector: 700 (Program Check) at [c000000423c93450] pc: c00000000009cfa4: opal_xive_dump+0x50/0x68 lr: c0000000000997b8: opal_return+0x0/0x50 This patch simply checks if XIVE is enabled before calling XIVE functions. Fixes: 243e2511 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Suggested-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: NBreno Leitao <leitao@debian.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Vaibhav Jain 提交于
Access to PSL/XSL_DEBUG registers on the adapter provides easy access to the debug facilities provided by PSL/XSL. So this patch adds two new files (debug, xsl-debug) to the cxl-adapter specific debugfs folder located at /sys/kernel/debugfs/cxl/card<n>, which will provide direct r/w access to corrosponding debug registers in the adapter config-space. Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 21 10月, 2017 1 次提交
-
-
由 Michael Neuling 提交于
Unfortunately userspace can construct a sigcontext which enables suspend. Thus userspace can force Linux into a path where trechkpt is executed. This patch blocks this from happening on POWER9 by sanity checking sigcontexts passed in. ptrace doesn't have this problem as only MSR SE and BE can be changed via ptrace. This patch also adds a number of WARN_ON()s in case we ever enter suspend when we shouldn't. This should not happen, but if it does the symptoms are soft lockup warnings which are not obviously TM related, so the WARN_ON()s should make it obvious what's happening. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-