- 27 3月, 2018 4 次提交
-
-
由 Michael Ellerman 提交于
This commit adds security feature flags to reflect the settings we receive from firmware regarding Spectre/Meltdown mitigations. The feature names reflect the names we are given by firmware on bare metal machines. See the hostboot source for details. Arguably these could be firmware features, but that then requires them to be read early in boot so they're available prior to asm feature patching, but we don't actually want to use them for patching. We may also want to dynamically update them in future, which would be incompatible with the way firmware features work (at the moment at least). So for now just make them separate flags. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
Currently the rfi-flush messages print 'Using <type> flush' for all enabled_flush_types, but that is not necessarily true -- as now the fallback flush is always enabled on pseries, but the fixup function overwrites its nop/branch slot with other flush types, if available. So, replace the 'Using <type> flush' messages with '<type> flush is available'. Also, print the patched flush types in the fixup function, so users can know what is (not) being used (e.g., the slower, fallback flush, or no flush type at all if flush is disabled via the debugfs switch). Suggested-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
For PowerVM migration we want to be able to call setup_rfi_flush() again after we've migrated the partition. To support that we need to check that we're not trying to allocate the fallback flush area after memblock has gone away (i.e., boot-time only). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
rfi_flush_enable() includes a check to see if we're already enabled (or disabled), and in that case does nothing. But that means calling setup_rfi_flush() a 2nd time doesn't actually work, which is a bit confusing. Move that check into the debugfs code, where it really belongs. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 3月, 2018 5 次提交
-
-
由 Paul Mackerras 提交于
POWER9 has hardware bugs relating to transactional memory and thread reconfiguration (changes to hardware SMT mode). Specifically, the core does not have enough storage to store a complete checkpoint of all the architected state for all four threads. The DD2.2 version of POWER9 includes hardware modifications designed to allow hypervisor software to implement workarounds for these problems. This patch implements those workarounds in KVM code so that KVM guests see a full, working transactional memory implementation. The problems center around the use of TM suspended state, where the CPU has a checkpointed state but execution is not transactional. The workaround is to implement a "fake suspend" state, which looks to the guest like suspended state but the CPU does not store a checkpoint. In this state, any instruction that would cause a transition to transactional state (rfid, rfebb, mtmsrd, tresume) or would use the checkpointed state (treclaim) causes a "soft patch" interrupt (vector 0x1500) to the hypervisor so that it can be emulated. The trechkpt instruction also causes a soft patch interrupt. On POWER9 DD2.2, we avoid returning to the guest in any state which would require a checkpoint to be present. The trechkpt in the guest entry path which would normally create that checkpoint is replaced by either a transition to fake suspend state, if the guest is in suspend state, or a rollback to the pre-transactional state if the guest is in transactional state. Fake suspend state is indicated by a flag in the PACA plus a new bit in the PSSCR. The new PSSCR bit is write-only and reads back as 0. On exit from the guest, if the guest is in fake suspend state, we still do the treclaim instruction as we would in real suspend state, in order to get into non-transactional state, but we do not save the resulting register state since there was no checkpoint. Emulation of the instructions that cause a softpatch interrupt is handled in two paths. If the guest is in real suspend mode, we call kvmhv_p9_tm_emulation_early() to handle the cases where the guest is transitioning to transactional state. This is called before we do the treclaim in the guest exit path; because we haven't done treclaim, we can get back to the guest with the transaction still active. If the instruction is a case that kvmhv_p9_tm_emulation_early() doesn't handle, or if the guest is in fake suspend state, then we proceed to do the complete guest exit path and subsequently call kvmhv_p9_tm_emulation() in host context with the MMU on. This handles all the cases including the cases that generate program interrupts (illegal instruction or TM Bad Thing) and facility unavailable interrupts. The emulation is reasonably straightforward and is mostly concerned with checking for exception conditions and updating the state of registers such as MSR and CR0. The treclaim emulation takes care to ensure that the TEXASR register gets updated as if it were the guest treclaim instruction that had done failure recording, not the treclaim done in hypervisor state in the guest exit path. With this, the KVM_CAP_PPC_HTM capability returns true (1) even if transactional memory is not available to host userspace. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
POWER9 processors up to and including "Nimbus" v2.2 have hardware bugs relating to transactional memory and thread reconfiguration. One of these bugs has a workaround which is to get the core into SMT4 state temporarily. This workaround is only needed when running bare-metal. This patch provides a function which gets the core into SMT4 mode by preventing threads from going to a stop state, and waking up those which are already in a stop state. Once at least 3 threads are not in a stop state, the core will be in SMT4 and we can continue. To do this, we add a "dont_stop" flag to the paca to tell the thread not to go into a stop state. If this flag is set, power9_idle_stop() just returns immediately with a return value of 0. The pnv_power9_force_smt4_catch() function does the following: 1. Set the dont_stop flag for each thread in the core, except ourselves (in fact we use an atomic_inc() in case more than one thread is calling this function concurrently). 2. See how many threads are awake, indicated by their requested_psscr field in the paca being 0. If this is at least 3, skip to step 5. 3. Send a doorbell interrupt to each thread that was seen as being in a stop state in step 2. 4. Until at least 3 threads are awake, scan the threads to which we sent a doorbell interrupt and check if they are awake now. This relies on the following properties: - Once dont_stop is non-zero, requested_psccr can't go from zero to non-zero, except transiently (and without the thread doing stop). - requested_psscr being zero guarantees that the thread isn't in a state-losing stop state where thread reconfiguration could occur. - Doing stop with a PSSCR value of 0 won't be a state-losing stop and thus won't allow thread reconfiguration. - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core must be in SMT4 mode, since SMT modes are powers of 2. This does add a sync to power9_idle_stop(), which is necessary to provide the correct ordering between setting requested_psscr and checking dont_stop. The overhead of the sync should be unnoticeable compared to the latency of going into and out of a stop state. Because some objected to incurring this extra latency on systems where the XER[SO] bug is not relevant, I have put the test in power9_idle_stop inside a feature section. This means that pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will probably hang the system. In order to cater for uses where the caller has an operation that has to be done while the core is in SMT4, the core continues to be kept in SMT4 after pnv_power9_force_smt4_catch() function returns, until the pnv_power9_force_smt4_release() function is called. It undoes the effect of step 1 above and allows the other threads to go into a stop state. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds a CPU feature bit which is set for POWER9 "Nimbus" DD2.2 processors which will be used to enable the hypervisor to assist hardware with the handling of checkpointed register values while the CPU is in suspend state, in order to work around hardware bugs. The hardware assistance for these workarounds introduced a new hardware bug relating to the XER[SO] bit. We add a separate feature bit for this bug in case future chips fix it while still requiring the hypervisor assistance with suspend state. When the dt_cpu_ftrs subsystem is in use, the software assistance can be enabled using a "tm-suspend-hypervisor-assist" node in the device tree, and a "tm-suspend-xer-so-bug" node enables the workarounds for the XER[SO] bug. In the absence of such nodes, a quirk enables both for POWER9 "Nimbus" DD2.2 processors. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This moves all the CPU feature bits that are only used on 32-bit machines to the top 20 bits of the CPU feature word and arranges for them to be defined only in 32-bit builds. The features that are common to 32-bit and 64-bit machines are moved to bits 0-11 of the CPU feature word. This means that for 64-bit platforms, bits 44-63 can now be used for new features that only exist on 64-bit machines. (These bit numbers are counting from the right, i.e. the LSB is bit 0.) Because CPU_FTR_L3_DISABLE_NAP moved from the low 16 bits to the high 16 bits, we have to adjust some assembly code. Also, CPU_FTR_EMB_HV moved from the high 16 bits to the low 16 bits. Note that CPU_FTR_REAL_LE only applies to 64-bit chips, because only 64-bit chips (POWER6, 7, 8, 9) have a true little-endian mode that is a CPU execution mode as opposed to being a page attribute. With this we now have 20 free CPU feature bits on 64-bit machines. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
All PowerPC CPUs other than the original PPC601 have a timebase register rather than the "real-time clock" (RTC) register that the PPC601 (and the original POWER and POWER2 CPUs) had. Currently we have a CPU feature bit to indicate the presence of the timebase, but it makes more sense to use a bit to indicate the unusual situation rather than the common situation. This therefore defines a CPU_FTR_USE_RTC bit in place of the CPU_FTR_USE_TB bit, and arranges for it to be set on PPC601 systems. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 20 3月, 2018 2 次提交
-
-
由 Markus Elfring 提交于
It's slightly less error prone to use sizeof(*foo) rather than specifying the type. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> [mpe: Consolidate into one patch, rewrite change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Matt Brown 提交于
The flush_dcache_phys_range() function is no longer used in the kernel. The last usage was removed in c40785ad ("powerpc/dart: Use a cachable DART"). This patch removes the function and declaration. Signed-off-by: NMatt Brown <matthew.brown.dev@gmail.com> [mpe: Munge change log, include commit that removed last user] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 14 3月, 2018 1 次提交
-
-
由 Alexandre Belloni 提交于
The RTC core is always calling rtc_valid_tm after the read_time callback. It is not necessary to call it just before returning from the callback. Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 13 3月, 2018 6 次提交
-
-
由 Mathieu Malaterre 提交于
early_init() and machine_init() have no prototype, add one in asm-prototypes.h. Fixes the following warnings (treated as error in W=1): arch/powerpc/kernel/setup_32.c:68:30: error: no previous prototype for ‘early_init’ arch/powerpc/kernel/setup_32.c:99:21: error: no previous prototype for ‘machine_init’ Signed-off-by: NMathieu Malaterre <malat@debian.org> [mpe: Move them to asm-prototypes.h, drop other functions] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mathieu Malaterre 提交于
These functions can all be static, make it so. Signed-off-by: NMathieu Malaterre <malat@debian.org> [mpe: Combine a patch of Mathieu's with some other static conversions] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mathieu Malaterre 提交于
When neither CONFIG_ALTIVEC, nor CONFIG_VSX or CONFIG_PPC64 is defined, the array feature_properties is defined as an empty array, which in turn triggers the following warning (treated as error on W=1): arch/powerpc/kernel/prom.c: In function ‘check_cpu_feature_properties’: arch/powerpc/kernel/prom.c:298:16: error: comparison of unsigned expression < 0 is always false for (i = 0; i < ARRAY_SIZE(feature_properties); ++i, ++fp) { ^ Suggested-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NMathieu Malaterre <malat@debian.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mathieu Malaterre 提交于
Two functions did not have a prototype defined in signal.h header. Fix the following two warnings (treated as errors in W=1): arch/powerpc/kernel/signal_32.c:1135:6: error: no previous prototype for ‘sys_rt_sigreturn’ arch/powerpc/kernel/signal_32.c:1422:6: error: no previous prototype for ‘sys_sigreturn’ Signed-off-by: NMathieu Malaterre <malat@debian.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mathieu Malaterre 提交于
__giveup_fpu() is never called outside process.c, so it can be static. That also means we don't need an empty definition in switch_to.h Signed-off-by: NMathieu Malaterre <malat@debian.org> [mpe: Also drop the empty version, rewrite change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mathieu Malaterre 提交于
Since the value of `tmp` is never intended to be read, declare both `tmp` variables as unused. Fix warning (treated as error in W=1): arch/powerpc/kernel/signal_32.c: In function ‘sys_swapcontext’: arch/powerpc/kernel/signal_32.c:1048:16: error: variable ‘tmp’ set but not used arch/powerpc/kernel/signal_32.c: In function ‘sys_debug_setcontext’: arch/powerpc/kernel/signal_32.c:1234:16: error: variable ‘tmp’ set but not used Signed-off-by: NMathieu Malaterre <malat@debian.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 06 3月, 2018 2 次提交
-
-
由 Christophe Leroy 提交于
While the implementation of the "slices" address space allows a significant amount of high slices, it limits the number of low slices to 16 due to the use of a single u64 low_slices_psize element in struct mm_context_t On the 8xx, the minimum slice size is the size of the area covered by a single PMD entry, ie 4M in 4K pages mode and 64M in 16K pages mode. This means we could have at least 64 slices. In order to override this limitation, this patch switches the handling of low_slices_psize to char array as done already for high_slices_psize. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Christophe Leroy 提交于
On the 8xx, the page size is set in the PMD entry and applies to all pages of the page table pointed by the said PMD entry. When an app has some regular pages allocated (e.g. see below) and tries to mmap() a huge page at a hint address covered by the same PMD entry, the kernel accepts the hint allthough the 8xx cannot handle different page sizes in the same PMD entry. 10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc 10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc mmap(0x10080000, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000 This results the app remaining forever in do_page_fault()/hugetlb_fault() and when interrupting that app, we get the following warning: [162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4 [162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W 4.14.6 #85 [162980.035744] task: c67e2c00 task.stack: c668e000 [162980.035783] NIP: c000fe18 LR: c00e1eec CTR: c00f90c0 [162980.035830] REGS: c668fc20 TRAP: 0700 Tainted: G W (4.14.6) [162980.035854] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24044224 XER: 20000000 [162980.036003] [162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000 [162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc [162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48 [162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000 [162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4 [162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150 [162980.036861] Call Trace: [162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable) [162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150 [162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4 [162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8 [162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c [162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc [162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614 [162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244 [162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88 [162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4 [162980.037781] Instruction dump: [162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94 [162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094 [162980.038216] ---[ end trace c0ceeca8e7a5800a ]--- [162980.038754] BUG: non-zero nr_ptes on freeing mm: 1 [162985.363322] BUG: non-zero nr_ptes on freeing mm: -1 In order to fix this, this patch uses the address space "slices" implemented for BOOK3S/64 and enhanced to support PPC32 by the preceding patch. This patch modifies the context.id on the 8xx to be in the range [1:16] instead of [0:15] in order to identify context.id == 0 as not initialised contexts as done on BOOK3S This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is selected for the 8xx Alltough we could in theory have as many slices as PMD entries, the current slices implementation limits the number of low slices to 16. This limitation is not preventing us to fix the initial issue allthough it is suboptimal. It will be cured in a subsequent patch. Fixes: 4b914286 ("powerpc/8xx: Implement support of hugepages") Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 22 2月, 2018 1 次提交
-
-
由 Michael Bringmann 提交于
This reverts commit 02ef6dd8. The earlier patch tried to enable support for a new property "ibm,drc-info" on powerpc systems. Unfortunately, some errors in the associated patch set break things in some of the DLPAR operations. In particular when attempting to hot-add a new CPU or set of CPUs, the original patch failed to properly calculate the available resources, and aborted the operation. In addition, the original set missed several opportunities to compress and reuse common code. As the associated patch set was meant to provide an optimization of storage and performance of a set of device-tree properties for future systems with large amounts of resources, reverting just restores the previous behavior for existing systems. It seems unnecessary to enable this feature and introduce the consequent problems in the field that it will cause at this time, so please revert it for now until testing of the corrections are finished properly. Fixes: 02ef6dd8 ("powerpc: Enable support for ibm,drc-info devtree property") Signed-off-by: NMichael W. Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 21 2月, 2018 1 次提交
-
-
由 Juan J. Alvarez 提交于
The notify_resume() callback in eeh_ops is NULL on powernv, leading to crashes: NIP (null) LR eeh_report_resume+0x218/0x220 Call Trace: eeh_report_resume+0x1f0/0x220 (unreliable) eeh_pe_dev_traverse+0x98/0x170 eeh_handle_normal_event+0x3f4/0x650 eeh_handle_event+0x54/0x380 eeh_event_handler+0x14c/0x210 kthread+0x168/0x1b0 ret_from_kernel_thread+0x5c/0xb4 Fix it by adding a check before calling it. Fixes: 856e1eb9 ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com> Reviewed-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Tested-by: NCarol L. Soto <clsoto@us.ibm.com> Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Tested-by: NMauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Acked-by: NMichael Neuling <mikey@neuling.org> [mpe: Rewrite change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 15 2月, 2018 1 次提交
-
-
由 Cyril Bur 提交于
The TSCR can only be accessed in hypervisor mode. Fixes: 88b5e12eeb11 ("powerpc: Expose TSCR via sysfs") Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 12 2月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2018 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
59f47eff ("powerpc/pci: Use of_irq_parse_and_map_pci() helper") replaced of_irq_parse_pci() + irq_create_of_mapping() with of_irq_parse_and_map_pci(), but neglected to capture the virq returned by irq_create_of_mapping(), so virq remained zero, which caused INTx configuration to fail. Save the virq value returned by of_irq_parse_and_map_pci() and correct the virq declaration to match the of_irq_parse_and_map_pci() signature. Fixes: 59f47eff "powerpc/pci: Use of_irq_parse_and_map_pci() helper" Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> [bhelgaas: changelog] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 08 2月, 2018 1 次提交
-
-
由 Nicholas Piggin 提交于
The soft IRQ masking code has to hard-disable interrupts in cases where the exception is not cleared by the masked handler. External interrupts used this approach for soft masking. Now recently PMU interrupts do the same thing. The soft IRQ masking code additionally allowed for interrupt handlers to hard-enable interrupts after soft-disabling them. The idea is to allow PMU interrupts through to profile interrupt handlers. So when interrupts are being replayed when there is a pending interrupt that requires hard-disabling, there is a test to prevent those handlers from hard-enabling them if there is a pending external interrupt. may_hard_irq_enable() handles this. After f442d004 ("powerpc/64s: Add support to mask perf interrupts and replay them"), may_hard_irq_enable() could prematurely enable MSR[EE] when a PMU exception exists, which would result in the interrupt firing again while masked, and MSR[EE] being disabled again. I haven't seen that this could cause a serious problem, but it's more consistent to handle these soft-masked interrupts in the same way. So introduce a define for all types of interrupts that require MSR[EE] masking in their soft-disable handlers, and use that in may_hard_irq_enable(). Fixes: f442d004 ("powerpc/64s: Add support to mask perf interrupts and replay them") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 28 1月, 2018 3 次提交
-
-
由 Michael Ellerman 提交于
When a CPU detects its locked up via soft_nmi_interrupt() we have pt_regs, so print the regs->nip, which points to where we took the soft-NMI. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
soft_nmi_interrupt() is called directly from the asm exception handling code, which passes regs as a pointer to the stack. So regs can't be NULL, it may be full of junk, but that's a separate problem. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Use pr_fmt() in the watchdog code, so we don't have to say "Watchdog" so many times. Rather than "CPU:%d" just spell it "CPU %d", "Hard" doesn't need a capital in the middle of a sentence, and "LOCKUP other CPUS" should be "LOCKUP on other CPUS". Also make it clear when a CPU self detects a lockup by spelling it out. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 1月, 2018 5 次提交
-
-
由 Sukadev Bhattiprolu 提交于
clear_thread_tidr() is called in interrupt context as a part of delayed put of the task structure (i.e as a part of timer interrupt). To prevent a deadlock, block interrupts when holding vas_thread_id_lock to set/ clear TIDR for a task. Fixes: ec233ede ("powerpc: Add support for setting SPRN_TIDR") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Bryant G. Ly 提交于
When enabling SR-IOV in pseries platform, the VF bar properties for a PF are reported on the device node in the device tree. This patch adds the IOV Bar resources to Linux structures from the device tree for later use when configuring SR-IOV by PF driver. Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Bryant G. Ly 提交于
Introduce a method for notify resume to be called from sysfs. In this patch one can now call notify resume from sysfs when is supported by platform. Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com> Acked-by: NRussell Currey <ruscur@russell.cc> [mpe: Add NULL check, add empty versions to avoid #ifdefs] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Bryant G. Ly 提交于
Devices can go offline when erors reported. This patch adds a change to the kernel object and lets udev know of error. When device resumes, a change is also set reporting device as online. Therefore, EEH and AER events are better propagated to user space for PCI devices in all arches. Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Bryant G. Ly 提交于
Add EEH platform operations for pseries to update VF config space. With this change after EEH, the VF will have updated config space for pseries platform. Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com> Acked-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 1月, 2018 3 次提交
-
-
由 Nicholas Piggin 提交于
The fallback RFI flush is used when firmware does not provide a way to flush the cache. It's a "displacement flush" that evicts useful data by displacing it with an uninteresting buffer. The flush has to take care to work with implementation specific cache replacment policies, so the recipe has been in flux. The initial slow but conservative approach is to touch all lines of a congruence class, with dependencies between each load. It has since been determined that a linear pattern of loads without dependencies is sufficient, and is significantly faster. Measuring the speed of a null syscall with RFI fallback flush enabled gives the relative improvement: P8 - 1.83x P9 - 1.75x The flush also becomes simpler and more adaptable to different cache geometries. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Eric W. Biederman 提交于
There are so many places that build struct siginfo by hand that at least one of them is bound to get it wrong. A handful of cases in the kernel arguably did just that when using the errno field of siginfo to pass no errno values to userspace. The usage is limited to a single si_code so at least does not mess up anything else. Encapsulate this questionable pattern in a helper function so that the userspace ABI is preserved. Update all of the places that use this pattern to use the new helper function. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
signal_code is always TRAP_HWBKPT Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 22 1月, 2018 3 次提交
-
-
由 Nicholas Piggin 提交于
Platforms with a panic handler that halts the system can have problems getting kernel messages out, because the panic notifiers are called before kernel/panic.c does its flushing of printk buffers an console etc. This was attempted to be solved with commit a3b2cb30 ("powerpc: Do not call ppc_md.panic in fadump panic notifier"), but that wasn't the right approach and caused other problems, and was reverted by commit ab9dbf77. Instead, the powernv shutdown paths have already had a similar problem, fixed by taking the message flushing sequence from kernel/panic.c. That's a little bit ugly, but while we have the code duplicated, it will work for this case as well. So have ppc panic handlers do the same flushing before they terminate. Without this patch, a qemu pseries_le_defconfig guest stops silently when issued the nmi command when xmon is off and no crash dumpers enabled. Afterwards, an oops is printed by each CPU as expected. Fixes: ab9dbf77 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gustavo Romero 提交于
Currently it's possible that a thread on PPC64 LE has its endianness flipped inadvertently to Big-Endian resulting in a crash once the process is back from the signal handler. If giveup_all() is called when regs->msr has the bits MSR.FP and MSR.VEC disabled (and hence MSR.VSX disabled too) it returns without calling check_if_tm_restore_required() which copies regs->msr to ckpt_regs->msr if the process caught a signal whilst in transactional mode. Then once in setup_tm_sigcontexts() MSR from ckpt_regs.msr is used, but since check_if_tm_restore_required() was not called previuosly, gp_regs[PT_MSR] gets a copy of invalid MSR bits as MSR in ckpt_regs was not updated from regs->msr and so is zeroed. Later when leaving the signal handler once in sys_rt_sigreturn() the TS bits of gp_regs[PT_MSR] are checked to determine if restore_tm_sigcontexts() must be called to pull in the correct MSR state into the user context. Because TS bits are zeroed restore_tm_sigcontexts() is never called and MSR restored from the user context on returning from the signal handler has the MSR.LE (the endianness bit) forced to zero (Big-Endian). That leads, for instance, to 'nop' being treated as an illegal instruction in the following sequence: tbegin. beq 1f trap tend. 1: nop on PPC64 LE machines and the process dies just after returning from the signal handler. PPC64 BE is also affected but in a subtle way since forcing Big-Endian on a BE machine does not change the endianness. This commit fixes the issue described above by ensuring that once in setup_tm_sigcontexts() the MSR used is from regs->msr instead of from ckpt_regs->msr and by ensuring that we pull in only the MSR.FP, MSR.VEC, and MSR.VSX bits from ckpt_regs->msr. The fix was tested both on LE and BE machines and no regression regarding the powerpc/tm selftests was observed. Signed-off-by: NGustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Anton Blanchard 提交于
The thread switch control register (TSCR) is a per core register that configures how the CPU shares resources between SMT threads. Exposing it via sysfs allows us to tune it at run time. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-