提交 · 9a868f634349e62922c226834aa23e3d1329ae7f · openanolis / cloud-kernel

27 3月, 2018 4 次提交

powerpc: Add security feature flags for Spectre/Meltdown · 9a868f63

由 Michael Ellerman 提交于 3月 27, 2018

This commit adds security feature flags to reflect the settings we
receive from firmware regarding Spectre/Meltdown mitigations.

The feature names reflect the names we are given by firmware on bare
metal machines. See the hostboot source for details.

Arguably these could be firmware features, but that then requires them
to be read early in boot so they're available prior to asm feature
patching, but we don't actually want to use them for patching. We may
also want to dynamically update them in future, which would be
incompatible with the way firmware features work (at the moment at
least). So for now just make them separate flags.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9a868f63

powerpc/rfi-flush: Differentiate enabled and patched flush types · 0063d61c

由 Mauricio Faria de Oliveira 提交于 3月 14, 2018

Currently the rfi-flush messages print 'Using <type> flush' for all
enabled_flush_types, but that is not necessarily true -- as now the
fallback flush is always enabled on pseries, but the fixup function
overwrites its nop/branch slot with other flush types, if available.

So, replace the 'Using <type> flush' messages with '<type> flush is
available'.

Also, print the patched flush types in the fixup function, so users
can know what is (not) being used (e.g., the slower, fallback flush,
or no flush type at all if flush is disabled via the debugfs switch).
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0063d61c

powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again · abf110f3

由 Michael Ellerman 提交于 3月 14, 2018

For PowerVM migration we want to be able to call setup_rfi_flush()
again after we've migrated the partition.

To support that we need to check that we're not trying to allocate the
fallback flush area after memblock has gone away (i.e., boot-time only).
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

abf110f3

powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code · 1e2a9fc7

由 Michael Ellerman 提交于 3月 14, 2018

rfi_flush_enable() includes a check to see if we're already
enabled (or disabled), and in that case does nothing.

But that means calling setup_rfi_flush() a 2nd time doesn't actually
work, which is a bit confusing.

Move that check into the debugfs code, where it really belongs.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1e2a9fc7

23 3月, 2018 5 次提交

KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 · 4bb3c7a0

由 Paul Mackerras 提交于 3月 21, 2018

POWER9 has hardware bugs relating to transactional memory and thread
reconfiguration (changes to hardware SMT mode). Specifically, the core
does not have enough storage to store a complete checkpoint of all the
architected state for all four threads. The DD2.2 version of POWER9
includes hardware modifications designed to allow hypervisor software
to implement workarounds for these problems. This patch implements
those workarounds in KVM code so that KVM guests see a full, working
transactional memory implementation.

The problems center around the use of TM suspended state, where the
CPU has a checkpointed state but execution is not transactional. The
workaround is to implement a "fake suspend" state, which looks to the
guest like suspended state but the CPU does not store a checkpoint.
In this state, any instruction that would cause a transition to
transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
checkpointed state (treclaim) causes a "soft patch" interrupt (vector
0x1500) to the hypervisor so that it can be emulated. The trechkpt
instruction also causes a soft patch interrupt.

On POWER9 DD2.2, we avoid returning to the guest in any state which
would require a checkpoint to be present. The trechkpt in the guest
entry path which would normally create that checkpoint is replaced by
either a transition to fake suspend state, if the guest is in suspend
state, or a rollback to the pre-transactional state if the guest is in
transactional state. Fake suspend state is indicated by a flag in the
PACA plus a new bit in the PSSCR. The new PSSCR bit is write-only and
reads back as 0.

On exit from the guest, if the guest is in fake suspend state, we still
do the treclaim instruction as we would in real suspend state, in order
to get into non-transactional state, but we do not save the resulting
register state since there was no checkpoint.

Emulation of the instructions that cause a softpatch interrupt is
handled in two paths. If the guest is in real suspend mode, we call
kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
transitioning to transactional state. This is called before we do the
treclaim in the guest exit path; because we haven't done treclaim, we
can get back to the guest with the transaction still active. If the
instruction is a case that kvmhv_p9_tm_emulation_early() doesn't
handle, or if the guest is in fake suspend state, then we proceed to
do the complete guest exit path and subsequently call
kvmhv_p9_tm_emulation() in host context with the MMU on. This handles
all the cases including the cases that generate program interrupts
(illegal instruction or TM Bad Thing) and facility unavailable
interrupts.

The emulation is reasonably straightforward and is mostly concerned
with checking for exception conditions and updating the state of
registers such as MSR and CR0. The treclaim emulation takes care to
ensure that the TEXASR register gets updated as if it were the guest
treclaim instruction that had done failure recording, not the treclaim
done in hypervisor state in the guest exit path.

With this, the KVM_CAP_PPC_HTM capability returns true (1) even if
transactional memory is not available to host userspace.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4bb3c7a0

powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a

由 Paul Mackerras 提交于 3月 21, 2018

POWER9 processors up to and including "Nimbus" v2.2 have hardware
bugs relating to transactional memory and thread reconfiguration.
One of these bugs has a workaround which is to get the core into
SMT4 state temporarily.  This workaround is only needed when
running bare-metal.

This patch provides a function which gets the core into SMT4 mode
by preventing threads from going to a stop state, and waking up
those which are already in a stop state.  Once at least 3 threads
are not in a stop state, the core will be in SMT4 and we can
continue.

To do this, we add a "dont_stop" flag to the paca to tell the
thread not to go into a stop state.  If this flag is set,
power9_idle_stop() just returns immediately with a return value
of 0.  The pnv_power9_force_smt4_catch() function does the following:

1. Set the dont_stop flag for each thread in the core, except
   ourselves (in fact we use an atomic_inc() in case more than
   one thread is calling this function concurrently).
2. See how many threads are awake, indicated by their
   requested_psscr field in the paca being 0.  If this is at
   least 3, skip to step 5.
3. Send a doorbell interrupt to each thread that was seen as
   being in a stop state in step 2.
4. Until at least 3 threads are awake, scan the threads to which
   we sent a doorbell interrupt and check if they are awake now.

This relies on the following properties:

- Once dont_stop is non-zero, requested_psccr can't go from zero to
  non-zero, except transiently (and without the thread doing stop).
- requested_psscr being zero guarantees that the thread isn't in
  a state-losing stop state where thread reconfiguration could occur.
- Doing stop with a PSSCR value of 0 won't be a state-losing stop
  and thus won't allow thread reconfiguration.
- Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
  must be in SMT4 mode, since SMT modes are powers of 2.

This does add a sync to power9_idle_stop(), which is necessary to
provide the correct ordering between setting requested_psscr and
checking dont_stop.  The overhead of the sync should be unnoticeable
compared to the latency of going into and out of a stop state.

Because some objected to incurring this extra latency on systems where
the XER[SO] bug is not relevant, I have put the test in
power9_idle_stop inside a feature section.  This means that
pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
probably hang the system.

In order to cater for uses where the caller has an operation that
has to be done while the core is in SMT4, the core continues to be
kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
until the pnv_power9_force_smt4_release() function is called.
It undoes the effect of step 1 above and allows the other threads
to go into a stop state.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7672691a

powerpc: Add CPU feature bits for TM bug workarounds on POWER9 v2.2 · b5af4f27

由 Paul Mackerras 提交于 3月 21, 2018

This adds a CPU feature bit which is set for POWER9 "Nimbus" DD2.2
processors which will be used to enable the hypervisor to assist
hardware with the handling of checkpointed register values while the
CPU is in suspend state, in order to work around hardware bugs. The
hardware assistance for these workarounds introduced a new hardware
bug relating to the XER[SO] bit. We add a separate feature bit for
this bug in case future chips fix it while still requiring the
hypervisor assistance with suspend state.

When the dt_cpu_ftrs subsystem is in use, the software assistance can
be enabled using a "tm-suspend-hypervisor-assist" node in the device
tree, and a "tm-suspend-xer-so-bug" node enables the workarounds for
the XER[SO] bug. In the absence of such nodes, a quirk enables both
for POWER9 "Nimbus" DD2.2 processors.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b5af4f27

powerpc: Free up CPU feature bits on 64-bit machines · 9bbf0b57

由 Paul Mackerras 提交于 3月 20, 2018

This moves all the CPU feature bits that are only used on 32-bit
machines to the top 20 bits of the CPU feature word and arranges
for them to be defined only in 32-bit builds.  The features that
are common to 32-bit and 64-bit machines are moved to bits 0-11
of the CPU feature word.  This means that for 64-bit platforms,
bits 44-63 can now be used for new features that only exist on
64-bit machines.  (These bit numbers are counting from the right,
i.e. the LSB is bit 0.)

Because CPU_FTR_L3_DISABLE_NAP moved from the low 16 bits to the high
16 bits, we have to adjust some assembly code.  Also, CPU_FTR_EMB_HV
moved from the high 16 bits to the low 16 bits.

Note that CPU_FTR_REAL_LE only applies to 64-bit chips, because only
64-bit chips (POWER6, 7, 8, 9) have a true little-endian mode that is
a CPU execution mode as opposed to being a page attribute.

With this we now have 20 free CPU feature bits on 64-bit machines.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9bbf0b57

powerpc: Use feature bit for RTC presence rather than timebase presence · c0d64cf9

由 Paul Mackerras 提交于 3月 20, 2018

All PowerPC CPUs other than the original PPC601 have a timebase
register rather than the "real-time clock" (RTC) register that the
PPC601 (and the original POWER and POWER2 CPUs) had.  Currently
we have a CPU feature bit to indicate the presence of the timebase,
but it makes more sense to use a bit to indicate the unusual
situation rather than the common situation.  This therefore defines
a CPU_FTR_USE_RTC bit in place of the CPU_FTR_USE_TB bit, and
arranges for it to be set on PPC601 systems.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c0d64cf9

20 3月, 2018 2 次提交

powerpc: Use sizeof(*foo) rather than sizeof(struct foo) · a0828cf5

由 Markus Elfring 提交于 1月 19, 2017

It's slightly less error prone to use sizeof(*foo) rather than
specifying the type.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
[mpe: Consolidate into one patch, rewrite change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a0828cf5

powerpc: Remove unused flush_dcache_phys_range() · 31513207

由 Matt Brown 提交于 7月 20, 2017

The flush_dcache_phys_range() function is no longer used in the
kernel. The last usage was removed in c40785ad ("powerpc/dart: Use
a cachable DART").

This patch removes the function and declaration.
Signed-off-by: NMatt Brown <matthew.brown.dev@gmail.com>
[mpe: Munge change log, include commit that removed last user]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

31513207

14 3月, 2018 1 次提交

powerpc/time: stop validating rtc_time in .read_time · 890ae797

由 Alexandre Belloni 提交于 2月 21, 2018

The RTC core is always calling rtc_valid_tm after the read_time callback.
It is not necessary to call it just before returning from the callback.
Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

890ae797

13 3月, 2018 6 次提交

powerpc/32: Add missing prototypes for (early|machine)_init() · e82d70cf

由 Mathieu Malaterre 提交于 3月 08, 2018

early_init() and machine_init() have no prototype, add one in
asm-prototypes.h.

Fixes the following warnings (treated as error in W=1):
  arch/powerpc/kernel/setup_32.c:68:30: error: no previous prototype for ‘early_init’
  arch/powerpc/kernel/setup_32.c:99:21: error: no previous prototype for ‘machine_init’
Signed-off-by: NMathieu Malaterre <malat@debian.org>
[mpe: Move them to asm-prototypes.h, drop other functions]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e82d70cf

powerpc/32: Make some functions static · d15a261d

由 Mathieu Malaterre 提交于 3月 07, 2018

These functions can all be static, make it so.
Signed-off-by: NMathieu Malaterre <malat@debian.org>
[mpe: Combine a patch of Mathieu's with some other static conversions]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d15a261d

powerpc/prom: Remove warning on array size when empty · 4f1f40f7

由 Mathieu Malaterre 提交于 3月 02, 2018

When neither CONFIG_ALTIVEC, nor CONFIG_VSX or CONFIG_PPC64 is
defined, the array feature_properties is defined as an empty array,
which in turn triggers the following warning (treated as error on
W=1):

  arch/powerpc/kernel/prom.c: In function ‘check_cpu_feature_properties’:
  arch/powerpc/kernel/prom.c:298:16: error: comparison of unsigned expression < 0 is always false
    for (i = 0; i < ARRAY_SIZE(feature_properties); ++i, ++fp) {
                  ^
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMathieu Malaterre <malat@debian.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4f1f40f7

powerpc: Add missing prototypes for sys_sigreturn() & sys_rt_sigreturn() · b53875c4

由 Mathieu Malaterre 提交于 2月 25, 2018

Two functions did not have a prototype defined in signal.h header. Fix
the following two warnings (treated as errors in W=1):

arch/powerpc/kernel/signal_32.c:1135:6: error: no previous prototype for ‘sys_rt_sigreturn’
arch/powerpc/kernel/signal_32.c:1422:6: error: no previous prototype for ‘sys_sigreturn’
Signed-off-by: NMathieu Malaterre <malat@debian.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b53875c4

powerpc/kernel: Make function __giveup_fpu() static · 1cdf039b

由 Mathieu Malaterre 提交于 2月 25, 2018

__giveup_fpu() is never called outside process.c, so it can be static.
That also means we don't need an empty definition in switch_to.h
Signed-off-by: NMathieu Malaterre <malat@debian.org>
[mpe: Also drop the empty version, rewrite change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1cdf039b

powerpc/32: Mark both tmp variables as unused · 67b464a8

由 Mathieu Malaterre 提交于 2月 25, 2018

Since the value of `tmp` is never intended to be read, declare both `tmp`
variables as unused. Fix warning (treated as error in W=1):

arch/powerpc/kernel/signal_32.c: In function ‘sys_swapcontext’:
arch/powerpc/kernel/signal_32.c:1048:16: error: variable ‘tmp’ set but not used
arch/powerpc/kernel/signal_32.c: In function ‘sys_debug_setcontext’:
arch/powerpc/kernel/signal_32.c:1234:16: error: variable ‘tmp’ set but not used
Signed-off-by: NMathieu Malaterre <malat@debian.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

67b464a8

06 3月, 2018 2 次提交

powerpc/mm/slice: Allow up to 64 low slices · 15472423

由 Christophe Leroy 提交于 2月 22, 2018

While the implementation of the "slices" address space allows
a significant amount of high slices, it limits the number of
low slices to 16 due to the use of a single u64 low_slices_psize
element in struct mm_context_t

On the 8xx, the minimum slice size is the size of the area
covered by a single PMD entry, ie 4M in 4K pages mode and 64M in
16K pages mode. This means we could have at least 64 slices.

In order to override this limitation, this patch switches the
handling of low_slices_psize to char array as done already for
high_slices_psize.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

15472423

powerpc/mm/slice: Fix hugepage allocation at hint address on 8xx · aa0ab02b

由 Christophe Leroy 提交于 2月 22, 2018

On the 8xx, the page size is set in the PMD entry and applies to
all pages of the page table pointed by the said PMD entry.

When an app has some regular pages allocated (e.g. see below) and tries
to mmap() a huge page at a hint address covered by the same PMD entry,
the kernel accepts the hint allthough the 8xx cannot handle different
page sizes in the same PMD entry.

10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc

mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
     MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000

This results the app remaining forever in do_page_fault()/hugetlb_fault()
and when interrupting that app, we get the following warning:

[162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
[162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W       4.14.6 #85
[162980.035744] task: c67e2c00 task.stack: c668e000
[162980.035783] NIP:  c000fe18 LR: c00e1eec CTR: c00f90c0
[162980.035830] REGS: c668fc20 TRAP: 0700   Tainted: G W        (4.14.6)
[162980.035854] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24044224 XER: 20000000
[162980.036003]
[162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
[162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
[162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
[162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
[162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
[162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
[162980.036861] Call Trace:
[162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
[162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
[162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
[162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
[162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
[162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
[162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
[162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
[162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
[162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
[162980.037781] Instruction dump:
[162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
[162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
[162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
[162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
[162985.363322] BUG: non-zero nr_ptes on freeing mm: -1

In order to fix this, this patch uses the address space "slices"
implemented for BOOK3S/64 and enhanced to support PPC32 by the
preceding patch.

This patch modifies the context.id on the 8xx to be in the range
[1:16] instead of [0:15] in order to identify context.id == 0 as
not initialised contexts as done on BOOK3S

This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
selected for the 8xx

Alltough we could in theory have as many slices as PMD entries, the
current slices implementation limits the number of low slices to 16.
This limitation is not preventing us to fix the initial issue allthough
it is suboptimal. It will be cured in a subsequent patch.

Fixes: 4b914286 ("powerpc/8xx: Implement support of hugepages")
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aa0ab02b

22 2月, 2018 1 次提交

powerpc/pseries: Revert support for ibm,drc-info devtree property · c7a3275e

由 Michael Bringmann 提交于 2月 13, 2018

This reverts commit 02ef6dd8.

The earlier patch tried to enable support for a new property
"ibm,drc-info" on powerpc systems.

Unfortunately, some errors in the associated patch set break things
in some of the DLPAR operations.  In particular when attempting to
hot-add a new CPU or set of CPUs, the original patch failed to
properly calculate the available resources, and aborted the operation.
In addition, the original set missed several opportunities to compress
and reuse common code.

As the associated patch set was meant to provide an optimization of
storage and performance of a set of device-tree properties for future
systems with large amounts of resources, reverting just restores
the previous behavior for existing systems.  It seems unnecessary
to enable this feature and introduce the consequent problems in the
field that it will cause at this time, so please revert it for now
until testing of the corrections are finished properly.

Fixes: 02ef6dd8 ("powerpc: Enable support for ibm,drc-info devtree property")
Signed-off-by: NMichael W. Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c7a3275e

21 2月, 2018 1 次提交

powerpc/eeh: Fix crashes in eeh_report_resume() · 521ca5a9

由 Juan J. Alvarez 提交于 2月 15, 2018

The notify_resume() callback in eeh_ops is NULL on powernv, leading to
crashes:

  NIP (null)
  LR  eeh_report_resume+0x218/0x220
  Call Trace:
   eeh_report_resume+0x1f0/0x220 (unreliable)
   eeh_pe_dev_traverse+0x98/0x170
   eeh_handle_normal_event+0x3f4/0x650
   eeh_handle_event+0x54/0x380
   eeh_event_handler+0x14c/0x210
   kthread+0x168/0x1b0
   ret_from_kernel_thread+0x5c/0xb4

Fix it by adding a check before calling it.

Fixes: 856e1eb9 ("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Reviewed-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: NCarol L. Soto <clsoto@us.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: NMauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Acked-by: NMichael Neuling <mikey@neuling.org>
[mpe: Rewrite change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

521ca5a9

15 2月, 2018 1 次提交

powerpc: Expose TSCR via sysfs only on powernv · c134f0d5

由 Cyril Bur 提交于 2月 14, 2018

The TSCR can only be accessed in hypervisor mode.

Fixes: 88b5e12eeb11 ("powerpc: Expose TSCR via sysfs")
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c134f0d5

12 2月, 2018 1 次提交

vfs: do bulk POLL* -> EPOLL* replacement · a9a08845

由 Linus Torvalds 提交于 2月 11, 2018

This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a9a08845

11 2月, 2018 1 次提交

powerpc/pci: Fix broken INTx configuration via OF · c591c2e3

由 Alexey Kardashevskiy 提交于 2月 09, 2018

59f47eff ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
replaced of_irq_parse_pci() + irq_create_of_mapping() with
of_irq_parse_and_map_pci(), but neglected to capture the virq
returned by irq_create_of_mapping(), so virq remained zero, which
caused INTx configuration to fail.

Save the virq value returned by of_irq_parse_and_map_pci() and correct
the virq declaration to match the of_irq_parse_and_map_pci() signature.

Fixes: 59f47eff "powerpc/pci: Use of_irq_parse_and_map_pci() helper"
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
[bhelgaas: changelog]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

c591c2e3

08 2月, 2018 1 次提交

powerpc/64s: Fix may_hard_irq_enable() for PMI soft masking · 6cc3f91b

由 Nicholas Piggin 提交于 2月 03, 2018

The soft IRQ masking code has to hard-disable interrupts in cases
where the exception is not cleared by the masked handler. External
interrupts used this approach for soft masking. Now recently PMU
interrupts do the same thing.

The soft IRQ masking code additionally allowed for interrupt handlers
to hard-enable interrupts after soft-disabling them. The idea is to
allow PMU interrupts through to profile interrupt handlers.

So when interrupts are being replayed when there is a pending
interrupt that requires hard-disabling, there is a test to prevent
those handlers from hard-enabling them if there is a pending external
interrupt. may_hard_irq_enable() handles this.

After f442d004 ("powerpc/64s: Add support to mask perf interrupts
and replay them"), may_hard_irq_enable() could prematurely enable
MSR[EE] when a PMU exception exists, which would result in the
interrupt firing again while masked, and MSR[EE] being disabled again.

I haven't seen that this could cause a serious problem, but it's
more consistent to handle these soft-masked interrupts in the same
way. So introduce a define for all types of interrupts that require
MSR[EE] masking in their soft-disable handlers, and use that in
may_hard_irq_enable().

Fixes: f442d004 ("powerpc/64s: Add support to mask perf interrupts and replay them")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6cc3f91b

28 1月, 2018 3 次提交

powerpc/watchdog: Print the NIP in soft_nmi_interrupt() · 0bc00914

由 Michael Ellerman 提交于 10月 12, 2017

When a CPU detects its locked up via soft_nmi_interrupt() we have
pt_regs, so print the regs->nip, which points to where we took the
soft-NMI.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0bc00914

powerpc/watchdog: regs can't be null in soft_nmi_interrupt() · 3ba45b7e

由 Michael Ellerman 提交于 10月 12, 2017

soft_nmi_interrupt() is called directly from the asm exception
handling code, which passes regs as a pointer to the stack. So regs
can't be NULL, it may be full of junk, but that's a separate problem.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3ba45b7e

powerpc/watchdog: Tweak watchdog printks · d8fa82e0

由 Michael Ellerman 提交于 10月 12, 2017

Use pr_fmt() in the watchdog code, so we don't have to say "Watchdog"
so many times.

Rather than "CPU:%d" just spell it "CPU %d", "Hard" doesn't need a
capital in the middle of a sentence, and "LOCKUP other CPUS" should be
"LOCKUP on other CPUS".

Also make it clear when a CPU self detects a lockup by spelling it
out.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d8fa82e0

27 1月, 2018 5 次提交

powerpc/kernel: Block interrupts when updating TIDR · 384dfd62

由 Sukadev Bhattiprolu 提交于 11月 28, 2017

clear_thread_tidr() is called in interrupt context as a part of delayed
put of the task structure (i.e as a part of timer interrupt). To prevent
a deadlock, block interrupts when holding vas_thread_id_lock to set/
clear TIDR for a task.

Fixes: ec233ede ("powerpc: Add support for setting SPRN_TIDR")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

384dfd62

powerpc/pseries: Add Initialization of VF Bars · fc5f6221

由 Bryant G. Ly 提交于 1月 05, 2018

When enabling SR-IOV in pseries platform, the VF bar properties for a
PF are reported on the device node in the device tree.

This patch adds the IOV Bar resources to Linux structures from the
device tree for later use when configuring SR-IOV by PF driver.
Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Acked-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fc5f6221

powerpc/eeh: Add EEH notify resume sysfs · 6ea3df69

由 Bryant G. Ly 提交于 1月 05, 2018

Introduce a method for notify resume to be called from sysfs. In this
patch one can now call notify resume from sysfs when is supported by
platform.
Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Acked-by: NRussell Currey <ruscur@russell.cc>
[mpe: Add NULL check, add empty versions to avoid #ifdefs]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6ea3df69

PCI/AER: Add uevents in AER and EEH error/resume · 856e1eb9

由 Bryant G. Ly 提交于 1月 05, 2018

Devices can go offline when erors reported. This patch adds a change
to the kernel object and lets udev know of error. When device resumes,
a change is also set reporting device as online. Therefore, EEH and
AER events are better propagated to user space for PCI devices in all
arches.
Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

856e1eb9

powerpc/eeh: Update VF config space after EEH · 64ba3dc7

由 Bryant G. Ly 提交于 1月 05, 2018

Add EEH platform operations for pseries to update VF config space.
With this change after EEH, the VF will have updated config space for
pseries platform.
Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: NJuan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Acked-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

64ba3dc7

23 1月, 2018 3 次提交

powerpc/64s: Improve RFI L1-D cache flush fallback · bdcb1aef

由 Nicholas Piggin 提交于 1月 17, 2018

The fallback RFI flush is used when firmware does not provide a way
to flush the cache. It's a "displacement flush" that evicts useful
data by displacing it with an uninteresting buffer.

The flush has to take care to work with implementation specific cache
replacment policies, so the recipe has been in flux. The initial
slow but conservative approach is to touch all lines of a congruence
class, with dependencies between each load. It has since been
determined that a linear pattern of loads without dependencies is
sufficient, and is significantly faster.

Measuring the speed of a null syscall with RFI fallback flush enabled
gives the relative improvement:

P8 - 1.83x
P9 - 1.75x

The flush also becomes simpler and more adaptable to different cache
geometries.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bdcb1aef

signal/ptrace: Add force_sig_ptrace_errno_trap and use it where needed · f71dd7dc

由 Eric W. Biederman 提交于 1月 22, 2018

There are so many places that build struct siginfo by hand that at
least one of them is bound to get it wrong.  A handful of cases in the
kernel arguably did just that when using the errno field of siginfo to
pass no errno values to userspace.  The usage is limited to a single
si_code so at least does not mess up anything else.

Encapsulate this questionable pattern in a helper function so
that the userspace ABI is preserved.

Update all of the places that use this pattern to use the new helper
function.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f71dd7dc

E
signal/powerpc: Remove unnecessary signal_code parameter of do_send_trap · 47355040
由 Eric W. Biederman 提交于 1月 16, 2018
```
signal_code is always TRAP_HWBKPT
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
```
47355040

22 1月, 2018 3 次提交

powerpc/pseries, ps3: panic flush kernel messages before halting system · 35adacd6

由 Nicholas Piggin 提交于 12月 24, 2017

Platforms with a panic handler that halts the system can have problems
getting kernel messages out, because the panic notifiers are called
before kernel/panic.c does its flushing of printk buffers an console
etc.

This was attempted to be solved with commit a3b2cb30 ("powerpc: Do
not call ppc_md.panic in fadump panic notifier"), but that wasn't the
right approach and caused other problems, and was reverted by commit
ab9dbf77.

Instead, the powernv shutdown paths have already had a similar
problem, fixed by taking the message flushing sequence from
kernel/panic.c. That's a little bit ugly, but while we have the code
duplicated, it will work for this case as well. So have ppc panic
handlers do the same flushing before they terminate.

Without this patch, a qemu pseries_le_defconfig guest stops silently
when issued the nmi command when xmon is off and no crash dumpers
enabled. Afterwards, an oops is printed by each CPU as expected.

Fixes: ab9dbf77 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

35adacd6

powerpc/tm: Fix endianness flip on trap · 1c200e63

由 Gustavo Romero 提交于 12月 31, 2017

Currently it's possible that a thread on PPC64 LE has its endianness
flipped inadvertently to Big-Endian resulting in a crash once the process
is back from the signal handler.

If giveup_all() is called when regs->msr has the bits MSR.FP and MSR.VEC
disabled (and hence MSR.VSX disabled too) it returns without calling
check_if_tm_restore_required() which copies regs->msr to ckpt_regs->msr if
the process caught a signal whilst in transactional mode. Then once in
setup_tm_sigcontexts() MSR from ckpt_regs.msr is used, but since
check_if_tm_restore_required() was not called previuosly, gp_regs[PT_MSR]
gets a copy of invalid MSR bits as MSR in ckpt_regs was not updated from
regs->msr and so is zeroed. Later when leaving the signal handler once in
sys_rt_sigreturn() the TS bits of gp_regs[PT_MSR] are checked to determine
if restore_tm_sigcontexts() must be called to pull in the correct MSR state
into the user context. Because TS bits are zeroed
restore_tm_sigcontexts() is never called and MSR restored from the user
context on returning from the signal handler has the MSR.LE (the endianness
bit) forced to zero (Big-Endian). That leads, for instance, to 'nop' being
treated as an illegal instruction in the following sequence:

	tbegin.
	beq	1f
	trap
	tend.
1:	nop

on PPC64 LE machines and the process dies just after returning from the
signal handler.

PPC64 BE is also affected but in a subtle way since forcing Big-Endian on
a BE machine does not change the endianness.

This commit fixes the issue described above by ensuring that once in
setup_tm_sigcontexts() the MSR used is from regs->msr instead of from
ckpt_regs->msr and by ensuring that we pull in only the MSR.FP, MSR.VEC,
and MSR.VSX bits from ckpt_regs->msr.

The fix was tested both on LE and BE machines and no regression regarding
the powerpc/tm selftests was observed.
Signed-off-by: NGustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1c200e63

powerpc: Expose TSCR via sysfs · b6d34eb4

由 Anton Blanchard 提交于 9月 08, 2017

The thread switch control register (TSCR) is a per core register
that configures how the CPU shares resources between SMT threads.

Exposing it via sysfs allows us to tune it at run time.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b6d34eb4

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功