提交 · 74857f69fec523c8c2d34e70c17cc1f5c28b3393 · openanolis / cloud-kernel

15 5月, 2019 1 次提交

powerpc/speculation: Support 'mitigations=' cmdline option · 74857f69

由 Josh Poimboeuf 提交于 4月 12, 2019

commit 782e69efb3dfed6e8360bc612e8c7827a901a8f9 upstream

Configure powerpc CPU runtime speculation bug mitigations in accordance
with the 'mitigations=' cmdline option.  This affects Meltdown, Spectre
v1, Spectre v2, and Speculative Store Bypass.

The default behavior is unchanged.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: NJiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/245a606e1a42a558a310220312d9b6adb9159df6.1555085500.git.jpoimboe@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

74857f69

14 11月, 2018 1 次提交

powerpc/tm: Fix HFSCR bit for no suspend case · e0f5f1c8

由 Michael Neuling 提交于 9月 11, 2018

commit dd9a8c5a87395b6f05552c3b44e42fdc95760552 upstream.

Currently on P9N DD2.1 we end up taking infinite TM facility
unavailable exceptions on the first TM usage by userspace.

In the special case of TM no suspend (P9N DD2.1), Linux is told TM is
off via CPU dt-ftrs but told to (partially) use it via
OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED. So HFSCR[TM] will be off from
dt-ftrs but we need to turn it on for the no suspend case.

This patch fixes this by enabling HFSCR TM in this case.

Cc: stable@vger.kernel.org # 4.15+
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e0f5f1c8

30 7月, 2018 1 次提交

powerpc: clean inclusions of asm/feature-fixups.h · 2c86cd18

由 Christophe Leroy 提交于 7月 05, 2018

files not using feature fixup don't need asm/feature-fixups.h
files using feature fixup need asm/feature-fixups.h
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2c86cd18

19 6月, 2018 1 次提交

powerpc/64: hard disable irqs in panic_smp_self_stop · 8c1aef6a

由 Nicholas Piggin 提交于 5月 19, 2018

Similarly to commit 855bfe0d ("powerpc: hard disable irqs in
smp_send_stop loop"), irqs should be hard disabled by
panic_smp_self_stop.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8c1aef6a

03 5月, 2018 2 次提交

powerpc64/ftrace: Delay enabling ftrace on secondary cpus · d1039786

由 Naveen N. Rao 提交于 4月 19, 2018

On the boot cpu, though we enable paca->ftrace_enabled in early_setup()
(via cpu_ready_for_interrupts()), we don't start tracing until much
later since ftrace is not initialized yet and since we only support
DYNAMIC_FTRACE on powerpc. However, it is possible that ftrace has been
initialized by the time some of the secondary cpus start up. In this
case, we will try to trace some of the early boot code which can cause
problems.

To address this, move setting paca->ftrace_enabled from
cpu_ready_for_interrupts() to early_setup() for the boot cpu, and towards
the end of start_secondary() for secondary cpus.
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d1039786

powerpc64/ftrace: Add a field in paca to disable ftrace in unsafe code paths · ea678ac6

由 Naveen N. Rao 提交于 4月 19, 2018

We have some C code that we call into from real mode where we cannot
take any exceptions. Though the C functions themselves are mostly safe,
if these functions are traced, there is a possibility that we may take
an exception. For instance, in certain conditions, the ftrace code uses
WARN(), which uses a 'trap' to do its job.

For such scenarios, introduce a new field in paca 'ftrace_enabled',
which is checked on ftrace entry before continuing. This field can then
be set to zero to disable/pause ftrace, and set to a non-zero value to
resume ftrace.
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ea678ac6

17 4月, 2018 1 次提交

powerpc/64s: Default l1d_size to 64K in RFI fallback flush · 9dfbf78e

由 Madhavan Srinivasan 提交于 1月 18, 2018

If there is no d-cache-size property in the device tree, l1d_size could
be zero. We don't actually expect that to happen, it's only been seen
on mambo (simulator) in some configurations.

A zero-size l1d_size leads to the loop in the asm wrapping around to
2^64-1, and then walking off the end of the fallback area and
eventually causing a page fault which is fatal.

Just default to 64K which is correct on some CPUs, and sane enough to
not cause a crash on others.

Fixes: aa8a5e00 ('powerpc/64s: Add support for RFI flush of L1-D cache')
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Rewrite comment and change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9dfbf78e

10 4月, 2018 1 次提交

powerpc/64s: Fix section mismatch warnings from setup_rfi_flush() · 501a78cb

由 Michael Ellerman 提交于 4月 05, 2018

The recent LPM changes to setup_rfi_flush() are causing some section
mismatch warnings because we removed the __init annotation on
setup_rfi_flush():

  The function setup_rfi_flush() references
  the function __init ppc64_bolted_size().
  the function __init memblock_alloc_base().

The references are actually in init_fallback_flush(), but that is
inlined into setup_rfi_flush().

These references are safe because:
 - only pseries calls setup_rfi_flush() at runtime
 - pseries always passes L1D_FLUSH_FALLBACK at boot
 - so the fallback flush area will always be allocated
 - so the check in init_fallback_flush() will always return early:
   /* Only allocate the fallback flush area once (at boot time). */
   if (l1d_flush_fallback_area)
   	return;

 - and therefore we won't actually call the freed init routines.

We should rework the code to make it safer by default rather than
relying on the above, but for now as a quick-fix just add a __ref
annotation to squash the warning.

Fixes: abf110f3 ("powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

501a78cb

30 3月, 2018 4 次提交

N
powerpc/64: Allocate per-cpu stacks node-local if possible · f3865f9a
由 Nicholas Piggin 提交于 2月 14, 2018
```
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
f3865f9a

powerpc/64: Allocate pacas per node · 4890aea6

由 Nicholas Piggin 提交于 2月 14, 2018

Per-node allocations are possible on 64s with radix that does
not have the bolted SLB limitation.

Hash would be able to do the same if all CPUs had the bottom of
their node-local memory bolted as well. This is left as an
exercise for the reader.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Add dummy definition of boot_cpuid for !SMP]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4890aea6

powerpc/64: move default SPR recording · c0abd0c7

由 Nicholas Piggin 提交于 2月 14, 2018

Move this into the early setup code, and don't iterate over CPU masks.
We don't want to call into sysfs so early from setup, and a future patch
won't initialize CPU masks by the time this is called.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Fold in incremental fix from Nick for DSCR handling]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c0abd0c7

powerpc/64: Use array of paca pointers and allocate pacas individually · d2e60075

由 Nicholas Piggin 提交于 2月 14, 2018

Change the paca array into an array of pointers to pacas. Allocate
pacas individually.

This allows flexibility in where the PACAs are allocated. Future work
will allocate them node-local. Platforms that don't have address limits
on PACAs would be able to defer PACA allocations until later in boot
rather than allocate all possible ones up-front then freeing unused.

This is slightly more overhead (one additional indirection) for cross
CPU paca references, but those aren't too common.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d2e60075

27 3月, 2018 4 次提交

powerpc/64s: Move cpu_show_meltdown() · 8ad33041

由 Michael Ellerman 提交于 3月 27, 2018

This landed in setup_64.c for no good reason other than we had nowhere
else to put it. Now that we have a security-related file, that is a
better place for it so move it.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8ad33041

powerpc/rfi-flush: Differentiate enabled and patched flush types · 0063d61c

由 Mauricio Faria de Oliveira 提交于 3月 14, 2018

Currently the rfi-flush messages print 'Using <type> flush' for all
enabled_flush_types, but that is not necessarily true -- as now the
fallback flush is always enabled on pseries, but the fixup function
overwrites its nop/branch slot with other flush types, if available.

So, replace the 'Using <type> flush' messages with '<type> flush is
available'.

Also, print the patched flush types in the fixup function, so users
can know what is (not) being used (e.g., the slower, fallback flush,
or no flush type at all if flush is disabled via the debugfs switch).
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0063d61c

powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again · abf110f3

由 Michael Ellerman 提交于 3月 14, 2018

For PowerVM migration we want to be able to call setup_rfi_flush()
again after we've migrated the partition.

To support that we need to check that we're not trying to allocate the
fallback flush area after memblock has gone away (i.e., boot-time only).
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

abf110f3

powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code · 1e2a9fc7

由 Michael Ellerman 提交于 3月 14, 2018

rfi_flush_enable() includes a check to see if we're already
enabled (or disabled), and in that case does nothing.

But that means calling setup_rfi_flush() a 2nd time doesn't actually
work, which is a bit confusing.

Move that check into the debugfs code, where it really belongs.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1e2a9fc7

23 1月, 2018 1 次提交

powerpc/64s: Improve RFI L1-D cache flush fallback · bdcb1aef

由 Nicholas Piggin 提交于 1月 17, 2018

The fallback RFI flush is used when firmware does not provide a way
to flush the cache. It's a "displacement flush" that evicts useful
data by displacing it with an uninteresting buffer.

The flush has to take care to work with implementation specific cache
replacment policies, so the recipe has been in flux. The initial
slow but conservative approach is to touch all lines of a congruence
class, with dependencies between each load. It has since been
determined that a linear pattern of loads without dependencies is
sufficient, and is significantly faster.

Measuring the speed of a null syscall with RFI fallback flush enabled
gives the relative improvement:

P8 - 1.83x
P9 - 1.75x

The flush also becomes simpler and more adaptable to different cache
geometries.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bdcb1aef

19 1月, 2018 3 次提交

powerpc/64: Rename soft_enabled to irq_soft_mask · 4e26bc4a

由 Madhavan Srinivasan 提交于 12月 20, 2017

Rename the paca->soft_enabled to paca->irq_soft_mask as it is no
longer used as a flag for interrupt state, but a mask.
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4e26bc4a

powerpc/64: Move set_soft_enabled() and rename · 0b63acf4

由 Madhavan Srinivasan 提交于 12月 20, 2017

Move set_soft_enabled() from powerpc/kernel/irq.c to asm/hw_irq.c, to
encourage updates to paca->soft_enabled done via these access
function. Add "memory" clobber to hint compiler since
paca->soft_enabled memory is the target here.

Renaming it as soft_enabled_set() will make namespaces works better as
prefix than a postfix when new soft_enabled manipulation functions are
introduced.
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0b63acf4

powerpc/64: Add #defines for paca->soft_enabled flags · c2e480ba

由 Madhavan Srinivasan 提交于 12月 20, 2017

Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when
updating paca->soft_enabled. Replace the hardcoded values used when
updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic
change.
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c2e480ba

18 1月, 2018 1 次提交

powerpc/64s: Relax PACA address limitations · 1af19331

由 Nicholas Piggin 提交于 12月 22, 2017

Book3S PACA memory allocation is restricted by the RMA limit and also
must not take SLB faults when accessed in virtual mode. Currently a
fixed 256MB limit is used for this, which is imprecise and sub-optimal.

Update the paca allocation limits to use use the ppc64_rma_size for RMA
limit, and share the safe_stack_limit() that is currently used for stack
allocations that must not take virtual mode faults.

The safe_stack_limit() name is changed to ppc64_bolted_size() to match
ppc64_rma_size and some comments are updated. We also need to use
early_mmu_has_feature() because we are now calling this function prior
to the jump label patching that enables mmu_has_feature().
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Change mmu_has_feature() to early_mmu_has_feature()]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1af19331

17 1月, 2018 2 次提交

powerpc/64s: Allow control of RFI flush via debugfs · 236003e6

由 Michael Ellerman 提交于 1月 16, 2018

Expose the state of the RFI flush (enabled/disabled) via debugfs, and
allow it to be enabled/disabled at runtime.

eg: $ cat /sys/kernel/debug/powerpc/rfi_flush
    1
    $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush
    $ cat /sys/kernel/debug/powerpc/rfi_flush
    0
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>

236003e6

powerpc/64s: Wire up cpu_show_meltdown() · fd6e440f

由 Michael Ellerman 提交于 1月 16, 2018

The recent commit 87590ce6 ("sysfs/cpu: Add vulnerability folder")
added a generic folder and set of files for reporting information on
CPU vulnerabilities. One of those was for meltdown:

  /sys/devices/system/cpu/vulnerabilities/meltdown

This commit wires up that file for 64-bit Book3S powerpc.

For now we default to "Vulnerable" unless the RFI flush is enabled.
That may not actually be true on all hardware, further patches will
refine the reporting based on the CPU/platform etc. But for now we
default to being pessimists.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fd6e440f

10 1月, 2018 2 次提交

powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti · bc9c9304

由 Michael Ellerman 提交于 1月 10, 2018

Because there may be some performance overhead of the RFI flush, add
kernel command line options to disable it.

We add a sensibly named 'no_rfi_flush' option, but we also hijack the
x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
see 'nopti' we can guess that the user is trying to avoid any overhead
of Meltdown mitigations, and it means we don't have to educate every
one about a different command line option.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bc9c9304

powerpc/64s: Add support for RFI flush of L1-D cache · aa8a5e00

由 Michael Ellerman 提交于 1月 10, 2018

On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.

This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.

The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.

In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.

In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.

For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.

In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: NJon Masters <jcm@redhat.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aa8a5e00

11 12月, 2017 1 次提交

powerpc: Remove DEBUG define in 64-bit early setup code · 10de741f

由 Benjamin Herrenschmidt 提交于 11月 22, 2017

This statement causes some not very useful messages to always
be printed on the serial port at boot, even on quiet boots.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

10de741f

10 11月, 2017 2 次提交

powerpc/64: Set DSCR default initially from SPR · 1696d0fb

由 Nicholas Piggin 提交于 10月 24, 2017

Take the DSCR value set by firmware as the dscr_default value,
rather than zero.

POWER9 recommends DSCR default to a non-zero value.
Signed-off-by: NFrom: Nicholas Piggin <npiggin@gmail.com>
[mpe: Make record_spr_defaults() __init]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1696d0fb

powerpc/powernv: Avoid waiting for secondary hold spinloop with OPAL · 339a3293

由 Nicholas Piggin 提交于 10月 23, 2017

OPAL boot does not insert secondaries at 0x60 to wait at the secondary
hold spinloop. Instead they are started later, and inserted at
generic_secondary_smp_init(), which is after the secondary hold
spinloop.

Avoid waiting on this spinloop when booting with OPAL firmware. This
wait always times out that case.

This saves 100ms boot time on powernv, and 10s of seconds of real time
when booting on the simulator in SMP.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

339a3293

31 8月, 2017 2 次提交

powerpc/64: Fix watchdog configuration regressions · 70412c55

由 Nicholas Piggin 提交于 8月 28, 2017

This fixes a couple more bits of fallout from the new hard lockup watchdog
patch.

It restores the required hw_nmi_get_sample_period() function for the
perf watchdog, and removes some function declarations on 64e that are only
defined for 64s. This fixes the 64e build when the hardlockup detector is
enabled.

It restores the default behaviour of disabling the perf watchdog, and also
fixes disabling the 64s watchdog when running as a guest.

Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

70412c55

powerpc/64s/radix: Remove bolted-SLB address limit for per-cpu stacks · d5507190

由 Nicholas Piggin 提交于 8月 13, 2017

Radix MMU does not take SLB or TLB interrupts when accessing kernel
linear address. Remove this restriction for radix mode.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d5507190

13 7月, 2017 2 次提交

powerpc/64s: implement arch-specific hardlockup watchdog · 2104180a

由 Nicholas Piggin 提交于 7月 12, 2017

Implement an arch-speicfic watchdog rather than use the perf-based
hardlockup detector.

The new watchdog takes the soft-NMI directly, rather than going through
perf.  Perf interrupts are to be made maskable in future, so that would
prevent the perf detector from working in those regions.

Additionally, implement a SMP based detector where all CPUs watch one
another by pinging a shared cpumask.  This is because powerpc Book3S
does not have a true periodic local NMI, but some platforms do implement
a true NMI IPI.

If a CPU is stuck with interrupts hard disabled, the soft-NMI watchdog
does not work, but the SMP watchdog will.  Even on platforms without a
true NMI IPI to get a good trace from the stuck CPU, other CPUs will
notice the lockup sufficiently to report it and panic.

[npiggin@gmail.com: honor watchdog disable at boot/hotplug]
  Link: http://lkml.kernel.org/r/20170621001346.5bb337c9@roar.ozlabs.ibm.com
[npiggin@gmail.com: fix false positive warning at CPU unplug]
  Link: http://lkml.kernel.org/r/20170630080740.20766-1-npiggin@gmail.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170616065715.18390-6-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NDon Zickus <dzickus@redhat.com>
Tested-by: Babu Moger <babu.moger@oracle.com>	[sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2104180a

kernel/watchdog: split up config options · 05a4a952

由 Nicholas Piggin 提交于 7月 12, 2017

Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.

LOCKUP_DETECTOR implies the general boot, sysctl, and programming
interfaces for the lockup detectors.

An architecture that wants to use a hard lockup detector must define
HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH.

Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the
minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and
does not implement the LOCKUP_DETECTOR interfaces.

sparc is unusual in that it has started to implement some of the
interfaces, but not fully yet.  It should probably be converted to a full
HAVE_HARDLOCKUP_DETECTOR_ARCH.

[npiggin@gmail.com: fix]
  Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com
Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.comSigned-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NDon Zickus <dzickus@redhat.com>
Reviewed-by: NBabu Moger <babu.moger@oracle.com>
Tested-by: Babu Moger <babu.moger@oracle.com>	[sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05a4a952

23 6月, 2017 1 次提交

powerpc/64: Initialise thread_info for emergency stacks · 34f19ff1

由 Nicholas Piggin 提交于 6月 21, 2017

Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.

Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.

To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.

Cc: stable@vger.kernel.org
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
      crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

34f19ff1

06 6月, 2017 1 次提交

powerpc/numa: Fix percpu allocations to be NUMA aware · ba4a648f

由 Michael Ellerman 提交于 6月 06, 2017

In commit 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
switched to the generic implementation of cpu_to_node(), which uses a percpu
variable to hold the NUMA node for each CPU.

Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
of our percpu areas, leading to a chicken and egg problem. In practice what
happens is when we are setting up the percpu areas, cpu_to_node() reports that
all CPUs are on node 0, so we allocate all percpu areas on node 0.

This is visible in the dmesg output, as all pcpu allocs being in group 0:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
  pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
  pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47

To fix it we need an early_cpu_to_node() which can run prior to percpu being
setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
in. With the patch dmesg output shows two groups, 0 and 1:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
  pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
  pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47

We can also check the data_offset in the paca of various CPUs, with the fix we
see:

  CPU 0:  data_offset = 0x0ffe8b0000
  CPU 24: data_offset = 0x1ffe5b0000

And we can see from dmesg that CPU 24 has an allocation on node 1:

  node   0: [mem 0x0000000000000000-0x0000000fffffffff]
  node   1: [mem 0x0000001000000000-0x0000001fffffffff]

Cc: stable@vger.kernel.org # v3.16+
Fixes: 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ba4a648f

09 5月, 2017 1 次提交

powerpc/64s: Support new device tree binding for discovering CPU features · 5a61ef74

由 Nicholas Piggin 提交于 5月 09, 2017

The ibm,powerpc-cpu-features device tree binding describes CPU features with
ASCII names and extensible compatibility, privilege, and enablement metadata
that allows improved flexibility and compatibility with new hardware.

The interface is described in detail in ibm,powerpc-cpu-features.txt in this
patch.

Currently this code is not enabled by default, and there are no released
firmwares that provide the binding.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5a61ef74

28 4月, 2017 1 次提交

powerpc/64s: Dedicated system reset interrupt stack · b1ee8a3d

由 Nicholas Piggin 提交于 12月 20, 2016

The system reset interrupt is used for crash/debug situations, so it is
desirable to have as little impact on the normal state of the system as
possible.

Currently it uses the current kernel stack to process the exception.
This stores into the stack which may be involved with the crash. The
stack pointer may be corrupted, or it may have overflowed.

Avoid or minimise these problems by creating a dedicated NMI stack for
the system reset interrupt to use.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b1ee8a3d

28 3月, 2017 2 次提交

powerpc: Disable HFSCR[TM] if TM is not supported · 7ed23e1b

由 Benjamin Herrenschmidt 提交于 3月 20, 2017

On Power8 & Power9 the early CPU inititialisation in __init_HFSCR()
turns on HFSCR[TM] (Hypervisor Facility Status and Control Register
[Transactional Memory]), but that doesn't take into account that TM
might be disabled by CPU features, or disabled by the kernel being built
with CONFIG_PPC_TRANSACTIONAL_MEM=n.

So later in boot, when we have setup the CPU features, clear HSCR[TM] if
the TM CPU feature has been disabled. We use CPU_FTR_TM_COMP to account
for the CONFIG_PPC_TRANSACTIONAL_MEM=n case.

Without this a KVM guest might try use TM, even if told not to, and
cause an oops in the host kernel. Typically the oops is seen in
__kvmppc_vcore_entry() and may or may not be fatal to the host, but is
always bad news.

In practice all shipping CPU revisions do support TM, and all host
kernels we are aware of build with TM support enabled, so no one should
actually be able to hit this in the wild.

Fixes: 2a3563b0 ("powerpc: Setup in HFSCR for POWER8")
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
[mpe: Rewrite change log with input from Sam, add Fixes/stable]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7ed23e1b

powerpc/64: Don't use early_cpu_has_feature() in cpu_ready_for_interrupts() · 5511a45f

由 Michael Ellerman 提交于 3月 21, 2017

cpu_ready_for_interrupts() is called after feature patching, so there's
no need to use early_cpu_has_feature().
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5511a45f

06 3月, 2017 1 次提交

powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info() · 6ba422c7

由 Anton Blanchard 提交于 3月 05, 2017

I see a panic in early boot when building with a recent gcc toolchain.
The issue is a divide by zero, which is undefined. Older toolchains
let us get away with it:

int foo(int a) { return a / 0; }

foo:
	li 9,0
	divw 3,3,9
	extsw 3,3
	blr

But newer ones catch it:

foo:
	trap

Add a check to avoid the divide by zero.

Fixes: e2827fe5 ("powerpc/64: Clean up ppc64_caches using a struct per cache")
Signed-off-by: NAnton Blanchard <anton@samba.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6ba422c7

15 2月, 2017 1 次提交

powerpc/64e: Fix bogus usage of WARN_ONCE() · 0d2b5cdc

由 Michael Ellerman 提交于 2月 15, 2017

WARN_ONCE() takes a condition and a format string. We were passing a
constant string as the condition, and the function name as the format
string. It would work, but the message would be just the function name.

Fix it by just using WARN_ONCE() directly instead of if (x) WARN_ONCE().
Noticed-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0d2b5cdc

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功