提交 · 05a308c722822b0fbcc706b54be70f9bb9d52539 · openanolis / cloud-kernel

28 7月, 2014 6 次提交

KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue · 05a308c7

由 Anton Blanchard 提交于 6月 12, 2014

To establish addressability quickly, ABIv2 requires the target
address of the function being called to be in r12.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

05a308c7

KVM: PPC: Book3S PR: Handle hyp doorbell exits · 568fccc4

由 Alexander Graf 提交于 6月 16, 2014

If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts.
Handle those the same way we treat normal doorbells.
Signed-off-by: NAlexander Graf <agraf@suse.de>

568fccc4

KVM: PPC: Book3s HV: Fix tlbie compile error · f6bf3a66

由 Alexander Graf 提交于 6月 11, 2014

Some compilers complain about uninitialized variables in the compute_tlbie_rb
function. When you follow the code path you'll realize that we'll never get
to that point, but the compiler isn't all that smart.

So just default to 4k page sizes for everything, making the compiler happy
and the code slightly easier to read.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NPaul Mackerras <paulus@samba.org>

f6bf3a66

KVM: PPC: Book3s PR: Disable AIL mode with OPAL · fb4188ba

由 Alexander Graf 提交于 6月 09, 2014

When we're using PR KVM we must not allow the CPU to take interrupts
in virtual mode, as the SLB does not contain host kernel mappings
when running inside the guest context.

To make sure we get good performance for non-KVM tasks but still
properly functioning PR KVM, let's just disable AIL whenever a vcpu
is scheduled in.

This is fundamentally different from how we deal with AIL on pSeries
type machines where we disable AIL for the whole machine as soon as
a single KVM VM is up.

The reason for that is easy - on pSeries we do not have control over
per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
races and AIL configuration, so setting it per CPU is easier and more
flexible.

This patch fixes running PR KVM on POWER8 bare metal for me.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Acked-by: NPaul Mackerras <paulus@samba.org>

fb4188ba

KVM: PPC: BOOK3S: PR: Emulate instruction counter · 06da28e7

由 Aneesh Kumar K.V 提交于 6月 05, 2014

Writing to IC is not allowed in the privileged mode.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

06da28e7

KVM: PPC: BOOK3S: PR: Emulate virtual timebase register · 8f42ab27

由 Aneesh Kumar K.V 提交于 6月 05, 2014

virtual time base register is a per VM, per cpu register that needs
to be saved and restored on vm exit and entry. Writing to VTB is not
allowed in the privileged mode.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: fix compile error]
Signed-off-by: NAlexander Graf <agraf@suse.de>

8f42ab27

06 7月, 2014 1 次提交

KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation · 3cd60e31

由 Aneesh Kumar K.V 提交于 6月 04, 2014

We use time base for PURR and SPURR emulation with PR KVM since we
are emulating a single threaded core. When using time base
we need to make sure that we don't accumulate time spent in the host
in PURR and SPURR value.

Also we don't need to emulate mtspr because both the registers are
hypervisor resource.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

3cd60e31

12 6月, 2014 2 次提交

powerpc: Avoid circular dependency with zImage.% · 699c659b

由 Michal Marek 提交于 6月 11, 2014

The rule to create the final images uses a zImage.% pattern.
Unfortunately, this also matches the names of the zImage.*.lds linker
scripts, which appear as a dependency of the final images. This somehow
worked when $(srctree) used to be an absolute path, but now the pattern
matches too much. List only the images from $(image-y) as the target of
the rule, to avoid the circular dependency.
Reported-and-tested-by: NMike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: NMichal Marek <mmarek@suse.cz>

699c659b

powerpc/book3s: Fix some ABIv2 issues in machine check code · ad718622

由 Anton Blanchard 提交于 6月 12, 2014

Commit 2749a2f2 (powerpc/book3s: Fix machine check handling for
unhandled errors) introduced a few ABIv2 issues.

We can maintain ABIv1 and ABIv2 compatibility by branching to the
function rather than the dot symbol.

Fixes: 2749a2f2 ("powerpc/book3s: Fix machine check handling for unhandled errors")
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ad718622

11 6月, 2014 28 次提交

powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. · 74845bc2

由 Mahesh Salgaonkar 提交于 6月 11, 2014

Currently we forward MCEs to guest which have been recovered by guest.
And for unhandled errors we do not deliver the MCE to guest. It looks like
with no support of FWNMI in qemu, guest just panics whenever we deliver the
recovered MCEs to guest. Also, the existig code used to return to host for
unhandled errors which was casuing guest to hang with soft lockups inside
guest and makes it difficult to recover guest instance.

This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
And, for recovered errors we just go back to normal functioning of guest
instead of returning to host. This fixes soft lockup issues in guest.
This patch also fixes an issue where guest MCE events were not logged to
host console.
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

74845bc2

powerpc/book3s: Increment the mce counter during machine_check_early call. · e6654d5b

由 Mahesh Salgaonkar 提交于 6月 11, 2014

We don't see MCE counter getting increased in /proc/interrupts which gives
false impression of no MCE occurred even when there were MCE events.
The machine check early handling was added for PowerKVM and we missed to
increment the MCE count in the early handler.

We also increment mce counters in the machine_check_exception call, but
in most cases where we handle the error hypervisor never reaches there
unless its fatal and we want to crash. Only during fatal situation we may
see double increment of mce count. We need to fix that. But for
now it always good to have some count increased instead of zero.
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e6654d5b

powerpc/book3s: Add stack overflow check in machine check handler. · e75ad93a

由 Mahesh Salgaonkar 提交于 6月 11, 2014

Currently machine check handler does not check for stack overflow for
nested machine check. If we hit another MCE while inside the machine check
handler repeatedly from same address then we get into risk of stack
overflow which can cause huge memory corruption. This patch limits the
nested MCE level to 4 and panic when we cross level 4.
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e75ad93a

powerpc/book3s: Fix machine check handling for unhandled errors · 2749a2f2

由 Mahesh Salgaonkar 提交于 6月 11, 2014

Current code does not check for unhandled/unrecovered errors and return from
interrupt if it is recoverable exception which in-turn triggers same machine
check exception in a loop causing hypervisor to be unresponsive.

This patch fixes this situation and forces hypervisor to panic for
unhandled/unrecovered errors.

This patch also fixes another issue where unrecoverable_exception routine
was called in real mode in case of unrecoverable exception (MSR_RI = 0).
This causes another exception vector 0x300 (data access) during system crash
leading to confusion while debugging cause of the system crash.

Also turn ME bit off while going down, so that when another MCE is hit during
panic path, system will checkstop and hypervisor will get restarted cleanly
by SP.

With the above fixes we now throw correct console messages (see below) while
crashing the system in case of unhandled/unrecoverable machine checks.

--------------
Severe Machine check interrupt [[Not recovered]
  Initiator: CPU
  Error type: UE [Instruction fetch]
    Effective address: 0000000030002864
Oops: Machine check, sig: 7 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork]
CPU: 36 PID: 55162 Comm: bash Tainted: G           O 3.14.0mce #1
task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000
NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c
REGS: c000000007ec3d80 TRAP: 0200   Tainted: G           O  (3.14.0mce)
MSR: 9000000000041002 <SF,HV,ME,RI>  CR: 28222848  XER: 20000000
CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1
GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864
GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9
GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728
GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000
GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc
GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000
GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250
GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000
NIP [0000000030002864] 0x30002864
LR [00000000300151a4] 0x300151a4
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 7285f0beac1e29d3 ]---

Sending IPI to other CPUs
IPI complete
OPAL V3 detected !
--------------
Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2749a2f2

powerpc/eeh: Dump PE location code · 357b2f3d

由 Gavin Shan 提交于 6月 11, 2014

As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.

If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

357b2f3d

powerpc/powernv: Enable POWER8 doorbell IPIs · d4e58e59

由 Michael Neuling 提交于 6月 11, 2014

This patch enables POWER8 doorbell IPIs on powernv.

Since doorbells can only IPI within a core, we test to see when we can use
doorbells and if not we fall back to XICS.  This also enables hypervisor
doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.

Based on tests by Anton, the best case IPI latency between two threads dropped
from 894ns to 512ns.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d4e58e59

powerpc/powernv: Fix killed EEH event · 5c7a35e3

由 Gavin Shan 提交于 6月 04, 2014

On PowerNV platform, EEH errors are reported by IO accessors or poller
driven by interrupt. After the PE is isolated, we won't produce EEH
event for the PE. The current implementation has possibility of EEH
event lost in this way:

The interrupt handler queues one "special" event, which drives the poller.
EEH thread doesn't pick the special event yet. IO accessors kicks in, the
frozen PE is marked as "isolated" and EEH event is queued to the list.
EEH thread runs because of special event and purge all existing EEH events.
However, we never produce an other EEH event for the frozen PE. Eventually,
the PE is marked as "isolated" and we don't have EEH event to recover it.

The patch fixes the issue to keep EEH events for PEs that have been
marked as "isolated" with the help of additional "force" help to
eeh_remove_event().
Reported-by: NRolf Brudeseth <rolfb@us.ibm.com>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5c7a35e3

powerpc: fix typo 'CONFIG_PMAC' · 6e0fdf9a

由 Paul Bolle 提交于 5月 20, 2014

Commit b0d278b7 ("powerpc/perf_event: Reduce latency of calling
perf_event_do_pending") added a check for CONFIG_PMAC were a check for
CONFIG_PPC_PMAC was clearly intended.

Fixes: b0d278b7 ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending")
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6e0fdf9a

powerpc: fix typo 'CONFIG_PPC_CPU' · b69a1da9

由 Paul Bolle 提交于 5月 20, 2014

Commit cd64d169 ("powerpc: mtmsrd not defined") added a check for
CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended.

Fixes: cd64d169 ("powerpc: mtmsrd not defined")
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b69a1da9

powerpc/powernv: Don't escalate non-existing frozen PE · 71b540ad

由 Gavin Shan 提交于 5月 05, 2014

Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE")
escalates the frozen state on non-existing PE to fenced PHB. It
was to improve kdump reliability. After that, commit 361f2a2a
("powrpc/powernv: Reset PHB in kdump kernel") was introduced to
issue complete reset on all PHBs to increase the reliability of
kdump kernel.

Commit cb5b242c becomes unuseful and it would be reverted.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

71b540ad

powerpc/eeh: Report frozen parent PE prior to child PE · 1ad7a72c

由 Gavin Shan 提交于 5月 05, 2014

When we have the corner case of frozen parent and child PE at the
same time, we have to handle the frozen parent PE prior to the
child. Without clearning the frozen state on parent PE, the child
PE can't be recovered successfully.

The patch searches the EEH PE hierarchy tree and returns the toppest
frozen PE to be handled. It ensures the frozen parent PE will be
handled prior to child PE.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1ad7a72c

powerpc/eeh: Clear frozen state for child PE · 2c665992

由 Gavin Shan 提交于 5月 05, 2014

Since commit cb523e09 ("powerpc/eeh: Avoid I/O access during PE
reset"), the PE is kept as frozen state on hardware level until
the PE reset is done completely. After that, we explicitly clear
the frozen state of the affected PE. However, there might have
frozen child PEs of the affected PE and we also need clear their
frozen state as well. Otherwise, the recovery is going to fail.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2c665992

powerpc/powernv: Reduce panic timeout from 180s to 10s · 4817fc32

由 Anton Blanchard 提交于 5月 01, 2014

We've already dropped the default pseries timeout to 10s, do
the same for powernv.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4817fc32

powerpc/xmon: avoid format string leaking to printk · 50b66dbf

由 Kees Cook 提交于 6月 10, 2014

This makes sure format strings cannot leak into printk (the string has
already been correctly processed for format arguments).
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

50b66dbf

powerpc/perf: Ensure all EBB register state is cleared on fork() · 3df48c98

由 Michael Ellerman 提交于 6月 10, 2014

In commit 330a1eb7 "Core EBB support for 64-bit book3s" I messed up
clear_task_ebb(). It clears some but not all of the task's Event Based
Branch (EBB) registers when we duplicate a task struct.

That allows a child task to observe the EBBHR & EBBRR of its parent,
which it should not be able to do.

Fix it by clearing EBBHR & EBBRR.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Cc: stable@vger.kernel.org [v3.11+]
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3df48c98

powerpc/powernv: Fix reading of OPAL msglog · caf69ba6

由 Joel Stanley 提交于 6月 10, 2014

memory_return_from_buffer returns a signed value, so ret should be
ssize_t.

Fixes the following issue reported by David Binderman:

  [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style)
  Checking if unsigned variable 'ret' is less than zero.
  [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style)
  Checking if unsigned variable 'ret' is less than zero.

  Local variable "ret" is of type size_t. This is always unsigned,
  so it is pointless to check if it is less than zero.

  https://bugzilla.kernel.org/show_bug.cgi?id=77551

Fixing this exposes a real bug for the case where the entire count
bytes is successfully read from the POS_WRAP case. The second
memory_read_from_buffer will return EINVAL, causing the entire read to
return EINVAL to userspace, despite the data being copied correctly. The
fix is to test for the case where the data has been read and return
early.
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

caf69ba6

powerpc/spufs: Remove duplicate SPUFS_CNTL_MAP_SIZE define · be8f9642

由 Dan Carpenter 提交于 6月 09, 2014

The SPUFS_CNTL_MAP_SIZE define is cut and pasted twice so we can delete
the second instance.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NJeremy Kerr <jk@ozlabs.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

be8f9642

powerpc/cpm: Remove duplicate FCC_GFMR_TTX define · d9d82123

由 Dan Carpenter 提交于 6月 09, 2014

The FCC_GFMR_TTX define is cut and pasted twice so we can remove the
second instance.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d9d82123

powerpc/powernv: Fix endianness problems in EEH · ddf0322a

由 Guo Chao 提交于 6月 09, 2014

EEH information fetched from OPAL need fix before using in LE environment.
To be included in sparse's endian check, declare them as __beXX and
access them by accessors.

Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ddf0322a

powernv: Fix permissions on sysparam sysfs entries · 1bd09890

由 Anton Blanchard 提交于 6月 07, 2014

Everyone can write to these files, which is not what we want.

Cc: stable@vger.kernel.org # 3.15
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1bd09890

powerpc/powernv : Disable subcore for UP configs · ad417330

由 Shreyas B. Prabhu 提交于 6月 06, 2014

Build throws following errors when CONFIG_SMP=n
arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’:
arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function)
arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment

'setup_max_cpus' variable is relevant only on SMP, so there is no point
working around it for UP. Furthermore, subcore itself is relevant only
on SMP and hence the better solution is to exclude subcore.o and
subcore-asm.o for UP builds.
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ad417330

powerpc/powernv: Include asm/smp.h to fix UP build failure · b2a80878

由 Shreyas B. Prabhu 提交于 6月 06, 2014

Build throws following errors when CONFIG_SMP=n
arch/powerpc/platforms/powernv/setup.c: In function ‘pnv_kexec_wait_secondaries_down’:
arch/powerpc/platforms/powernv/setup.c:179:4: error: implicit declaration of function ‘get_hard_smp_processor_id’
    rc = opal_query_cpu_status(get_hard_smp_processor_id(i),

The usage of get_hard_smp_processor_id() needs the declaration from
<asm/smp.h>. The file setup.c includes <linux/sched.h>, which in-turn
includes <linux/smp.h>. However, <linux/smp.h> includes <asm/smp.h>
only on SMP configs and hence UP builds fail.

Fix this by directly including <asm/smp.h> in setup.c unconditionally.
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b2a80878

powerpc: Don't setup CPUs with bad status · 59a53afe

由 Michael Neuling 提交于 6月 06, 2014

OPAL will mark a CPU that is guarded as "bad" in the status property of the CPU
node.

Unfortunatley Linux doesn't check this property and will put the bad CPU in the
present map.  This has caused hangs on booting when we try to unsplit the core.

This patch checks the CPU is avaliable via this status property before putting
it in the present map.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Tested-by: NAnton Blanchard <anton@samba.org>
cc: stable@vger.kernel.org
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

59a53afe

powerpc: Correct DSCR during TM context switch · 96d01610

由 Sam bobroff 提交于 6月 05, 2014

Correct the DSCR SPR becoming temporarily corrupted if a task is
context switched during a transaction.

The problem occurs while suspending the task and is caused by saving
the DSCR to thread.dscr after it has already been set to the CPU's
default value:

__switch_to() calls __switch_to_tm()
	which calls tm_reclaim_task()
	which calls tm_reclaim_thread()
	which calls tm_reclaim()
		where the DSCR is set to the CPU's default
__switch_to() calls _switch()
		where thread.dscr is set to the DSCR

When the task is resumed, it's transaction will be doomed (as usual)
and the DSCR SPR will be corrupted, although the checkpointed value
will be correct. Therefore the DSCR will be immediately corrected by
the transaction aborting, unless it has been suspended. In that case
the incorrect value can be seen by the task until it resumes the
transaction.

The fix is to treat the DSCR similarly to the TAR and save it early
in __switch_to().

A program exposing the problem is added to the kernel self tests as:
tools/testing/selftests/powerpc/tm/tm-resched-dscr.
Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
CC: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

96d01610

powerpc: Remove platforms/wsp and associated pieces · fb5a5157

由 Michael Ellerman 提交于 6月 02, 2014

__attribute__ ((unused))

WSP is the last user of CONFIG_PPC_A2, so we remove that as well.

Although CONFIG_PPC_ICSWX still exists, it's no longer selectable for
any Book3E platform, so we can remove the code in mmu-book3e.h that
depended on it.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fb5a5157

powerpc: Remove check for CONFIG_SERIAL_TEXT_DEBUG · 94314290

由 Paul Bolle 提交于 5月 20, 2014

The Kconfig symbol SERIAL_TEXT_DEBUG was removed from
arch/powerpc/Kconfig.debug in v2.6.22. (In v2.6.27 it was also removed
from arch/ppc/Kconfig.debug.) So the check for its macro has evaluated
to false for over five years now. Remove that check and the few lines
of code hidden behind it.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

94314290

powerpc: Add AT_HWCAP2 to indicate V.CRYPTO category support · dd58a092

由 Benjamin Herrenschmidt 提交于 6月 10, 2014

The Vector Crypto category instructions are supported by current POWER8
chips, advertise them to userspace using a specific bit to properly
differentiate with chips of the same architecture level that might not
have them.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.10+]

dd58a092

booke/watchdog: refine and clean up the codes · d2deebab

由 Tang Yuantian 提交于 5月 08, 2014

Basically, this patch does the following:
1. Move the codes of parsing boot parameters from setup-common.c
   to driver. In this way, code reader can know directly that
   there are boot parameters that can change the timeout.
2. Make boot parameter 'booke_wdt_period' effective.
   currently, when driver is loaded, default timeout is always
   being used in stead of booke_wdt_period.
3. Wrap up the watchdog timeout in device struct and clean up
   unnecessary codes.
Signed-off-by: NTang Yuantian <yuantian.tang@freescale.com>
Acked-by: NScott Wood <scottwood@freescale.com>
Reviewed-by: NLi Yang <leoli@freescale.com>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

d2deebab

07 6月, 2014 2 次提交

powerpc: update comments for generic idle conversion · 0d2b7ea9

由 Geert Uytterhoeven 提交于 6月 06, 2014

As of commit 799fef06 ("powerpc: Use generic idle loop"), this
applies to arch_cpu_idle() instead of cpu_idle().
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d2b7ea9

powerpc/powernv: Add missing include to LPC code · 0c0a3e5a

由 Benjamin Herrenschmidt 提交于 6月 07, 2014

kbuild bot spotted that one:

  arch/powerpc/platforms/powernv/opal-lpc.c: In function 'opal_lpc_init_debugfs':
>> arch/powerpc/platforms/powernv/opal-lpc.c:319:35: error: 'powerpc_debugfs_root' undeclared (first use in this function)
     root = debugfs_create_dir("lpc", powerpc_debugfs_root);
                                      ^
We neet to include the definition explicitely.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0c0a3e5a

06 6月, 2014 1 次提交

powerpc/mm: Check paca psize is up to date for huge mappings · 09567e7f

由 Michael Ellerman 提交于 5月 28, 2014

We have a bug in our hugepage handling which exhibits as an infinite
loop of hash faults. If the fault is being taken in the kernel it will
typically trigger the softlockup detector, or the RCU stall detector.

The bug is as follows:

 1. mmap(0xa0000000, ..., MAP_FIXED | MAP_HUGE_TLB | MAP_ANONYMOUS ..)
 2. Slice code converts the slice psize to 16M.
 3. The code on lines 539-540 of slice.c in slice_get_unmapped_area()
    synchronises the mm->context with the paca->context. So the paca slice
    mask is updated to include the 16M slice.
 3. Either:
    * mmap() fails because there are no huge pages available.
    * mmap() succeeds and the mapping is then munmapped.
    In both cases the slice psize remains at 16M in both the paca & mm.
 4. mmap(0xa0000000, ..., MAP_FIXED | MAP_ANONYMOUS ..)
 5. The slice psize is converted back to 64K. Because of the check on line 539
    of slice.c we DO NOT update the paca->context. The paca slice mask is now
    out of sync with the mm slice mask.
 6. User/kernel accesses 0xa0000000.
 7. The SLB miss handler slb_allocate_realmode() **uses the paca slice mask**
    to create an SLB entry and inserts it in the SLB.
18. With the 16M SLB entry in place the hardware does a hash lookup, no entry
    is found so a data access exception is generated.
19. The data access handler calls do_page_fault() -> handle_mm_fault().
10. __handle_mm_fault() creates a THP mapping with do_huge_pmd_anonymous_page().
11. The hardware retries the access, there is still nothing in the hash table
    so once again a data access exception is generated.
12. hash_page() calls into __hash_page_thp() and inserts a mapping in the
    hash. Although the THP mapping maps 16M the hashing is done using 64K
    as the segment page size.
13. hash_page() returns immediately after calling __hash_page_thp(), skipping
    over the code at line 1125. Resulting in the mismatch between the
    paca->context and mm->context not being detected.
14. The hardware retries the access, the hash it generates using the 16M
    SLB entry does NOT match the hash we inserted.
15. We take another data access and go into __hash_page_thp().
16. We see a valid entry in the hpte_slot_array and so we call updatepp()
    which succeeds.
17. Goto 14.

We could fix this in two ways. The first would be to remove or modify
the check on line 539 of slice.c.

The second option is to cause the check of paca psize in hash_page() on
line 1125 to also be done for THP pages.

We prefer the latter, because the check & update of the paca psize is
not done until we know it's necessary. It's also done only on the
current cpu, so we don't need to IPI all other cpus.

Without further rearranging the code, the simplest fix is to pull out
the code that checks paca psize and call it in two places. Firstly for
THP/hugetlb, and secondly for other mappings as before.

Thanks to Dave Jones for trinity, which originally found this bug.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.11+]

09567e7f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功