提交 · c533b46cc7da58e5ccbd20ac2f607ff60d24eb4e · openeuler / Kernel

10 9月, 2012 11 次提交

powerpc/eeh: Build EEH event based on PE · c533b46c

由 Gavin Shan 提交于 9月 07, 2012

The original implementation builds EEH event based on EEH device.
We already had dedicated struct to depict PE. It's reasonable to
build EEH event based on PE.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c533b46c

powerpc/eeh: Remove PE at appropriate time · 82e8882f

由 Gavin Shan 提交于 9月 07, 2012

During PCI hotplug and EEH recovery, the PE hierarchy tree might be
changed due to the PCI topology changes. At later point when the
PCI device is added, the PE will be created dynamically again.

The patch introduces new function to remove EEH devices from the
associated PE. That also can cause that the parent PE is removed
from the PE tree if the parent PE doesn't include valid EEH devices
and child PEs.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

82e8882f

powerpc/eeh: Create PEs duing EEH initialization · 9b84348c

由 Gavin Shan 提交于 9月 07, 2012

The patch creates PEs and associated the newly created PEs with
it parent/silbing as well as EEH devices. It would become more
straight to trace EEH errors and recover them accordingly.

Once the EEH functionality on one PCI IOA has been enabled, we
tries to create PE against it. If there's existing PE, to which
the current PCI IOA should be attached, the existing PE will be
converted from "device" type to "bus" type accordingly.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9b84348c

powerpc/eeh: Search PE based on requirement · 22f4ab12

由 Gavin Shan 提交于 9月 07, 2012

The patch implements searching PE based on the following
requirements:

 * Search PE according to PE address, which is traditional
   PE address that is composed of PCI bus/device/function
   number, or unified PE address assigned by firmware or
   platform.
 * Search parent PE according to the given EEH device. It's
   useful when creating new PE and put it into right position.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

22f4ab12

powerpc/eeh: Create PEs for PHBs · 55037d17

由 Gavin Shan 提交于 9月 07, 2012

For one particular PE, it's only meaningful in the ancestor PHB
domain. Therefore, each PHB should have its own PE hierarchy tree
to trace those PEs created against the PHB.

The patch creates PEs for the PHBs and put those PEs into the
global link list traced by "eeh_phb_pe". The link list of PEs
would be first level of overall PE hierarchy tree across the
system.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

55037d17

powerpc/eeh: Introduce global mutex · 646a8499

由 Gavin Shan 提交于 9月 07, 2012

The patch introduces global mutex for EEH so that the core data
structures can be protected by that. Also, 2 inline functions
are exported for that: eeh_lock() and eeh_unlock().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

646a8499

powerpc/eeh: Introduce eeh_pe struct · 968f968f

由 Gavin Shan 提交于 9月 07, 2012

As defined in PAPR 2.4, Partitionable Endpoint (PE) is an I/O subtree
that can be treated as a unit for the purposes of partitioning and error
recovery. Therefore, eeh core should be aware of PE. With eeh_pe struct,
we can support PE explicitly. Further more, it makes all the stuff much
more data centralized. Another important reason is for eeh core to support
multiple platforms. Some of them like pSeries figures out PEs through
OF nodes while others like powernv have to do that through PCI bus/device
tree. With explicit PE support, eeh core will be implemented based on
the centrialized data and platform dependent implementations figure it
out by their feasible ways.

When the struct is designed, following factors are taken in account:
  * Reflecting the relationships of PEs. PE might have parent
    as well children.
  * Reflecting the association of PE and (eeh) devices.
  * PEs have PHB boundary.
  * PE should have unique address assigned in the corresponding
    PHB domain.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

968f968f

powerpc/eeh: More logs for EEH initialization · 3ea1ae98

由 Gavin Shan 提交于 9月 07, 2012

The patch adds more logs to EEH initialization functions for
debugging purpose. Also, the machine type (pSeries) is checked
in the platform initialization to assure it's the correct platform
to invoke it.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3ea1ae98

powerpc/eeh: Use slab to allocate eeh devices · 7e4bbaf0

由 Gavin Shan 提交于 9月 07, 2012

The EEH initialization functions have been postponed until slab/slub
are ready. So we use slab/slub to allocate the memory chunks for newly
creatd EEH devices. That would save lots of memory.

The patch also does cleanup to replace "kmalloc" with "kzalloc" so
that we needn't clear the allocated memory chunk explicitly.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7e4bbaf0

powerpc/eeh: Move EEH initialization around · 35e5cfe2

由 Gavin Shan 提交于 9月 07, 2012

Currently, we have 3 phases for EEH initialization on pSeries platform.
All of them are done through builtin functions: platform initialization,
EEH device creation, and EEH subsystem enablement. All of them are done
no later than ppc_md.setup_arch. That means that the slab/slub isn't ready
yet, so we have to allocate memory chunks on basis of PAGE_SIZE for those
dynamically created EEH devices. That's pretty expensive.

In order to utilize slab/slub for memory allocation, we have to move the EEH
initialization functions around, but all of them should be called after slab
is ready.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

35e5cfe2

powerpc: Initialise paca.data_offset with poison · 407821a3

由 Michael Ellerman 提交于 9月 07, 2012

It's possible for the cpu_possible_mask to change between the time we
initialise the pacas and the time we setup per_cpu areas.

Obviously impossible cpus shouldn't ever be running, but stranger things
have happened. So be paranoid and initialise data_offset with a poison
value in case we don't set it up later.

Based on a patch from Anton Blanchard.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

407821a3

07 9月, 2012 13 次提交

powerpc: Use the XDABR hcall · 06c88766

由 Michael Neuling 提交于 9月 05, 2012

We never use the XDABR hcall since we check for DABR hcall first.
XDABR syscall is better since it allows us to also set the DABRX.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

06c88766

powerpc: Use consistent name info for arch_hw_breakpoint · 3f4693ee

由 Michael Neuling 提交于 9月 05, 2012

Change bp_info to info to be consistent with the rest of this file.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3f4693ee

powerpc: Pack arch_hw_breakpoint to avoid holes in struct · ab046a93

由 Michael Neuling 提交于 9月 05, 2012

No functional change
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ab046a93

powerpc: Export memory limit via device tree · 4bc77a5e

由 Suzuki Poulose 提交于 8月 21, 2012

The powerpc kernel doesn't export the memory limit enforced by 'mem='
kernel parameter. This is required for building the ELF header in
kexec-tools to limit the vmcore to capture only the used memory. On
powerpc the kexec-tools depends on the device-tree for memory related
information, unlike /proc/iomem on the x86.

Without this information, the kexec-tools assumes the entire System
RAM and vmcore creates an unnecessarily larger dump.

This patch exports the memory limit, if present, via
chosen/linux,memory-limit
property, so that the vmcore can be limited to the memory limit.

The prom_init seems to export this value in the same node. But doesn't
really
appear there.  Also the memory_limit gets adjusted with the processing of
crashkernel= parameter. This patch makes sure we get the actual limit.

The kexec-tools will use the value to limit the 'end' of the memory
regions.

Tested this patch on ppc64 and ppc32(ppc440) with a kexec-tools
patch by Mahesh.
Signed-off-by: NSuzuki K. Poulose <suzuki@in.ibm.com>
Tested-by: NMahesh J. Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4bc77a5e

powerpc: Change memory_limit from phys_addr_t to unsigned long long · a84fcd46

由 Suzuki Poulose 提交于 8月 21, 2012

There are some device-tree nodes, whose values are of type phys_addr_t.
The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.

Change these to a fixed unsigned long long for consistency.

This patch does the change only for memory_limit.

The following is a list of such variables which need the change:

 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c

 2) (struct resource *)crashk_res.start - We could export a local static
    variable from machine_kexec.c.

Changing the above values might break the kexec-tools. So, I will
fix kexec-tools first to handle the different sized values and then change
 the above.
Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NSuzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a84fcd46

powerpc: Fix build dependencies for c files requiring libfdt.h · 0090e02b

由 Matthew McClintock 提交于 9月 06, 2012

Several files in obj-plat depend on libfdt header file. Sometimes
when building one can see the following issue. This patch adds
libfdt as dependency to those object files

| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:854:1: error: unterminated comment
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:1:0: error: unterminated #ifndef
|   BOOTCC  arch/powerpc/boot/inffast.o
| make[1]: *** [arch/powerpc/boot/treeboot-iss4xx.o] Error 1
| make[1]: *** Waiting for unfinished jobs....
|   BOOTCC  arch/powerpc/boot/inflate.o
| make: *** [uImage] Error 2
| ERROR: oe_runmake failed
| ERROR: Function failed: do_compile (see /srv/home/pokybuild/yocto-autobuilder/yocto-slave/p1022ds/build/build/tmp/work/p1022ds-poky-linux-gnuspe/linux-qoriq-sdk-3.0.34-r5/temp/log.do_compile.2167 for further information)
NOTE: recipe linux-qoriq-sdk-3.0.34-r5: task do_compile: Failed
Signed-off-by: NMatthew McClintock <msm@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0090e02b

powerpc/oprofile: Fix marked events support on Power7+ not set. · adbf115d

由 Carl E. Love 提交于 8月 03, 2012

Starting with Power 7+ we need to check for marked events if the SIAR
register is valid, i.e. it contains the correct address of the instruction
at the time the performance counter overflowed. The mmcra register on
Power 7+, contains a new bit to indicate that the contents of the SIAR
is valid. If the event is not marked, then the sample is recorded
independently of the SIAR valid bit setting. For older processors, there
is no SIAR valid bit to check so the samples are always recorded. This is
done by forcing the cntr_marked_events bit mask to zero. The code will
always record the sample in this case since the bit mask says the event is
not a marked event even if it really is a marked event.
Signed-off-by: NCarl Love <cel@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

adbf115d

powerpc: Define Power7+ PV constant PV_POWER7p · 22d8ce88

由 sukadev@linux.vnet.ibm.com 提交于 7月 16, 2012

This definition will be used by subsequent perf and oprofile patches
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

22d8ce88

powerpc/pseries: Round up MSI-X requests · 752f5216

由 Anton Blanchard 提交于 6月 04, 2012

The pseries firmware currently refuses any non power of two MSI-X
request. Unfortunately most network drivers end up asking for that
because they want a power of two for RX queues and one or two extra
for everything else.

This patch rounds up the firmware request to the next power of two
if the quota allows it. If this fails we fall back to using the
original request size.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

752f5216

powerpc/pci: Save P2P bridge resource if possible · cf1a4cf8

由 Gavin Shan 提交于 6月 03, 2012

When PCI probe flag PCI_REASSIGN_ALL_RSRC has been passed into PCI
core, it's hoped that all resources to be reassigned by PCI core.
As to particular P2P (PCI-to-PCI) bridge, the size of the corresponding
BAR (I/O, MMIO, prefetchable MMIO) is calculated by the resources
required by the PCI devices behind the P2P bridge. That means that
the information like start/end address retrieved from the hardware
registers of the P2P bridge is meainingless in the case. However,
we still count that in and the BARs might have been configured by
firmware with non-zero size. That leads to space waste.

The patch explicitly sets the size of P2P bridge BARs to zero in
case that resource reassignment is expected with PCI probe flag
PCI_REASSIGN_ALL_RSRC. In the result, it will save overall resource
required by the system without waste.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cf1a4cf8

B
Merge branch 'merge' into next · fff34b34
由 Benjamin Herrenschmidt 提交于 9月 07, 2012
```
Brings in various bug fixes from 3.6-rcX
```
fff34b34

powerpc/kprobes: Rename opcode_t in probes.h to ppc_opcode_t · 28e1e58f

由 Ananth N Mavinakayanahalli 提交于 9月 05, 2012

commit: 8b7b80b9
[24/29] powerpc: Uprobes port to powerpc

Caused a clash with the fore200e driver:

In file included from drivers/atm/fore200e.c:70:0:
drivers/atm/fore200e.h:263:3: error: redefinition of typedef 'opcode_t' with different type
arch/powerpc/include/asm/probes.h:25:13: note: previous declaration of 'opcode_t' was here

Fix the namespace clash by making opcode_t in probes.h to ppc_opcode_t.
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

28e1e58f

powerpc: Restore VDSO information on critical exception om BookE · 0127262c

由 Mihai Caraman 提交于 9月 06, 2012

Critical exception on 64-bit booke uses user-visible SPRG3 as scratch.
Restore VDSO information in SPRG3 on exception prolog.

Use a common sprg3 field in PACA for all powerpc64 architectures.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0127262c

05 9月, 2012 16 次提交

powerpc: Don't use __put_user() in patch_instruction · 636802ef

由 Benjamin Herrenschmidt 提交于 9月 04, 2012

patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.

Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

636802ef

powerpc: Make sure IPI handlers see data written by IPI senders · 9fb1b36c

由 Paul Mackerras 提交于 9月 04, 2012

We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state.  This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.

In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.

This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store.  This adds an explicit barrier in the two remaining cases.

These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.

The analysis of the reason why processes were not waking up properly
is due to Milton Miller.

Cc: stable@vger.kernel.org # v3.0+
Reported-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9fb1b36c

powerpc: Restore correct DSCR in context switch · 71433285

由 Anton Blanchard 提交于 9月 03, 2012

During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.

Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.

This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

With the four patches applied I can run a combination of all
test cases successfully at the same time:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

71433285

powerpc: Fix DSCR inheritance in copy_thread() · 1021cb26

由 Anton Blanchard 提交于 9月 03, 2012

If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.

We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.

This was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_default_test.cSigned-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1021cb26

powerpc: Keep thread.dscr and thread.dscr_inherit in sync · 00ca0de0

由 Anton Blanchard 提交于 9月 03, 2012

When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.

If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.

This issue was found with the following testcase:

http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

00ca0de0

powerpc: Update DSCR on all CPUs when writing sysfs dscr_default · 1b6ca2a6

由 Anton Blanchard 提交于 9月 03, 2012

Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.

This issue was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_explicit_test.cSigned-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1b6ca2a6

powerpc/powernv: Always go into nap mode when CPU is offline · 375f561a

由 Paul Mackerras 提交于 7月 26, 2012

The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode.  Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

375f561a

powerpc: Give hypervisor decrementer interrupts their own handler · dabe859e

由 Paul Mackerras 提交于 7月 26, 2012

At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event.  In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.

When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1.  Thus this adds an empty handler
function for them.  We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dabe859e

powerpc/vphn: Fix arch_update_cpu_topology() return value · 79c5fceb

由 Jesse Larrew 提交于 6月 07, 2012

arch_update_cpu_topology() should only return 1 when the topology has
actually changed, and should return 0 otherwise.

This patch fixes a potential bug where rebuild_sched_domains() would
reinitialize the sched domains even when the topology hasn't changed.
Signed-off-by: NJesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

79c5fceb

powerpc/booke64: Use SPRG0/3 scratch for bolted TLB miss & crit int · 8b64a9df

由 Mihai Caraman 提交于 8月 06, 2012

Embedded.Hypervisor category defines GSPRG0..3 physical registers for guests.
Avoid SPRG4-7 usage as scratch in host exception handlers, otherwise guest
SPRG4-7 registers will be clobbered.
For bolted TLB miss exception handlers, which is the version currently
supported by KVM, use SPRN_SPRG_GEN_SCRATCH aka SPRG0 instead of
SPRN_SPRG_TLB_SCRATCH aka SPRG6. Keep using TLB PACA slots to fit in one
64-byte cache line.
For critical exception handlers use SPRG3 instead of SPRG7. Provide a routine
to store and restore user-visible SPRGs. This will be subsequently used
to restore VDSO information in SPRG3. Add EX_R13 to paca slots to free up
SPRG3 and change the critical exception epilog to use it.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8b64a9df

powerpc/booke64: Eemove mfspr srr1 duplicate in exception prolog · 79b5c8db

由 Mihai Caraman 提交于 8月 06, 2012

Refactor exception prolog to get rid of mfspr srr1 duplicate. This was
introduced by KVM integration, with DO_KVM macro logic expecting srr1 value
earlier in r11.
Reserve r11 to hold srr1's value also required at the end of the prolog and
free up r10 to serve as spare in addition macros.
For syscalls case this change does not add any performance penalty. For irq
soft-disabled case the change adds a store/load of conditional register value
to/from a paca slot. Paca slots fit in one 64-byte cache line so these
additional operations have little impact on performance.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

79b5c8db

powerpc/booke64: Add DO_KVM kernel hooks · fecff0f7

由 Mihai Caraman 提交于 8月 06, 2012

Hook DO_KVM macro into 64-bit booke for KVM integration. Extend interrupt
handlers' parameter list with interrupt vector numbers to accomodate the macro.
Only the bolted version of tlb miss handers is addressed now.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fecff0f7

powerpc/booke64: Use GSRR registers in Guest Doorbell interrupts · 5473eb1c

由 Mihai Caraman 提交于 8月 06, 2012

Guest Doorbell interrupts use guest save and restore registers. Add a new
Guest Doorbell exception type to accommodate GSRR0/1 SPRs usage in exception
prolog and fix the exception handler.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5473eb1c

powerpc/booke64: Fix machine check handler to use the right prolog · a1310757

由 Mihai Caraman 提交于 8月 06, 2012

Machine check exception handler was using a wrong prolog. Hypervisors like
KVM which are called early from the exception handler rely on the interrupt
source.
Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a1310757

powerpc: Uprobes port to powerpc · 8b7b80b9

由 Ananth N Mavinakayanahalli 提交于 8月 23, 2012

This is the port of uprobes to powerpc. Usage is similar to x86.

[root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
Added new event:
  probe_libc:malloc    (on 0xb4860)

You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

[root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
[ perf record: Woken up 22 times to write data ]
[ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
[root@xxxx ~]# ./bin/perf report --stdio
...

    69.05%           tar  libc-2.12.so   [.] malloc
    28.57%            rm  libc-2.12.so   [.] malloc
     1.32%  avahi-daemon  libc-2.12.so   [.] malloc
     0.58%          bash  libc-2.12.so   [.] malloc
     0.28%          sshd  libc-2.12.so   [.] malloc
     0.08%    irqbalance  libc-2.12.so   [.] malloc
     0.05%         bzip2  libc-2.12.so   [.] malloc
     0.04%         sleep  libc-2.12.so   [.] malloc
     0.03%    multipathd  libc-2.12.so   [.] malloc
     0.01%      sendmail  libc-2.12.so   [.] malloc
     0.01%     automount  libc-2.12.so   [.] malloc

The trap_nr addition patch is a prereq.
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8b7b80b9

powerpc: Add trap_nr to thread_struct · 41ab5266

由 Ananth N Mavinakayanahalli 提交于 8月 23, 2012

Add thread_struct.trap_nr and use it to store the last exception
the thread experienced. In this patch, we populate the field at
various places where we force_sig_info() to the process.

This is also used in uprobes to determine if the probed instruction
caused an exception.
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

41ab5266

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功