提交 · 8708d002c416b8bf87351bc626d15d7407896edb · openeuler / raspberrypi-kernel

02 9月, 2009 3 次提交

powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT · 76acc2c1

由 Kumar Gala 提交于 9月 01, 2009

Switch to using the Power ISA defined PTE format when we have a 64-bit
PTE.  This makes the code handling between fsl-booke and book3e-64
similiar for TLB faults.

Additionally this lets use take advantage of the page size encodings and
full permissions that the HW PTE defines.

Also defined _PMD_PRESENT, _PMD_PRESENT_MASK, and _PMD_BAD since the
32-bit ppc arch code expects them.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

76acc2c1

powerpc/pseries: Fix to handle slb resize across migration · 46db2f86

由 Brian King 提交于 8月 28, 2009

The SLB can change sizes across a live migration, which was not
being handled, resulting in possible machine crashes during
migration if migrating to a machine which has a smaller max SLB
size than the source machine. Fix this by first reducing the
SLB size to the minimum possible value, which is 32, prior to
migration. Then during the device tree update which occurs after
migration, we make the call to ensure the SLB gets updated. Also
add the slb_size to the lparcfg output so that the migration
tools can check to make sure the kernel has this capability
before allowing migration in scenarios where the SLB size will change.

BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid
      breaking ppc32 build
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

46db2f86

powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan() · 0ed2c722

由 Grant Likely 提交于 8月 28, 2009

The two versions are doing almost exactly the same thing.  No need to
maintain them as separate files.  This patch also has the side effect
of making the PCI device tree scanning code available to 32 bit powerpc
machines, but no board ports actually make use of this feature at this
point.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0ed2c722

28 8月, 2009 13 次提交

powerpc: Properly start decrementer on BookE secondary CPUs · 77c0a700

由 Benjamin Herrenschmidt 提交于 8月 28, 2009

This moves the code to start the decrementer on 40x and BookE into
a separate function which is now called from time_init() and
secondary_time_init(), before the respective clock sources are
registered. We also remove the 85xx specific code for doing it
from the platform code.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

77c0a700

powerpc/pci: Pull ppc32 PCI features into common · 89c2dd62

由 Kumar Gala 提交于 8月 25, 2009

Some of the PCI features we have in ppc32 we will need on ppc64
platforms in the future.  These include support for:

* ppc_md.pci_exclude_device
* indirect config cycles
* early config cycles

We also simplified the logic in fake_pci_bus() to assume it will always
get a valid pci_controller.  Since all current callers seem to pass it
one.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Acked-by: NGrant Likely <grant.likely@secretlab.ca>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

89c2dd62

powerpc/pci: move pci_64.c device tree scanning code into pci-common.c · fbe65447

由 Grant Likely 提交于 8月 25, 2009

The PCI device tree scanning code in pci_64.c is some useful functionality.
It allows PCI devices to be described in the device tree instead of being
probed for, which in turn allows pci devices to use all of the device tree
facilities to describe complex PCI bus architectures like GPIO and IRQ
routing (perhaps not a common situation for desktop or server systems,
but useful for embedded systems with on-board PCI devices).

This patch moves the device tree scanning into pci-common.c so it is
available for 32-bit powerpc machines too.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fbe65447

powerpc/pci: Remove dead checks for CONFIG_PPC_OF · ae14e13a

由 Grant Likely 提交于 8月 25, 2009

PPC_OF is always selected for arch/powerpc.  This patch removes the stale
#defines
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ae14e13a

powerpc/book3e-64: Add support to initial_tlb_book3e for non-HES TLB · bb1af71e

由 Kumar Gala 提交于 8月 18, 2009

We now search through TLBnCFG looking for the first array that has IPROT
support (we assume that there is only one). If that TLB has hardware
entry select (HES) support we use the existing code and with the proper
TLB select (the HES code still needs to clean up bolted entries from
firmware). The non-HES code is pretty similiar to the 32-bit FSL Book-E
code but does make some new assumtions (like that we have tlbilx) and
simplifies things down a bit.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bb1af71e

powerpc/book3e-64: Add helper function to setup IVORs · 4b98d9e7

由 Kumar Gala 提交于 8月 18, 2009

Not all 64-bit Book-3E parts will have fixed IVORs so add a function that
cpusetup code can call to setup the base IVORs (0..15) to match the fixed
offsets. We need to 'or' part of interrupt_base_book3e into the IVORs
since on parts that have them the IVPR doesn't extend as far down.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4b98d9e7

powerpc/book3e-64: Wait til generic_calibrate_decr to enable decrementer · 6c188829

由 Kumar Gala 提交于 8月 18, 2009

Match what we do on 32-bit Book-E processors and enable the decrementer
in generic_calibrate_decr.  We need to make sure we disable the
decrementer early in boot since we currently use lazy (soft) interrupt
on 64-bit Book-E and possible get a decrementer exception before we
are ready for it.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6c188829

powerpc/book3e-64: Move the default cpu table entry · f45c4486

由 Kumar Gala 提交于 8月 18, 2009

Move the default cpu entry table for CONFIG_PPC_BOOK3E_64 to the
very end since we will probably want to support both 32-bit and
64-bit kernels for some processors that are higher up in the list.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f45c4486

powerpc: Add CONFIG_DMA_API_DEBUG support · 80d3e8ab

由 FUJITA Tomonori 提交于 8月 04, 2009

Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

80d3e8ab

powerpc: Handle SWIOTLB mapping error properly · 4a9a6bfe

由 FUJITA Tomonori 提交于 8月 04, 2009

Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4a9a6bfe

powerpc: use dma_map_ops struct · 45223c54

由 FUJITA Tomonori 提交于 8月 04, 2009

This converts uses dma_map_ops struct (in include/linux/dma-mapping.h)
instead of POWERPC homegrown dma_mapping_ops.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

45223c54

powerpc: Remove swiotlb_pci_dma_ops · 3702977f

由 FUJITA Tomonori 提交于 8月 04, 2009

Now swiotlb_pci_dma_ops is identical to swiotlb_dma_ops; we can use
swiotlb_dma_ops with any devices. This removes swiotlb_pci_dma_ops.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3702977f

powerpc: Remove addr_needs_map in struct dma_mapping_ops · 762afb73

由 FUJITA Tomonori 提交于 8月 04, 2009

This patch adds max_direct_dma_addr to struct dev_archdata to remove
addr_needs_map in struct dma_mapping_ops. It also converts
dma_capable() to use max_direct_dma_addr.

max_direct_dma_addr is initialized in pci_dma_dev_setup_swiotlb(),
called via ppc_md.pci_dma_dev_setup hook.

For further information:
http://marc.info/?t=124719060200001&r=1&w=2Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

762afb73

27 8月, 2009 3 次提交

powerpc/pseries: Reduce the polling interval in __cpu_up() · 67764263

由 Gautham R Shenoy 提交于 6月 23, 2009

Time time taken for a single cpu online operation on a pseries machine
is as follows:
Dedicated LPAR (POWER6): ~220ms.
Shared LPAR (POWER5)   : ~240ms.

Of this time, approximately 200ms is taken up by __cpu_up(). This is because
we poll every 200ms to check if the new cpu has notified it's presence
through the cpu_callin_map. We repeat this operation until the new cpu sets
the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier.

However, using completion_structs instead of polling loops,
the time taken by the new processor to indicate it's presence has
found to be less than 1ms on pseries. This method however may not
work on all powerpc platforms due to the time-base synchronization code.

Keeping this in mind, we could reduce msleep polling interval from
200ms to 1ms while retaining the 5 second timeout.

With this, the time taken for a cpu online operation changes as follows:
Dedicated LPAR (POWER6): 20-25ms.
Shared LPAR (POWER5)   : 60-80ms.

In both these cases, it was found that the code polls through the loop
only once indicating that 1ms is a reasonable value, atleast on pseries.

The code needs testing on other powerpc platforms.
Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
Acked-by: NJoel Schopp <jschopp@austin.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

67764263

powerpc: Fix __flush_icache_range on 44x · 14d75752

由 Josh Boyer 提交于 8月 19, 2009

The ptrace POKETEXT interface allows a process to modify the text pages of
a child process being ptraced, usually to insert breakpoints via trap
instructions. The kernel eventually calls copy_to_user_page, which in turn
calls __flush_icache_range to invalidate the icache lines for the child
process.

However, this function does not work on 44x due to the icache being virtually
indexed. This was noticed by a breakpoint being triggered after it had been
cleared by ltrace on a 440EPx board. The convenient solution is to do a
flash invalidate of the icache in the __flush_icache_range function.
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

14d75752

powerpc/mm: Cleanup handling of execute permission · ea3cc330

由 Benjamin Herrenschmidt 提交于 8月 18, 2009

This is an attempt at cleaning up a bit the way we handle execute
permission on powerpc. _PAGE_HWEXEC is gone, _PAGE_EXEC is now only
defined by CPUs that can do something with it, and the myriad of
#ifdef's in the I$/D$ coherency code is reduced to 2 cases that
hopefully should cover everything.

The logic on BookE is a little bit different than what it was though
not by much. Since now, _PAGE_EXEC will be set by the generic code
for executable pages, we need to filter out if they are unclean and
recover it. However, I don't expect the code to be more bloated than
it already was in that area due to that change.

I could boast that this brings proper enforcing of per-page execute
permissions to all BookE and 40x but in fact, we've had that now for
some time as a side effect of my previous rework in that area (and
I didn't even know it :-) We would only enable execute permission if
the page was cache clean and we would only cache clean it if we took
and exec fault. Since we now enforce that the later only work if
VM_EXEC is part of the VMA flags, we de-fact already enforce per-page
execute permissions... Unless I missed something
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ea3cc330

20 8月, 2009 18 次提交

powerpc/vmlinux.lds: Move _edata down · 903444e4

由 Michael Ellerman 提交于 8月 09, 2009

Currently _edata does not include several data sections, this causes
the kernel's report of memory usage at boot to not match reality, and
also prevents kmemleak from working - because it scan between _sdata
and _edata for pointers to allocated memory.

This mirrors a similar change made recently to the x86 linker script.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

903444e4

powerpc: Enable GCOV · a15098c9

由 Michael Ellerman 提交于 8月 09, 2009

Make it possible to enable GCOV code coverage measurement on powerpc.

Lightly tested on 64-bit, seems to work as expected.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a15098c9

powerpc: Use DIV_ROUND_CLOSEST in time init code · 14ea58ad

由 Julia Lawall 提交于 8月 01, 2009

The kernel.h macro DIV_ROUND_CLOSEST performs the computation (x + d/2)/d
but is perhaps more readable.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@haskernel@
@@

#include <linux/kernel.h>

@depends on haskernel@
expression x,__divisor;
@@

- (((x) + ((__divisor) / 2)) / (__divisor))
+ DIV_ROUND_CLOSEST(x,__divisor)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

14ea58ad

powerpc/prom_init: Evaluate mem kernel parameter for early allocation · cf68787b

由 Benjamin Krill 提交于 7月 27, 2009

Evaluate mem kernel parameter for early memory allocations. If mem is set
no allocation in the region above the given boundary is allowed. The current
code doesn't take care about this and allocate memory above the given mem
boundary.
Signed-off-by: NBenjamin Krill <ben@codiert.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cf68787b

powerpc: Add AMCC 460EX/460GT Rev. B support to cputable.c · 20d70345

由 Stefan Roese 提交于 7月 29, 2009

Signed-off-by: NStefan Roese <sr@denx.de>
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

20d70345

powerpc: Remaining 64-bit Book3E support · 2d27cfd3

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

This contains all the bits that didn't fit in previous patches :-) This
includes the actual exception handlers assembly, the changes to the
kernel entry, other misc bits and wiring it all up in Kconfig.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2d27cfd3

powerpc: Add TLB management code for 64-bit Book3E · 25d21ad6

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

This adds the TLB miss handler assembly, the low level TLB flush routines
along with the necessary hook for dealing with our virtual page tables
or indirect TLB entries that need to be flushes when PTE pages are freed.

There is currently no support for hugetlbfs
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

25d21ad6

powerpc: Add PACA fields specific to 64-bit Book3E processors · dce6670a

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

This adds various fields in the PACA that are for use specifically
by Book3E processors, such as exception save areas, current pgd
pointer, special exceptions kernel stacks etc...
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dce6670a

powerpc: Add memory management headers for new 64-bit BookE · 57e2a99f

由 Benjamin Herrenschmidt 提交于 7月 28, 2009

This adds the PTE and pgtable format definitions, along with changes
to the kernel memory map and other definitions related to implementing
support for 64-bit Book3E. This also shields some asm-offset bits that
are currently only relevant on 32-bit

We also move the definition of the "linux" page size constants to
the common mmu.h file and add a few sizes that are relevant to
embedded processors.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

57e2a99f

powerpc: Move definitions of secondary CPU spinloop to header file · cf54dc7c

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

Those definitions are currently declared extern in the .c file where
they are used, move them to a header file instead.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cf54dc7c

powerpc: Clean ifdef usage in copy_thread() · 747bea91

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

Currently, a single ifdef covers SLB related bits and more generic ppc64
related bits, split this in two separate ifdef's since 64-bit BookE will
need one but not the other.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

747bea91

powerpc/mm: Call mmu_context_init() from ppc64 · 6f0ef0f5

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

Our 64-bit hash context handling has no init function, but 64-bit Book3E
will use the common mmu_context_nohash.c code which does, so define an
empty inline mmu_context_init() for 64-bit server and call it from
our 64-bit setup_arch()
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NKumar Gala <galak@kernel.crashing.org>

6f0ef0f5

powerpc/of: Remove useless register save/restore when calling OF back · 6c171994

由 Benjamin Herrenschmidt 提交于 7月 23, 2009

enter_prom() used to save and restore registers such as CTR, XER etc..
which are volatile, or SRR0,1... which we don't care about. This
removes a bunch of useless code and while at it turns an mtmsrd into
an MTMSRD macro which will be useful to Book3E.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6c171994

powerpc: Add compat_sys_truncate · dd90bbd5

由 Benjamin Herrenschmidt 提交于 7月 28, 2009

The truncate syscall has a signed long parameter, so when using a 32-
bit userspace with a 64-bit kernel the argument is zero-extended
instead of sign-extended. Adding the compat_sys_truncate function
fixes the issue.

This was noticed during an LSB truncate test failure. The test was
checking for the correct error number set when truncate is called with
a length of -1. The test can be found at:

http://bzr.linuxfoundation.org/lsb/devel/runtime-test?cmd=inventory;rev=stewb%40linux-foundation.org-20090626205411-sfb23cc0tjj7jzgm;path=modules/vsx-pcts/tset/POSIX.os/files/truncate/

BenH: Added compat_sys_ftruncate() as well, same issue.
Signed-off-by: NChase Douglas <cndougla@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dd90bbd5

powerpc: Remove use of a second scratch SPRG in STAB code · c5a8c0c9

由 Benjamin Herrenschmidt 提交于 7月 16, 2009

The STAB code used on Power3 and RS/64 uses a second scratch SPRG to
save a GPR in order to decide whether to go to do_stab_bolted_* or
to handle a normal data access exception.

This prevents our scheme of freeing SPRG3 which is user visible for
user uses since we cannot use SPRG0 which, on RS/64, seems to be
read-only for supervisor mode (like POWER4).

This reworks the STAB exception entry to use the PACA as temporary
storage instead.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c5a8c0c9

powerpc: Use names rather than numbers for SPRGs (v2) · ee43eb78

由 Benjamin Herrenschmidt 提交于 7月 14, 2009

The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ee43eb78

powerpc: Rename exception.h to exception-64s.h · 8aa34ab8

由 Benjamin Herrenschmidt 提交于 7月 14, 2009

The file include/asm/exception.h contains definitions
that are specific to exception handling on 64-bit server
type processors.

This renames the file to exception-64s.h to reflect that
fact and avoid confusion.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8aa34ab8

powerpc: Move 64bit VDSO to improve context switch performance · 30d0b368

由 Anton Blanchard 提交于 7月 13, 2009

On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:

10000000-10001000 r-xp 00000000 08:06 5778459 /root/context_switch_64
10010000-10011000 r--p 00000000 08:06 5778459 /root/context_switch_64
10011000-10012000 rw-p 00001000 08:06 5778459 /root/context_switch_64
fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0
fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051 /lib64/power6/libc-2.9.so
fffa947c000-fffa9480000 rw-p 00000000 00:00 0
fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852 /lib64/ld-2.9.so
fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0

fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0 [vdso] <----- here I am

fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852 /lib64/ld-2.9.so
fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852 /lib64/ld-2.9.so
fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0
fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0 [stack]

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

30d0b368

18 8月, 2009 2 次提交

perf_counter: powerpc: Add callchain support · 20002ded

由 Paul Mackerras 提交于 8月 18, 2009

This adds support for tracing callchains for powerpc, both 32-bit
and 64-bit, and both in the kernel and userspace, from PMU interrupt
context.

The first three entries stored for each callchain are the NIP (next
instruction pointer), LR (link register), and the contents of the LR
save area in the second stack frame (the first is ignored because the
ABI convention on powerpc is that functions save their return address
in their caller's stack frame). Because leaf functions don't have to
save their return address (LR value) and don't have to establish a
stack frame, it's possible for either or both of LR and the second
stack frame's LR save area to have valid return addresses in them.
This is basically impossible to disambiguate without either reading
the code or looking at auxiliary information such as CFI tables.
Since we don't want to do either of those things at interrupt time,
we store both LR and the second stack frame's LR save area.

Once we get past the second stack frame, there is no ambiguity; all
return addresses we get are reliable.

For kernel traces, we check whether they are valid kernel instruction
addresses and store zero instead if they are not (rather than
omitting them, which would make it impossible for userspace to know
which was which). We also store zero instead of the second stack
frame's LR save area value if it is the same as LR.

For kernel traces, we check for interrupt frames, and for user traces,
we check for signal frames. In each case, since we're starting a new
trace, we store a PERF_CONTEXT_KERNEL/USER marker so that userspace
knows that the next three entries are NIP, LR and the second stack frame
for the interrupted context.

We read user memory with __get_user_inatomic. On 64-bit, if this
PMU interrupt occurred while interrupts are soft-disabled, and
there is no MMU hash table entry for the page, we will get an
-EFAULT return from __get_user_inatomic even if there is a valid
Linux PTE for the page, since hash_page isn't reentrant. Thus we
have code here to read the Linux PTE and access the page via the
kernel linear mapping. Since 64-bit doesn't use (or need) highmem
there is no need to do kmap_atomic. On 32-bit, we don't do soft
interrupt disabling, so this complication doesn't occur and there
is no need to fall back to reading the Linux PTE, since hash_page
(or the TLB miss handler) will get called automatically if necessary.

Note that we cannot get PMU interrupts in the interval during
context switch between switch_mm (which switches the user address
space) and switch_to (which actually changes current to the new
process). On 64-bit this is because interrupts are hard-disabled
in switch_mm and stay hard-disabled until they are soft-enabled
later, after switch_to has returned. So there is no possibility
of trying to do a user stack trace when the user address space is
not current's address space.
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

20002ded

powerpc: Allow perf_counters to access user memory at interrupt time · 9c1e1052

由 Paul Mackerras 提交于 8月 17, 2009

This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine.  Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor.  This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table.  32-bit processors are already able to access user memory
at interrupt time.  Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.

On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca.  This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields.  To prevent this, we hard-disable
interrupts in switch_slb.  Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.

This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.

Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.

If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism.  An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

9c1e1052

10 8月, 2009 1 次提交

powerpc/dma: pci_set_dma_mask() shouldn't fail if mask fits in RAM · b2f2e8fe

由 Benjamin Herrenschmidt 提交于 8月 10, 2009

On an iMac G5, the b43 driver is failing to initialise because trying to
set the dma mask to 30-bit fails. Even though there's only 512MiB of RAM
in the machine anyway:
	https://bugzilla.redhat.com/show_bug.cgi?id=514787

We should probably let it succeed if the available RAM in the system
doesn't exceed the requested limit.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b2f2e8fe