提交 · f0f0c9ac2051e5da4afa1f3f908ace197a4de80e · openanolis / cloud-kernel

24 8月, 2012 1 次提交

powerpc: Remove unnecessary ifdefs · f0f0c9ac

由 Michael Neuling 提交于 8月 21, 2012

Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f0f0c9ac

12 4月, 2012 1 次提交

KVM: PPC: add CPU_FTR_EMB_HV to CPU table · 9de6fe91

由 Scott Wood 提交于 4月 11, 2012

e6500 support (commit 10241842,
"powerpc: Add initial e6500 cpu support" and the introduction of
CPU_FTR_EMB_HV (commit 73196cd3,
"KVM: PPC: e500mc support") collided during merge, leaving e6500's CPU
table entry missing CPU_FTR_EMB_HV.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9de6fe91

08 4月, 2012 3 次提交

KVM: PPC: e500mc support · 73196cd3

由 Scott Wood 提交于 12月 20, 2011

Add processor support for e500mc, using hardware virtualization support
(GS-mode).

Current issues include:
 - No support for external proxy (coreint) interrupt mode in the guest.

Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
Varun Sethi <Varun.Sethi@freescale.com>, and
Liu Yu <yu.liu@freescale.com>.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

73196cd3

powerpc/e500: split CPU_FTRS_ALWAYS/CPU_FTRS_POSSIBLE · 06aae867

由 Scott Wood 提交于 12月 20, 2011

Split e500 (v1/v2) and e500mc/e5500 to allow optimization of feature
checks that differ between the two.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

06aae867

powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit · 52b066fa

由 Scott Wood 提交于 12月 20, 2011

Currently 32-bit only cares about this for choice of exception
vector, which is done in core-specific code.  However, KVM will
want to distinguish as well.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

52b066fa

16 3月, 2012 1 次提交

powerpc: Add initial e6500 cpu support · 10241842

由 Kumar Gala 提交于 11月 06, 2011

Add basic support for e6500 core in its single threaded mode.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

10241842

19 12月, 2011 1 次提交

powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX · a66086b8

由 Anton Blanchard 提交于 12月 07, 2011

Implement a POWER7 optimised copy_to_user/copy_from_user using VMX.
For large aligned copies this new loop is over 10% faster, and for
large unaligned copies it is over 200% faster.

If we take a fault we fall back to the old version, this keeps
things relatively simple and easy to verify.

On POWER7 unaligned stores rarely slow down - they only flush when
a store crosses a 4KB page boundary. Furthermore this flush is
handled completely in hardware and should be 20-30 cycles.

Unaligned loads on the other hand flush much more often - whenever
crossing a 128 byte cache line, or a 32 byte sector if either sector
is an L1 miss.

Considering this information we really want to get the loads aligned
and not worry about the alignment of the stores. Microbenchmarks
confirm that this approach is much faster than the current unaligned
copy loop that uses shifts and rotates to ensure both loads and
stores are aligned.

We also want to try and do the stores in cacheline aligned, cacheline
sized chunks. If the store queue is unable to merge an entire
cacheline of stores then the L2 cache will have to do a
read/modify/write. Even worse, we will serialise this with the stores
in the next iteration of the copy loop since both iterations hit
the same cacheline.

Based on this, the new loop does the following things:

1 - 127 bytes
Get the source 8 byte aligned and use 8 byte loads and stores. Pretty
boring and similar to how the current loop works.

128 - 4095 bytes
Get the source 8 byte aligned and use 8 byte loads and stores,
1 cacheline at a time. We aren't doing the stores in cacheline
aligned chunks so we will potentially serialise once per cacheline.
Even so it is much better than the loop we have today.

4096 - bytes
If both source and destination have the same alignment get them both
16 byte aligned, then get the destination cacheline aligned. Do
cacheline sized loads and stores using VMX.

If source and destination do not have the same alignment, we get the
destination cacheline aligned, and use permute to do aligned loads.

In both cases the VMX loop should be optimal - we always do aligned
loads and stores and are always doing stores in cacheline aligned,
cacheline sized chunks.

To be able to use VMX we must be careful about interrupts and
sleeping. We don't use the VMX loop when in an interrupt (which should
be rare anyway) and we wrap the VMX loop in disable/enable_pagefault
and fall back to the existing copy_tofrom_user loop if we do need to
sleep.

The VMX breakpoint of 4096 bytes was chosen using this microbenchmark:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

Since we are using VMX and there is a cost to saving and restoring
the user VMX state there are two broad cases we need to benchmark:

- Best case - userspace never uses VMX

- Worst case - userspace always uses VMX

In reality a userspace process will sit somewhere between these two
extremes. Since we need to test both aligned and unaligned copies we
end up with 4 combinations. The point at which the VMX loop begins to
win is:

0% VMX
aligned		2048 bytes
unaligned	2048 bytes

100% VMX
aligned		16384 bytes
unaligned	8192 bytes

Considering this is a microbenchmark, the data is hot in cache and
the VMX loop has better store queue merging properties we set the
breakpoint to 4096 bytes, a little below the unaligned breakpoints.

Some future optimisations we can look at:

- Looking at the perf data, a significant part of the cost when a
  task is always using VMX is the extra exception we take to restore
  the VMX state. As such we should do something similar to the x86
  optimisation that restores FPU state for heavy users. ie:

        /*
         * If the task has used fpu the last 5 timeslices, just do a full
         * restore of the math state immediately to avoid the trap; the
         * chances of needing FPU soon are obviously high now
         */
        preload_fpu = tsk_used_math(next_p) && next_p->fpu_counter > 5;

  and

        /*
         * fpu_counter contains the number of consecutive context switches
         * that the FPU is used. If this is over a threshold, the lazy fpu
         * saving becomes unlazy to save the trap. This is an unsigned char
         * so that after 256 times the counter wraps and the behavior turns
         * lazy again; this to deal with bursty apps that only use FPU for
         * a short time
         */

- We could create a paca bit to mirror the VMX enabled MSR bit and check
  that first, avoiding multiple calls to calling enable_kernel_altivec.
  That should help with iovec based system calls like readv.

- We could have two VMX breakpoints, one for when we know the user VMX
  state is loaded into the registers and one when it isn't. This could
  be a second bit in the paca so we can calculate the break points quickly.

- One suggestion from Ben was to save and restore the VSX registers
  we use inline instead of using enable_kernel_altivec.

[BenH: Fixed a problem with preempt and fixed build without CONFIG_ALTIVEC]
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a66086b8

25 11月, 2011 1 次提交

powerpc/book3e: Add ICSWX/ACOP support to Book3e cores like A2 · fac26ad4

由 Jimi Xenidis 提交于 9月 29, 2011

ICSWX is also used by the A2 processor to access coprocessors,
although not all "chips" that contain A2s have coprocessors.
Signed-off-by: NJimi Xenidis <jimix@pobox.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fac26ad4

12 7月, 2011 1 次提交

powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and architecture bits · 969391c5

由 Paul Mackerras 提交于 6月 29, 2011

This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to
indicate that we have a usable hypervisor mode, and another to indicate
that the processor conforms to PowerISA version 2.06.  We also add
another bit to indicate that the processor conforms to ISA version 2.01
and set that for PPC970 and derivatives.

Some PPC970 chips (specifically those in Apple machines) have a
hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode
is not useful in the sense that there is no way to run any code in
supervisor mode (HV=0 PR=0).  On these processors, the LPES0 and LPES1
bits in HID4 are always 0, and we use that as a way of detecting that
hypervisor mode is not useful.

Where we have a feature section in assembly code around code that
only applies on POWER7 in hypervisor mode, we use a construct like

END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)

The definition of END_FTR_SECTION_IFSET is such that the code will
be enabled (not overwritten with nops) only if all bits in the
provided mask are set.

Note that the CPU feature check in __tlbie() only needs to check the
ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called
if we are running bare-metal, i.e. in hypervisor mode.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

969391c5

19 5月, 2011 1 次提交
- K
  powerpc/fsl-booke64: Add support for Debug Level exception handler · d36b4c4f
  由 Kumar Gala 提交于 4月 06, 2011
```
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
```
  d36b4c4f
04 5月, 2011 2 次提交

powerpc: Save Come-From Address Register (CFAR) in exception frame · 48404f2e

由 Paul Mackerras 提交于 5月 01, 2011

Recent 64-bit server processors (POWER6 and POWER7) have a "Come-From
Address Register" (CFAR), that records the address of the most recent
branch or rfid (return from interrupt) instruction for debugging purposes.

This saves the value of the CFAR in the exception entry code and stores
it in the exception frame. We also make xmon print the CFAR value in
its register dump code.

Rather than extend the pt_regs struct at this time, we steal the orig_gpr3
field, which is only used for system calls, and use it for the CFAR value
for all exceptions/interrupts other than system calls. This means we
don't save the CFAR on system calls, which is not a great problem since
system calls tend not to happen unexpectedly, and also avoids adding the
overhead of reading the CFAR to the system call entry path.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

48404f2e

powerpc: Add Initiate Coprocessor Store Word (icswx) support · 851d2e2f

由 Tseng-Hui (Frank) Lin 提交于 5月 02, 2011

Icswx is a PowerPC instruction to send data to a co-processor. On Book-S
processors the LPAR_ID and process ID (PID) of the owning process are
registered in the window context of the co-processor at initialization
time. When the icswx instruction is executed the L2 generates a cop-reg
transaction on PowerBus. The transaction has no address and the
processor does not perform an MMU access to authenticate the transaction.
The co-processor compares the LPAR_ID and the PID included in the
transaction and the LPAR_ID and PID held in the window context to
determine if the process is authorized to generate the transaction.

The OS needs to assign a 16-bit PID for the process. This cop-PID needs
to be updated during context switch. The cop-PID needs to be destroyed
when the context is destroyed.
Signed-off-by: NSonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: NTseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

851d2e2f

27 4月, 2011 2 次提交

powerpc: Free up some CPU feature bits by moving out MMU-related features · 44ae3ab3

由 Matt Evans 提交于 4月 06, 2011

Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits. All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

44ae3ab3

powerpc: Add A2 cpu support · 76b4eda8

由 Benjamin Herrenschmidt 提交于 4月 14, 2011

Add the cputable entry, regs and setup & restore entries for
the PowerPC A2 core.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

76b4eda8

20 4月, 2011 1 次提交

powerpc: Define CPU feature for Architected 2.06 HV mode · 24cc67de

由 Benjamin Herrenschmidt 提交于 1月 20, 2011

This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).

We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

24cc67de

12 4月, 2011 2 次提交

powerpc/e500mc: Remove CPU_FTR_MAYBE_CAN_NAP/CPU_FTR_MAYBE_CAN_DOZE · d51ad915

由 Scott Wood 提交于 5月 27, 2010

e500mc does not support the HID0/MSR mechanism that is used by e500_idle
(and there are also issues with waking on certain types of interrupts).

Further, even if napping is never actually enabled, just having
CPU_FTR_CAN_NAP will cause machine_init() to overwrite the board's supplied
ppc_md.power_save().

We drop CPU_FTR_MAYBE_CAN_DOZE becuase we should use 'wait' instead on
e500mc.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

d51ad915

powerpc/book3e: Fix CPU feature handling on 64-bit e5500 · 11ed0db9

由 Kumar Gala 提交于 4月 06, 2011

The CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS defines did not encompass
e5500 CPU features when built for 64-bit.  This causes issues with
cpu_has_feature() as it utilizes the POSSIBLE & ALWAYS defines as part
of its check.

Create a unique CPU_FTRS_E5500 (as its different from CPU_FTRS_E500MC),
created a new group for 64-bit Book3e based CPUs and add CPU_FTRS_E5500
to that group.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

11ed0db9

02 2月, 2011 1 次提交

powerpc/476: define specific cpu table entry DD2 core · c48d0dba

由 Dave Kleikamp 提交于 1月 26, 2011

The DD2 core still has some unstability.  Define CPU_FTR_476_DD2 to
enable workarounds in later patches.

This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>

c48d0dba

29 11月, 2010 1 次提交

powerpc: Add support for popcnt instructions · 64ff3128

由 Anton Blanchard 提交于 8月 12, 2010

POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

64ff3128

02 9月, 2010 1 次提交

powerpc: Feature nop out reservation clear when stcx checks address · f89451fb

由 Anton Blanchard 提交于 8月 11, 2010

The POWER architecture does not require stcx to check that it is operating
on the same address as the larx. This means it is possible for an
an exception handler to execute a larx, get a reservation, decide
not to do the stcx and then return back with an active reservation. If the
interrupted code was in the middle of a larx/stcx sequence the stcx could
incorrectly succeed.

All recent POWER CPUs check the address before letting the stcx succeed
so we can create a CPU feature and nop it out. As Ben suggested, we can
only do this in our syscall path because there is a remote possibility
some kernel code gets interrupted by an exception that ends up operating
on the same cacheline.

Thanks to Paul Mackerras and Derek Williams for the idea.

To test this I used a very simple null syscall (actually getppid) testcase
at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested against 2.6.35-git10 with the following changes against the
pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n
CONFIG_PPC_4K_PAGES=n
CONFIG_PPC_64K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_PPC_SUBPAGE_PROT=n
CONFIG_FUNCTION_TRACER=n
CONFIG_FUNCTION_GRAPH_TRACER=n
CONFIG_IRQSOFF_TRACER=n
CONFIG_STACK_TRACER=n

to remove the overhead of virtual CPU accounting, syscall auditing and
the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.

POWER6: +8.2%
POWER7: +7.0%

Another suggestion was to use a larx to something in the L1 instead of a stcx.
This was almost as fast as removing the larx on POWER6, but only 3.5% faster
on POWER7. We can use this to speed up the reservation clear in our
exception exit code.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f89451fb

22 6月, 2010 1 次提交

powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors · 5aae8a53

由 K.Prasad 提交于 6月 15, 2010

Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras <paulus@samba.org> to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]
Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

5aae8a53

09 6月, 2010 1 次提交

powerpc: Enable asymmetric SMT scheduling on POWER7 · 76cbd8a8

由 Michael Neuling 提交于 6月 08, 2010

The POWER7 core has dynamic SMT mode switching which is controlled by
the hypervisor.  There are 3 SMT modes:
	SMT1 uses thread  0
	SMT2 uses threads 0 & 1
	SMT4 uses threads 0, 1, 2 & 3
When in any particular SMT mode, all threads have the same performance
as each other (ie. at any moment in time, all threads perform the same).

The SMT mode switching works such that when linux has threads 2 & 3 idle
and 0 & 1 active, it will cede (H_CEDE hypercall) threads 2 and 3 in the
idle loop and the hypervisor will automatically switch to SMT2 for that
core (independent of other cores).  The opposite is not true, so if
threads 0 & 1 are idle and 2 & 3 are active, we will stay in SMT4 mode.

Similarly if thread 0 is active and threads 1, 2 & 3 are idle, we'll go
into SMT1 mode.

If we can get the core into a lower SMT mode (SMT1 is best), the threads
will perform better (since they share less core resources).  Hence when
we have idle threads, we want them to be the higher ones.

This adds a feature bit for asymmetric packing to powerpc and then
enables it on POWER7.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
LKML-Reference: <20100608045702.31FB5CC8C7@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

76cbd8a8

21 5月, 2010 1 次提交

powerpc/e500mc: Implement machine check handler. · fe04b112

由 Scott Wood 提交于 4月 08, 2010

Most of the MSCR bit assigments are different in e500mc versus
e500, and they are now write-one-to-clear.

Some e500mc machine check conditions are made recoverable (as long as
they aren't stuck on), most notably L1 instruction cache parity errors.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

fe04b112

05 5月, 2010 2 次提交

powerpc/476: add machine check handler for 47x core · fc5e7097

由 Dave Kleikamp 提交于 3月 05, 2010

The 47x core's MCSR varies from 44x, so it needs it's own machine check
handler.
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>

fc5e7097

powerpc/47x: Base ppc476 support · e7f75ad0

由 Dave Kleikamp 提交于 3月 05, 2010

This patch adds the base support for the 476 processor.  The code was
primarily written by Ben Herrenschmidt and Torez Smith, but I've been
maintaining it for a while.

The goal is to have a single binary that will run on 44x and 47x, but
we still have some details to work out.  The biggest is that the L1 cache
line size differs on the two platforms, but it's currently a compile-time
option.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NTorez Smith  <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>

e7f75ad0

17 2月, 2010 1 次提交

powerpc: Use lwsync for acquire barrier if CPU supports it · 5a0e9b57

由 Anton Blanchard 提交于 2月 10, 2010

Nick Piggin discovered that lwsync barriers around locks were faster than isync
on 970. That was a long time ago and I completely dropped the ball in testing
his patches across other ppc64 processors.

Turns out the idea helps on other chips. Using a microbenchmark that
uses a lot of threads to contend on a global pthread mutex (and therefore a
global futex), POWER6 improves 8% and POWER7 improves 2%. I checked POWER5
and while I couldn't measure an improvement, there was no regression.

This patch uses the lwsync patching code to replace the isyncs with lwsyncs
on CPUs that support the instruction. We were marking POWER3 and RS64 as lwsync
capable but in reality they treat it as a full sync (ie slow). Remove the
CPU_FTR_LWSYNC bit from these CPUs so they continue to use the faster isync
method.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5a0e9b57

17 3月, 2009 1 次提交

powerpc/5200: Enable CPU_FTR_NEED_COHERENT for MPC52xx · c9310920

由 Piotr Ziecik 提交于 3月 17, 2009

BestComm, a DMA engine in MPC52xx SoC, requires snooping when
CPU caches are enabled to work properly.

Adding CPU_FTR_NEED_COHERENT fixes NFS problems on MPC52xx machines
introduced by 'powerpc/mm: Fix handling of _PAGE_COHERENT in BAT setup
code' (sha1: 4c456a67).
Signed-off-by: NPiotr Ziecik <kosmo@semihalf.com>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

c9310920

23 2月, 2009 1 次提交

powerpc: Add support for using doorbells for SMP IPI · 620165f9

由 Kumar Gala 提交于 2月 12, 2009

The e500mc supports the new msgsnd/doorbell mechanisms that were added in
the Power ISA 2.05 architecture.  We use the normal level doorbell for
doing SMP IPIs at this point.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

620165f9

21 12月, 2008 2 次提交

powerpc/mm: Introduce MMU features · 7c03d653

由 Benjamin Herrenschmidt 提交于 12月 18, 2008

We're soon running out of CPU features and I need to add some new
ones for various MMU related bits, so this patch separates the MMU
features from the CPU features.  I moved over the 32-bit MMU related
ones, added base features for MMU type families, but didn't move
over any 64-bit only feature yet.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

7c03d653

powerpc/4xx: Extended DCR support v2 · 6d2170be

由 Benjamin Herrenschmidt 提交于 12月 18, 2008

This adds supports to the "extended" DCR addressing via the indirect
mfdcrx/mtdcrx instructions supported by some 4xx cores (440H6 and
later).

I enabled the feature for now only on AMCC 460 chips.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

6d2170be

16 12月, 2008 1 次提交

powerpc: Fix bogus cache flushing on all 40x and BookE processors v2 · 8309ce72

由 Benjamin Herrenschmidt 提交于 12月 12, 2008

We were missing the CPU_FTR_NOEXECUTE bit in our cputable for all
these processors. The result is that update_mmu_cache() would flush
the cache for all pages mapped to userspace which is totally
unnecessary on those processors since we already handle flushing
on execute in the page fault path.

This should provide a nice speed up ;-)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

8309ce72

05 11月, 2008 1 次提交

powerpc: Add new CPU feature: CPU_FTR_UNALIGNED_LD_STD · 4ec577a2

由 Mark Nelson 提交于 10月 27, 2008

Add a new CPU feature bit, CPU_FTR_UNALIGNED_LD_STD, to be added
to the 64bit powerpc chips that can do unaligned load double and
store double without any performance hit.

This is added to Power6 and Cell and will be used in the next commit
to disable the code that gets the destination address aligned on
those CPUs where doing that doesn't improve performance.
Signed-off-by: NMark Nelson <markn@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

4ec577a2

16 9月, 2008 1 次提交

powerpc: Add new CPU feature: CPU_FTR_CP_USE_DCBTZ · 2a929436

由 Mark Nelson 提交于 8月 22, 2008

Add a new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, to be added to the
64bit powerpc chips that benefit from having dcbt and dcbz
instructions used in their memory copy routines.

This will be used in a subsequent patch that updates copy_4K_page().
The new bit is added to Cell, PPC970 and Power4 because they show
better performance with the new copy_4K_page() when dcbt and dcbz
instructions are used.
Signed-off-by: NMark Nelson <markn@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

2a929436

20 8月, 2008 1 次提交

powerpc: Expose PMCs & cache topology in sysfs on 32-bit · b950bdd0

由 Benjamin Herrenschmidt 提交于 8月 18, 2008

The file arch/powerpc/kernel/sysfs.c is currently only compiled for
64-bit kernels.  It contain code to register CPU sysdevs in sysfs and
add various properties such as cache topology and raw access by root
to performance monitor counters (PMCs).  A lot of that can be re-used
as is on 32-bits.

This makes the file be built for both, with appropriate ifdef'ing
for the few bits that are really 64-bit specific, and adds some
support for the raw PMCs for 75x and 74xx processors.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b950bdd0

04 8月, 2008 1 次提交

powerpc: Move include files to arch/powerpc/include/asm · b8b572e1

由 Stephen Rothwell 提交于 8月 01, 2008

from include/asm-powerpc.  This is the result of a

mkdir arch/powerpc/include/asm
git mv include/asm-powerpc/* arch/powerpc/include/asm

Followed by a few documentation/comment fixups and a couple of places
where <asm-powepc/...> was being used explicitly.  Of the latter only
one was outside the arch code and it is a driver only built for powerpc.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b8b572e1

25 7月, 2008 1 次提交

powerpc: Enable AT_BASE_PLATFORM aux vector · 9115d134

由 Nathan Lynch 提交于 7月 16, 2008

Stash the first platform string matched by identify_cpu() in
powerpc_base_platform, and supply that to the ELF loader for the value
of AT_BASE_PLATFORM.
Signed-off-by: NNathan Lynch <ntl@pobox.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9115d134

15 7月, 2008 1 次提交

powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT · 0f473314

由 Nathan Lynch 提交于 7月 10, 2008

Background from Maynard Johnson:
As of POWER6, a set of 32 common events is defined that must be
supported on all future POWER processors.  The main impetus for this
compat set is the need to support partition migration, especially from
processor P(n) to processor P(n+1), where performance software that's
running in the new partition may not be knowledgeable about processor
P(n+1).  If a performance tool determines it does not support the
physical processor, but is told (via the
PPC_FEATURE_PSERIES_PERFMON_COMPAT bit) that the processor supports
the notion of the PMU compat set, then the performance tool can
surface just those events to the user of the tool.

PPC_FEATURE_PSERIES_PERFMON_COMPAT indicates that the PMU supports at
least this basic subset of events which is compatible across POWER
processor lines.
Signed-off-by: NNathan Lynch <ntl@pobox.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0f473314

09 7月, 2008 1 次提交

powerpc/mm: Add SAO Feature bit to the cputable · 37907049

由 Dave Kleikamp 提交于 7月 08, 2008

Add the CPU feature bit for the new Strong Access Ordering
facility of Power7
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NJoel Schopp <jschopp@austin.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

37907049

03 7月, 2008 1 次提交

powerpc: Fixup lwsync at runtime · 2d1b2027

由 Kumar Gala 提交于 7月 02, 2008

To allow for a single kernel image on e500 v1/v2/mc we need to fixup lwsync
at runtime. On e500v1/v2 lwsync causes an illop so we need to patch up
the code. We default to 'sync' since that is always safe and if the cpu
is capable we will replace 'sync' with 'lwsync'.

We introduce CPU_FTR_LWSYNC as a way to determine at runtime if this is
needed. This flag could be moved elsewhere since we dont really use it
for the normal CPU_FTR purpose.

Finally we only store the relative offset in the fixup section to keep it
as small as possible rather than using a full fixup_entry.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

2d1b2027

01 7月, 2008 1 次提交

powerpc: Add VSX CPU feature · b962ce9d

由 Michael Neuling 提交于 6月 25, 2008

Add a VSX CPU feature.  Also add code to detect if VSX is available
from the device tree.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NJoel Schopp <jschopp@austin.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b962ce9d

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功