- 16 12月, 2008 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
We were missing the CPU_FTR_NOEXECUTE bit in our cputable for all these processors. The result is that update_mmu_cache() would flush the cache for all pages mapped to userspace which is totally unnecessary on those processors since we already handle flushing on execute in the page fault path. This should provide a nice speed up ;-) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 05 11月, 2008 1 次提交
-
-
由 Mark Nelson 提交于
Add a new CPU feature bit, CPU_FTR_UNALIGNED_LD_STD, to be added to the 64bit powerpc chips that can do unaligned load double and store double without any performance hit. This is added to Power6 and Cell and will be used in the next commit to disable the code that gets the destination address aligned on those CPUs where doing that doesn't improve performance. Signed-off-by: NMark Nelson <markn@au1.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 16 9月, 2008 1 次提交
-
-
由 Mark Nelson 提交于
Add a new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, to be added to the 64bit powerpc chips that benefit from having dcbt and dcbz instructions used in their memory copy routines. This will be used in a subsequent patch that updates copy_4K_page(). The new bit is added to Cell, PPC970 and Power4 because they show better performance with the new copy_4K_page() when dcbt and dcbz instructions are used. Signed-off-by: NMark Nelson <markn@au1.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 20 8月, 2008 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The file arch/powerpc/kernel/sysfs.c is currently only compiled for 64-bit kernels. It contain code to register CPU sysdevs in sysfs and add various properties such as cache topology and raw access by root to performance monitor counters (PMCs). A lot of that can be re-used as is on 32-bits. This makes the file be built for both, with appropriate ifdef'ing for the few bits that are really 64-bit specific, and adds some support for the raw PMCs for 75x and 74xx processors. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 04 8月, 2008 1 次提交
-
-
由 Stephen Rothwell 提交于
from include/asm-powerpc. This is the result of a mkdir arch/powerpc/include/asm git mv include/asm-powerpc/* arch/powerpc/include/asm Followed by a few documentation/comment fixups and a couple of places where <asm-powepc/...> was being used explicitly. Of the latter only one was outside the arch code and it is a driver only built for powerpc. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 25 7月, 2008 1 次提交
-
-
由 Nathan Lynch 提交于
Stash the first platform string matched by identify_cpu() in powerpc_base_platform, and supply that to the ELF loader for the value of AT_BASE_PLATFORM. Signed-off-by: NNathan Lynch <ntl@pobox.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 7月, 2008 1 次提交
-
-
由 Nathan Lynch 提交于
Background from Maynard Johnson: As of POWER6, a set of 32 common events is defined that must be supported on all future POWER processors. The main impetus for this compat set is the need to support partition migration, especially from processor P(n) to processor P(n+1), where performance software that's running in the new partition may not be knowledgeable about processor P(n+1). If a performance tool determines it does not support the physical processor, but is told (via the PPC_FEATURE_PSERIES_PERFMON_COMPAT bit) that the processor supports the notion of the PMU compat set, then the performance tool can surface just those events to the user of the tool. PPC_FEATURE_PSERIES_PERFMON_COMPAT indicates that the PMU supports at least this basic subset of events which is compatible across POWER processor lines. Signed-off-by: NNathan Lynch <ntl@pobox.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 7月, 2008 1 次提交
-
-
由 Dave Kleikamp 提交于
Add the CPU feature bit for the new Strong Access Ordering facility of Power7 Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: NJoel Schopp <jschopp@austin.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 03 7月, 2008 1 次提交
-
-
由 Kumar Gala 提交于
To allow for a single kernel image on e500 v1/v2/mc we need to fixup lwsync at runtime. On e500v1/v2 lwsync causes an illop so we need to patch up the code. We default to 'sync' since that is always safe and if the cpu is capable we will replace 'sync' with 'lwsync'. We introduce CPU_FTR_LWSYNC as a way to determine at runtime if this is needed. This flag could be moved elsewhere since we dont really use it for the normal CPU_FTR purpose. Finally we only store the relative offset in the fixup section to keep it as small as possible rather than using a full fixup_entry. Signed-off-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 01 7月, 2008 3 次提交
-
-
由 Michael Neuling 提交于
Add a VSX CPU feature. Also add code to detect if VSX is available from the device tree. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NJoel Schopp <jschopp@austin.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Michael Ellerman 提交于
The CPU and firmware feature fixup macros are currently spread across three files, firmware.h, cputable.h and asm-compat.h. Consolidate them into their own file, feature-fixups.h Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Adrian Bunk 提交于
asm/asm-compat.h doesn't seem to be intended for userspace usage. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 30 6月, 2008 1 次提交
-
-
由 Michael Neuling 提交于
Add a cputable entry for the POWER7 processor. Also tell firmware that we know about POWER7. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NJoel Schopp <jschopp@austin.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 26 6月, 2008 2 次提交
-
-
由 Kumar Gala 提交于
If we have an L2CSR register (e500mc) we need to flush the L2 before going to nap. We use the HW flush mechanism provided in that register. The code reuses the CPU_FTR_604_PERF_MON bit as it is no longer used by any code in the kernel. Additionally we didn't reuse the exist L2CR feature bit as this is intended for the 7xxx L2CR register and L2CSR is part of the new Freescale "Book-E" registers. Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
由 Kumar Gala 提交于
The e500 core enter DOZE/NAP power-saving modes when the core go to cpu_idle routine. The power management default running mode is DOZE, If the user echo 1 > /proc/sys/kernel/powersave-nap the system will change to NAP running mode. Signed-off-by: NDave Liu <daveliu@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 19 6月, 2008 1 次提交
-
-
由 Kumar Gala 提交于
The new e500mc core from Freescale is based on the e500v2 but with the following changes: * Supports only the Enhanced Debug Architecture (DSRR0/1, etc) * Floating Point * No SPE * Supports lwsync * Doorbell Exceptions * Hypervisor * Cache line size is now 64-bytes (e500v1/v2 have a 32-byte cache line) Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 06 2月, 2008 1 次提交
-
-
由 Andy Fleming 提交于
Some of the more recent e300 cores have the same performance monitor implementation as the e500. e300 isn't book-e, so the name isn't really appropriate. In preparation for e300 support, rename a bunch of fsl_booke things to say fsl_emb (Freescale Embedded Performance Monitors). Signed-off-by: NAndy Fleming <afleming@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 24 12月, 2007 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
This adds a cputable function pointer for the CPU-side machine check handling. The semantic is still the same as the old one, the one in ppc_md. overrides the one in cputable, though ultimately we'll want to change that so the CPU gets first. This removes CONFIG_440A which was a problem for multiplatform kernels and instead fixes up the IVOR at runtime from a setup_cpu function. The "A" version of the machine check also tweaks the regs->trap value to differenciate the 2 versions at the C level. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
-
- 13 11月, 2007 1 次提交
-
-
由 Becky Bruce 提交于
The context switch code in the kernel issues a dummy stwcx. to clear the reservation, as recommended by the architecture. However, some processors can have issues if this stwcx to address A occurs while the reservation is already held to a different address B. To avoid this problem, the dummy stwcx. needs to be paired with a dummy lwarx to the same address. This adds the dummy lwarx, and creates a cpu feature bit to indicate which cpus are affected. Tested on mpc8641_hpcn_defconfig in arch/powerpc; build tested in arch/ppc. Signed-off-by: NBecky Bruce <becky.bruce@freescale.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 17 10月, 2007 1 次提交
-
-
由 Olof Johansson 提交于
PA6T has a bug where the slbie instruction does not honor the large segment bit. As a result, we have to always use slbia when switching context. We don't have to worry about changing the slbie's during fault processing, since they should never be replacing one VSID with another using the same ESID. I.e. there's no risk for inserting duplicate entries due to a failed slbie of the old entry. So as long as we clear it out on context switch we should be fine. Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 12 10月, 2007 1 次提交
-
-
由 Paul Mackerras 提交于
This makes the kernel use 1TB segments for all kernel mappings and for user addresses of 1TB and above, on machines which support them (currently POWER5+, POWER6 and PA6T). We detect that the machine supports 1TB segments by looking at the ibm,processor-segment-sizes property in the device tree. We don't currently use 1TB segments for user addresses < 1T, since that would effectively prevent 32-bit processes from using huge pages unless we also had a way to revert to using 256MB segments. That would be possible but would involve extra complications (such as keeping track of which segment size was used when HPTEs were inserted) and is not addressed here. Parts of this patch were originally written by Ben Herrenschmidt. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 11 10月, 2007 1 次提交
-
-
由 Paul Mackerras 提交于
Some IBM machines supply a "logical" PVR (processor version register) value in the device tree in the cpu nodes rather than the real PVR. This is used for instance to indicate that the processors in a POWER6 partition have been configured by the hypervisor to run in POWER5+ mode rather than POWER6 mode. To cope with this, we call identify_cpu a second time with the logical PVR value (the first call is with the real PVR value in the very early setup code). However, POWER5+ machines can also supply a logical PVR value, and use the same value (the value that indicates a v2.04 architecture compliant processor). This causes problems for code that uses the performance monitor (such as oprofile), because the PMU registers are different in POWER6 (even in POWER5+ mode) from the real POWER5+. This change works around this problem by taking out the PMU information from the cputable entries for the logical PVR values, and changing identify_cpu so that the second call to it won't overwrite the PMU information that was established by the first call (the one with the real PVR), but does update the other fields. Specifically, if the cputable entry for the logical PVR value has num_pmcs == 0, none of the PMU-related fields get used. So that we can create a mixed cputable entry, we now make cur_cpu_spec point to a single static struct cpu_spec, and copy stuff from cpu_specs[i] into it. This has the side-effect that we can now make cpu_specs[] be initdata. Ultimately it would be good to move the PMU-related fields out to a separate structure, pointed to by the cputable entries, and change identify_cpu so that it saves the PMU info pointer, copies the whole structure, and restores the PMU info pointer, rather than identify_cpu having to list all the fields that are *not* PMU-related. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 10月, 2007 1 次提交
-
-
由 Scott Wood 提交于
The 8272 (and presumably other PCI PQ2 chips) appear to have the same issue as the 83xx regarding PCI streaming DMA. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 14 9月, 2007 1 次提交
-
-
由 Kumar Gala 提交于
Make it so that SPE support can be determined at runtime. This is similiar to how we handle AltiVec. This allows us to have SPE support built in and work on processors with and without SPE. Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 10 7月, 2007 1 次提交
-
-
由 Josh Boyer 提交于
The 750 CPU_FTR macros have quite a bit of duplication in them. Consolidate them to use CPU_FTRS_750 and only list the unique features for derivatives. Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 14 6月, 2007 1 次提交
-
-
由 David Gibson 提交于
Currently the powerpc kernel has a 64-bit only feature, COHERENT_ICACHE used for those CPUS which maintain icache/dcache coherency in hardware (POWER5, essentially). It also has a feature, SPLIT_ID_CACHE, which is used on CPUs which have separate i and d-caches, which is to say everything except 601 and Freescale E200. In nearly all the places we check the SPLIT_ID_CACHE, what we actually care about is whether the i and d-caches are coherent (which they will be, trivially, if they're the same cache). This tries to clarify the situation a little. The COHERENT_ICACHE feature becomes availble on 32-bit and is set for all CPUs where i and d-cache are effectively coherent, whether this is due to special logic (POWER5) or because they're unified. We check this, instead of SPLIT_ID_CACHE nearly everywhere. The SPLIT_ID_CACHE feature itself is replaced by a UNIFIED_ID_CACHE feature with reversed sense, set only on 601 and Freescale E200. In the two places (one Freescale BookE specific) where we really care whether it's a unified cache, not whether they're coherent, we check this feature. The CPUs with unified cache are so few, we could consider replacing this feature bit with explicit checks against the PVR. This will make unifying the 32-bit and 64-bit cache flush code a little more straightforward. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 17 5月, 2007 1 次提交
-
-
由 James.Yang 提交于
Remove CPU_FTR_NEED_COHERENT for MPC7448 (and single-core MPC86xx). This prevents needlessly setting M=1 when not SMP. Signed-off-by: NJames.Yang <James.Yang@freescale.com> Acked-by: NJon Loeliger <jdl@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 24 4月, 2007 2 次提交
-
-
由 Olof Johansson 提交于
Oprofile support for PA6T, kernel side. Also rename the PA6T_SPRN.* defines to SPRN_PA6T.*. Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Josh Boyer 提交于
PowerPC 750CL has high BATs. The patch below adds a CPU_FTRS_750CL that includes that. Without it, the original firmware mappings in the high BATs aren't cleared which continue to override the linux translations. It also adds CPU_FTR_COMMON to CPU_FTRS_750GX for completeness. Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 07 2月, 2007 2 次提交
-
-
由 Olof Johansson 提交于
Add cputable entries for which type of PMC implementation the processor has. I've only filled in the current 64-bit processors, the unfilled default value will have same behaviour as before so it can be done over time as needed. Also tidy up the dummy_perf implementation a bit, aggregating it into one function with ifdefs instead of several. Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Michael Neuling 提交于
CPU_FTRS_POWER6X is unused, hence remove it. Signed-off-by Michael Neuling <mikey@neuling.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 09 12月, 2006 1 次提交
-
-
由 Anton Blanchard 提交于
POWER6 adds a new SPR, the data stream control register (DSCR). It can be used to adjust how agressive the prefetch mechanisms are. Its possible we may want to context switch this, but for now just export it to userspace via sysfs so we can adjust it. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 08 12月, 2006 1 次提交
-
-
由 Kim Phillips 提交于
The e300c2 has no FPU. Its MSR[FP] is grounded to zero. If an attempt is made to execute a floating point instruction (including floating-point load, store, or move instructions), the e300c2 takes a floating-point unavailable interrupt. This patch adds support for FP emulation on the e300c2 by declaring a new CPU_FTR_FP_TAKES_FPUNAVAIL, where FP unavail interrupts are intercepted and redirected to the ProgramCheck exception path for correct emulation handling. (If we run out of CPU_FTR bits we could look to reclaim this bit by adding support to test the cpu_user_features for PPC_FEATURE_HAS_FPU instead) It adds a nop to the exception path for 32-bit processors with a FPU. Signed-off-by: NKim Phillips <kim.phillips@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 04 12月, 2006 4 次提交
-
-
由 Stephen Rothwell 提交于
Remove CPU_FTR_16M_PAGE from the cupfeatures mask at runtime on iSeries. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Michael Ellerman 提交于
It saves #ifdef'ing in callers if we at least define the 64-bit cpu features for 32-bit also. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
-
由 Paul Mackerras 提交于
This adds code to look at the properties firmware puts in the device tree to determine what compatibility mode the partition is in on POWER6 machines, and set the ELF aux vector AT_HWCAP and AT_PLATFORM entries appropriately. Specifically, we look at the cpu-version property in the cpu node(s). If that contains a "logical" PVR value (of the form 0x0f00000x), we call identify_cpu again with this PVR value. A value of 0x0f000001 indicates the partition is in POWER5+ compatibility mode, and a value of 0x0f000002 indicates "POWER6 architected" mode, with various extensions disabled. We also look for various other properties: ibm,dfp, ibm,purr and ibm,spurr. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Maynard Johnson 提交于
Add PPU event-based and cycle-based profiling support to Oprofile for Cell. Oprofile is expected to collect data on all CPUs simultaneously. However, there is one set of performance counters per node. There are two hardware threads or virtual CPUs on each node. Hence, OProfile must multiplex in time the performance counter collection on the two virtual CPUs. The multiplexing of the performance counters is done by a virtual counter routine. Initially, the counters are configured to collect data on the even CPUs in the system, one CPU per node. In order to capture the PC for the virtual CPU when the performance counter interrupt occurs (the specified number of events between samples has occurred), the even processors are configured to handle the performance counter interrupts for their node. The virtual counter routine is called via a kernel timer after the virtual sample time. The routine stops the counters, saves the current counts, loads the last counts for the other virtual CPU on the node, sets interrupts to be handled by the other virtual CPU and restarts the counters, the virtual timer routine is scheduled to run again. The virtual sample time is kept relatively small to make sure sampling occurs on both CPUs on the node with a relatively small granularity. Whenever the counters overflow, the performance counter interrupt is called to collect the PC for the CPU where data is being collected. The oprofile driver relies on a firmware RTAS call to setup the debug bus to route the desired signals to the performance counter hardware to be counted. The RTAS call must set the routing registers appropriately in each of the islands to pass the signals down the debug bus as well as routing the signals from a particular island onto the bus. There is a second firmware RTAS call to reset the debug bus to the non pass thru state when the counters are not in use. Signed-off-by: NCarl Love <carll@us.ibm.com> Signed-off-by: NMaynard Johnson <mpjohn@us.ibm.com> Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 25 10月, 2006 3 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The Cell CPU timebase has an erratum. When reading the entire 64 bits of the timebase with one mftb instruction, there is a handful of cycles window during which one might read a value with the low order 32 bits already reset to 0x00000000 but the high order bits not yet incremeted by one. This fixes it by reading the timebase again until the low order 32 bits is no longer 0. That might introduce occasional latencies if hitting mftb just at the wrong time, but no more than 70ns on a cell blade, and that was considered acceptable. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
This patch reworks the feature fixup mecanism so vdso's can be fixed up. The main issue was that the construct: .long label (or .llong on 64 bits) will not work in the case of a shared library like the vdso. It will generate an empty placeholder in the fixup table along with a reloc, which is not something we can deal with in the vdso. The idea here (thanks Alan Modra !) is to instead use something like: 1: .long label - 1b That is, the feature fixup tables no longer contain addresses of bits of code to patch, but offsets of such code from the fixup table entry itself. That is properly resolved by ld when building the .so's. I've modified the fixup mecanism generically to use that method for the rest of the kernel as well. Another trick is that the 32 bits vDSO included in the 64 bits kernel need to have a table in the 64 bits format. However, gas does not support 32 bits code with a statement of the form: .llong label - 1b (Or even just .llong label) That is, it cannot emit the right fixup/relocation for the linker to use to assign a 32 bits address to an .llong field. Thus, in the specific case of the 32 bits vdso built as part of the 64 bits kernel, we are using a modified macro that generates: .long 0xffffffff .llong label - 1b Note that is assumes that the value is negative which is enforced by the .lds (those offsets are always negative as the .text is always before the fixup table and gas doesn't support emiting the reloc the other way around). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
This patch adds some macros that can be used with an explicit label in order to nest cpu features. This should be used very careful but is necessary for the upcoming cell TB fixup. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-