- 13 3月, 2010 7 次提交
-
-
由 FUJITA Tomonori 提交于
This converts powerpc to use the generic pci_set_dma_mask and pci_set_consistent_dma_mask (drivers/pci/pci.c). The generic pci_set_dma_mask does what powerpc's pci_set_dma_mask does. Unlike powerpc's pci_set_consistent_dma_mask, the gneric pci_set_consistent_dma_mask sets only coherent_dma_mask. It doesn't work for powerpc? pci_set_consistent_dma_mask API should set only coherent_dma_mask? Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 FUJITA Tomonori 提交于
All the architectures properly set NEED_DMA_MAP_STATE now so we can safely add linux/pci-dma.h to linux/pci.h and remove the linux/pci-dma.h inclusion in arch's asm/pci.h Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: NArnd Bergmann <arnd@arndb.de> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 FUJITA Tomonori 提交于
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
While in theory user_enable_single_step/user_disable_single_step/ user_enable_blockstep could also be provided as an inline or macro there's no good reason to do so, and having the prototype in one places keeps code size and confusion down. Roland said: The original thought there was that user_enable_single_step() et al might well be only an instruction or three on a sane machine (as if we have any of those!), and since there is only one call site inlining would be beneficial. But I agree that there is no strong reason to care about inlining it. As to the arch changes, there is only one thought I'd add to the record. It was always my thinking that for an arch where PTRACE_SINGLESTEP does text-modifying breakpoint insertion, user_enable_single_step() should not be provided. That is, arch_has_single_step()=>true means that there is an arch facility with "pure" semantics that does not have any unexpected side effects. Inserting a breakpoint might do very unexpected strange things in multi-threaded situations. Aside from that, it is a peculiar side effect that user_{enable,disable}_single_step() should cause COW de-sharing of text pages and so forth. For PTRACE_SINGLESTEP, all these peculiarities are the status quo ante for that arch, so having arch_ptrace() itself do those is one thing. But for building other things in the future, it is nicer to have a uniform "pure" semantics that arch-independent code can expect. OTOH, all such arch issues are really up to the arch maintainer. As of today, there is nothing but ptrace using user_enable_single_step() et al so it's a distinction without a practical difference. If/when there are other facilities that use user_enable_single_step() and might care, the affected arch's can revisit the question when someone cares about the quality of the arch support for said new facility. Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Acked-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Add generic implementations of the old and really old uname system calls. Note that sh only implements sys_olduname but not sys_oldolduname, but I'm not going to bother with another ifdef for that special case. m32r implemented an old uname but never wired it up, so kill it, too. Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
On an architecture that supports 32-bit compat we need to override the reported machine in uname with the 32-bit value. Instead of doing this separately in every architecture introduce a COMPAT_UTS_MACHINE define in <asm/compat.h> and apply it directly in sys_newuname(). Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Add a generic implementation of the ipc demultiplexer syscall. Except for s390 and sparc64 all implementations of the sys_ipc are nearly identical. There are slight differences in the types of the parameters, where mips and powerpc as the only 64-bit architectures with sys_ipc use unsigned long for the "third" argument as it gets casted to a pointer later, while it traditionally is an "int" like most other paramters. frv goes even further and uses unsigned long for all parameters execept for "ptr" which is a pointer type everywhere. The change from int to unsigned long for "third" and back to "int" for the others on frv should be fine due to the in-register calling conventions for syscalls (we already had a similar issue with the generic sys_ptrace), but I'd prefer to have the arch maintainers looks over this in details. Except for that h8300, m68k and m68knommu lack an impplementation of the semtimedop sub call which this patch adds, and various architectures have gets used - at least on i386 it seems superflous as the compat code on x86-64 and ia64 doesn't even bother to implement it. [akpm@linux-foundation.org: add sys_ipc to sys_ni.c] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: NH. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Acked-by: NDavid Howells <dhowells@redhat.com> Acked-by: NKyle McMartin <kyle@mcmartin.ca> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 3月, 2010 2 次提交
-
-
由 Dave Kleikamp 提交于
powerpc/booke: Fix a couple typos in the advanced ptrace code Found and fixed a couple typos in the advanced ptrace patches. (These patches are currently in benh's next tree.) Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
On 64-bit kernels we currently have a 512 byte struct paca_struct for each cpu (usually just called "the paca"). Currently they are statically allocated, which means a kernel built for a large number of cpus will waste a lot of space if it's booted on a machine with few cpus. We can avoid that by only allocating the number of pacas we need at boot. However this is complicated by the fact that we need to access the paca before we know how many cpus there are in the system. The solution is to dynamically allocate enough space for NR_CPUS pacas, but then later in boot when we know how many cpus we have, we free any unused pacas. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 3月, 2010 1 次提交
-
-
由 Scott Wood 提交于
This implements perf_event support for the Freescale embedded performance monitor, based on the existing perf_event.c that supports server/classic chips. Some limitations: - Performance monitor interrupts are regular EE interrupts, and thus you can't profile places with interrupts disabled. We may want to implement soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs. - When trying to schedule multiple event groups at once, and using restricted events, situations could arise where scheduling fails even though it would be possible. Consider three groups, each with two events. One group has restricted events, the others don't. The two non-restricted groups are scheduled, then one is removed, which happens to occupy the two counters that can't do restricted events. The remaining non-restricted group will not be moved to the non-restricted-capable counters to make room if the restricted group tries to be scheduled. Signed-off-by: NScott Wood <scottwood@freescale.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 01 3月, 2010 15 次提交
-
-
由 Liu Yu 提交于
Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by: NLiu Yu <yu.liu@freescale.com> Acked-by: NHollis Blanchard <hollis@penguinppc.org> Acked-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Liu Yu 提交于
commit 55fb1027c1cf9797dbdeab48180da530e81b1c39 doesn't update tlbcfg correctly. Fix it. And since guest OS likes 'fixed' hardware, initialize tlbcfg everytime when guest access is useless. So move this part to init code. Signed-off-by: NLiu Yu <yu.liu@freescale.com> Acked-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Liu Yu 提交于
Latest kernel start to access l1csr0 to contron L1. We just tell guest no operation is on going. Signed-off-by: NLiu Yu <yu.liu@freescale.com> Acked-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Alexander Graf 提交于
SRR1 stores more information that just the MSR value. It also stores valuable information about the type of interrupt we received, for example whether the storage interrupt we just got was because of a missing htab entry or not. We use that information to speed up the exit path. Now if we get preempted before we can interpret the shadow_msr values, we get into vcpu_put which then calls the MSR handler, which then sets all the SRR1 information bits in shadow_msr to 0. Great. So let's preserve the SRR1 specific bits in shadow_msr whenever we set the MSR. They don't hurt. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Commit 7d01b4c3ed2bb33ceaf2d270cb4831a67a76b51b introduced PACA backed vcpu values. With this patch, when a userspace app was setting GPRs before it was actually first loaded, the set values get discarded. This is because vcpu_load loads them from the vcpu backing store that we use whenever we're not owning the PACA. That behavior is not really a major problem, because we don't need it for qemu. Other users (like kvmctl) do have problems with it though, so let's better do it right. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
When our guest starts using either the FPU, Altivec or VSX we need to make sure Linux knows about it and sneak into its process switching code accordingly. This patch makes accesses to the above parts of the system work inside the VM. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Linux contains quite some bits of code to load FPU, Altivec and VSX lazily for a task. It calls those bits in real mode, coming from an interrupt handler. For KVM we better reuse those, so let's wrap a bit of trampoline magic around them and then we can call them from normal module code. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
An SLB entry contains two pieces of information related to size: 1) PTE size 2) SLB size The L bit defines the PTE be "large" (usually means 16MB), SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the default 256MB. Apparently I messed things up and just put those two in one box, shaked it heavily and came up with the current code which handles large pages incorrectly, because it also treats large page SLB entries as "1TB" segment entries. This patch splits those two features apart, making Linux guests boot even when they have > 256MB. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Currently we're racy when doing the transition from IR=1 to IR=0, from the module memory entry code to the real mode SLB switching code. To work around that I took a look at the RTAS entry code which is faced with a similar problem and did the same thing: A small helper in linear mapped memory that does mtmsr with IR=0 and then RFIs info the actual handler. Thanks to that trick we can safely take page faults in the entry code and only need to be really wary of what to do as of the SLB switching part. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
To fetch the last instruction we were interrupted on, we enable DR in early exit code, where we are still in a very transitional phase between guest and host state. Most of the time this seemed to work, but another CPU can easily flush our TLB and HTAB which makes us go in the Linux page fault handler which totally breaks because we still use the guest's SLB entries. To work around that, let's introduce a second KVM guest mode that defines that whenever we get a trap, we don't call the Linux handler or go into the KVM exit code, but just jump over the faulting instruction. That way a potentially bad lwz doesn't trigger any faults and we can later on interpret the invalid instruction we fetched as "fetch didn't work". Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
We're being horribly racy right now. All the entry and exit code hijacks random fields from the PACA that could easily be used by different code in case we get interrupted, for example by a #MC or even page fault. After discussing this with Ben, we figured it's best to reserve some more space in the PACA and just shove off some vcpu state to there. That way we can drastically improve the readability of the code, make it less racy and less complex. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
We now have helpers for the GPRs, so let's also add some for CR and XER. Having them in the PACA simplifies code a lot, as we don't need to care about where to store CC or not to overflow any integers. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
All code in PPC KVM currently accesses gprs in the vcpu struct directly. While there's nothing wrong with that wrt the current way gprs are stored and loaded, it doesn't suffice for the PACA acceleration that will follow in this patchset. So let's just create little wrapper inline functions that we call whenever a GPR needs to be read from or written to. The compiled code shouldn't really change at all for now. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. So let's implement this logic in KVM too. This patch moves the line clearing from the firing of the interrupt to the mtdec emulation. This makes PPC64 guests work without AGGRESSIVE_DEC defined. Signed-off-by: NAlexander Graf <agraf@suse.de> Acked-by: NAcked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 26 2月, 2010 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Anton's commit enabling the use of the lwsync fixup mechanism on 64-bit breaks modules. The lwsync fixup section uses .long instead of the FTR_ENTRY_OFFSET macro used by other fixups sections, and thus will generate 32-bit relocations that our module loader cannot resolve. This changes it to use the same type as other feature sections. Note however that we might want to consider using 32-bit for all the feature fixup offsets and add support for R_PPC_REL32 to module_64.c instead as that would reduce the size of the kernel image. I'll leave that as an exercise for the reader for now... Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 21 2月, 2010 1 次提交
-
-
由 Russell King 提交于
On VIVT ARM, when we have multiple shared mappings of the same file in the same MM, we need to ensure that we have coherency across all copies. We do this via make_coherent() by making the pages uncacheable. This used to work fine, until we allowed highmem with highpte - we now have a page table which is mapped as required, and is not available for modification via update_mmu_cache(). Ralf Beache suggested getting rid of the PTE value passed to update_mmu_cache(): On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables to construct a pointer to the pte again. Passing a pte_t * is much more elegant. Maybe we might even replace the pte argument with the pte_t? Ben Herrenschmidt would also like the pte pointer for PowerPC: Passing the ptep in there is exactly what I want. I want that -instead- of the PTE value, because I have issue on some ppc cases, for I$/D$ coherency, where set_pte_at() may decide to mask out the _PAGE_EXEC. So, pass in the mapped page table pointer into update_mmu_cache(), and remove the PTE value, updating all implementations and call sites to suit. Includes a fix from Stephen Rothwell: sparc: fix fallout from update_mmu_cache API change Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 19 2月, 2010 2 次提交
-
-
由 Thomas Gleixner 提交于
mpic_lock, irq_rover_lock and fixup_lock need to be real spinlocks in RT. Convert them to raw_spinlock. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Thomas Gleixner 提交于
feature_lock needs to be a real spinlock in RT. Convert it to raw_spinlock. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 2月, 2010 11 次提交
-
-
由 Anatolij Gustschin 提交于
MPC5121 has 12 PSC devices. Enable UART support for all of them by defining the number of max. PSCs depending on selection of PPC_MPC512x platform support. Signed-off-by: NAnatolij Gustschin <agust@denx.de> Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-
由 Dave Kleikamp 提交于
powerpc/booke: Add support for advanced debug registers From: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Based on patches originally written by Torez Smith. This patch defines context switch and trap related functionality for BookE specific Debug Registers. It adds support to ptrace() for setting and getting BookE related Debug Registers Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Torez Smith <lnxtorez@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <dwg@au1.ibm.com> Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Sergio Durigan Junior <sergiodj@br.ibm.com> Cc: Thiago Jung Bauermann <bauerman@br.ibm.com> Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Dave Kleikamp 提交于
powerpc/booke: Add definitions for advanced debug registers From: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Based on patches originally written by Torez Smith. This patch adds additional definitions for BookE Debug Registers to the reg_booke.h header file. Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Acked-by: NDavid Gibson <dwg@au1.ibm.com> Cc: Torez Smith <lnxtorez@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Sergio Durigan Junior <sergiodj@br.ibm.com> Cc: Thiago Jung Bauermann <bauerman@br.ibm.com> Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Dave Kleikamp 提交于
powerpc: Extended ptrace interface From: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Based on patches originally written by Torez Smith. Add a new extended ptrace interface so that user-space has a single interface for powerpc, without having to know the specific layout of the debug registers. Implement: PPC_PTRACE_GETHWDEBUGINFO PPC_PTRACE_SETHWDEBUG PPC_PTRACE_DELHWDEBUG Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Acked-by: NDavid Gibson <dwg@au1.ibm.com> Cc: Torez Smith <lnxtorez@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Sergio Durigan Junior <sergiodj@br.ibm.com> Cc: Thiago Jung Bauermann <bauerman@br.ibm.com> Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
Nick Piggin discovered that lwsync barriers around locks were faster than isync on 970. That was a long time ago and I completely dropped the ball in testing his patches across other ppc64 processors. Turns out the idea helps on other chips. Using a microbenchmark that uses a lot of threads to contend on a global pthread mutex (and therefore a global futex), POWER6 improves 8% and POWER7 improves 2%. I checked POWER5 and while I couldn't measure an improvement, there was no regression. This patch uses the lwsync patching code to replace the isyncs with lwsyncs on CPUs that support the instruction. We were marking POWER3 and RS64 as lwsync capable but in reality they treat it as a full sync (ie slow). Remove the CPU_FTR_LWSYNC bit from these CPUs so they continue to use the faster isync method. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
For performance reasons we are about to change ISYNC_ON_SMP to sometimes be lwsync. Now that the macro name doesn't make sense, change it and LWSYNC_ON_SMP to better explain what the barriers are doing. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
This patch implements the lwarx/ldarx hint bit for bit locks. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
Recent versions of the PowerPC architecture added a hint bit to the larx instructions to differentiate between an atomic operation and a lock operation: > 0 Other programs might attempt to modify the word in storage addressed by EA > even if the subsequent Store Conditional succeeds. > > 1 Other programs will not attempt to modify the word in storage addressed by > EA until the program that has acquired the lock performs a subsequent store > releasing the lock. To avoid a binutils dependency this patch create macros for the extended lwarx format and uses it in the spinlock code. To test this change I used a simple test case that acquires and releases a global pthread mutex: pthread_mutex_lock(&mutex); pthread_mutex_unlock(&mutex); On a 32 core POWER6, running 32 test threads we spend almost all our time in the futex spinlock code: 94.37% perf [kernel] [k] ._raw_spin_lock | |--99.95%-- ._raw_spin_lock | | | |--63.29%-- .futex_wake | | | |--36.64%-- .futex_wait_setup Which is a good test for this patch. The results (in lock/unlock operations per second) are: before: 1538203 ops/sec after: 2189219 ops/sec An improvement of 42% A 32 core POWER7 improves even more: before: 1279529 ops/sec after: 2282076 ops/sec An improvement of 78% Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
I often get asked if BAD interrupts are really bad. On some boxes (eg IBM machines running a hypervisor) there are valid cases where are presented with an interrupt that is not for us. These cases are common enough to show up as thousands of BAD interrupts a day. Tone them down by calling them spurious. Since they can be a significant cause of OS jitter, we may as well log them per cpu so we know where they are occurring. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
With NO_HZ it is useful to know how often the decrementer is going off. The patch below adds an entry for it and also adds it into the /proc/stat summaries. While here, I added performance monitoring and machine check exceptions. I found it useful to keep an eye on the PMU exception rate when using the perf tool. Since it's possible to take a completely handled machine check on a System p box it also sounds like a good idea to keep a machine check summary. The event naming matches x86 to keep gratuitous differences to a minimum. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
PowerPC is currently using asm-generic/hardirq.h which statically allocates an NR_CPUS irq_stat array. Switch to an arch specific implementation which uses per cpu data: On a kernel with NR_CPUS=1024, this saves quite a lot of memory: text data bss dec hex filename 8767938 2944132 1636796 13348866 cbb002 vmlinux.baseline 8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat A saving of around 128kB. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-