- 18 5月, 2009 5 次提交
-
-
由 Benjamin Herrenschmidt 提交于
I don't think anything guarantees that the objects in data.page_aligned are a multiple of PAGE_SIZE, thus the section may end on any boundary. So the following section, .data.cacheline_aligned needs an explicit alignment. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Geoff Levand 提交于
Refresh and set these options: CONFIG_SYSFS_DEPRECATED_V2: y -> n CONFIG_INPUT_JOYSTICK: y -> n CONFIG_HID_SONY: n -> m CONFIG_RTC_DRV_PS3: - -> m Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Steven Rostedt 提交于
After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function graph tracer broke. This was discovered on my x86 boxes. The issue is that gcc used the same register for an output as it did for an input in an asm statement. I first thought this was a bug in gcc and reported it. I was notified that gcc was correct and that the output had to be flagged as an "early clobber". I noticed that powerpc had the same issue and this patch fixes it. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
pr_debug() can now result in code being generated even when #DEBUG is not defined. That's not really desirable in the ftrace code which we want to be snappy. With CONFIG_DYNAMIC_DEBUG=y: size before: text data bss dec hex filename 3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o size after: text data bss dec hex filename 2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Mel Gorman 提交于
With CONFIG_DEBUG_VM, an assertion is made when changing the protection flags of a PTE that the PTE is locked. Huge pages use a different pagetable format and the assertion is bogus and will always trigger with a bug looking something like Unable to handle kernel paging request for data at address 0xf1a00235800006f8 Faulting instruction address: 0xc000000000034a80 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA Maple Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor windfarm_cpufreq_clamp windfarm_core i2c_powermac NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003 REGS: c000000003037600 TRAP: 0300 Not tainted (2.6.30-rc3-autokern1) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 28002484 XER: 200fffff DAR: f1a00235800006f8, DSISR: 0000000040010000 TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2 GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500 GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001 GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8 GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000 GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20 GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000 GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8 GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880 NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0 LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 Call Trace: [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable) [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674 [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8 [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828 [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584 [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c Instruction dump: 7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4 7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000 This patch fixes the problem by not asseting the PTE is locked for VMAs backed by huge pages. Signed-off-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 16 5月, 2009 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Xiaohui Xin and some other folks at Intel have been looking into what's behind the performance hit of paravirt_ops when running native. It appears that the hit is entirely due to the paravirtualized spinlocks introduced by: | commit 8efcbab6 | Date: Mon Jul 7 12:07:51 2008 -0700 | | paravirt: introduce a "lock-byte" spinlock implementation The extra call/return in the spinlock path is somehow causing an increase in the cycles/instruction of somewhere around 2-7% (seems to vary quite a lot from test to test). The working theory is that the CPU's pipeline is getting upset about the call->call->locked-op->return->return, and seems to be failing to speculate (though I haven't seen anything definitive about the precise reasons). This doesn't entirely make sense, because the performance hit is also visible on unlock and other operations which don't involve locked instructions. But spinlock operations clearly swamp all the other pvops operations, even though I can't imagine that they're nearly as common (there's only a .05% increase in instructions executed). If I disable just the pv-spinlock calls, my tests show that pvops is identical to non-pvops performance on native (my measurements show that it is actually about .1% faster, but Xiaohui shows a .05% slowdown). Summary of results, averaging 10 runs of the "mmperf" test, using a no-pvops build as baseline: nopv Pv-nospin Pv-spin CPU cycles 100.00% 99.89% 102.18% instructions 100.00% 100.10% 100.15% CPI 100.00% 99.79% 102.03% cache ref 100.00% 100.84% 100.28% cache miss 100.00% 90.47% 88.56% cache miss rate 100.00% 89.72% 88.31% branches 100.00% 99.93% 100.04% branch miss 100.00% 103.66% 107.72% branch miss rt 100.00% 103.73% 107.67% wallclock 100.00% 99.90% 102.20% The clear effect here is that the 2% increase in CPI is directly reflected in the final wallclock time. (The other interesting effect is that the more ops are out of line calls via pvops, the lower the cache access and miss rates. Not too surprising, but it suggests that the non-pvops kernel is over-inlined. On the flipside, the branch misses go up correspondingly...) So, what's the fix? Paravirt patching turns all the pvops calls into direct calls, so _spin_lock etc do end up having direct calls. For example, the compiler generated code for paravirtualized _spin_lock is: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq *0xffffffff805a5b30 <_spin_lock+22>: retq The indirect call will get patched to: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq <__ticket_spin_lock> <_spin_lock+20>: nop; nop /* or whatever 2-byte nop */ <_spin_lock+22>: retq One possibility is to inline _spin_lock, etc, when building an optimised kernel (ie, when there's no spinlock/preempt instrumentation/debugging enabled). That will remove the outer call/return pair, returning the instruction stream to a single call/return, which will presumably execute the same as the non-pvops case. The downsides arel 1) it will replicate the preempt_disable/enable code at eack lock/unlock callsite; this code is fairly small, but not nothing; and 2) the spinlock definitions are already a very heavily tangled mass of #ifdefs and other preprocessor magic, and making any changes will be non-trivial. The other obvious answer is to disable pv-spinlocks. Making them a separate config option is fairly easy, and it would be trivial to enable them only when Xen is enabled (as the only non-default user). But it doesn't really address the common case of a distro build which is going to have Xen support enabled, and leaves the open question of whether the native performance cost of pv-spinlocks is worth the performance improvement on a loaded Xen system (10% saving of overall system CPU when guests block rather than spin). Still it is a reasonable short-term workaround. [ Impact: fix pvops performance regression when running native ] Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com> Analysed-by: N"Li Xin" <xin.li@intel.com> Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Xen-devel <xen-devel@lists.xensource.com> LKML-Reference: <4A0B62F7.5030802@goop.org> [ fixed the help text ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 5月, 2009 13 次提交
-
-
由 Jason Wessel 提交于
The treatment of the SP register is different on x86_64 and i386. This is a regression fix that lived outside the mainline kernel from 2.6.27 to now. The regression was a result of the original merge consolidation of the i386 and x86_64 archs to x86. The incorrectly reported SP on i386 prevented stack tracebacks from working correctly in gdb. Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
由 David Brownell 提交于
This is a build fix, resyncing the DaVinci EVM ASoC board code with the version in the DaVinci tree. That resync includes support for the DM355 EVM, although that board isn't yet in mainline. (NOTE: also includes a bugfix to the platform_add_resources call, recently sent by Chaithrika U S <chaithrika@ti.com> but not yet merged into the DaVinci tree.) Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
-
由 Benjamin Herrenschmidt 提交于
A couple of issues crept in since about 2.6.27 related to accessing PCI device ROMs on various powerpc machines. First, historically, we don't allocate the ROM resource in the resource tree. I'm not entirely certain of why, I susepct they often contained garbage on x86 but it's hard to tell. This causes the current generic code to always call pci_assign_resource() when trying to access the said ROM from sysfs, which will try to re-assign some new address regardless of what the ROM BAR was already set to at boot time. This can be a problem on hypervisor platforms like pSeries where we aren't supposed to move PCI devices around (and in fact probably can't). Second, our code that generates the PCI tree from the OF device-tree (instead of doing config space probing) which we mostly use on pseries at the moment, didn't set the (new) flag IORESOURCE_SIZEALIGN on any resource. That means that any attempt at re-assigning such a resource with pci_assign_resource() would fail due to resource_alignment() returning 0. This fixes this by doing these two things: - The code that calculates resource flags based on the OF device-node is improved to set IORESOURCE_SIZEALIGN on any valid BAR, and while at it also set IORESOURCE_READONLY for ROMs since we were lacking that too - We now allocate ROM resources as part of the resource tree. However to limit the chances of nasty conflicts due to busted firmwares, we only do it on the second pass of our two-passes allocation scheme, so that all valid and enabled BARs get precedence. This brings pSeries back the ability to access PCI ROMs via sysfs (and thus initialize various video cards from X etc...). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
My previous pach for fixing the oprofile CPU type got somewhat mismerged (by my fault) when it collided with another related patch. This should finally (fingers crossed) fix the whole thing. We make sure we keep the -old- oprofile type and CPU type whenever one of them was specified in the first pass through the function. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gerhard Stenzel 提交于
There have been a series of checkstops on QS21 related to ptcal being set up incorrectly. On systems that only have memory on a single node, ptcal fails when it gets a pointer to memory on the remote node. Moreover, agressive prefetching in memcpy and other functions may accidentally touch the first cache line of the page that we reserve for ptcal, which causes an ECC checkstop. We now allocate pages only from the specified node, moves the ptcal area into the middle of the allocated page to avoid potential prefetch problems and prints the address of the ptcal area to facilitate diagnostics. Signed-off-by: NGerhard Stenzel <gerhard.stenzel@de.ibm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Becky Bruce 提交于
We're currently choking on mem=4g (and above) due to memory_limit being specified as an unsigned long. Make memory_limit phys_addr_t to fix this. Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Kumar Gala 提交于
Before when we were setting up the irq host map for mpic we passed in just isu_size for the size of the linear map. However, for a number of mpic implementations we have no isu (thus pass in 0) and will end up with a no linear map (size = 0). This causes us to always call irq_find_mapping() from mpic_get_irq(). By moving the allocation of the host map to after we've determined the number of sources we can actually benefit from having a linear map for the non-isu users that covers all the interrupt sources. Signed-off-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Maynard Johnson 提交于
Description ----------- Change ppc64 oprofile kernel driver to use the SLOT bits (MMCRA[37:39]only on older processors where those bits are defined. Background ---------- The performance monitor unit of the 64-bit POWER processor family has the ability to collect accurate instruction-level samples when profiling on marked events (i.e., "PM_MRK_<event-name>"). In processors prior to POWER6, the MMCRA register contained "slot information" that the oprofile kernel driver used to adjust the value latched in the SIAR at the time of a PMU interrupt. But as of POWER6, these slot bits in MMCRA are no longer necessary for oprofile to use, since the SIAR itself holds the accurate sampled instruction address. With POWER6, these MMCRA slot bits were zero'ed out by hardware so oprofile's use of these slot bits was, in effect, a NOP. But with POWER7, these bits are no longer zero'ed out; however, they serve some other purpose rather than slot information. Thus, using these bits on POWER7 to adjust the SIAR value results in samples being attributed to the wrong instructions. The attached patch changes the oprofile kernel driver to ignore these slot bits on all newer processors starting with POWER6. Signed-off-by: NMaynard Johnson <maynardj@us.ibm.com> Signed-off-by: NMichael Wolf <mjw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Stephen Rothwell x 提交于
Commit 4fc665b8 "powerpc: Merge 32 and 64-bit dma code" made changes to the PCI initialisation code that added an assignment to archdata.dma_data but only for 32 bit code. Commit 7eef440a "powerpc/pci: Cosmetic cleanups of pci-common.c" removed the conditional compilation. Unfortunately, the iSeries code setup the archdata.dma_data before that assignment was done - effectively overwriting the dma_data with NULL. Fix this up by moving the iSeries setup of dma_data into a pci_dma_dev_setup callback. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Timur Tabi 提交于
The mktree utility defines some variables as "uint", although this is not a standard C type, and so cross-compiling on Mac OS X fails. Change this to "unsigned int". Signed-off-by: NTimur Tabi <timur@freescale.com> Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 John Linn 提交于
The interrupt controller was not handling level interrupts correctly such that duplicate interrupts were happening. This fixes the problem and adds edge type interrupts which are needed in Xilinx hardware. Signed-off-by: NJohn Linn <john.linn@xilinx.com> Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-
由 Grant Likely 提交于
It is common to use U-Boot on Xilinx Virtex platforms. This patch ensures that CONFIG_DEFAULT_UIMAGE is selected for virtex Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-
由 Grant Likely 提交于
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-
- 14 5月, 2009 21 次提交
-
-
由 Thomas Bogendoerfer 提交于
Locking of irq_desc is now done in irq_set_affinity; don't lock it again in chip specific set_affinity function. Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 David Daney 提交于
When init is started it is SIGNAL_UNKILLABLE. If it were to get an address error, we would try to send it SIGBUS, but it would be ignored and the faulting instruction restarted. This results in an endless loop. We need to use force_sig() instead so it will actually die and give us some useful information. Reported-by: NFlorian Fainelli <florian@openwrt.org> Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Coly Li 提交于
This patch modifies parameter of octeon_cvmcount_read() from 'void' to 'struct clocksource *cs', which fixes compile warning for incompatible parameter type. Signed-off-by: NColy Li <coly.li@suse.de> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: NDavid Daney <ddaney@caviumnetworks.com> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
The inline assembler used on 32-bit kernels was using the "h" constraint which was considered dangerous and removed for gcc 4.4.0. Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Commit 35133692 (kernel.org) rsp. b3594a089f1c17ff919f8f78505c3f20e1f6f8ce (linux-mips.org): > From: Chris Dearman <chris@mips.com> > Date: Wed, 19 Sep 2007 00:58:24 +0100 > Subject: [PATCH] [MIPS] Allow setting of the cache attribute at run time. > > Slightly tacky, but there is a precedent in the sparc archirecture code. introduces the variable _page_cachable_default, which defaults to zero and. is used to create the prototype PTE for __kmap_atomic in arch/mips/mm/init.c:kmap_init before initialization in arch/mips/mm/c-r4k.c:coherency_setup, so the default value of 0 will be used as the CCA of kmap atomic pages which on many processors is not a defined CCA value and may result in writes to kmap_atomic pages getting corrupted. Debugged by Jon Fraser (jfraser@broadcom.com). Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Probably nobody does arithmetic on cp0 register values so this has never bitten. Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Shane McDonald 提交于
The RAMROOT function was a successful but non-portable attempt to append the root filesystem to the end of the kernel image. The preferred and portable solution is to use an initramfs instead. Signed-off-by: NShane McDonald <mcdonald.shane@gmail.com> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
I don't think that in 15 years of Linux/MIPS the zero division checking code generated by gcc by default has ever caught anything. Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Otherwise indigestable options might be passed to the host compiler. Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Shane McDonald 提交于
There have been a number of compile problems with the msp71xx configuration ever since it was included in the linux-mips.org repository. This patch resolves compilation problems with attempting to reset the board using non-existent GPIO routines. This patch has been compile-tested against the current HEAD. Signed-off-by: NShane McDonald <mcdonald.shane@gmail.com> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Shane McDonald 提交于
There have been a number of compile problems with the msp71xx configuration ever since it was included in the linux-mips.org repository. This patch resolves the "multiple definition of plat_timer_setup" problem, and creates the required get_c0_compare_int function. This patch has been compile-tested against the current HEAD. Signed-off-by: NShane McDonald <mcdonald.shane@gmail.com> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Thomas Bogendoerfer 提交于
There was already a define for NMI_OFFSET in asm/sn/addr.h, which now clashes with linux/hardirq.h. Rename the one in sn/addr.h to fix IP27 builds.. Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Manuel Lauss 提交于
Fix breakage introduced by 8e19608e. Signed-off-by: NManuel Lauss <mano@roarinelk.homelinux.net> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Ralf Baechle 提交于
Beyond the requirements of the architecture standard Cavium also supports 8k and 32k pages. Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NDavid Daney <ddaney@caviumnetworks.com>
-
由 Atsushi Nemoto 提交于
Addition of -fwrapv option in 2.6.29 discloses possible overflow with signed arithmetics. For example, result of "a * 6 / 12" (int a = 400000000) is 200000000 without -fwrapv but -157913941 with -fwrapv. Change some variable to unsigned to avoid such overflows. Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Atsushi Nemoto 提交于
Synchronize dma_map_page/dma_unmap_page and dma_map_single/dma_unmap_single. This will reduce unnecessary writebacks and invalidates. [Ralf: make dma_unmap_page an inline function.] Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-