1. 18 5月, 2009 5 次提交
    • B
      powerpc: Explicit alignment for .data.cacheline_aligned · 0e337b42
      Benjamin Herrenschmidt 提交于
      I don't think anything guarantees that the objects in data.page_aligned
      are a multiple of PAGE_SIZE, thus the section may end on any boundary.
      
      So the following section, .data.cacheline_aligned needs an explicit
      alignment.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0e337b42
    • G
      powerpc/ps3: Update ps3_defconfig · dc892288
      Geoff Levand 提交于
      Refresh and set these options:
      
       CONFIG_SYSFS_DEPRECATED_V2: y -> n
       CONFIG_INPUT_JOYSTICK:      y -> n
       CONFIG_HID_SONY:            n -> m
       CONFIG_RTC_DRV_PS3:         - -> m
      Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dc892288
    • S
      powerpc/ftrace: Fix constraint to be early clobber · c3cf8667
      Steven Rostedt 提交于
      After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
      graph tracer broke. This was discovered on my x86 boxes.
      
      The issue is that gcc used the same register for an output as it did for
      an input in an asm statement. I first thought this was a bug in gcc and
      reported it. I was notified that gcc was correct and that the output had
      to be flagged as an "early clobber".
      
      I noticed that powerpc had the same issue and this patch fixes it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c3cf8667
    • M
      powerpc/ftrace: Use pr_devel() in ftrace.c · 021376a3
      Michael Ellerman 提交于
      pr_debug() can now result in code being generated even when #DEBUG
      is not defined. That's not really desirable in the ftrace code
      which we want to be snappy.
      
      With CONFIG_DYNAMIC_DEBUG=y:
      
      size before:
         text	   data	    bss	    dec	    hex	filename
         3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o
      
      size after:
         text	   data	    bss	    dec	    hex	filename
         2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      021376a3
    • M
      powerpc: Do not assert pte_locked for hugepage PTE entries · af3e4aca
      Mel Gorman 提交于
      With CONFIG_DEBUG_VM, an assertion is made when changing the protection
      flags of a PTE that the PTE is locked. Huge pages use a different pagetable
      format and the assertion is bogus and will always trigger with a bug looking
      something like
      
       Unable to handle kernel paging request for data at address 0xf1a00235800006f8
       Faulting instruction address: 0xc000000000034a80
       Oops: Kernel access of bad area, sig: 11 [#1]
       SMP NR_CPUS=32 NUMA Maple
       Modules linked in: dm_snapshot dm_mirror dm_region_hash
        dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
        pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
        windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
        windfarm_cpufreq_clamp windfarm_core i2c_powermac
       NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
       REGS: c000000003037600 TRAP: 0300   Not tainted (2.6.30-rc3-autokern1)
       MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28002484  XER: 200fffff
       DAR: f1a00235800006f8, DSISR: 0000000040010000
       TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
       GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
       GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
       GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
       GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
       GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
       GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
       GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
       GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
       NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
       LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       Call Trace:
       [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
       [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
       [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
       [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
       [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
       [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
       Instruction dump:
       7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
       7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000
      
      This patch fixes the problem by not asseting the PTE is locked for VMAs
      backed by huge pages.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af3e4aca
  2. 16 5月, 2009 1 次提交
    • J
      x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126
      Jeremy Fitzhardinge 提交于
      Xiaohui Xin and some other folks at Intel have been looking into what's
      behind the performance hit of paravirt_ops when running native.
      
      It appears that the hit is entirely due to the paravirtualized
      spinlocks introduced by:
      
       | commit 8efcbab6
       | Date:   Mon Jul 7 12:07:51 2008 -0700
       |
       |     paravirt: introduce a "lock-byte" spinlock implementation
      
      The extra call/return in the spinlock path is somehow
      causing an increase in the cycles/instruction of somewhere around 2-7%
      (seems to vary quite a lot from test to test).  The working theory is
      that the CPU's pipeline is getting upset about the
      call->call->locked-op->return->return, and seems to be failing to
      speculate (though I haven't seen anything definitive about the precise
      reasons).  This doesn't entirely make sense, because the performance
      hit is also visible on unlock and other operations which don't involve
      locked instructions.  But spinlock operations clearly swamp all the
      other pvops operations, even though I can't imagine that they're
      nearly as common (there's only a .05% increase in instructions
      executed).
      
      If I disable just the pv-spinlock calls, my tests show that pvops is
      identical to non-pvops performance on native (my measurements show that
      it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
      
      Summary of results, averaging 10 runs of the "mmperf" test, using a
      no-pvops build as baseline:
      
      		nopv		Pv-nospin	Pv-spin
      CPU cycles	100.00%		99.89%		102.18%
      instructions	100.00%		100.10%		100.15%
      CPI		100.00%		99.79%		102.03%
      cache ref	100.00%		100.84%		100.28%
      cache miss	100.00%		90.47%		88.56%
      cache miss rate	100.00%		89.72%		88.31%
      branches	100.00%		99.93%		100.04%
      branch miss	100.00%		103.66%		107.72%
      branch miss rt	100.00%		103.73%		107.67%
      wallclock	100.00%		99.90%		102.20%
      
      The clear effect here is that the 2% increase in CPI is
      directly reflected in the final wallclock time.
      
      (The other interesting effect is that the more ops are
      out of line calls via pvops, the lower the cache access
      and miss rates.  Not too surprising, but it suggests that
      the non-pvops kernel is over-inlined.  On the flipside,
      the branch misses go up correspondingly...)
      
      So, what's the fix?
      
      Paravirt patching turns all the pvops calls into direct calls, so
      _spin_lock etc do end up having direct calls.  For example, the compiler
      generated code for paravirtualized _spin_lock is:
      
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq  *0xffffffff805a5b30
      <_spin_lock+22>:	retq
      
      The indirect call will get patched to:
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq <__ticket_spin_lock>
      <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
      <_spin_lock+22>:	retq
      
      One possibility is to inline _spin_lock, etc, when building an
      optimised kernel (ie, when there's no spinlock/preempt
      instrumentation/debugging enabled).  That will remove the outer
      call/return pair, returning the instruction stream to a single
      call/return, which will presumably execute the same as the non-pvops
      case.  The downsides arel 1) it will replicate the
      preempt_disable/enable code at eack lock/unlock callsite; this code is
      fairly small, but not nothing; and 2) the spinlock definitions are
      already a very heavily tangled mass of #ifdefs and other preprocessor
      magic, and making any changes will be non-trivial.
      
      The other obvious answer is to disable pv-spinlocks.  Making them a
      separate config option is fairly easy, and it would be trivial to
      enable them only when Xen is enabled (as the only non-default user).
      But it doesn't really address the common case of a distro build which
      is going to have Xen support enabled, and leaves the open question of
      whether the native performance cost of pv-spinlocks is worth the
      performance improvement on a loaded Xen system (10% saving of overall
      system CPU when guests block rather than spin).  Still it is a
      reasonable short-term workaround.
      
      [ Impact: fix pvops performance regression when running native ]
      Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
      Analysed-by: N"Li Xin" <xin.li@intel.com>
      Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      LKML-Reference: <4A0B62F7.5030802@goop.org>
      [ fixed the help text ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4ecc126
  3. 15 5月, 2009 13 次提交
  4. 14 5月, 2009 21 次提交