1. 30 5月, 2009 2 次提交
  2. 22 5月, 2009 5 次提交
  3. 21 5月, 2009 4 次提交
  4. 19 5月, 2009 2 次提交
  5. 18 5月, 2009 10 次提交
    • M
      microblaze: Fix kind-of-intr checking against number of interrupts · 7b7210d7
      Michal Simek 提交于
      + Fix typographic fault.
      Signed-off-by: NMichal Simek <monstr@monstr.eu>
      7b7210d7
    • M
      microblaze: Update Microblaze defconfig · 3026589c
      Michal Simek 提交于
      Signed-off-by: NMichal Simek <monstr@monstr.eu>
      3026589c
    • P
    • M
      [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 · eb33575c
      Mel Gorman 提交于
      pfn_valid() is meant to be able to tell if a given PFN has valid memmap
      associated with it or not. In FLATMEM, it is expected that holes always
      have valid memmap as long as there is valid PFNs either side of the hole.
      In SPARSEMEM, it is assumed that a valid section has a memmap for the
      entire section.
      
      However, ARM and maybe other embedded architectures in the future free
      memmap backing holes to save memory on the assumption the memmap is never
      used. The page_zone linkages are then broken even though pfn_valid()
      returns true. A walker of the full memmap must then do this additional
      check to ensure the memmap they are looking at is sane by making sure the
      zone and PFN linkages are still valid. This is expensive, but walkers of
      the full memmap are extremely rare.
      
      This was caught before for FLATMEM and hacked around but it hits again for
      SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
      are totally screwed. This looks like a hatchet job but the reality is that
      any clean solution would end up consumning all the memory saved by punching
      these unexpected holes in the memmap. For example, we tried marking the
      memmap within the section invalid but the section size exceeds the size of
      the hole in most cases so pfn_valid() starts returning false where valid
      memmap exists. Shrinking the size of the section would increase memory
      consumption offsetting the gains.
      
      This patch identifies when an architecture is punching unexpected holes
      in the memmap that the memory model cannot automatically detect and sets
      ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
      which is the model sub-architecture this has been reported on but may expand
      later. When set, walkers of the full memmap must call memmap_valid_within()
      for each PFN and passing in what it expects the page and zone to be for
      that PFN. If it finds the linkages to be broken, it assumes the memmap is
      invalid for that PFN.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eb33575c
    • B
      powerpc: Explicit alignment for .data.cacheline_aligned · 0e337b42
      Benjamin Herrenschmidt 提交于
      I don't think anything guarantees that the objects in data.page_aligned
      are a multiple of PAGE_SIZE, thus the section may end on any boundary.
      
      So the following section, .data.cacheline_aligned needs an explicit
      alignment.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0e337b42
    • G
      powerpc/ps3: Update ps3_defconfig · dc892288
      Geoff Levand 提交于
      Refresh and set these options:
      
       CONFIG_SYSFS_DEPRECATED_V2: y -> n
       CONFIG_INPUT_JOYSTICK:      y -> n
       CONFIG_HID_SONY:            n -> m
       CONFIG_RTC_DRV_PS3:         - -> m
      Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dc892288
    • S
      powerpc/ftrace: Fix constraint to be early clobber · c3cf8667
      Steven Rostedt 提交于
      After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
      graph tracer broke. This was discovered on my x86 boxes.
      
      The issue is that gcc used the same register for an output as it did for
      an input in an asm statement. I first thought this was a bug in gcc and
      reported it. I was notified that gcc was correct and that the output had
      to be flagged as an "early clobber".
      
      I noticed that powerpc had the same issue and this patch fixes it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c3cf8667
    • M
      powerpc/ftrace: Use pr_devel() in ftrace.c · 021376a3
      Michael Ellerman 提交于
      pr_debug() can now result in code being generated even when #DEBUG
      is not defined. That's not really desirable in the ftrace code
      which we want to be snappy.
      
      With CONFIG_DYNAMIC_DEBUG=y:
      
      size before:
         text	   data	    bss	    dec	    hex	filename
         3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o
      
      size after:
         text	   data	    bss	    dec	    hex	filename
         2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      021376a3
    • M
      powerpc: Do not assert pte_locked for hugepage PTE entries · af3e4aca
      Mel Gorman 提交于
      With CONFIG_DEBUG_VM, an assertion is made when changing the protection
      flags of a PTE that the PTE is locked. Huge pages use a different pagetable
      format and the assertion is bogus and will always trigger with a bug looking
      something like
      
       Unable to handle kernel paging request for data at address 0xf1a00235800006f8
       Faulting instruction address: 0xc000000000034a80
       Oops: Kernel access of bad area, sig: 11 [#1]
       SMP NR_CPUS=32 NUMA Maple
       Modules linked in: dm_snapshot dm_mirror dm_region_hash
        dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
        pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
        windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
        windfarm_cpufreq_clamp windfarm_core i2c_powermac
       NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
       REGS: c000000003037600 TRAP: 0300   Not tainted (2.6.30-rc3-autokern1)
       MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28002484  XER: 200fffff
       DAR: f1a00235800006f8, DSISR: 0000000040010000
       TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
       GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
       GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
       GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
       GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
       GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
       GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
       GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
       GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
       NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
       LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       Call Trace:
       [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
       [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
       [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
       [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
       [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
       [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
       Instruction dump:
       7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
       7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000
      
      This patch fixes the problem by not asseting the PTE is locked for VMAs
      backed by huge pages.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af3e4aca
    • R
      [ARM] realview: fix broadcast tick support · ee348d5a
      Russell King 提交于
      Having discussed broadcast tick support with Thomas Glexiner, the
      broadcast tick devices should be registered with a higher rating
      than the global tick device, and it should have the ONESHOT and
      PERIODIC feature flags set.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NThomas Glexiner <tglx@linutronix.de>
      ee348d5a
  6. 17 5月, 2009 4 次提交
  7. 16 5月, 2009 4 次提交
    • T
      ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2 · 005187ee
      Tony Lindgren 提交于
      This makes the framebuffer work on omap3.
      
      Also fix the clk_get usage for checkpatch.pl
      "ERROR: do not use assignment in if condition".
      
      Cc: Imre Deak <imre.deak@nokia.com>
      Cc: linux-fbdev-devel@lists.sourceforge.net
      Acked-by: NKrzysztof Helt <krzysztof.h1@wp.pl>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      005187ee
    • K
      ARM: OMAP3: Fix HW SAVEANDRESTORE shift define · 8dbe4393
      Kalle Jokiniemi 提交于
      The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used
      by powerdomain code in
      "1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but
      the definition was also (1 << 4), meaning we actually
      modified bit 16. So the definition needs to be 4.
      
      This fixes also a cold reset HW bug in OMAP3430 ES3.x
      where some of the efuse bits are not isolated during
      wake-up from off mode. This can cause randomish
      cold resets with off mode. Enabling the USBTLL hardware
      SAVEANDRESTORE causes the core power up assert to be
      delayed in a way that we will not get faulty values
      when boot ROM is reading the unisolated registers.
      Signed-off-by: NKalle Jokiniemi <kalle.jokiniemi@digia.com>
      Acked-by: NKevin Hilman <khilman@deeprootsystems.com>
      Acked-by: NPaul Walmsley <paul@pwsan.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      8dbe4393
    • V
      ARM: OMAP3: Fix number of GPIO lines for 34xx · e102657e
      Vikram Pandita 提交于
      As per 3430 TRM, there are 6 banks [0 to 191]
      Signed-off-by: NTom Rix <Tom.Rix@windriver.com>
      Signed-off-by: NVikram Pandita <vikram.pandita@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      e102657e
    • J
      x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126
      Jeremy Fitzhardinge 提交于
      Xiaohui Xin and some other folks at Intel have been looking into what's
      behind the performance hit of paravirt_ops when running native.
      
      It appears that the hit is entirely due to the paravirtualized
      spinlocks introduced by:
      
       | commit 8efcbab6
       | Date:   Mon Jul 7 12:07:51 2008 -0700
       |
       |     paravirt: introduce a "lock-byte" spinlock implementation
      
      The extra call/return in the spinlock path is somehow
      causing an increase in the cycles/instruction of somewhere around 2-7%
      (seems to vary quite a lot from test to test).  The working theory is
      that the CPU's pipeline is getting upset about the
      call->call->locked-op->return->return, and seems to be failing to
      speculate (though I haven't seen anything definitive about the precise
      reasons).  This doesn't entirely make sense, because the performance
      hit is also visible on unlock and other operations which don't involve
      locked instructions.  But spinlock operations clearly swamp all the
      other pvops operations, even though I can't imagine that they're
      nearly as common (there's only a .05% increase in instructions
      executed).
      
      If I disable just the pv-spinlock calls, my tests show that pvops is
      identical to non-pvops performance on native (my measurements show that
      it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
      
      Summary of results, averaging 10 runs of the "mmperf" test, using a
      no-pvops build as baseline:
      
      		nopv		Pv-nospin	Pv-spin
      CPU cycles	100.00%		99.89%		102.18%
      instructions	100.00%		100.10%		100.15%
      CPI		100.00%		99.79%		102.03%
      cache ref	100.00%		100.84%		100.28%
      cache miss	100.00%		90.47%		88.56%
      cache miss rate	100.00%		89.72%		88.31%
      branches	100.00%		99.93%		100.04%
      branch miss	100.00%		103.66%		107.72%
      branch miss rt	100.00%		103.73%		107.67%
      wallclock	100.00%		99.90%		102.20%
      
      The clear effect here is that the 2% increase in CPI is
      directly reflected in the final wallclock time.
      
      (The other interesting effect is that the more ops are
      out of line calls via pvops, the lower the cache access
      and miss rates.  Not too surprising, but it suggests that
      the non-pvops kernel is over-inlined.  On the flipside,
      the branch misses go up correspondingly...)
      
      So, what's the fix?
      
      Paravirt patching turns all the pvops calls into direct calls, so
      _spin_lock etc do end up having direct calls.  For example, the compiler
      generated code for paravirtualized _spin_lock is:
      
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq  *0xffffffff805a5b30
      <_spin_lock+22>:	retq
      
      The indirect call will get patched to:
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq <__ticket_spin_lock>
      <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
      <_spin_lock+22>:	retq
      
      One possibility is to inline _spin_lock, etc, when building an
      optimised kernel (ie, when there's no spinlock/preempt
      instrumentation/debugging enabled).  That will remove the outer
      call/return pair, returning the instruction stream to a single
      call/return, which will presumably execute the same as the non-pvops
      case.  The downsides arel 1) it will replicate the
      preempt_disable/enable code at eack lock/unlock callsite; this code is
      fairly small, but not nothing; and 2) the spinlock definitions are
      already a very heavily tangled mass of #ifdefs and other preprocessor
      magic, and making any changes will be non-trivial.
      
      The other obvious answer is to disable pv-spinlocks.  Making them a
      separate config option is fairly easy, and it would be trivial to
      enable them only when Xen is enabled (as the only non-default user).
      But it doesn't really address the common case of a distro build which
      is going to have Xen support enabled, and leaves the open question of
      whether the native performance cost of pv-spinlocks is worth the
      performance improvement on a loaded Xen system (10% saving of overall
      system CPU when guests block rather than spin).  Still it is a
      reasonable short-term workaround.
      
      [ Impact: fix pvops performance regression when running native ]
      Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
      Analysed-by: N"Li Xin" <xin.li@intel.com>
      Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      LKML-Reference: <4A0B62F7.5030802@goop.org>
      [ fixed the help text ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4ecc126
  8. 15 5月, 2009 9 次提交