1. 30 5月, 2009 2 次提交
    • L
      [ARM] alternative copy_to_user/clear_user implementation · 39ec58f3
      Lennert Buytenhek 提交于
      This implements {copy_to,clear}_user() by faulting in the userland
      pages and then using the regular kernel mem{cpy,set}() to copy the
      data (while holding the page table lock).  This is a win if the regular
      mem{cpy,set}() implementations are faster than the user copy functions,
      which is the case e.g. on Feroceon, where 8-word STMs (which memcpy()
      uses under the right conditions) give significantly higher memory write
      throughput than a sequence of individual 32bit stores.
      
      Here are numbers for page sized buffers on some Feroceon cores:
      
       - copy_to_user on Orion5x goes from 51 MB/s to 83 MB/s
       - clear_user on Orion5x goes from 89MB/s to 314MB/s
       - copy_to_user on Kirkwood goes from 240 MB/s to 356 MB/s
       - clear_user on Kirkwood goes from 367 MB/s to 1108 MB/s
       - copy_to_user on Disco-Duo goes from 248 MB/s to 398 MB/s
       - clear_user on Disco-Duo goes from 328 MB/s to 1741 MB/s
      
      Because the setup cost is non negligible, this is worthwhile only if
      the amount of data to copy is large enough.  The operation falls back
      to the standard implementation when the amount of data is below a certain
      threshold. This threshold was determined empirically, however some targets
      could benefit from a lower runtime determined value for optimal results
      eventually.
      
      In the copy_from_user() case, this technique does not provide any
      worthwhile performance gain due to the fact that any kind of read access
      allocates the cache and subsequent 32bit loads are just as fast as the
      equivalent 8-word LDM.
      Signed-off-by: NLennert Buytenhek <buytenh@marvell.com>
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Tested-by: NMartin Michlmayr <tbm@cyrius.com>
      39ec58f3
    • N
      [ARM] allow for alternative __copy_to_user/__clear_user implementations · a1f98849
      Nicolas Pitre 提交于
      This allows for optional alternative implementations of __copy_to_user
      and __clear_user, with a possible runtime fallback to the standard
      version when the alternative provides no gain over that standard
      version. This is done by making the standard __copy_to_user into a weak
      alias for the symbol __copy_to_user_std.  Same thing for __clear_user.
      
      Those two functions are particularly good candidates to have alternative
      implementations for, since they rely on the STRT instruction which has
      lower performances than STM instructions on some CPU cores such as
      the ARM1176 and Marvell Feroceon.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      a1f98849
  2. 22 5月, 2009 5 次提交
  3. 21 5月, 2009 4 次提交
  4. 19 5月, 2009 2 次提交
  5. 18 5月, 2009 10 次提交
    • M
      microblaze: Fix kind-of-intr checking against number of interrupts · 7b7210d7
      Michal Simek 提交于
      + Fix typographic fault.
      Signed-off-by: NMichal Simek <monstr@monstr.eu>
      7b7210d7
    • M
      microblaze: Update Microblaze defconfig · 3026589c
      Michal Simek 提交于
      Signed-off-by: NMichal Simek <monstr@monstr.eu>
      3026589c
    • P
    • M
      [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 · eb33575c
      Mel Gorman 提交于
      pfn_valid() is meant to be able to tell if a given PFN has valid memmap
      associated with it or not. In FLATMEM, it is expected that holes always
      have valid memmap as long as there is valid PFNs either side of the hole.
      In SPARSEMEM, it is assumed that a valid section has a memmap for the
      entire section.
      
      However, ARM and maybe other embedded architectures in the future free
      memmap backing holes to save memory on the assumption the memmap is never
      used. The page_zone linkages are then broken even though pfn_valid()
      returns true. A walker of the full memmap must then do this additional
      check to ensure the memmap they are looking at is sane by making sure the
      zone and PFN linkages are still valid. This is expensive, but walkers of
      the full memmap are extremely rare.
      
      This was caught before for FLATMEM and hacked around but it hits again for
      SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
      are totally screwed. This looks like a hatchet job but the reality is that
      any clean solution would end up consumning all the memory saved by punching
      these unexpected holes in the memmap. For example, we tried marking the
      memmap within the section invalid but the section size exceeds the size of
      the hole in most cases so pfn_valid() starts returning false where valid
      memmap exists. Shrinking the size of the section would increase memory
      consumption offsetting the gains.
      
      This patch identifies when an architecture is punching unexpected holes
      in the memmap that the memory model cannot automatically detect and sets
      ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
      which is the model sub-architecture this has been reported on but may expand
      later. When set, walkers of the full memmap must call memmap_valid_within()
      for each PFN and passing in what it expects the page and zone to be for
      that PFN. If it finds the linkages to be broken, it assumes the memmap is
      invalid for that PFN.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eb33575c
    • B
      powerpc: Explicit alignment for .data.cacheline_aligned · 0e337b42
      Benjamin Herrenschmidt 提交于
      I don't think anything guarantees that the objects in data.page_aligned
      are a multiple of PAGE_SIZE, thus the section may end on any boundary.
      
      So the following section, .data.cacheline_aligned needs an explicit
      alignment.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0e337b42
    • G
      powerpc/ps3: Update ps3_defconfig · dc892288
      Geoff Levand 提交于
      Refresh and set these options:
      
       CONFIG_SYSFS_DEPRECATED_V2: y -> n
       CONFIG_INPUT_JOYSTICK:      y -> n
       CONFIG_HID_SONY:            n -> m
       CONFIG_RTC_DRV_PS3:         - -> m
      Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dc892288
    • S
      powerpc/ftrace: Fix constraint to be early clobber · c3cf8667
      Steven Rostedt 提交于
      After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
      graph tracer broke. This was discovered on my x86 boxes.
      
      The issue is that gcc used the same register for an output as it did for
      an input in an asm statement. I first thought this was a bug in gcc and
      reported it. I was notified that gcc was correct and that the output had
      to be flagged as an "early clobber".
      
      I noticed that powerpc had the same issue and this patch fixes it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c3cf8667
    • M
      powerpc/ftrace: Use pr_devel() in ftrace.c · 021376a3
      Michael Ellerman 提交于
      pr_debug() can now result in code being generated even when #DEBUG
      is not defined. That's not really desirable in the ftrace code
      which we want to be snappy.
      
      With CONFIG_DYNAMIC_DEBUG=y:
      
      size before:
         text	   data	    bss	    dec	    hex	filename
         3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o
      
      size after:
         text	   data	    bss	    dec	    hex	filename
         2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      021376a3
    • M
      powerpc: Do not assert pte_locked for hugepage PTE entries · af3e4aca
      Mel Gorman 提交于
      With CONFIG_DEBUG_VM, an assertion is made when changing the protection
      flags of a PTE that the PTE is locked. Huge pages use a different pagetable
      format and the assertion is bogus and will always trigger with a bug looking
      something like
      
       Unable to handle kernel paging request for data at address 0xf1a00235800006f8
       Faulting instruction address: 0xc000000000034a80
       Oops: Kernel access of bad area, sig: 11 [#1]
       SMP NR_CPUS=32 NUMA Maple
       Modules linked in: dm_snapshot dm_mirror dm_region_hash
        dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
        pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
        windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
        windfarm_cpufreq_clamp windfarm_core i2c_powermac
       NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
       REGS: c000000003037600 TRAP: 0300   Not tainted (2.6.30-rc3-autokern1)
       MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28002484  XER: 200fffff
       DAR: f1a00235800006f8, DSISR: 0000000040010000
       TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
       GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
       GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
       GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
       GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
       GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
       GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
       GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
       GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
       NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
       LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       Call Trace:
       [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
       [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
       [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
       [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
       [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
       [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
       [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
       Instruction dump:
       7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
       7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000
      
      This patch fixes the problem by not asseting the PTE is locked for VMAs
      backed by huge pages.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af3e4aca
    • R
      [ARM] realview: fix broadcast tick support · ee348d5a
      Russell King 提交于
      Having discussed broadcast tick support with Thomas Glexiner, the
      broadcast tick devices should be registered with a higher rating
      than the global tick device, and it should have the ONESHOT and
      PERIODIC feature flags set.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NThomas Glexiner <tglx@linutronix.de>
      ee348d5a
  6. 17 5月, 2009 4 次提交
  7. 16 5月, 2009 4 次提交
    • T
      ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2 · 005187ee
      Tony Lindgren 提交于
      This makes the framebuffer work on omap3.
      
      Also fix the clk_get usage for checkpatch.pl
      "ERROR: do not use assignment in if condition".
      
      Cc: Imre Deak <imre.deak@nokia.com>
      Cc: linux-fbdev-devel@lists.sourceforge.net
      Acked-by: NKrzysztof Helt <krzysztof.h1@wp.pl>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      005187ee
    • K
      ARM: OMAP3: Fix HW SAVEANDRESTORE shift define · 8dbe4393
      Kalle Jokiniemi 提交于
      The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used
      by powerdomain code in
      "1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but
      the definition was also (1 << 4), meaning we actually
      modified bit 16. So the definition needs to be 4.
      
      This fixes also a cold reset HW bug in OMAP3430 ES3.x
      where some of the efuse bits are not isolated during
      wake-up from off mode. This can cause randomish
      cold resets with off mode. Enabling the USBTLL hardware
      SAVEANDRESTORE causes the core power up assert to be
      delayed in a way that we will not get faulty values
      when boot ROM is reading the unisolated registers.
      Signed-off-by: NKalle Jokiniemi <kalle.jokiniemi@digia.com>
      Acked-by: NKevin Hilman <khilman@deeprootsystems.com>
      Acked-by: NPaul Walmsley <paul@pwsan.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      8dbe4393
    • V
      ARM: OMAP3: Fix number of GPIO lines for 34xx · e102657e
      Vikram Pandita 提交于
      As per 3430 TRM, there are 6 banks [0 to 191]
      Signed-off-by: NTom Rix <Tom.Rix@windriver.com>
      Signed-off-by: NVikram Pandita <vikram.pandita@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      e102657e
    • J
      x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126
      Jeremy Fitzhardinge 提交于
      Xiaohui Xin and some other folks at Intel have been looking into what's
      behind the performance hit of paravirt_ops when running native.
      
      It appears that the hit is entirely due to the paravirtualized
      spinlocks introduced by:
      
       | commit 8efcbab6
       | Date:   Mon Jul 7 12:07:51 2008 -0700
       |
       |     paravirt: introduce a "lock-byte" spinlock implementation
      
      The extra call/return in the spinlock path is somehow
      causing an increase in the cycles/instruction of somewhere around 2-7%
      (seems to vary quite a lot from test to test).  The working theory is
      that the CPU's pipeline is getting upset about the
      call->call->locked-op->return->return, and seems to be failing to
      speculate (though I haven't seen anything definitive about the precise
      reasons).  This doesn't entirely make sense, because the performance
      hit is also visible on unlock and other operations which don't involve
      locked instructions.  But spinlock operations clearly swamp all the
      other pvops operations, even though I can't imagine that they're
      nearly as common (there's only a .05% increase in instructions
      executed).
      
      If I disable just the pv-spinlock calls, my tests show that pvops is
      identical to non-pvops performance on native (my measurements show that
      it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
      
      Summary of results, averaging 10 runs of the "mmperf" test, using a
      no-pvops build as baseline:
      
      		nopv		Pv-nospin	Pv-spin
      CPU cycles	100.00%		99.89%		102.18%
      instructions	100.00%		100.10%		100.15%
      CPI		100.00%		99.79%		102.03%
      cache ref	100.00%		100.84%		100.28%
      cache miss	100.00%		90.47%		88.56%
      cache miss rate	100.00%		89.72%		88.31%
      branches	100.00%		99.93%		100.04%
      branch miss	100.00%		103.66%		107.72%
      branch miss rt	100.00%		103.73%		107.67%
      wallclock	100.00%		99.90%		102.20%
      
      The clear effect here is that the 2% increase in CPI is
      directly reflected in the final wallclock time.
      
      (The other interesting effect is that the more ops are
      out of line calls via pvops, the lower the cache access
      and miss rates.  Not too surprising, but it suggests that
      the non-pvops kernel is over-inlined.  On the flipside,
      the branch misses go up correspondingly...)
      
      So, what's the fix?
      
      Paravirt patching turns all the pvops calls into direct calls, so
      _spin_lock etc do end up having direct calls.  For example, the compiler
      generated code for paravirtualized _spin_lock is:
      
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq  *0xffffffff805a5b30
      <_spin_lock+22>:	retq
      
      The indirect call will get patched to:
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq <__ticket_spin_lock>
      <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
      <_spin_lock+22>:	retq
      
      One possibility is to inline _spin_lock, etc, when building an
      optimised kernel (ie, when there's no spinlock/preempt
      instrumentation/debugging enabled).  That will remove the outer
      call/return pair, returning the instruction stream to a single
      call/return, which will presumably execute the same as the non-pvops
      case.  The downsides arel 1) it will replicate the
      preempt_disable/enable code at eack lock/unlock callsite; this code is
      fairly small, but not nothing; and 2) the spinlock definitions are
      already a very heavily tangled mass of #ifdefs and other preprocessor
      magic, and making any changes will be non-trivial.
      
      The other obvious answer is to disable pv-spinlocks.  Making them a
      separate config option is fairly easy, and it would be trivial to
      enable them only when Xen is enabled (as the only non-default user).
      But it doesn't really address the common case of a distro build which
      is going to have Xen support enabled, and leaves the open question of
      whether the native performance cost of pv-spinlocks is worth the
      performance improvement on a loaded Xen system (10% saving of overall
      system CPU when guests block rather than spin).  Still it is a
      reasonable short-term workaround.
      
      [ Impact: fix pvops performance regression when running native ]
      Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
      Analysed-by: N"Li Xin" <xin.li@intel.com>
      Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      LKML-Reference: <4A0B62F7.5030802@goop.org>
      [ fixed the help text ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4ecc126
  8. 15 5月, 2009 9 次提交