提交 · 39ec58f3fea47c242724109cc1da999f74810bbc · openeuler / Kernel

30 5月, 2009 2 次提交

[ARM] alternative copy_to_user/clear_user implementation · 39ec58f3

由 Lennert Buytenhek 提交于 3月 09, 2009

This implements {copy_to,clear}_user() by faulting in the userland
pages and then using the regular kernel mem{cpy,set}() to copy the
data (while holding the page table lock).  This is a win if the regular
mem{cpy,set}() implementations are faster than the user copy functions,
which is the case e.g. on Feroceon, where 8-word STMs (which memcpy()
uses under the right conditions) give significantly higher memory write
throughput than a sequence of individual 32bit stores.

Here are numbers for page sized buffers on some Feroceon cores:

 - copy_to_user on Orion5x goes from 51 MB/s to 83 MB/s
 - clear_user on Orion5x goes from 89MB/s to 314MB/s
 - copy_to_user on Kirkwood goes from 240 MB/s to 356 MB/s
 - clear_user on Kirkwood goes from 367 MB/s to 1108 MB/s
 - copy_to_user on Disco-Duo goes from 248 MB/s to 398 MB/s
 - clear_user on Disco-Duo goes from 328 MB/s to 1741 MB/s

Because the setup cost is non negligible, this is worthwhile only if
the amount of data to copy is large enough.  The operation falls back
to the standard implementation when the amount of data is below a certain
threshold. This threshold was determined empirically, however some targets
could benefit from a lower runtime determined value for optimal results
eventually.

In the copy_from_user() case, this technique does not provide any
worthwhile performance gain due to the fact that any kind of read access
allocates the cache and subsequent 32bit loads are just as fast as the
equivalent 8-word LDM.
Signed-off-by: NLennert Buytenhek <buytenh@marvell.com>
Signed-off-by: NNicolas Pitre <nico@marvell.com>
Tested-by: NMartin Michlmayr <tbm@cyrius.com>

39ec58f3

[ARM] allow for alternative __copy_to_user/__clear_user implementations · a1f98849

由 Nicolas Pitre 提交于 3月 08, 2009

This allows for optional alternative implementations of __copy_to_user
and __clear_user, with a possible runtime fallback to the standard
version when the alternative provides no gain over that standard
version. This is done by making the standard __copy_to_user into a weak
alias for the symbol __copy_to_user_std. Same thing for __clear_user.

Those two functions are particularly good candidates to have alternative
implementations for, since they rely on the STRT instruction which has
lower performances than STM instructions on some CPU cores such as
the ARM1176 and Marvell Feroceon.
Signed-off-by: NNicolas Pitre <nico@marvell.com>

a1f98849

22 5月, 2009 5 次提交

MIPS: IP32: Remove unnecessary if not even harmful volatile keywords. · d2f82c2f

由 Ralf Baechle 提交于 5月 22, 2009

They are unneeded and as the issue fixed in lmo commit
63f7ec59053e3f850ab67a9938e631bcba64c6ce shows even harmful.
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

d2f82c2f

MIPS: IP32: Fix build error due to uninitialized variable. · 63c901c7

由 Ralf Baechle 提交于 5月 22, 2009

  CC      arch/mips/sgi-ip32/ip32-reset.o
cc1: warnings being treated as errors
arch/mips/sgi-ip32/ip32-reset.c: In function 'debounce':
arch/mips/sgi-ip32/ip32-reset.c:97: error: 'reg_a' is used uninitialized in this function

The issues is old but due to the volatile keyword gcc older than 4.4 did
not warn about this obvious bug.
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

63c901c7

MIPS: Fix sparse warning in incompatiable argument type of clear_user. · 63d38923

由 Wu Zhangjin 提交于 5月 21, 2009

The type of the second argument of access_ok should be (void __user *).
The unnecessary conversion of the clear_user address argument was causing
sparse to emit warnings on the __chk_user_ptr check.
Signed-off-by: NWu Zhangjin <wuzhangjin@gmail.com>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

63d38923

powerpc/maple: Add a quirk to disable MSI for IPR on Bimini · 6eb0ac03

由 Michael Ellerman 提交于 5月 21, 2009

Something in the HW or FW setup is busted and MSIs aren't working with
IPR on Bimini, so until we figure out exaxtly what's up, we quirk them
out
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6eb0ac03

sh: ap325 camera without i2c driver fix · 37869fa2

由 Magnus Damm 提交于 5月 20, 2009

This patch fixes the ap325rxa ncm03j camera code to handle
the case where no i2c driver is present. Without this fix
i2c_transfer() may be passed NULL as adapter which results
in a crash.

Triggered when i2c-sh_mobile.c failed to probe() due to
missing MSTP clocks.
Signed-off-by: NMagnus Damm <damm@igel.co.jp>
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>

37869fa2

21 5月, 2009 4 次提交

MIPS: 64-bit: Fix system lockup. · a5e696e5

由 Greg Ungerer 提交于 5月 20, 2009

The address range size calculation inside local_flush_tlb_kernel_range()
is being truncated by a too small size variable holder on 64-bit systems.
The truncated size can result in an erroneous tlbsize check that means we
sit spinning inside a loop trying to flush a hige number of TLB entries.
This is for all intents and purposes a system hang. Fix by using an
appropriately sized valiable to hold the size.

[Ralf: Greg's original patch submission identified the issue and fixed one
instance in tlb-r4k.c but there there were several more.  For consistency
I also modified tlb-r3k.c even though that file is only used on 32-bit.]
Signed-off-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

a5e696e5

MIPS: IP28: Change to build with -mr10k-cache-barrier=store · 195d1a96

由 peter fuerst 提交于 5月 17, 2009

Richard Sandiford's new code for inserting the cache-barriers, for GCC
4.3 and above and already incorporated in the current GCC-release, uses
a slightly different option-syntax.
Signed-off-by: Npeter fuerst <post@pfrst.de>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

195d1a96

MIPS: IP22: Fix hang in power button interrupt handler · 7e9e05ca

由 Ralf Baechle 提交于 5月 16, 2009

The hang was caused by the use of disable_irq() from the interrupt handler
itself.  Fixed by the use of disable_irq_nosync().  The issue was
triggered by:

    commit 3aa551c9
    Author: Thomas Gleixner <tglx@linutronix.de>
    Date:   Mon Mar 23 18:28:15 2009 +0100

        genirq: add threaded interrupt handler support
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

7e9e05ca

MIPS: IP32: Fix hang on shutdown in power button interrupt handler. · 950312ce

由 Andrew Randrianasulu 提交于 5月 14, 2009

The hang was caused by the use of disable_irq() from the interrupt handler
itself.  Fixed by the use of disable_irq_nosync().  The issue was
triggered by:

    commit 3aa551c9
    Author: Thomas Gleixner <tglx@linutronix.de>
    Date:   Mon Mar 23 18:28:15 2009 +0100

        genirq: add threaded interrupt handler support
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

950312ce

19 5月, 2009 2 次提交

[ARM] 5517/1: integrator: don't put clock lookups in __initdata · a93ea9b3

由 Rabin Vincent 提交于 5月 18, 2009

Remove the __initdata annotation for the clock lookups, since they
will be needed when loading modules which use clk_get().
Signed-off-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

a93ea9b3

[ARM] 5518/1: versatile: don't put clock lookups in __initdata · 982db663

由 Rabin Vincent 提交于 5月 18, 2009

Remove the __initdata annotation for the clock lookups, since they
will be needed when loading modules which use clk_get().
Signed-off-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

982db663

18 5月, 2009 10 次提交

M
microblaze: Fix kind-of-intr checking against number of interrupts · 7b7210d7
由 Michal Simek 提交于 5月 14, 2009
```
+ Fix typographic fault.
Signed-off-by: NMichal Simek <monstr@monstr.eu>
```
7b7210d7
M
microblaze: Update Microblaze defconfig · 3026589c
由 Michal Simek 提交于 5月 11, 2009
```
Signed-off-by: NMichal Simek <monstr@monstr.eu>
```
3026589c

[ARM] mach-l7200: fix spelling of SYS_CLOCK_OFF · 8190b37f

由 Pavel Roskin 提交于 5月 12, 2009

Signed-off-by: NPavel Roskin <proski@gnu.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8190b37f

[ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 · eb33575c

由 Mel Gorman 提交于 5月 13, 2009

pfn_valid() is meant to be able to tell if a given PFN has valid memmap
associated with it or not. In FLATMEM, it is expected that holes always
have valid memmap as long as there is valid PFNs either side of the hole.
In SPARSEMEM, it is assumed that a valid section has a memmap for the
entire section.

However, ARM and maybe other embedded architectures in the future free
memmap backing holes to save memory on the assumption the memmap is never
used. The page_zone linkages are then broken even though pfn_valid()
returns true. A walker of the full memmap must then do this additional
check to ensure the memmap they are looking at is sane by making sure the
zone and PFN linkages are still valid. This is expensive, but walkers of
the full memmap are extremely rare.

This was caught before for FLATMEM and hacked around but it hits again for
SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
are totally screwed. This looks like a hatchet job but the reality is that
any clean solution would end up consumning all the memory saved by punching
these unexpected holes in the memmap. For example, we tried marking the
memmap within the section invalid but the section size exceeds the size of
the hole in most cases so pfn_valid() starts returning false where valid
memmap exists. Shrinking the size of the section would increase memory
consumption offsetting the gains.

This patch identifies when an architecture is punching unexpected holes
in the memmap that the memory model cannot automatically detect and sets
ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
which is the model sub-architecture this has been reported on but may expand
later. When set, walkers of the full memmap must call memmap_valid_within()
for each PFN and passing in what it expects the page and zone to be for
that PFN. If it finds the linkages to be broken, it assumes the memmap is
invalid for that PFN.
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

eb33575c

powerpc: Explicit alignment for .data.cacheline_aligned · 0e337b42

由 Benjamin Herrenschmidt 提交于 5月 17, 2009

I don't think anything guarantees that the objects in data.page_aligned
are a multiple of PAGE_SIZE, thus the section may end on any boundary.

So the following section, .data.cacheline_aligned needs an explicit
alignment.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0e337b42

powerpc/ps3: Update ps3_defconfig · dc892288

由 Geoff Levand 提交于 5月 15, 2009

Refresh and set these options:

 CONFIG_SYSFS_DEPRECATED_V2: y -> n
 CONFIG_INPUT_JOYSTICK:      y -> n
 CONFIG_HID_SONY:            n -> m
 CONFIG_RTC_DRV_PS3:         - -> m
Signed-off-by: NGeoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dc892288

powerpc/ftrace: Fix constraint to be early clobber · c3cf8667

由 Steven Rostedt 提交于 5月 15, 2009

After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
graph tracer broke. This was discovered on my x86 boxes.

The issue is that gcc used the same register for an output as it did for
an input in an asm statement. I first thought this was a bug in gcc and
reported it. I was notified that gcc was correct and that the output had
to be flagged as an "early clobber".

I noticed that powerpc had the same issue and this patch fixes it.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c3cf8667

powerpc/ftrace: Use pr_devel() in ftrace.c · 021376a3

由 Michael Ellerman 提交于 5月 13, 2009

pr_debug() can now result in code being generated even when #DEBUG
is not defined. That's not really desirable in the ftrace code
which we want to be snappy.

With CONFIG_DYNAMIC_DEBUG=y:

size before:
   text	   data	    bss	    dec	    hex	filename
   3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o

size after:
   text	   data	    bss	    dec	    hex	filename
   2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

021376a3

powerpc: Do not assert pte_locked for hugepage PTE entries · af3e4aca

由 Mel Gorman 提交于 4月 30, 2009

With CONFIG_DEBUG_VM, an assertion is made when changing the protection
flags of a PTE that the PTE is locked. Huge pages use a different pagetable
format and the assertion is bogus and will always trigger with a bug looking
something like

 Unable to handle kernel paging request for data at address 0xf1a00235800006f8
 Faulting instruction address: 0xc000000000034a80
 Oops: Kernel access of bad area, sig: 11 [#1]
 SMP NR_CPUS=32 NUMA Maple
 Modules linked in: dm_snapshot dm_mirror dm_region_hash
  dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
  pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
  windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
  windfarm_cpufreq_clamp windfarm_core i2c_powermac
 NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
 REGS: c000000003037600 TRAP: 0300   Not tainted (2.6.30-rc3-autokern1)
 MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28002484  XER: 200fffff
 DAR: f1a00235800006f8, DSISR: 0000000040010000
 TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
 GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
 GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
 GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
 GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
 GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
 GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
 GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
 GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
 NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
 LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
 Call Trace:
 [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
 [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
 [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
 [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
 [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
 [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
 [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
 Instruction dump:
 7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
 7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000

This patch fixes the problem by not asseting the PTE is locked for VMAs
backed by huge pages.
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

af3e4aca

[ARM] realview: fix broadcast tick support · ee348d5a

由 Russell King 提交于 5月 17, 2009

Having discussed broadcast tick support with Thomas Glexiner, the
broadcast tick devices should be registered with a higher rating
than the global tick device, and it should have the ONESHOT and
PERIODIC feature flags set.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NThomas Glexiner <tglx@linutronix.de>

ee348d5a

17 5月, 2009 4 次提交

[ARM] realview: remove useless smp_cross_call_done() · 78d236c2

由 Russell King 提交于 5月 17, 2009

smp_cross_call_done() is a no-op for MPCore, and since it's only
used by platform code, there's no point in having it unless it's
doing something.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

78d236c2

[ARM] smp: fix cpumask usage in ARM SMP code · 82668104

由 Russell King 提交于 5月 17, 2009

The ARM SMP code wasn't properly updated for the cpumask changes, which
results in smp_timer_broadcast() broadcasting ticks to non-online CPUs.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

82668104

[ARM] 5513/1: Eurotech VIPER SBC: fix compilation error · 776abac8

由 Ricardo Martins 提交于 5月 11, 2009

Compilation for this board yields the following errors:

arch/arm/mach-pxa/viper.c:511: error: 'FFUART' undeclared here (not in a function)
arch/arm/mach-pxa/viper.c:520: error: 'BTUART' undeclared here (not in a function)
arch/arm/mach-pxa/viper.c:529: error: 'STUART' undeclared here (not in a function)

Fix them by including the necessary header.
Signed-off-by: NRicardo Martins <rasm@fe.up.pt>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

776abac8

[ARM] 5509/1: ep93xx: clkdev enable UARTS · ff05c033

由 Hartley Sweeten 提交于 5月 07, 2009

Fix the clkdev API support for the ep93xx uart clocks.

The uarts available in the ep93xx have individual clock controls.
The current implementation assumes that the bootloader has enabled
the clocks before the kernel has booted. It also assumes that the
bootloader has set the UARTBAUD bit indicating that the uarts are
running off the 14.7456MHz external crystal.

This fixes both issues. It also allows the uart clocks to be stopped
when there are no users.
Tested-by: NMatthias Kaehlcke <matthias@kaehlcke.net>

Cc: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

ff05c033

16 5月, 2009 4 次提交

ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2 · 005187ee

由 Tony Lindgren 提交于 5月 16, 2009

This makes the framebuffer work on omap3.

Also fix the clk_get usage for checkpatch.pl
"ERROR: do not use assignment in if condition".

Cc: Imre Deak <imre.deak@nokia.com>
Cc: linux-fbdev-devel@lists.sourceforge.net
Acked-by: NKrzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: NTony Lindgren <tony@atomide.com>

005187ee

ARM: OMAP3: Fix HW SAVEANDRESTORE shift define · 8dbe4393

由 Kalle Jokiniemi 提交于 5月 16, 2009

The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used
by powerdomain code in
"1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but
the definition was also (1 << 4), meaning we actually
modified bit 16. So the definition needs to be 4.

This fixes also a cold reset HW bug in OMAP3430 ES3.x
where some of the efuse bits are not isolated during
wake-up from off mode. This can cause randomish
cold resets with off mode. Enabling the USBTLL hardware
SAVEANDRESTORE causes the core power up assert to be
delayed in a way that we will not get faulty values
when boot ROM is reading the unisolated registers.
Signed-off-by: NKalle Jokiniemi <kalle.jokiniemi@digia.com>
Acked-by: NKevin Hilman <khilman@deeprootsystems.com>
Acked-by: NPaul Walmsley <paul@pwsan.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

8dbe4393

ARM: OMAP3: Fix number of GPIO lines for 34xx · e102657e

由 Vikram Pandita 提交于 5月 16, 2009

As per 3430 TRM, there are 6 banks [0 to 191]
Signed-off-by: NTom Rix <Tom.Rix@windriver.com>
Signed-off-by: NVikram Pandita <vikram.pandita@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

e102657e

x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126

由 Jeremy Fitzhardinge 提交于 5月 13, 2009

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

 | commit 8efcbab6
 | Date:   Mon Jul 7 12:07:51 2008 -0700
 |
 |     paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test).  The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons).  This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions.  But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

		nopv		Pv-nospin	Pv-spin
CPU cycles	100.00%		99.89%		102.18%
instructions	100.00%		100.10%		100.15%
CPI		100.00%		99.79%		102.03%
cache ref	100.00%		100.84%		100.28%
cache miss	100.00%		90.47%		88.56%
cache miss rate	100.00%		89.72%		88.31%
branches	100.00%		99.93%		100.04%
branch miss	100.00%		103.66%		107.72%
branch miss rt	100.00%		103.73%		107.67%
wallclock	100.00%		99.90%		102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates.  Not too surprising, but it suggests that
the non-pvops kernel is over-inlined.  On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls.  For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq  *0xffffffff805a5b30
<_spin_lock+22>:	retq

The indirect call will get patched to:
<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq <__ticket_spin_lock>
<_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
<_spin_lock+22>:	retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled).  That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case.  The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks.  Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin).  Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]
Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: N"Li Xin" <xin.li@intel.com>
Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b4ecc126

15 5月, 2009 9 次提交

[ARM] S3C: Do not set clk->owner field if unset · 3ac19bb4

由 Ben Dooks 提交于 5月 15, 2009

The s3c24xx_register_clock() function has been doing a test
on clk->owner to see if it is NULL, and then setting itself
as the owner if clk->owner == NULL.

This is not needed, arch/arm/plat-s3c/clock.c cannot be
compiled as a module, and even if it was, it should not be
playing with this field if it being registered from somewhere
else.

The best course of action is to remove this bit of
code completely.
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

3ac19bb4

[ARM] S3C2410: mach-bast.c registering i2c data too early · a8af6de0

由 Ben Dooks 提交于 5月 15, 2009

The BAST support code is calling s3c_i2c0_set_platdata() from
the map_io() entry, instead of the bast_init() code. This causes
the registration to fail due to kmalloc() not being available
at the time.

This fixes the following error:
s3c_i2c0_set_platdata: no memory for platform data
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

a8af6de0

[ARM] S3C24XX: Fix unused code warning in arch/arm/plat-s3c24xx/dma.c · 871fcd7c

由 Ben Dooks 提交于 5月 15, 2009

Fix unused code warning in arch/arm/plat-s3c24xx/dma.c if there
is no PM support enabled. The function to_dma_chan() should
be marked inline so that the compiler will eliminate it without
warning if it isn't used.

arch/arm/plat-s3c24xx/dma.c:1239: warning: 'to_dma_chan' defined but not used
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

871fcd7c

[ARM] S3C64XX: fix GPIO debug · beb9f4ed

由 Marek Szyprowski 提交于 5月 07, 2009

Fix compilation bug when debug was enabled
Reviewed-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

beb9f4ed

[ARM] S3C64XX: GPIO include cleanup · f36dd6e7

由 Marek Szyprowski 提交于 5月 07, 2009

Cleanup arm/plat-s3c64xx/include/plat/gpio-bank-h.h include file.
Using shift-left operation with value >32 is a bad habit.
Reviewed-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

f36dd6e7

kgdb,i386: use address that SP register points to in the exception frame · 33ab1979

由 Jason Wessel 提交于 2月 11, 2009

The treatment of the SP register is different on x86_64 and i386.
This is a regression fix that lived outside the mainline kernel from
2.6.27 to now.  The regression was a result of the original merge
consolidation of the i386 and x86_64 archs to x86.

The incorrectly reported SP on i386 prevented stack tracebacks from
working correctly in gdb.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>

33ab1979

[ARM] nwfpe: fix 'floatx80_is_nan' sparse warning · 3ea385f0

由 Ben Dooks 提交于 4月 17, 2009

The symbol 'floatx80_is_nan' prototype was defined
locally in fpa11_cprt.c when it was built outside the
file in softfloat-specialisze.

Move this into softfloat.h to fix the following sparse
warning:

softfloat-specialize:276:6: warning: symbol 'floatx80_is_nan' was not declared. Should it be static?
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

3ea385f0

[ARM] nwfpe: Add decleration for ExtendedCPDO · ceec1c33

由 Ben Dooks 提交于 4月 17, 2009

Add header file decleration for 'ExtendedCPDO' in fpa11.h
to stop the following sparse warning:

extended_cpdo.c:90:14: warning: symbol 'ExtendedCPDO' was not declared. Should it be static?
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

ceec1c33

ASoC: DaVinci EVM board support buildfixes · f492ec9f

由 David Brownell 提交于 5月 14, 2009

This is a build fix, resyncing the DaVinci EVM ASoC board code
with the version in the DaVinci tree.  That resync includes
support for the DM355 EVM, although that board isn't yet in
mainline.

(NOTE:  also includes a bugfix to the platform_add_resources
call, recently sent by Chaithrika U S <chaithrika@ti.com> but
not yet merged into the DaVinci tree.)
Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>

f492ec9f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功